Praveen Manvi's Technical Diary: 2009

Sunday, September 20, 2009

Google App Engine & Cloud computing : Making world flat for Developers all over the world.
The playing field for developers all over the world has been levelled with the advent of cloud computing & web services. It's pretty much free for any developer to develop/deploy application over the Internet. There is no start-up cost.Scaling up can happen incrementally based business requirement with upfront investment & I think that's revolutionary in my opinion.
Here are my thoughts on Google App Engine, after deploying my first sample application.

Salient features of Google App Engine

Google started supporting java on their engine in April - 09 - Ref
Support for Google Id and Sign In
Automatic Persistence -JDO or JPA (standards based approach, implementing standard Java APIs on top of App Engine where possible. So instead of using the underlying App Engine datastore API, developers can program against Java Data Objects or Java Persistence API)
Local Development - Remote Depolyment
Scalability and free Use - Pretty cheap
Monitoring - Nice overview
Eclipse plug in - I did not face any problem with while deploying.
Limited support for JDK classes - When I tried converting my swing application to GWT one I found many of classes I used were not supported like java.util.Timer classes
Native threads can't be spawned

Google unleashed App Engine with support for Python in 2008(April). This was Google's first entry into on-demand application development and deployment. Developers were able to build, develop and easily deploy apps using Python. I think it has been successful in that.The economic impact of is that cloud computing disrupts the data center world by slashing the capital and skills required to deploy a web application. I am betting on success of this model.

About the current application: java typing tutor

I created this sample application within an hour with GWT & published it. Initially I tried to porting my one of Swing application "Java Typing Tutor" which I created after reading the steve yegge's blog long back Programmings dirtiest little secret - BTW I don't believe that fast typist should also be better programmer in general, but it was interesting write up. Whenever I get time I do intent to make this application on par with my feature rich swing application.

GWT - Although I feel very comfortable coding with this framework, (may be because I spent lot of time in coding Swing Apps) it's API limitations are quite annoying(java.util.Timer does't work etc...). Initially I had the impression that converting a Swing application to GWT application is straight forward & I had even the ambition to write even converter!(paint() for GWT widgets), But I guess now I realized that it's not possible. The programming style, approach are quite different. Once I used wingS framework to convert one of swing application to web - it was straight forward because of deep integration with swing, but now apparently that project has died. It definitely requires a different mindset to develop web application than developing swing application.

Finally some happy notes for swing programmers:
Wicket & GWT are great saviours for struggling swing developers : With Swing GUIS losing their relevance, these 2 framework provides great pathway to move into web development.

Over all I feel GWT & Wicket are great framework for Swing developers to develop web application in terms productivity. Rails/Grails (or any request/response framework) guys can never match component developers in terms of productivity :-)

Advantage of GWT clients (or any fat client) is that they can tap into the resources of the client PC they are running on (such as memory to store state) and thus scale much better for large numbers of users. GWT produces something that is more akin to applets, independent applications that happen to be hosted on web page.

But where as,

Wicket still assumes that you want to build at least part of your application the 'old fashioned'

way, so that it will work without JavaScript, turn up in search engines, can be bookmarked so on...

So component developers can satisfy both the worlds.

"It's stupidity. It's worse than stupidity: it's a marketing hype campaign," -Richard Stallman

Amazon looks to be better than Google for Cloud

Sunday, September 13, 2009

Elements of Java Coding Styles

Java Coding Styles: Here are some of the tricks which I felt useful for java developers & I have been using them a lot. Again these are personal preferences, consider what you like & disdain if you don't. By end of the day the software professional is all about getting "good working software, quickly and at low cost" that can sustain for longer time & solves the business problem at hand (with YAGNI caveat). I want to keep updating this whenever I see interesting ones.

Prefer smaller methods
They look beautiful, easy to understand/reuse/change & test.

'As I did 20 years ago, I still fervently believe that the only way to make software secure, reliable, and fast is to make it small.' - Andrew Tanenbaum

Dr. Venkat gives wonderful explanation on - how to convince your fellow developer to write short methods?

Use JUnit test cases; assert statements and method/variable names as documentation, writing comments in JavaDoc are obsolete and useless many a times.

The purpose of the assert statement is to give you a way to catch program errors early, using assertions to state things you know (or think you know) about your program can improve readability & are great in serving as comments, they are intended to be cheap to write, just drop them into your code any time you think of them. They are great tool of communication for a programmer & can get rid of dumb English stuff in the code with // & /** */.

"There is nothing as useless as doing efficiently, which should not be done at all." - Peter Drucker.

I hate to hear someone telling me to add javadoc comments especially for private methods. The method name variable name should be wise enough to reveal the intent. Fluent interface & design pattern can help a lot in naming variables, methods & class names. Over the years my variable & function names have become more verbose. It's so much easier to understand code from years ago when it reads like a sentence.

Big caveat with respect to assertions,

Assertions should be used to specify things that to be true at various points in your program for providing the documentation & definitely NOT for error checking and any active code. Assertions are disabled ("turned off") by default. Assertions were introduced in Java1.4 & can be turned on with -enableassertions (or -ea) flag on the java command line.

& for JavaDoc,

Code that isn't fully documented is unfinished and potentially useless. Javadoc comments on methods and classes should normally indicate what the method or class

It was necessary document the type of keys and values in a Map, as well as the Map's purpose, but now with Java5 Generics that is also not required. Prior to Generics I always used to code List/*String*/ = new ArrayList();

Some Java 5 features are smart; Look for smarter new libraries that exploit these features well

Use basic infrastructure libraries like Google Collections (I prefer this over apache collections), sl4j, Guice etc… as much as possible. We will learn to appreciate the value of Java-5 new features by looking into the source code of these libraries, Old coding styles have to be abandoned in favor new features.

Commenting the code with "if(false)"

In many a times we want to use different implementation for testing & commenting out code temporarily. If the method is very long it's painful to comment out the whole block. Let us say we have method like this.

public void sendMail(){

if(true) return;

// Lot of code for adding to database to event

...

// Transforming text with XSL etc...

....

}

if(true) return; -> This line has the same impact as the commenting out the whole stuff :)

Use double brace initialization (For non-performance intensive code), If possible use Google Collections in all the cases.

As swing programmer I have been using these quite long time. I have seen quite number of experienced java programmers aren't aware this. This really useful while testing. It's concise, requires less typing & more readable but comes with cost. Google Collections provide better same feature without any performance cost.

Map map = new HashMap() {{

    put(1, "one");

    put(2, "two");

    put(3, "three");

    put(4, "four");

    put(5, "five");

}};

Use protected & final where ever possible

Using final does not make much sense from performance point of view as modern JVM is intelligent enough to make use them efficiently, but still I feel they have lot of value in making the intent clear & in multi threading simple with im-mutable state.

Usage of "private" is over-rated & I think it's better to have it has protected making setting the code simpler.

Person p = new Person(){{

age=30; name="Suresh";

}}

where age & name are protected variables.

More & more you start using inner classes, you will get frustrate with java for lack of "closures" & will eventually move to Groovy or Scala (Like me :-))

Use Null Object Pattern & throw IllegalArguementException wherever applicable

The Null Object provides intelligent do nothing behaviour that helps to avoid problems with Null references (NPEs) & so called "A billion dollar irreversible mistake"

Prefer Unchecked Exception over Checked Exception

90% of time Unchecked Exception makes more sense than the Checked Exception. This is one of the most significant attribute of successful libraries like Spring & Hibernate. Use Checked Exception for recoverable errors (Business Exception) & Unchecked for remaining stuff.

Understand Soft, Weak and Phantom references in Java

WeakReference is a reference which doesn't have enough force to prevent Garbage Collector(GC) deleting object - Useful for caching
Soft Reference behaves like weak references, except when GC determines object is softly reachable & can be used to avoid OutOfMemoryError.
Phantom references are enqueued when objects are deleted from memory and get()method always returns null to prevent resurrecting object. So phantom references are good for determining exactly when object is deleted from memory.
__________________
Interesting quotes from Rock star programmers that I collected.

"Write lots of code. Have fun with it!" — Joshua Bloch

"Learn to use your tools. And I don't mean just enough to get by. I mean really learn how to use your tools." — Tor Norbye

"Don't use line numbers. Don't put your entire application in one method." — Chet Haase

"Don't be overwhelmed by the language or the platform." — Raghavan Srinivas (In the same line Neal Ford says "When you were hired by your current employer, you may think it's because of your winning personality, your dazzling smile, or your encyclopedic knowledge of Java. But it's not. You were hired for your ability to sit and concentrate for long periods of time to solve problems")

"Millions of people have been employed because someone at Sun Microsystems invented Java." - Masood Mortazavi. Master it & you will never regret for that

"There will always be opportunities for great engineers, but as I said earlier, I think the number of these opportunities will shrink as other, less technical personnel play larger roles in the software-development process, using more productive, higher-level tools and frameworks than we have used in the past." - Ben Galbraith

"Google makes finding information easier than ever, but nothing beats interacting with an expert."- Ben Galbraith. So always associate with people from whom you think you can learn.

From Pragmatic Thinking & Learning

There is no expertise without experience.It takes something on the order of ten years/ 10,000 hours
of practice to be expert in a field Deliberate, thoughtful practice is what makes the difference—not
just going through the motions.Practice doesn't make perfect, but it does make permanent:
neuroplasticity will cause your brain to re-wire itself according to what you do.You may not become
what you dream, or what you aspire to be, but you will become what you do.Unfortunately there is no substitute for hard work.

References:

http://java.sun.com/developer/technicalArticles/Interviews/studentdevs/index.html?intcmp=2225

http://www.agiledeveloper.com/blog/PermaLink.aspx?guid=8a745e85-2a34-4d9c-8c25-ca371530e281 - How to convince your fellow developer to write short methods?

http://weblog.raganwald.com/2007/04/rails-style-creators-in-java-or-how-i.html - Rails style initializers

http://www.refactory.org/s/double_brace_initialisation/view/latest

http://c2.com/cgi/wiki?DoubleBraceInitialization

http://bwinterberg.blogspot.com/2009/09/introduction-to-google-collections.html

http://thestrangeloop.com/sessions/ghost-virtual-machine-reference-references

http://juixe.com/techknow/index.php/2009/11/18/favorite-programming-quotes-2009/

http://www.readwriteweb.com/archives/top_10_software_engineer_traits.php

Friday, June 26, 2009

ALL ABOUT UNIT TESTING

"First about Boring Theory"

"Unit Test is the smallest piece of testable part of an application" In computer programming, unit testing is a software verification and validation method where the programmer gains confidence those individual units of source code is fit for use. A unit is the smallest testable part of an application. The primary goal of unit testing is to take the smallest piece of testable software in the application, isolate it from the remainder of the code, and determine whether it behaves exactly as you expect. Each unit is tested separately before integrating them into modules to test the interfaces between modules. Unit testing has proven its value in that a large percentage of defects are identified during its use.

"In Java Unit Test cases means JUnit test cases, the single most importance of Spring & Guice (or any dependency injection framework) is to make unit testing easier"

JUnit is the de-facto framework for unit testing in java world. JUnit is a simple library, although there are mock objects, test code generators, behavioral test design & many other tools based on dynamic languages JUnit remains viable option while testing libraries or API where developer is a end user. Bob Lee (Author of Guice) stresses on the point that single most importance of dependency injection framework or interface driven design for matter is easier testability. All Google great applications like Gmail, Google Adsense, Calendar are the great testimony of this fact.

"Developers don't like writing unit test cases; Management needs to understand the technical debt associated with un-availability of test cases"

Let's face it, developers don't like writing unit tests & write documentation. Kent says, Software, like golf, is both a long and short game. JUnit is an example of a long game project – lots of users, stable revenue, where the key goal is to just stay ahead of the needs of the users. So it's hard to sell writing JUnit cases for small projects that don't have longer life. It's clearly avoidable overhead in such cases (Most of the web applications). It may not economically make sense to write extensive test cases for short lived & small applications.

In some cases developers hate to be embarrassed & look stupid when someone finds a mistake or highly technical guys think they don't need to write test their solid code. The first case can be handled through management as it's a purely competence issue which can be sorted out through training & other means, in second case it's hard to convince as these guys very much correct in their assertions in their own way. It's an attitude problem, the best way would to be deploy someone to write unit test cases. A "high level of quality code" is great, yet most software lives on and on and people expect to add/modify features in that software or debug it, it's economic requirement that super stars need to prove that their code works with unit test cases. How much will the maintenance costs are without unit tests? How much more risk does that add?

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?" -- Brian Kernighan

"Unit testing seems to a lot of managers and developers like pure overhead, but professionally responsible developers know that it is one of the keys to quality." - Neal Ford

"Solid test cases with 100% coverage provides the courage to refactor the code & the reduces the testing effort in future"

We need to educate ourselves it's economically makes sense to have solid test cases especially for pure library (API)providers, as cost associated testing & testers can be completely (almost) avoided as anyway applications will test the APIs & we can avoid duplicating of the testing effort of testers in testing the APIs. Unfortunately these long sighted approaches are difficult to sell to the management and it's become thankless job when the unit tested solid code can't be differentiated with the working code. Developing APIs is a marathon job & not a 100 meters race. Stamina & perseverance plays very important role. In most of the cases committed can't be taken back, I guess JDK deprecated APIs must be haunting, humiliating the initial designers. One of the benefits of JUnit test cases are that developers get first had experience of the developers using the same.

As time goes on, there will be cases where the code works a bit less, some minor bugs, and some dirty quick fix (or hack) happens. Since you don't want to touch that code, you'll put fixes/enhancements/workarounds in other parts of the code, slightly but constantly degrading the quality of your design. You won't even upgrade a depending library, since you can't easily run regression tests over it. In a shorter time than you expect, that good designed and implemented project will turn into a nightmare. So it's just not about changing code, it's about changing environment - RDBMS vendor, JDK version, Library versions, OS versions…

"JUnit can be used to write End to End functional Tests & Unit test cases needs to be reviewed"

From definition, a test is not a unit test if:

It talks to the database
It communicates across the network
It touches the file system
It can't run at the same time as any of your other unit tests
You have to do special things to your environment (such as editing config files) to run it.

If we go by above principle 90% of our JUnit test cases don't pass above rules. Although POJO driven frameworks like Spring tries to solve it, I don't think we can use JUnit in it's pure form. It's ok to use the JUnit for functional and integration test cases as well.

"Test Before Code, or perhaps Test Before Design." – That means unit test cases need to be reviewed before the design or coding. These reviews probably should be more thorough than the code reviews itself.

"JUnit test cases can serve as great tool to document"

Probably writing documentation through Javadocs is a bad idea. Usage of verbose class names, method names & JUnit test cases is more scalable & efficient way of documenting classes. Communicating through the code is the best way communication.

"JUnit test cases have to be efficient & succinct"

Manual test hurts both economic wise & manageability wise. But by end of the day if test cases are not capturing the correct scenarios & worst part if we have repetitive test it really JUnit really doesn't help. Best of the people involved with software development needs to do this type of unit testing. Garbage in & Garbage out rule is perfectly applicable here.

I am done with all my legal points to sell unit testing? Do you guys buy this argument? J

My next topic on unit testing would be on patterns and anti-patterns while writing test cases.

Resource:

http://www.artima.com/weblogs/viewpost.jsp?thread=126923

http://www.theserverside.com/news/thread.tss?thread_id=51615

http://www.junit.org

http://c2.com/cgi/wiki?WhoIsUsingJunit

http://www.theserverside.com/news/thread.tss?thread_id=51615

http://c2.com/cgi/wiki?FunctionalTest

http://www.davenicolette.net/articles/functional_tdd.html

http://www.logigear.com/newsletter/api_vs_unit.asp

http://www.exubero.com/junit/antipatterns.html

http://www.infoq.com/news/2009/06/test-or-not

http://www.agitar.com/solutions/why_unit_testing.html

Friday, April 24, 2009

Search Domain Basics : Search is probably the most pervasive technology domain of this century. Here I have tried to cover some basic concepts & some software implementation details with java as the focus. I thought this information can help new comers to this domain & provides enough starting pointers to dig into more details.

"Content is King" - Content is what that drives the web & "search" is the engine for that. All the services providing the content has to employ search techniques to fetch the required information with minimum possible user inputs in fastest possible way. Google/Yahoo which have become synonyms with search is implementing all the applications in relation with search one way or the other. Apparently unstructured content forms the major portion of the content available in the universe. Unstructured data (or unstructured information) refers to (usually) computerized information that either does not have a data model or has one that is not easily usable by a computer program or in simple words any data that is not represented in terms of column names in RDBMS table schema. Parallel computing, Data sharding, Schema definition, Scale of data & nature of the content acquisition related with un-structured content makes it unsuitable to be solved from 100% RDBMS solution, although full text indexing does exist in the RDBMS world.

Raw data with context is called 'Information' & Information search and retrieval is all about locating relevant material from collection of raw data in a fastest way taking minimum possible input from the users. The ability to aid and assist a user in finding relevant information is the primary goal of information engineers & information Retrieval (IR) libraries.

Major parts for search engine:

fetching/Loading the document – downloading content (lists of pages) that have been referenced.

analysis – analyzing the database to assign a priority scores to pages (PageRank) and to prioritize fetching.

indexing – combines content from the fetcher, incoming information from the built-in data source, and link analysis scores into a data structure that’s quickly reachable usually using cache services.

searching – returns set of content that ranks pages against a query using an index.

database – keeping track of what documents with various context information helping the ranking better.

To scale to billions of documents, all of these must be distributable, i.e., each must be able to run in parallel on multiple machines. This should happen by throwing more hardware into the pool, without massive reconfiguration whenever scale up is required.As we cannot offer to have failure of any single component cause a major hiccups; a search solution must be able to easily scale by throwing more hardware into the pool, without massive reconfiguration; and things should largely fix themselves without human intervention. This can only be possible with the stateless implementation of software services.

Search Engine with Java for unstructured content: Studying the Information library APIs gives better insights into what search engine is capable of providing the service & I am taking the lucence as the sample for that.

Lucene library has become defacto IR library in java world, now lucene has been ported to almost all the major languages showcasing the popularity & capability of this small library.

Lucene is a search and retrieval library providing key word and fielded search. It can use boolean AND, OR, and NOT to formulate complex queries & can use fuzzy logic that is useful when searching text created by optical character recognition. Un-structured content far exceeds the structured content in the web world. Lucene mainly deals unstructured content & can effectively search structured content also with the field tags.Lucene provides minimum required information retrieval functionality. We can call this a "SearchKernal" library that provides full-text search and indexing functionality. Instead of an out of the box application, Lucene offers a usable API for programmers and operates on a lower level.There are off the shelf libraries (Compass, Nutch, Solr...) providing monitoring, transaction utilities over lucene. Commercial enterprise search offerings include from vendors such as, Autonomy, Google, Oracle & FAST (MS).Lucene does not search file by file. The search space is analysed first and translated into a normalised representation - the index. Lucene uses a reverse index. All words in the index are unique, that means the index is a compressed representation of the search space. Lucene only supports plain-text files. However, a variety of free open source document parsers are available for document types such as, RTF, PDF, HTML, XML, Word etc. Depending on the nature of the text content various analysers are on offer. For example, text can be analysed with a white space analyzer which breaks down the text in tokens separated by white space. To keep the response time short the process of generating and optimising the index is separated. The index gets normalized by applying a stemming and lemmatisation algorithms. Lucene beats the RDBMS in full text search in terms of processing speed, manageable reduced the size of the index footprint (now about 25 percent of the source documents’ size), easy incremental updates, support for index partitions, price & flexibility (index methodology, deploy options & schema evolution). RDBMS way of searching with where clause & LIKE % is not only scalable but ineffecient, although RDBMS like Oracle includes full text indexing capabilities they have not been as popular to independent solutions such as Lucene & also it's not easy to implement the parallel processing (map/reduce) invloving multiple machines & terabytes of data. It's the dynamic nature & context understanding nature of search bringing huge changes search domain with navigating the hierarchy is becoming old fashioned. With technical problems pretty solved in this domain, now the key thing to remember is "search methodology is much more important than the underlying technology".

Search Terminologies:

Proximity search:A search where users to specify that documents returned should have the words near each other.

Concept Search: A search for documents related conceptually to a word, rather than specifically containing the word itself. Involves parallel computing.

Boolean search: A search allowing the inclusion or exclusion of documents containing certain words through the use of operators such as AND, NOT and OR.

Proximity search: A search where users to specify that documents returned should have the words near each other.

Stemming: The ability for a search to include the "stem" of words. For example, stemming allows a user to enter "running" and get back results also for the stem word "run."

Lemmatisation: is the process of grouping together the different inflected forms of a word so they can be analysed as a single item.Lemmatisation is closely related to stemming. The difference is that a stemmer operates on a single word without knowledge of the context, and therefore cannot discriminate between words which have different meanings depending on part of speech.

Noise or Stop words:Conjunctions, prepositions and articles and other words such as AND, TO and A that appear often in documents yet alone may contain little meaning.

Thesaurus: A list of synonyms a search engine can use to find matches for particular words if the words themselves don't appear in documents.

Index: Normailzed presentation of words

Semantic Search: is a process used to improve online searching by using data from semantic networks to disambiguate queries and web text in order to generate more relevant results.

Web Search : Content is public & Generic.Uses keywords, Links (relevency) based some kind of historic traffic.

Enterprise Search : Also contains private documents that domian specific, Quality of content should be highest quality content & not necessarily popular. Information/metadata needs to be secure with role based access to the content.It has to support security (Realms, Roles), SLAs and many other requirements. Google & Yahoo do not provide enterprise search.

As of now I am interested in researching this topic in the search retrieval field, So will keep updating this blog with my research findings.

"Automatic annotation/Summary addition for content":Lengthy documents/text are boring to read. It will be great if someone or computer can automatically creates the gist of the content. Automatic creation of annotation is tough task especially for non-domain specific topics but can be predictable in domain specific cases. For example it might be easier to extract the information from judgments copy automatically (at least in routine cases that hardly requires special knowledge from legal experts to annotate) or through workflow for review with automatically annotating the content/documents. I see this as an interesting area.

Summary:

IR & Search domain is pretty complex subject requiring mastery over algorithms & data structures. Hope that I have been able assimilate the information related to search taking "lucene" as sample search engine library.

References:

Lucene Book : http://www.manning.com/hatcher3/

Google Search:http://en.wikipedia.org/wiki/Google_search

Useful wrapper libraries over Lucene.

http://lucene.apache.org/solr/features.html - Solr Features

http://www.compass-project.org/overview.html - Compass Features

Friday, April 17, 2009

Some statistics about Java source code of popular open source libraries. Currently I am looking/learning a big system that has multi-million number of source code. I wrote a simple utility to extract information about the java source code just for fun. I ran this utility on many of "src" directory of open source libraries as well.This java utility takes a source code directory as inputs & traverses all the java code recursively inside that directory. It collects total number of active lines of code excluding comments, package count...

& here is the result.

********* JDK1.5 ********

Total # Lines = 850918

Total # of Files = 6556

Avg # Lines per file = 129

Total # of packages = 368

******** Hadoop 0.18.3 ******

Total # Lines = 129742

Total # of Files = 926

Avg # Lines per file = 140

Total # of packages = 66

******* Lucene 2.4.1 *********

Total # Lines = 69606

Total # of Files = 528

Avg # Lines per file = 131

Total # of packages = 16

******** Struts *********

Total # Lines = 63707

Total # of Files = 1040

Avg # Lines per file = 61

Total # of packages = 120

********* iText 2.1.5 ***********

Total # Lines = 96830

Total # of Files = 544

Avg # Lines per file = 177

Total # of packages = 55

******* Tapestry 5.1.0.3 ******

Total # Lines = 96395

Total # of Files = 1937

Avg # Lines per file = 49

Total # of packages = 103

It's not all surprising that one of the best prolific java coder comes best ("Howard") in the java world when it it comes to modularity.

********* iBatis *********

Total # Lines = 14132

Total # of Files = 202

Avg # Lines per file = 69

Total # of packages = 45

***** Hibernate 3.3.1.GA *******

Total # Lines = 173698

Total # of Files = 2102

Avg # Lines per file = 82

Total # of packages = 292

I guess these figures can be considered as standard while reviewing the modularity of any library.Do let me know if any one interested in code & having ANT target to analyze the

source code b/w releases.

Tuesday, March 10, 2009

Interview with one of the best API designer in the world -Joshua Bloch.

Adding some other interesting wise quotes on API design that I have read...

When you design user interfaces, it's a good idea to keep two principles in mind:

Users don't have the manual, and if they did, they wouldn't read it.In fact, users can't read

anything, and if they could, they wouldn't want to.

Same rule applies for API designers as well:

Developers don't have the java docs, and if they did, they wouldn't read it.In fact, Developers can't read anything other than pressing "." against the object reference in IDE & wait for something to select, and if they could, they wouldn't want to.

Learnability,Effeciency,Memorability,Errors & Satisfaction remains the core of good interface design.

APIs should emerge from the needs of real applications, and that they should make common tasks super-easy as the demand for quality, validated designs far exceeds our capacity to create them.In 1996 it wasn't clear we could create a sufficiently fast language without primitive types and arrays, It wasn't clear how much boilerplate code would be required by anonymous callback classes or checked exception. So Java couldn't resist including primitive types,excluding closures in favour of anonymous classes & over-using checked exceptions.

Grady Booch->

"Great thing about objects is that they can be replaced". The great thing about Spring is it helps you replace them. Flexibility is much more important than the re-use.

Joshua Bloch says,

Public APIs are forever - one chance to get it right
Good code is modular–each module has an API
Thinking in terms of APIs improves code quality
Easy to use, even without documentation
Don’t let implementation details “leak” into API
Make classes and members as private as possible
Make variables "final" wherever possible

Now some economics:

Economic reality suggest that buying more memory can be easier and cheaper than to pay someone to debug code & it pays more in the long run to have understandable slow code than super fast cryptic code.

Once I heard from Architect in IBM conference (don;t remember the name) for choosing Java over C++ saying this,

"The most compelling reason for adopting Java over C++ is automatic memory management. It protects application from mediocre programmers. It eliminates many embarrassments of memory leaks & crash that randomly occur in production". He went on say that "So as a result we are trading with inexplicable crashes for slow performance (automatic memory management & database-centric storage).This makes sure that application at least works anyway"

"Rushing is at the root of all lack of quality" - Peter Calthorpe, architect

Reference:

Design Slides
Defensive Programming

Spring is Good

Monday, March 09, 2009

javaiq.in - A playground for learning Java RIAs. I am planning to expose my experimental applications with GWT, JavaFx & Felx. I will also be creating applications mixing multiple open services from various vendors (Google,Yahoo, Amazon, eBay...) & create a combined value. I guess this is the area that has huge scope.

I created this site & made it public 2 months back. It was result of my experimenting with new techniques with sample applications/code snippets,I always felt that best way to sell any new technique is with real application. It's great to have useful apps while learning new technologies/framework,I hate to write throw away examples. I have been following web frameworks from past few years, I have spent large amount of time in validating/learning over the year. I am SWING programmer & was always trying out the samples from Internet/books. Learning (or Stealing) the good code snippets that I liked & thought of exposing them as useful apps. Having spent a considerable amount of time and energy with Swing, it was queasy feeling in my stomach to see all my Swing programming heroes (Like Chet, Romain Guy...) either have gone to Adobe/Google or have moved to JavaFx/Flex way. I have also lost faith in Swing. I don't expect to develop new pure Swing apps any more, but I guess I was able to grasp GWT, Echo, Wicket frameworks much better than any pure web MVC framework (Struts, Spring MVC...) developers who were not having exposure to swing. I guess that's the advantage still I can leverage.

Chasing entrepreneurship dream- "When you are not into inheriting money and because you do not belong to a rich family and when you aren’t an individual blessed with the talent of a sportsperson or an actor, the next best thing to do is to become an entrepreneur" - Anand Morzaria

A man is a success if he gets up in the morning and gets to bed at night, and in between he does what he wants to do - Bob Dylan

Now experts say it has become relatively cheaper to start a new web application or a startup. Moore's law has made hardware cheap; open source has made software free; the web has made marketing and distribution free; and more powerful programming languages & techniques are making development teams smaller & powerful. But actual life it is not that simpler, especially if you are fighting a lone battle.

It's really pretty time consuming process to create & manage applications. I went with cheap shared Java hosting which was unable to run my "stripes" web framework based application & no one was there to help me out by providing tomcat logs & I had to settle with few JSPs with trimmed version. It's pretty costly affair to run java apps in a proper manner. It's big road blocker if anybody wants to develop/expose web applications in java today. Sun has to fix this problem (shared hosting) if it wants to replace PHP/Ruby On Rails apps in small scale applications. (I will have separate notes with java hosting) Enthusiasm to code is very hard to maintain, actually I wrote most of the applications in a few days, but was difficult to keep the momentum. Perseverance, relentless resourcefulness (as paul graham calls it) is difficult to achieve without full time dedication. Anyway it was good exercise to know all these limitations.

If I look back, I seriously doubt my spending of countless hours in browsing blogs, learning 10s of java web frameworks has significant impact on carrying out my day today work or think better. In fact my guess is I have wasted >30% of total time spent in gleaning through useless/marketing literature links. Hopefully from now on I will be able channelize all my energy in building better context for my day today work rather than learning all java web frameworks under the Sun. BTW all my web framework heroes (Wicket, Appfuse, Stripes, Tapestry...) that I have been following from past one year are apparently looking for jobs themselves & are investing time in using Groovy, Scala & Clojure. :-( not really a happy situation. I guess innovation with pure java in this space has reached dead end. Now I want to to learn/invest more on my current work so that I can become better at what I am doing currently, that's my priority #1 & I will resist experimenting with what blog writers say on latest technologies. One of the trouble I had disclosing this site was public embarrassment! of people knowing that I created this half baked application that I hesitated to own. But now it's quite decent set of useful applications that I started as my week-end project & I have good ideas to make it still better & this will be always my low priority work & also make sure to extract the best out of I am currently working on.Oh! Now I can call myself CTO of javaiq. nice feeling. :-)

Friday, March 06, 2009

Popular techie words: we hear below words a lot in discussion & blogs. I thought it's good to have their definitions (easier ones) & explain if anyone asks "what's that?" as I also use them to have buzzwords compliance.

Technical Debt:
Technical Debt is a wonderful metaphor developed by Ward Cunningham to help us think about this problem. In this metaphor, doing things the quick and dirty way sets us up with a technical debt, which is similar to a financial debt. Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice. We can choose to continue paying the interest, or we can pay down the principal by refactoring the quick and dirty design into the better design. Although it costs to pay down the principal, we gain by reduced interest payments in the future.
The metaphor also explains why it may be sensible to do the quick and dirty approach. Just as a business incurs some debt to take advantage of a market opportunity developers may incur technical debt to hit an important deadline. The all too common problem is that development organizations let their debt get out of control and spend most of their future development effort paying crippling interest payments.
Reference:
http://www.c2.com/cgi/wiki?TechnicalDebt
http://martinfowler.com/bliki/TechnicalDebt.html
http://www.youtube.com/watch?v=pqeJFYwnkjE

MapReduce:
MapReduce is hierarchical scatter/gather operation.
MapReduce is a library that lets you adopt a particular, stylized way of programming that's easy to split among a bunch of machines. The basic idea is that you divide the job into two parts: a Map, and a Reduce. Map basically takes the problem, splits it into sub-parts, and sends the sub-parts to different machines - so all the pieces run at the same time. Reduce takes the results from the sub-parts and combines them back together to get a single answer.
Reference:
http://scienceblogs.com/goodmath/2008/01/databases_are_hammers_mapreduc.php

Cloud Computing:
With cloud computing, everything is web-based instead of being desktop-based; access all programs and documents from any computer that’s connected to the Internet is possible. Cloud computing helps to do it more easily than ever before.

Wikipedia Cloud Computing is "a style of computing in which resources are provided as a service over the internet". Cloud computing user need not have worry about managing a machine or service at the physical level (Machine & location). Amazon Simple DB is an example for this as it handled operating system or database maintenance functions, SLA & operational issues.

Sharding:
Sharding or horizontal partitioning is about splitting up data sets. If data doesn't fit on one machine then split it up into pieces, each piece is called a shard.
Sharding is used when you have too much data to fit in one single relational database.

Reference:
http://highscalability.com/sharding-hibernate-way
http://highscalability.com/unorthodox-approach-database-design-coming-shard
http://lethargy.org/~jesus/archives/95-Partitioning-vs.-Federation-vs.-Sharding.html

Refactoring:
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. Its heart is a series of small behavior preserving transformations

Reference:
http://www.refactoring.com/
http://en.wikipedia.org/wiki/Code_refactoring

Code Smell:
In computer programming, code smell is any symptom in the source code of a program that possibly indicates a deeper problem.
Reference:
http://en.wikipedia.org/wiki/Code_smell

Saturday, February 28, 2009

DHH - The best talk I heard on startups

I was quite impressed by DHH (Creator of Ruby On Rails framework) speech on "The secret to making money online" when I viewed this video again today.

There are just 3 steps for this.
1. Create a gr8 product that people like & is useful to them
2. Have a price for the product & ask people to pay for it or it's usage.
3. Make profits!

This is so ridiculously simple grand old rule! he goes on to say that you don't need to be f***ing genius to achieve this.

He also came up with nice probability analysis that strike rate of success following above simple model is much higher rather than Yahoo, Google,facebook & youtube (setting up BillBoards/hoardings) way of attracting users providing free services & make use of web page real estate to show ads & make money. He also suggest to go slow on implementing the idea as "Finding a good cause is incredibly hard & time consuming" - Craign Newmark.

Simply superb.

I have been following many startup blogs including paul Graham's, but never impressed like this before, so simple words but telling you the hard truth :-)

Praveen Manvi's Technical Diary

Sunday, September 20, 2009

Sunday, September 13, 2009

Elements of Java Coding Styles

Friday, June 26, 2009

Friday, April 24, 2009

Friday, April 17, 2009

Tuesday, March 10, 2009

Monday, March 09, 2009

Friday, March 06, 2009

Saturday, February 28, 2009

Praveen Manvi

Archived

Labels

My Recommended Books

Gurus