Friday, April 17, 2009

Some statistics about Java source code of popular open source libraries. Currently I am looking/learning a big system that has multi-million number of source code. I wrote a simple utility to extract information about the java source code just for fun. I ran this utility on many of "src" directory of open source libraries as well.This java utility takes a source code directory as inputs & traverses all the java code recursively inside that directory. It collects total number of active lines of code excluding comments, package count... 

& here is the result.

********* JDK1.5 ********
Total # Lines = 850918
Total # of Files = 6556
Avg # Lines per file = 129
Total # of packages = 368
******** Hadoop 0.18.3 ******
Total # Lines = 129742
Total # of Files = 926
Avg # Lines per file = 140
Total # of packages = 66
******* Lucene 2.4.1 *********
Total # Lines = 69606
Total # of Files = 528
Avg # Lines per file = 131
Total # of packages = 16
******** Struts *********
Total # Lines = 63707
Total # of Files = 1040
Avg # Lines per file = 61
Total # of packages = 120
********* iText 2.1.5 ***********
Total # Lines = 96830
Total # of Files = 544
Avg # Lines per file = 177
Total # of packages = 55
******* Tapestry 5.1.0.3 ******
Total # Lines = 96395
Total # of Files = 1937
Avg # Lines per file = 49
Total # of packages = 103
It's not all surprising that one of the best prolific java coder comes best ("Howard") in the java world when it it comes to modularity.
********* iBatis *********
Total # Lines = 14132
Total # of Files = 202
Avg # Lines per file = 69
Total # of packages = 45
***** Hibernate 3.3.1.GA *******
Total # Lines = 173698
Total # of Files = 2102
Avg # Lines per file = 82
Total # of packages = 292

I guess these figures can be considered as standard while reviewing the modularity of any library.Do let me know if any one interested in code & having ANT target to analyze the
 source code b/w releases.

No comments:

Bookmark and Share