Praveen Manvi's Technical Diary: 2011

Friday, September 30, 2011

Backward and Forward compatibility with Java serialization protocol:-

Java offers a nice protocol for storing and communicating the persisted information. In a distributed deployment environment the server code has to support both the new and old information when its anticipated the control over the deployment of different versions of the servers are not available. The code has to deal with both backward as well as forward compatibility. i.e. VERSION N should be able to read both VERSION N-1 & VERSION N+1 object.

Maintaining the backward compatibility is easy. i.e. Version N code can read Version N-1 code by maintaining the version attribute. The version attribute in the read() API will make sure that the reading of new information is skipped. This is quite straight forward to achieve through “java.io.Externalizable” interface provided java where we have the full control of the context information on what we need to save. But it becomes tricky when we try to read the information created by future versions. The problem with forward compatibility is that the Java de-serialization it has to know what the new stuff that has been added & also should know where the information has been added.

The solution explained below tries to solve the problem in a pure java with less overhead & suitable for systems that have already established using Java serialization with ”java.io.Externalizable” interface.

Here the top down approach is followed for the explanation, it starts with how the usage of API should look like VERSION 0,1 & 2 then show how that can be achieved through code.
The challenge here is to read VERSION-1 (N) data from VERSION-0 (N-1) class. The current solution targets supporting only N+1 version & not the future versions which is the only important part of 24 X 7 support systems in the event of release back out.
Let's take an example of class "Test" having 2 attribute in VERSION-0, VERSION-1 introduces a new attribute called "versionData1" & VERSION-2 introduces the one more new attribute "versionData2". To simulate the case the data is inserted in between the data. In real-time scenario we will have lots of classes with readExternal()/writeExternal() methods spread across all over the code, but usually controlled with single mother object.
SKIP_START & SKIP_END are the marker interfaces used to identify the new information and ObjectInputWrapper is the new class used to move the cursor to skip the details when we read new information from the old class.
"Test" is Externalizable class having 3 attributes. We will include a new attribute in the second version and see how a "Test" object with VERSION -0 (N) reads VERSION -1 (N-1) object.

This plugin architecture is quite scalable. ObjectInputWrapper can be injected with different serialization protocols like Google proto buffer, Oracle Coherence's POF, etc... using a using factory pattern like ObjectInputOutputProviderFactory that provides different implementation including custom logic to efficiently store & retrieve the information

VERSION = 0;

    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
       out.writeInt(VERSION);
       out.writeInt(number);
       out.writeObject(name);
    }
    @Override
    public void readExternal(ObjectInput _in) throws IOException, ClassNotFoundException {
        ObjectInputWrapper in = new ObjectInputWrapper(_in);
        VERSION = in.readInt();
        number = in.readInt();
        name = (String) in.readObject();
    }
"Test{VERSION=0, number=10, name=praveen}"

VERSION = 1

This class has new String attribute called versionData1 with the value "versionData1"
    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
       out.writeInt(VERSION);
       out.writeInt(number);
       // Added for version1 in between
       out.writeObject(new SKIP_START());
       out.writeObject(version1Data);
       out.writeObject(new SKIP_END());
       out.writeObject(name);
    }
A marker interface class has been included to identify the new information that the older version can safely ignore.

    @Override
    public void readExternal(ObjectInput _in) throws IOException, ClassNotFoundException {
        ObjectInputWrapper in = new ObjectInputWrapper(_in);
        VERSION = in.readInt();
        number = in.readInt();
        if(VERSION>=1) {
             in.readObject(); // we can safely ignore skip signals
            version1Data = (String)in.readObject();
            in.readObject();
        }
        name = (String) in.readObject();
    }

"Test{VERSION=1, number=10, versionData1=versionData1, name=praveen}"

VERSION = 2

    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
       out.writeInt(VERSION);
       out.writeInt(number);
       out.writeObject(version1Data);
       // Added for version2 data in between
       out.writeObject(new SKIP_START());
       out.writeObject(version2Data);
       out.writeObject(new SKIP_END());
       out.writeObject(name);
    }
    @Override
    public void readExternal(ObjectInput _in) throws IOException, ClassNotFoundException {
        ObjectInputWrapper in = new ObjectInputWrapper(_in);
        VERSION = in.readInt();
        number = in.readInt();
        if(VERSION>=1) {
           if(VERSION==1){
                  in.readObject();
           }
           version1Data = (String)in.readObject();
           if(VERSION==1)
       in.readObject();
           }
        }
        if(VERSION>=2) {
            in.readObject();
            version2Data = (String)in.readObject();
            in.readObject();
        }
          name = (String) in.readObject();
    }
"Test{VERSION=2, number=10, versionData1=versionData1, versionData2=versionData2, name=praveen}"
The skip data version can safely be ignored in future versions. It will be removed in write first and also from read on subsequent versions.

The 2 key classes used to implement are explained here.

The ObjectInputWrapper class will have read method implementations for each basic data type. The sample below shows just two methods reading and readObject for illustration. For real implementations, we need to create final ObjectInputReaderTemplate classes for each basic data type and reuse them in the read methods of ObjectInputWrapper & they can be cached to avoid lots of object creations. The profiling of this code showed that these light weight are pretty cheap to handle, JVM will make sure that there is overhead in terms of memory or the cpu by in lining them automatically.

This class makes use of generics and anonymous inner classes to move the cursor to skip the new information that version N was un-aware while reading.

One sample code having all the classes published in github

Saturday, August 20, 2011

Caching Exceptions in server applications -

In web applications it’s common that response page of certain types can be cached for given http request, similar approach could be used to cache the exceptions in the server side applications as well. If the reason for any exception known upfront that it will be applicable for given amount of time, there is a opportunity to cache these to reduce the load on resources. This could be huge gain in cases where a particular piece of code is doing heavy operations like cpu intensive tasks, making remote calls, Database/File manipulation etc..

Here is a solution making use of Java generics.

For Ex:

public static int mockHeavyOperation(String someData) throws FileNotFoundException,Exception{

Thread.sleep(3000);//simulating some heavy operation

if(true) throw new FileNotFoundException();

return 10;

}

Can be converted into

public static int mockHeavyOperationWithCacheExceptionHandler(String someData) throws FileNotFoundException{

return new ExceptionCacheTemplate<Integer,FileNotFoundException>(){

@Override

public Integer handle() throws Exception {

Thread.sleep(3000); //simulating some heavy operation

if(true) throw new FileNotFoundException("Checked Exception");

return 10;

}

}.runIn(new ExceptionCacheTemplate.ExceptionKey(someData,40000));

}

As we can see from above code extra 4 lines are doing the trick.
If we run the first operation 10 times it takes 30000 millis where as 2nd example takes only 3000 millis, that's a huge gain I suppose.
1. The input information that resulted in exception can be stored as part of Exception Key class. It can be any class including String as shown above as long as it implements equals/hashCode(). runIn method works directly with String as well. If any Exception re-appears within the interval 40000 as shown above the Exception will be thrown from cache.
2. Only the exceptions that are included as part of the generic exceptions,
3. A custom map can be provided to manage the expiring of cached exceptions, Default implementation will be managing through the value provided while creating cache key.
"public Map getCache()" can be overridden to provide different implementation
For example Google's Guava library provides awesome fluent interface to manage expiry of keys.

Map<Key,Graph> graphs = new MapMaker()
       .concurrencyLevel(4)
       .weakKeys()
       .maximumSize(10000)
       .expireAfterWrite(10, TimeUnit.MINUTES)
       .makeComputingMap(
           new Function() {
             public Graph apply(Key key) {
               return createExpensiveGraph(key);
             }
           });

It can be handled at the framework level making it completely transparent to the applications.

ExceptionTamplate is available in github

Sunday, July 17, 2011

Notes on TDD and Unit testing in general.
Most of the benefits of TDD are unquantifiable, Its tough to correlate better design, better ability to refactor the code to TDD practice. It requires a real practice to experience and appreciate the productivity benefits. When we are working with legacy code and is not easily testable, advocating TDD there is counter productive and refactoring such code to suit TDD is a hard sell & taking up such task is usually a thankless exercise.For green field projects TDD should be applied without a second thought, to my mind its not debatable anymore.

Here are some notes that I collected can be used to sell TDD during the discussions. For me TDD as effective tool of "COMMUNICATION" is the single most important point that should to enough to employ TDD. The lesser defects as result of clear executable communication should make all the stake holders namely Developers, Managers, Testers, Clients and finally the real users of software happy.
Tests prove that your code actually works resulting in fewer bugs, if we catch all the scenarios. Testing before the code is a design activity.These tests can’t replace system and acceptance testing,but they do supplement it & also fewer bugs that make it to QA.
We can improve the design without breaking it. Having unit tests in place, we can do powerful refactorings that can untangle the most challenging of system psychoses.Refactoring not only becomes cheaper, it changes the developer mindset to strive for better quality. When refactoring becomes cheaper, the quality of software continuously improves.Fixing PMD and FindBugs errors should not require management approval.
Unit tests are a way to make programmers have documentation as they hate to write a MSWord document & is a project management technique. So when we can’t remember how to use a class APIs, read the unit tests to find out.MS Word documents are not meant to be compiled and deployed, if we can avoid as much as possible, its good for everyone.Unit tests reduces the communication pain points. Its a shared language.You know what your code needs to do. Then you make it do it. Even if you don’t have a working system, you can see your code actually run and actually work. You get that great “I’ve done it!” feeling. Developers can avoid nagging questions "Have you written it", "Have you tested it" & finally "Have you really tested it". If tests are written and published, developer can always go & check from web tool.
Just try test-first if you want to be high on endorphins, proud about your work, and motivated to do more.They demonstrate concrete progress. You don’t have to wait a month for all the pieces of the system to come together. You can show progress even without a working system. Not only can you say you've written the code, you can actually demonstrate success. Of course, this is another distinction that traditional programming teaches us to ignore.
“Done” doesn’t mean you’ve written the code and checked it in.“Done” means the code actually runs in the system without bugs. Running unit tests is a step closer to the latter.Unit tests are a form of sample code. We all encounter library functions and classes we don’t know how to use and one of the first places we go is the sample code. Sample code is documentation. But we don’t usually have samples for internal code. So we’re left shifting through the source or through the rest of the system.

Test-first forces you to plan before you code. Writing the test first forces you to think through your design and what it must accomplish before you write the code which results in better code. This not only keeps you focused, it makes for better designs.
Test-first reduces the cost of bugs. Bugs detected earlier are easier to fix. Bugs detected later are usually the result of many changes, and we don’t know which one caused the bug. So first we have to hunt for and find the bug. Then we have to refresh our memories on how the code is supposed to work, because we haven’t seen it for months. Then finally we understand enough to propose a solution.
Anything that reduces the time between when we code the bug and when we detect it seems like a obvious win. We consider ourselves lucky to find out about bugs within a few days, before the code is shipped to QA or to customers. But how about catching them within a few minutes? That’s what test-first accomplishes with the bugs it catches.It’s even better than code inspections. Code inspections, they say, are better than testing, because using them to detect and fix bugs is cheaper than testing. After the code ships, it’s much more expensive to fix the bugs. The earlier we can detect and fix bugs, the easier and cheaper and better. That’s the advantage of having code reviews.Code inspections catch more bugs within a few days, rather than a few months, but It virtually eliminates coder’s block. Ever wonder what statement to write next? Like writer’s block, coder’s block can be a real problem. But test-first systematizes the structured part of coding, allowing you to concentrate on the creative part. You may get stuck on how to test the next bit or how to make the test pass, but you’ll never be left puzzling over where to go next. In fact, usually you’re left with the opposite problem: You know you need to take a break before you burn out, but you’re on a roll and don’t want to stop.Failed tests make better designs. Testing a piece of code forces you to define what that code is responsible for. If you can do this easily, that means the code’s responsibility is well-defined and therefore that it has high cohesion. And if you can unit-test your code, that means you can bind it as easily to the test as to the rest of the system. Therefore, it has loose coupling to the pieces around it.High cohesion and loose coupling is the definition of good, maintainable design. Code that is easy to unit-test is also easy to maintain.It’s faster than writing code without tests! Or to put it another way, skipping unit tests is faster, unless you actually need the code to work.
Most of the effort we spend on code, we spend fixing it after we’ve checked it in to the source-code repository. But test-first eliminates much of that waste by allowing us to get more of it right to start with and by making bugs easier to fix.
One of the real value proposition of unit tests is that we get a low-level regression-test suite,we can go back at any time and see not only what broke but where the bug is. Arguably with many frameworks around it’s a low-effort way to catch bugs before the build goes off to QA. Whenever a bug comes it will make life lot easier without the need of using a debugger many a times.Test-first catches some bugs within a few minutes instead of a few days. It is even cheaper than code inspections, code reviews.

Monday, January 17, 2011

2011 - My Technical Goals

Make JavaIQ.in useful & interesting to Java developers

My child-hood ambition has been to become writer and editor to a magazine. My obsession with politics/history/technology, I guess hasn't helped me to make money, but it made me always feel complete and happy. Not sure by end of 2011, I will make JavaIQ.in a respectable online magazine, but I will definitely give my best shot at that. I also would like to refine my applications that I created as my weekend projects & publish them in this site.

Android - Develop mobile applications

My guess is that developing applications for smart phones, iPads will be the "big bet" in coming decade and Android is likely to lead in this area of innovation. My decade java programming experience should help. btw, I bought my android phone this year.

HTML5/Java Script - Learn it's effective usage

Some how I never liked Flash/Flex. Although I tried JavaFX (which apparently was waste of effort), never spent significant time with Flash. I hope create awesome applications with HTML5. JavaScript has been excellent lanaguage I hope to do plenty of coding using YUI, JQuery and GWT/JSNI.

Play/Grails - Developing web applications in simpler way.

The dumb struts & its similar web frameworks will have to be replaced with these awesome frameworks. I never had such a first time nice experience with any framework compared to Play. I intend to teach and develop few web applications with that this year.

JVM lanaguages - continue to invest in learning

I will continue to invest in learning Groovy and Scala & continue to look for places where I can use these to accomplish the tasks.

Happy & prosperous new year to all.

Praveen Manvi's Technical Diary

Friday, September 30, 2011

Saturday, August 20, 2011

Sunday, July 17, 2011

Monday, January 17, 2011

Praveen Manvi

Archived

Labels

My Recommended Books

Gurus