Sunday, August 02, 2015

Java-8 For Python Developers


Abstract

This write up I wrote for providing session on java programming semantics for python programmers.The theme was to convey that Java (especially Java8) is not as verbose we (python programmers) think :)

Java 8 release is touted as biggest release since its inception in terms of paradigm shift with the support for functional programming. Given the backward compatibility burden, Java engineering team has come up brilliant effort with seamless integration with functional programming. Most importantly adding power to all the existing old single method classes and enabling java collections with stream classes. Boiler plate code is going to be reduced drastically. This write up on "Java 8 for Python developers" is for  developers who are well versed python and familiar with basic java syntax. This content should provide enough orientation to navigate the Java ecosystem's libraries, tools and programming semantics.

As this is most the significant release adding support to functional programming, many of the previous assumptions (with verbosity) on Java language deserves a re-look. Java is a platform that supports multiple languages and surely ecosystem is bigger compared to python, so here the focus is only on language semantics. Its really hard to be popular for a long time for any language. I guess java has re-invented well and is giving tough fight to other languages despite its age and baggage.

In this I will go through over-view and in the next series we will see how new Java-8 features like Lambdas result in less boiler plate code (Method Reference,Concurrency), new annotations  improves decorator coding and how many of the popular external libraries (Guava, apache commons, GS Collectioins, Joda time) can be removed (its obsolete as most of these features now available with JDK itself). The improved java script engine as part core JDK helps to utilize the existing java script code can make code as closer its python counter part in most of the cases.

Topics

------------
History
------------
Java:
Created by Canadian 'James Gosling' in 1996 influenced by Oak/C++. This is supported by corporate (Sun/Oracle) and is regarded on #1 language/platform for developing all kinds of application.

Python:
Created by Dutch 'Guido van Rossum' in 1994 influenced by ABC. Its pure community driven project although its creator is called "Benevolent Dictator For Life" and regarded as most developer friendly languages.

Both are object oriented (classes, encapsulation, inheritance, polymorphism) but python is more scripting/functional friendly language.  Python is more object oriented in the sense that it does not have primitives and treats everything including functions as object. Java still deals with low level primitives and hence more perform-ant.

---------------------
General Mapping
---------------------


Python
Java
.py
.java
.pyc
.class
python.exe
java.exe + javac.exe
pypi
Maven central repo
PyCharm
IntelliJ IDEA
pip
maven
requirements.txt
pom.xml/gradle
PYTHONPATH
CLASSPATH
CPython
HotSpot/JIT
Jython/IronPython/PyPy
JRockit/oracle jdk/Dalvik



Java is a statement oriented language. You write a statement, and when it executes, it has an effect. REPLs by contrast are expression-oriented: You write an expression, and the REPL shows the result,like a calculator. Hence REPL was not super useful with java. (I remember hearing this from  scala's creator Martin Odersky )


Definitely REPL(Read Eval Print Loop) will be greatly missed by python developers. But if you use the jython/groovysh, they serve the purpose.  It should give the same feel of interactive mode development with REPL.
REPL for java coming with Java9 looks impressive.

For ex:
D:\jython2.7.0\bin>jython
Jython 2.7.0 (default:9987c746f838, Apr 29 2015, 02:25:11)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_25
Type "help", "copyright", "credits" or "license" for more information.
>>> 1+1
2
>>> from java.util import Arrays as A
>>> numbers = A.asList(3,4,2,5,1)
>>> print numbers
[3, 4, 2, 5, 1]
>>> from java.util import Collections as C
>>> C.sort(numbers)
>>> print numbers
[1, 2, 3, 4, 5]
>>>

In general python has linux orientation and code quality is much better than corresponding java code because of the people involved. Its favorite with sys admins and scientist community. Lot many awesome libraries exist in python compared to any other language, Its a great glue language i.e. the performance intensive stuff can easily moved to native (C++) and python has the best interface mechanism to include. As I was listening to Jessica, a great Python evangelist on python future - the focus is to get python working equally well in "windows" and "mobile"

-------------------------
Language semantics
-------------------------
In Java, all variable names (along with their types) must be explicitly declared. Attempting to assign an object of the wrong type to a variable name triggers a type exception.That’s what it means to say that Java is a statically typed language. In java you have to declare variables with their type and use '{' '} for scope. Python does not explicitly type parameters (python 3 allows notation optionally) and apparently google moving their legacy to python 3 mainly because of types. If we see carefully, type checking is actually winning, people are dying to type. "You don't have to type" is not really a cool thing.

In Python if a name is assigned to an object of one type, it may later be assigned to an object of a different type. That’s what it means to say that Python is a dynamically typed language.

In Java there are access specifier (private, public, protected) for encapsulation, but there is  nothing like in python but follows a convention with '_' for making developers to stay away from using it. 
Because of private variables there are lot of get/set verbose code for all fields in a class. Python solves this problem pretty well making it develop on demand basis (get_attr). But sealing of fields/methods with 'final' serves a great tool to communicate design.
In java getters/setters are required because public field don't give opportunity to go back and change. To be fair to have there is builder patterns and libraries like lombak and immutables that makes it type safe and faster to implement as well.

Java uses method overloading heavily, Python does not have overloading of method names, with default values its not necessary for python.

In java there is support for 'switch' statement its a great replacement for huge if/else or the dictionary in python to declare and work with multiple alternate flows.

-------------------------------------
Basic Data structures and usage
--------------------------------------

Some of the basic data types in Python are numbers, strings and collections. The collections are natively supported and hence makes to breeze to work with them.
Java has huge eco system for designing java collections (Interface, Class and Abstract class)  and separate set of classes to support primitives.(long,int,double ...).

List
numbers = [1, 2, 3, 4, 5 ]
ArrayList is usually replacement for mutable lists
List numbers = new ArrayList() {{ add(1);add(2);add(3);add(4);add(5); }}

Immutable List
tuple = (1, 2, 3, 4, 5 )
There is no equivalent of of tuple but there are immutable list implementation like
Arrays.asList() or Collections.unmodifiableList(), mutating is stopped at runtime.
List tuple = asList(1, 2, 3, 4, 5);

Map
called as dictionary
numberToText = { 1:"one",  2:"two", 3:"three" } 
HashMap is the equivalent to the dict jack of all trades.

Map numberToText = new HashMap<> {{ put(1 :"one"); put(2:"two");put(3:"three"); }}

Similarly set() translates to HashSet<>()

"Java has huge of set of data structure classes in JDK and other external libraries dealing with concurrency and performance probably unmatched compared to any language including python"

But dealing with map (dictionary) could be single most important feature missed by python developers. Probably most important and used dictionary python makes it breeze but definitely java makes it very painful.

String
Like in Python, strings are immutable in java. In python String values can be wrapped in either single or double quotes. To differentiate between vanilla ASCII strings and Unicode strings, Python uses the u prefix to denote the latter.

Multiline strings will be greatly missed by python programmers in java. Its painful and ugly to define large strings in java.

Here are the list of features that are missing in java (or how they could be acheived in java)

------------------------
comprehension (List ...)
------------------------

>>> numbers = [1,2,3,4,5]
>>> print numbers
1, 2, 3, 4, 5]
>>> even_numbers = [n for n in numbers if n%2==0]
>>> print even_numbers
[2, 4]

We could divide above into following blocs.

|final_data|  = [|conversion|   |enumeration|   |predicates| ]
Java don't have anything like native list comprehension this but with lambdas comes close with one liner, but no-where near its python counter part.





[TODOs]

--------------------------------------------------
Returning multiple values from a function
--------------------------------------------------

------------------------------------------------
sequential substitution of actual/formal parameters
-------------------------------------------------

-------------------------------------------
Defining functions inside a function
-------------------------------------------

-------------------------------------------------
@staticmethod, @classmethod, decorator
--------------------------------------------------

-----------------------
Meta programming
------------------------

-------------
Generators
-------------


References:
https://dzone.com/storage/assets/4018-rc193-010d-python_2.pdf
oracle has comprehensive documentation available compared to any language.
Java collections overview:



No comments:

Bookmark and Share