Designing Good API and Its importance

Recently I took a technical session on "Designing Good API and its importance" at BASIS SoftExpo 2012. I was introduced to this topic by Tech Talk of Joshua Bloch and my presentation is heavily influenced by his talk and a humble tribute to his great, inspiring, motivating and illuminating session. The slides are as follows:

The original TechTalk of Joshua Bloch is as follows:

Criticisms of my presentation are most welcome.

RESTful Web Services

I have come a long way in learning, understanding and imlpementing RESTful (at least RESTlike) systems in real life. From that experience of mine recently I made a technical presentation at BASIS SoftExpo 2011. The response seems to be good. The slides are as follows. Criticisms, suggestions are most welcome.

Using GNOME Blog

I was looking for a client to work offline and post my entry, rather than writing it from a browser. I did a simple apt-cache search and found GNOME Blog (gnome-blog). So now testing it out. Lets see how it turns out; first impression is rich text edit capabilities is very limited!

RESTful Architecture

The title is definitely making some readers think why is someone again writing on this topic. What inspired me to write again after a long time is the confusion between "RESTful Web Services" and "XML over HTTP". I came accross this confusion while working on a framework of mine and later I will mention how it interested me. So I would like to clear the confusion, as per my understanding, on RESTful Architecture.

First lets see how Dr. Roy Thomas Fielding in his PhD dissertation defines REST,

REST is defined by four interface constraints: identification of resources; manipulation of resources through representations; self-descriptive messages; and, hypermedia as the engine of application state.

In this writeup I would like illustrate how Dr. Fielding illustrates this and my (as a scholar of REST) understanding on the matter.

As the REST wikipedia page mentions at the center of the RESTful architecture is "Resource" and I fully agree with it. So I would like start by defining what is a resource. Dr. Fielding defines it as,

...any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time.
Now lets pick a common concept to build the idea of resources. Lets take a book sales aggregator service (similar to Google Checkout); it sells books written by one or many authors and published by one or more publishing house and sold at different prices by stores and are in different state (new or old) and different descriptions to go with them.
The obvious resources here are Book, Store, Author and Publisher and I would have BookState be a resource as well, it is comprised of price and description. Cardinality would be Book has many Stores, Authors and Publishers; Store has many BookStates. I would also like to make BookExcerpt a weak entity of Book. So according to Dr. Fielding's definition and my understanding of resource all the objects in OOP (in italic) are resources.

Before going into the nature of representation lets see what Dr. Fielding has to say about constituents of a resource -

The values in the set may be resource representations and/or resource identifiers. A resource can map to the empty set, which allows references to be made to a concept before any realization of that concept exists. ....... The only thing that is required to be static for a resource is the semantics of the mapping, since the semantics is what distinguishes one resource from another.

The first part of the statement actually signifies an advantage of RESTful architecture and that one may develop individual resource components independently. The second part represents a constraint on availability of definition of how resources are inter-related . So in our example the cardinal semantics between the resources/objects should be static. Now that we know about resources next thing to visit is the representation of resources and Dr. Fielding's take on this is,

A representation consists of data, metadata describing the data, and, on occasion, metadata to describe the metadata (usually for the purpose of verifying message integrity). Metadata is in the form of name-value pairs, where the name corresponds to a standard that defines the value's structure and semantics. Response messages may include both representation metadata and resource metadata: information about the resource that is not specific to the supplied representation.
Control data defines the purpose of a message between components, such as the action being requested or the meaning of a response. It is also used to parameterize requests and override the default behavior of some connecting elements. For example, cache behavior can be modified by control data included in the request or response message.

The representation of a resource in different media type format (referred to as data, while the media type itself is a metadata) is just part of the architecture and not architecture itself. The messages should be self-descriptive, the representation format for the resource should respect other resources as they are and the manipulation of all resources and their relations to other resources should be done through transition of states. So if I want to change the cover image of a Book resource I should be able to do it using a PUT command of HTTP.

Now here is where the confusion lies - lets say I am using Java on the server to respond to the requests. If I am using Jersey+JAX-RS stack and application/xml as media type, there is a good chance that I am also using JAXB (not that its mandatory, one can use their on converters, known Producer and Consumer) for converting your resource object to XML; I will see that Book is converted to a XML which contains the representations of authors and other related resources and sub-resources within them. This creates multiple problems, e.g.,

  • Why do I need to get all entity sets of a resource and its sub or related resources, while I do not need them? Which will cause an additional and not-required payload to be submitted over network.
  • How do I manipulate any related resources of book without actually transmitting the whole book, e.g. author(s)?
  • How do I sent different set of control bytes for different resources linked within the Book?

IMHO, here is where the principal of REST is violated, unless the answers to the above like questions are positive. So if we are to design a RESTful API we should consider not only the representation part but also the manipulation and saying so, IMHO, using different HTTP commands (or equivalent for other protocols) is not sufficient but actually designing the resource semantics such that it can be manipulated partly and totally (depending on requirements and other factors) is a must as well.

Another important component of the architecture, which is not mentioned in the definition of REST (probably because it was reason of REST coming to existence) is Uniform Interface. The only thing it does not mention explicitly mentioned as a constraint is that the representation format also should be uniform or conform to established standards, so that its readable in a uniform manner by the clients, e.g. how resources are linked - in HTML it should be <link>, in XML it should be XLink etc.

So here is my checklist for building a RESTful system:

  • Identify resources and their identifier. Deduce the problem domain well to divide them to as many independent component as possible.
  • Choose the states well for transiting between resource states for the specific protocol. E.g. if protocol is HTTP you might map the methods as mentioned here; that is not enough, HTTP headers and response codes should be choosen appropriately; for example cache control, compression.
  • Resources should refer to individual resources by their URI; state transition for relation between resources might (not sure but IMHO must) be implemented as well.
  • The representation format should follow established norms. E.g. XML representation format should have a DTD or XSD as a metadata to the representation.

In context of our book sales aggregator example I want to elaborate my understanding of some of the above mentioned checkpoints. Firstly we have (tentatively) identified our resources; next I would want to create useful identifier for them. For Book it would be , for Author it would be email address, for Publisher it would be their name in same formatting as a blog or wiki page title and so on. In my XML representation of book, I would have a element encapsulating all authors, it would in fact be <authors>; and every <author> in would refer to the author resource using XLink. I would have the following for updating the author relation with book - /books/{ISBN}/authors/[slot/{index}]?. Now lets say I want to get the 1st Author for Book A, then my URI would be /books/A/authors/slot/0; the response message could be either a XML pointing to the Author's URI or (my choice) use HTTP response status code 302; also I would make use HTTP compression using the headers. IMHO rest of the checkpoints are trivial.

I would like to end this writeup by stating why I came up with this post . I am working on the framework stated at the beginning to make infrastructure code for common tasks, such as versioning, full text search, DAOs and representations of domain objects in various media type available out of the box. For versioning and representing domain objects I could not find a better way than REST. I am also working on another project which needs Web Services to interface with clients; my past experience with SOAP was sour enough for me to look into alternatives and I chose RESTful WS, but later realized it was nothing but RESTlike WS or XML/JSON over HTTP. Furthermore this writeup just expresses my understanding and in some case thoughts of RESTful architecture, so if you feel I have misunderstood or made any mistake, feel free to correct me :). If you are interested in more details of implement REST over HTTP please check this out.

Upgrading from Ubuntu Fiesty Fawn (7.04) to Hardy Heron (8.04)

Since I have procured my laptop I have been using Ubuntu Fiesty Fawn (7.04) on it and I was more than satisfied until its support period expired on Jan 2009; I was compelled to upgrade and I did not want to take the hassle of upgrading again in the lifetime of this laptop so I decided to upgrade to Hardy Heron (8.04 LTS) (via Gutsy Gibbon). This writeup is my experience on this upgrade procedure.

My laptop is a Acer 5585WXMi, with GeForce Go 7300 VGA card, 802.11a/b/g WLAN, 5-in-1 Card Reader and Bluetooth built-in. In Fiesty all devices were recognized perfectly other than the Card Reader, did not spent much time on it since it was not much used by me.

The first hitch I faced was I found that Fiesty repository aren't there anymore so I disabled them from APT source list (/etc/apt/source.list or package manager). Then I started upgrading to Gutsy. During package installation it was trying to modify some configurations that I had earlier edited for my usage and also some which I had never touched :), so I looked at the difference (both upgrade from console and UI has it - 'd' from console) and took appropriate action. Most cases I let it override, the exception was MySQL and some other packages. The first problem I faced after upgrading to Gutsy was the display as it was stuck with 640x480 resolution and it was quite frustrating, but before spending more time on it I moved on to upgrading to Hardy.

During upgrading to Hardy all went smooth except for Slapd upgradation failed because the slapd database had data from unconfigured domain which was okey @ Fiesty and Gutsy. So after installation I manually restored the data from the backedup LDIF file and Slapd configuration was successful. But VGA was a pain as usual :(. Cutting the long story short to get the VGA to work all I had to do was enable the restricted NVidia driver and delete the /etc/X11/xorg.conf (obviously after backing up), followed by OS reboot and guess what now I can enjoy 1280x800 resolution :). Now a side note - I got the 'vesa' driver to work with 1024x768 but the problem was everything was vertically squeezed and it was depressing especially to view the photographs I took myself. Plus I also ran into problem using MSN & GTalk with Pidgin because of SSL lib issue (I am not certain why, but it could be related to my attempt to install pidgin on fiesty from source without SSL). So I uninstalled pidgin related packages and installed it from source by the help of this post; please add '--with-system-ssl-certs=/etc/ssl/certs' option during build else you will face unwanted hazards with certificate chain. Another problem faced was the Flash plugin of Firefox was not producing sound. For that I simply set all devices @ System -> Preferences -> Sound to use ALSA (by default it is PulseAudio). Plus I needed support for 'docx' and for that I needed OpenOffice 3. So I uninstalled all OpenOffice packages from my installation and downloaded DEB packages from and installed them using 'dpkg'. After main DEBs are installed check the DEB folder for another DEB @ 'desktop-integration/', that will create the menu shortcuts for you.

For photographers and users interested in graphics package you will find 'ubuntustudio-graphics' package useful, one might also find other ubuntustudio packages (apt-cache search ubuntustudio) useful - it has audio, video, desktop etc. packages for enthusiasts and professionals.

In a nutshell, now my Hardy is totally under control and I am enjoying using it :), so if anyone has not upgraded to Hardy yet please do so :).

Why Python?

For long I have been wondering "Why Python?" - why is python gaining popularity? Why do many prefer python for all sorts of work? Myself being a Java fan (some of my friends say fan-atic) it makes it more interesting to me to understand the reason behind its uprising.

I am a software engineer mainly working on enterprise systems with Java. One of my hobby is to write scripts for anything that I (might) have to repeat in near future. I usually prefer shell scripting for writing scripts. But for some particular tasks, like converting seconds to ISO8601 formatted date string, multi-threaded http request and response handling with file i/o; I needed something extra for these as shell script was becoming overly complicated.

From all the buzz about Python I said myself let me give it a shot and check it out. I have to admit that I was astounded, stupefied , awed at how simple and powerful it was to achieve all that I needed for my use case. It took me just about to 3 hours to get Python 2.5, IDLE installed and write my first 2 programs which the do the following tasks-
  1. Take seconds as input from command line argument and print its ISO8601 equivalent.
  2. Take URLs as input and take 'n' samples for their response-time and total duration while all URLs should perform these tasks in parallel.
Not only have I learnt new languages earlier, I was also a Lab Instructor and Teaching Assistant for programming courses, from my experience to say that I will get these things done within 2 hours without having even read ANY article on Python or knowing nothing about it in the past, I have to say that its awesome (Please let me know if you feel I was slow). It was down to simplicity of the language compounded with fluent syntax and excellent documentation. Within the time mentioned I learnt and used Objects, Collections, Classes, Control flows (if, for), Exception handling, File I/O and Threads. If you are interested to checkout what I achieved you can checkout them out here.

From what I read later, I learnt that its equally simple to build UIs with Python and with some Googling I learnt of Google App Engine, which just makes developers life easy (at least thats what its there for).

I plan to do more work Python in near future and learn it and master it. If you have not yet tried Python, then I recommend to do ASAP and enjoy it.....

Setting up firewall in Ubuntu

Though I am not a server or network administrator I have always been interested in learning how to secure a network. From some initial reading I learned that firewall is the starting point and trust me when I say that with Ubuntu 8.04 Server (code-named 'Hardy Heron') its seriously easy to setup a firewall. This article of mine will attempt to show beginners like myself how easy it is.

I am assuming that readers will have Hardy Heron installed before embarking on testing out the firewall. Once its installed once install the firewall front end 'ufw' using the following command -
sudo apt-get install ufw
For details reading on it one may try the Ubuntu wiki or 'man ufw'. So now I can get started in doing what I wanted to.

We have a network at my place and I want to restrict SSH from IPs other than mine and not only that I also want to ensure that pinging my servers return nothing. Being a newcomer to network firewalling to me it would be quite nice to achieve it. In general what I have seen for SSH is, there is only one gateway for the outside world to SSH into a network and from there one can SSH to the servers one is permitted to. Now SSH'ng the Gateway could be made further challening by specifying a IP to achieve which one has to be connected to the network VPN. Does it sound complicated to achieve? After using UFW I am pretty confident its not that difficult to set something up like this and hopefully you will feel the same.

I will skip the VPN part as that is a topic of it self and hopefully will have a writeup on how-to set it up sometime soon; setting a VPN server is not that difficult either thanks to OpenVPN, so interested readers if required can jump into it. My target is to block ping and block SSH from any IP other than my designated range.

Once one has UFW installed, first step would be to enable it and to that use the following command -

sudo ufw enable

Once it is enabled and one wants to check the status, one can use the following to see it -

sudo ufw status

If one wants to enable logging one can do -

sudo ufw logging on

I suppose one can easily guess how to turn logging 'off'.
The next step would be to instruct firewall to allow SSH from a particular IP or IP range. One can use the following command respectively to allow if for the 2 cases mentioned above -

sudo ufw allow from to port 22
sudo ufw allow from to port 22

Now the obvious question could be why mention to IP address, its because a server may have more than 1 IP address and to mention which IP address this rule would apply to the to IP address is required. If you wondering how to calculate IP range you might want to have a look at the wikipedia page and IP Address range calculator.
Now one will need to ensure that default policy is deny and to achieve that issue the following command -

sudo ufw default deny

At this point I feel that I should also mention how delete a rule; its simply just add 'delete' before the start of rule definition. For example, for the rules of SSH one can issue the following commands to delete them -

sudo ufw delete allow from to port 22
sudo ufw delete allow from to port 22

At this point I was thinking since default is deny and I have specified only port 22 to be open for a particular IP range then ping should not work and thinking that I pinged the server but to my astonishment I got reply. Then I started to Google and I found this. Following it I commented out the following line from /etc/ufw/before.rules -

-A ufw-before-input -p icmp --icmp-type echo-request -j ACCEPT

And pinging again returned nothing to my liking.

Now with combination off OpenVPN and UFW one can easily achieve a somewhat securer environment; saying so I actually loved the statement of Linus Trovalds when he said security is build on network of trust in his talk at Google regarding GIT. I am also a newbie to secure networking domain so please feel free to drop by your comments on the issue. If you are wondering why would I use UFW you can have a look at the small discussion in the comments section of this posting.

Code generation made easy using patterns

Since starting developing some code generation plugins (toString() generator and Java Util Logger Generator) using/for the NetBeans 6.1Beta I have learnt a lot and the more I worked on it I kept asking myself how could I make it easier. Once both the plugins took rudimentary form I learnt and discovered that patterns could make this task whole lot easier for me. So I started re-implementing the Logger generator (if you are interested in getting the resources please check the logger blog) plugin.

Now I wanted to implement that the plugin will search for System.out.print(ln) and out.print(ln) if static import exists. In doing so I had to walk the PARSED tree of a Java Source file. Just to give me a flavour that I am implementing something cool I called it JavaSourceTreeParser. Basically what it does is breaks down each every Java Statement to a form which can not be further decomposed. In the initial version the conversion was done in the parser it self (Revision 21). Then it felt that I could easily use Observer pattern for it and after implementing it (Revision 31) I saw that I am right and I achieved something which every code generator can use. (As I use GIT SCM I usually work offline and make bunch of commits together so dont worry how I make multiple commits at a go :)).

Once implementing it I found that for performance improvement and communication between various listeners I felt the need of session for state information sharing and then I decided to use Composite pattern for the purpose. and it also worked like a charm and as a result improving the overall performance of the plugins. The following diagram might give an idea how to use it using the API I designed.

What I implemented in the listener simply that the listener will get notified whenever the parser come across a particular type of tree, for System.out.println it would be MethodInvocationTree (Kind.METHOD_INVOCATION), and the listener according to its implementation will handle the Tree; in my case I replaced the method invocation with a logger invocation. In case where I inserted a Log method at the beginning of every method body I listened to Kind.METHOD. Now I am implementing Log insertion for Throw, Return and End of Method block. In process I will also add parsing life cycle listener to the API so that modification to the class is done only once when the parsing is completed. I hope this API helps users interested in code generation. I am very interested to learn what users think about it.