Saturday, January 29, 2011

Spring transactions readOnly - What's it all about

When using spring transactions, it's stated that using 'readOnly' provides the underlying data layers to perform optimisations.
When using spring transactions in conjunction with hibernate and using the HibernateTransaction manager, this translates to optimisation applied on the hibernate Session. When persisting data, a hibernate session works based on the FlushMode set on the session. A FlushMode defines the flushing strategy that synchronizes database state with session state.
We look at a class with transactions demarcated as follows
@Transactional(propagation = Propagation.REQUIRED)
public class FooService {
    
    public void doFoo(){
        //doing foo
    }

    public void doBar() {
        // doing bar
    }
}
The class Foo is demarcated with a transaction boundary. And all operations in Foo will have the transactional attributes specified by the @Transactional annotation at the class level. An operation within a transaction (or starting a transaction) would set the session FlushMode to AUTO and is also identified as a read-write transaction. This would mean, that the session state is sometimes flushed to make sure the transaction doesn't suffer from stale state.
However, if in the above example, we'd have the doBar() simply performing some read operations through hibernate, we wouldn't want hibernate trying to flush the session state. And the way to tell hibernate not to do this is through the FlushMode. In this instance the above example turns out as follows
@Transactional(propagation = Propagation.REQUIRED)
public class FooService {
    public void doFoo() {
        // do foo
    }

    @Transactional (readOnly = true)
    public void doBar() {
        // do bar 
    }
The above change in doBar() forces the session FlushMode to be set to NEVER. This would mean that we wont have hibernate trying to synchronise the session state within the scope of the session used in this method. After all it would be a waste to perform session synchronisation on a read operation. One thing to note in this configuration is that we are indeed spawning a new transaction. This happens by applying the @Transactional annotation.
However, this is only true if doBar() is called from a client who has not initiated or participated in a transaction. In other words, if doBar() is called within doFoo() (which has started a transaction), then the readOnly aspect wouldn't have any affect on the FlushMode. This is due to the fact that @Transactional uses Propagation.REQUIRED as the default propagation strategy and in this instance it would participate in the same transaction started by doFoo(). Thereby not overriding any of its transaction attributes. 
If for some reason doBar() needs to still have readOnly applied within an existing read-write transaction, then the propagation strategy for doBar() would need to be set to Propagation.REQUIRES_NEW. This forces the existing transaction to be suspended, and create a new transaction, which also sets the FlushMode to NEVER. However once it exists the new transaction, it would continue the first transaction and would also have the FlushMode set to AUTO (Following the transaction propagation model). However, I cant think of a scenario which would need such a configuration though.

While the readOnly attribute can also provide hints to underlying jdbc drivers, where supported, the implications of this attribute can vary based on the underlying persistence framework (Hibernate, Spring JPA, Toplink or even raw JDBC) in use. The above details aren't necessarily true for all approaches, it's only valid in the spring+hibernate land.

Another way of looking at all of this is probably asking as to 'Do we really need to spawn a transaction and then mark it as read-only for a pure read operation at all'? The answer is probably 'No'. And 'Yes' its best to avoid transactions for pure read operations. However this can come handy in the approach one picks to demarcate transactions. For example take a service implementation that has its transactional properties defined at a class level, which is configured to be read-write transactions. If majority of the operations of this service implementation shares these transactional properties, it makes sense to have these properties applied at a class level. Now if there are couple of operations that are actually read operations, the readOnly attribute comes handy in configuring only those operations as @Transactional (readOnly = true). This is still not perfect in the premise of creating a transaction for a read operation. So on this example another configuration might be @Transactional (propagation = Propagation.SUPPORTS, readOnly = true), which would mean a new transaction will not be created if one does not exist (relatively more efficient) and also has the readOnly applied for other optimisations (possibly on the drivers).

It all boils down to the fact that spring has the readOnly attributes as an option to use. But, the when where and why to use it is entirely up to the developer depending on the design to which an application is built. So it is quite useful to know what the 'readOnly' attribute is all about to put it to proper use.

Monday, January 24, 2011

Too many open files

I've had my share of dreaded moments of the 'Too many open files' exception. When you see 'java.io.IOException[java.net.SocketException]: Too many open files' it can be somewhat tricky finding out what exactly is causing it. But not so tricky if you know what needs to be done to get to the root cause.

First up though what's this exception really telling us?
When a file/socket is accessed in linux, a file descriptor is created for the process that deals with the operation. This information is available under /proc/process_id/fd. The number of file descriptors allowed are however restricted. Now if a file/socket is accessed, and the stream used to access the file/socket is not closed properly, then we run the danger of exhausting the limit of open files. This when we see 'Too many open files' cropping up in our logs.
However the fix to the root cause will vary from what's been uncovered. It's easy if it's an error made in your code base (simply because you can fix your own code easily - at least in theory)  and harder if it's a third-party library or worse the jdk (not so worse if its documented though).

So what do we do when this is upon us?
As with any other thing, find the root cause and cure it.
In order to find the root cause in relation to this exception, the first thing that would be nice to find out is, what files are opened and how many of them are opened to cause the exception. The 'lsof' command is your friend here.
shell>lsof -p [process_id]
(To state the obvious the process id is the pid of your java process)
The output of the above could be 'grep'd to find out what files are repeated and is increasing as the application runs.

Once we know the file(s), and if that file/socket is accessed within our code its a no brainer. Go fix the code. This could be something simple as not closing a stream properly like so;
public void doFoo() {
    Properties props = new Properties();
    props.load(Thread.currentThread().getContextClassLoader().getResourceAsStream("file.txt"));
}
The stream above is not closed. This would result in a open file handle against filt.txt.
If its third-party code, and you have access to the code, you have put your self through the some-what painful process of finding out the buggy code in order to work out what can be done about it.
In some situations third party software would require increasing the hard and soft limits applicable to the number of file descriptors. This can be done at /etc/conf/limits.conf by changing the figures for hard and soft values like so;
*   hard   nofile   1024
*   soft   nofile   1024
Usage of '*' is best replaced by user:group for which the change is applicable to.
Some helpful information about lsof