ASM gains ground

CGLib now uses ASM instead of BCEL.

CGLib really is pretty cool. One of my colleagues, Chris, had a go at writing an enhancing classloader using CGLib, but ran into difficulties with final classes and classloader recursion. It ought to be possible, and that seems to me to be the most logical place to inject the enhanced bytecode. The rest of the application would then be completely unaware of the enhancement and could just instantiate classes in the usual way.

Thanks for reminding me

From the BileBlog:

The major flaw with dynamic mocks is that the return values are for all intents and purposes static. So for example, you can specify something like myMock.expectAndReturn(“getBlah”, “blah”); but cannot express something where the return value is not static. Why might you want to do that, you might wonder. Well, lets say you want to create a mock object for ServletContext, you’d want to be able to specify a Map that methods like getInitParameter and get/setAttribute work off of. Alas, this is impossible without much arm twisting and ugly hackery. You basically have to implement your own CallFactory which has all the extra magic in it. Maybe I’m missing something, but a less braindead solution to me would be to have the possibility of return values implementing some kind of interface, and in that special case, use a callback to determine what the actual return should be.

I have done exactly this, for the the purposes of erecting a fake servlet engine around some code that needed a more dynamic interaction with its environment than a straight mock object could provide. I’ve even been given permission to open source it, I just haven’t gotten around to it yet. I’ll see what I can do about getting it out there.

Concurrent coding for fun

Every now and then I like to get stuck into some hairy coding, just for fun. Concurrency and multithreaded objects are always good for a challenge. While waiting for a build yesterday I had a go at writing a Work Queue implementation, which is kind of like a service-orientated approach to thread pooling.

A work queue is simply an object that allows Runnables to be scheduled for execution and contains a number of threads internally that run in a loop, popping tasks off the queue and running them. Synchronisation is fun, as you have to properly manage multiple threads adding and removing tasks from the queue, as well as preventing both busy waiting (where threads spin in a tight loop, taking all your cpu but doing nothing) and excessive blocking (and of course deadlock). Proper use of wait() and notify() is essential.

Some thoughts that occurred: Threads live in methods. All method-level variables are per-thread. So while there might be many threads running the same method, from their perspective they are in glorious isolation. The places where threads go to meet other threads are an object’s fields, and wait() and notify() calls. This is where the fun lies. I tend to imagine calls to wait() as turnstiles, where threads queue up, waiting to be allowed through. When a thread calls notify() on an object other threads are waiting on, one of them is allowed to go through the turnstile and continue executing. Calling notifyAll() is like starting a melee, where all the waiting threads make a dive for the door. Only one can get through at a time, but they’re all going to try. The other important point is that although a waiting thread may have been notified (it’s pushing at the turnstile), it can’t get through until the thread that called notify() leaves the synchronized block. This is why calls to notify() are usually found at the end of synchronized blocks, or right before a call to wait(). One other thing I forgot to mention. When a thread hits a wait() call, it gives up the lock its holding on that object (remember that wait can only be called in a synchronized block), thus allowing other threads to run in that block (or other block synchronized on the same object). Clear as mud?

Some things to remember when writing multi-threaded classes:-

If you call wait(), you must call at least one notify() on the same object.

If you have to wait(long) to avoid deadlock, you probably forgot to notify().

Figuring out which object to synchronise on is important.

Nested synchronized blocks are really easy to deadlock. Avoid if possible.

Don’t synchronise your methods unless its to prevent concurrent modification to your fields. Methods can only be synchronised on their own object. Synchronised blocks can use any variable or field, which can get really hairy, but is much more flexible.

When thinking about threads, time is a variable. Threads (at the same priority) can run at the same speed, overtake one another, one could run from start to finish before the other gets moving, and anything in between. The only way to coordinate the activities of multiple threads is by careful use of synchronized, wait() and notify().

Oh, and forget that ‘synchronisation is slow’ myth. I once wrote a partially synchronized Map of Maps implementation where the main map was a regular hashmap and the submaps were synchronized. The theory was that in a multithreaded system that would allow greater concurrency. Turns out the cost of hashing twice was more expensive than using a simple synchronized map and letting the threads compete for it. Go figure.

Multicast multithreaded mayhem

Javagroups is cool. Multicast peer to peer messaging and remote procedure calls in a nice easy package. The documentation could be better – figuring out how to make it do stuff invariably involves a short stroll through the source. Good thing its open source, really.

One thing that did give me trouble was the supplied RpcDispatcher. This handy utility class makes it really easy to do RPC over multicast, upon which can be built such things as grid computing, or distributed failover. The flaw being that if you happen to make an RPC call that blocks, all subsequent RPC calls to that node will also block, and probably time out. Javagroups handles client-side timeout quite well, but the server side can potentially block indefinitely.

Fixing this meant I got to play with the deep joys of inter-thread communication and synchronisation. Even deeper joy was found when trying to test it. The only means of doing this I’ve found so far is to liberally sprinkle log statements all over the code, and watch the order they appear in, and when they stopped appearing as my code proceeded to block in all the wrong places. Maybe a log4j JUnitAppender class is required, that can assert that log calls were made with the right messages in the right order? Hmm.

O/R Mapping for fun and profit

Back on this old chestnut again…

  • Always use a synthetic primary key.
  • Never use a primary key with business meaning (in case my last point wasn’t clear).
  • Call your primary key something that can’t be mistaken for something with business meaning. ID is usually good. PK works too.
  • Don’t include the table name in any of your column names. They already know which table they belong to.
  • Use Longs as primary keys, and generate them from a sequence.
  • Use unique and ‘NOT NULL’ constraints to identify secondary keys (the ones that DO have business meaning).
  • Use hibernate.

O/R mapping is a compromise. Bending both your DB and your object layer slightly to acknowledge this fact is far better than bending one of them over backwards to avoid changing the other. Using synthetic primary keys is almost always a good move anyway, and helps object persistence a lot. Having a consistent primary key name and datatype makes automated persistence much easier (ie. code-generation).

XML is dead, long live… Lisp?

So I had a go at writing a Jython equivalent of an ANT script at the weekend. I reckon it should be possible, but the syntax and structure would have to change, which I wanted to minimize (cos if I can do that its conceivable that I could parse ANT’s XML and generate Jython automatically which would be cool). It might be possible to do something with Jython’s variable argument ‘*args’ syntax, but the output could look very odd.

The issue is in the nesting. How to make

<project>
<target name="dostuff" >
<copy>
<!-- copy stuff -->
</copy>
<javac>
<!-- javac stuff -->
</javac>
</target>
</project>

look like Jython? The closest representation would probably look a bit like:

foo = project(
target("dostuff",
copy(
srcdir="src", destdir="foo"
),
javac(
srcdir="foo", destdir="bar"
)
)
)

but it would get hairy with all of ANT’s optional attributes. I’m loath to admit it, but I think the closest direct language representation would be obtained by using Lisp or Scheme. Now that I know about SISC, that might be worth a try.

Alternatives to XML config files

Charles writes here about using SISC to configure Java applications at runtime. SISC is a Java based implementation of Scheme.

I speculated (at around the same time) about using Jython for the same purpose. Not only do these approaches allow very powerful runtime configuration of your application, but you are not limited by a schema or DTD. If a method is exposed, you can call it from your script.