On the nature of time in distributed applications

Tracking the sequence of events in distributed applications is tricky, especially when queues are involved. Messages can get backed up at certain points, arrive on dead-letter queues and be subsequently requeued, arrive out of sequence etc. This has the added effect that it may not be possible to accurately report on all the in-flight transactions at a given point in time until a certain period after that time.

Meanwhile, your application may have internal counters and time based identifiers that rollover periodically (at midnight for example). So how to reconcile temporally decoupled events with time-sensitive counters?

One approach is to go back to first principles, and use timestamps to tag messages at critical points (for example, the time they are created). Then when they arrive at their destinations, the timestamp on the message can be reconciled to the internal counter value that was in force at that time. This way, even messages that languish on dead letter queues and get subsequently requeued can be slotted into the correct place when they finally arrive.

This approach only works if your systems have a very precise agreement about what time it is. Thankfully there is a protocol that exists specifically for this very purpose, and is built in to both Windows and every flavour of Unix-like system, called Network Time Protocol (NTP). In Windows its sometimes controlled via the ‘NET TIME’ command or ‘w32tm.exe’ or ‘w32time.exe’ depending on your windows version.

NTP is a client-server based protocol, based on hierarchical tiers (NTP calls them strata) of servers, with the authoritative time servers being machines with atomic clocks, or other highly accurate time device. There are many public NTP servers out there, links to which can be found at the site mentioned below.

Its really pretty straightforward to setup an internal NTP server group, and as having a consistent enterprise-wide picture of time is so useful, should really be done as a matter of course when setting up distributed enterprise systems.

More information:

Notes to self about DB systems

Mostly for myself this one:

UPDATE and DELETE considered harmful for OLTP. Everything can be captured with INSERTion of deltas. (Ref: Journaling).

Timelines. Very useful.

Revisionism is bad. See note about UPDATE and DELETE.

Interesting parallel with source code control systems.

To recapitulate: Everything can be captured as a delta and a reason, including correction of errors. Don’t use UPDATE.

You’re welcome, Sun Microsystems

My first official Java bug:

From: Sun Microsystems
To: Darren Hobbs

Hi Darren Hobbs,

Thank you for using our bug submit page.

We have determined that this report is a new bug and entered the bug into our
internal bug tracking system under Bug Id: 5018137.

You can monitor this bug and look for related issues on The Java Developer
Connection Bug Database at:
http://developer.java.sun.com/developer/bugParade/bugs/5018137.html

Later… looks like it got fixed already in an internal build.

Anagrams

Spent a wet saturday afternoon trawling internet anagram generators for humorous rearrangements of my name.

The best I could find with just my first and last name was ‘Horned Barbs’, which I didn’t like very much. Including my middle name produced ‘Washboard Brine Mill’, which is suitably surreal and much more preferable.

Attention retailers

Why are your opening hours so well coordinated to ensure that those of us who work for a living to earn enough money to buy your stuff can’t do so because when you are open we are working for a living to earn enough money to buy your stuff?