Musing about Markup

With regard to Joe’s recent

post about the deficiences of XML, I have something of a counterpoint to

offer. XML was invented as an attempt to unify and simplify data interchange

between disparate systems. This had been attempted before, but the efforts

never gained sufficient momentum to achieve general acceptance.

XML is a subset of SGML, which has been around for a number of years. SGML is

also the language from which HTML is derived. SGML itself is very complex, as

it includes all sorts of mechanisms for defining domain-specific dialects (such

as HTML and XML). XML was released on the back of the general and massive

uptake of HTML, and was similar enough to HTML to be explained as ‘HTML that

computers can understand’. Part of the reason for XML’s success is the huge

surge in popularity of the internet and its promise of global connectivity, part

is due to its design. XML is simple and formal enough to be relatively easy to

design parsers for, while being flexible enough to describe most types of data.

Developers were also used to dealing with HTML style markup. This combination

of factors probably accounts for XML’s huge popularity. The biggest hurdle for

any attempt to standardise on a data interchange format was always going to be

garnering enough general support to make it the ‘de facto’ standard.

There is always more than one way to do things, and XML may not be the prettiest

or the best, but the details of its design are probably less important than the

fact that it succeeded in its goal of achieving a standard means of describing

data that was easy to pass around between otherwise incompatible systems. Now

that we have come to expect easy data exchange, we are free to explore

improvements, but we wouldn’t be in this happy position were it not for XML.

Self referential meta blogging

Skimming over some of my old posts, I can tend to spot which ones were written

from home and which ones from work (it helps that I also remember writing

them!). Generally speaking, the more ‘off-topic’ and emotive posts tended to be

written from home. It seems that being at work causes me to put on my

‘professional’ hat, while at home I’m more likely to just bash out whatever’s on

my mind at the time. Interesting.

More interesting matters

As promised…

XDoclet 1.2 beta looks good. I’ve been trying out the castor and servlet tags today. It makes it a lot easier to evolve the design when you don’t have to keep changing your mapping file. It was the work of moments to throw together a couple of beans and collection objects and dump them out to XML. Nice.

Still not sure what to do with regard to persistence. I don’t need a relational database (or the hassle), just something quick’n’easy. I toyed with the idea of just storing the raw xml in Lucene, indexed on the various fields and attributes, but I need to look into the querying side a bit more to see how easy (or even possible) it would be to construct queries like ‘select * from documents where date is between 10-OCT-2002 and 20-OCT-2002’. I have a feeling this may be difficult, and probably isn’t the best use of a search engine anyway.

Things to check out further:

Any other suggestions out there?

Bugs

It turns out my radio problem is a known bug. Don’t know why I got so annoyed, other than the fact that most of the times I break something out of curiosity the only person who suffers is me, and most of the time I can dig around the source until I know enough to fix it. Or simply roll back. Its a little hard to hide it if you break your blog.

Found a fix on google, so this will be my last rant on that subject. Back to more interesting matters.

Why I hate proprietary formats

I’ve broken radio. All the so-called ‘dynamic’ links have frozen pointing to ‘www.darrenhobbs.com’ after I played around with the upstream via ftp option yesterday. I aborted that idea after finding out that turning on ftp switched off the normal upstreaming to my radio account. Now I’ve found out that if you ever use the ftp option you can’t apparently ever go back. Thank you Userland.

This wouldn’t have annoyed me if I could go in and fix the problem, but I can’t find the reference that the macros are using, leading me to believe it’s buried somewhere in one of the mysterious .root files, which are, naturally, binary.

Now nobody new will be able to subscribe to my RSS feed until I hard code the links back to what they should be.

And to think this morning I’d just about decided to stick with radio for the time being, to minimise the inconvenience to my rss subscribers. Thank you for making the decision for me, radio.

Looks like I will be moving blog software after all.

Why leave radio?

Another leaves radio land. It’s looks as if more people are leaving Radio behind. The Desktop Fishbowl has left its old Radio shanty to… [<big>kev’s</big> catalogue of this and that.]

I’m not sure what the incentive is (even though I’m now feeling it myself) to leave Radio. Maybe its developer masochism, that ‘not invented here’ feeling that because I didn’t write it / its not in my pet language, it must be bad for me. Maybe we feel vaguely guilty about using a ‘consumer’ product. We’re geeks, we’re supposed to use two or three arcane command line utilities and a dodgy perl script to achieve the same thing as a normal person using a pointy-clicky GUI app. Or something.

Maybe once the blogging addiction bites then you start wanting to add your own features, which is far easier to do when you have the source in front of you, in a programming language you’re comfortable with.

Going live

Woohoo, www.darrenhobbs.com is up and running. I can feel my ego expanding. Currently its just a static copy of my radio blog, and there may be some to-ing and fro-ing, but now I’ve got Orion running almost anything could happen. First plan is to put all my Lucene studies to work and get some of that full-text search action going on.

As soon as I get my head out of this door.