Backups

My hard-disk started emitting a highly irritating whine today. I panicked and decided to back up my stuff. Windows XP Professional? Hardly. It took me several minutes of disbelief and some googling before I could accept that this supposedly flagship OS can’t back up to CD-R. Oh no, you have to back up to a file, then burn it onto the CD manually. Not great if the reason for the backup is because you’re afraid that your HD is dodgy.

On a related note, there seems to be a dearth of affordable consumer backup products. Most tape drives appear to be aimed at the server market, and seem to cost as much as two hard disks. Might as well buy the second disk and a RAID controller card (in my case its built in to the motherboard already) and use RAID 1 mirroring instead it seems. That way you don’t have to remember to do backups.

Its all metadata

It seems there are many more readers of blogs than there are writers and everyone has email these days but relatively few people use RSS aggregators; so having a bridge to email so folks can keep in touch with RSS feeds makes lots of sense.

Then if we have an RSS to email gateway we can run ZOË off it to search & sort blogs.

I guess users might want different kinds of email feed;

  • 1 email for all Java blogs per day
  • 1 email per day per feed that they are interested in (so thats like each RSS feed being its own email list with daily digest
  • 1 email per hour of whats new
  • 1 email per change per RSS feed

Still from a basic RSS to email feed it should be pretty easy to do.

[James Strachan’s Radio Weblog]

I have several email addresses, most of which end up in different folders of the same mail program. Something like ZOË should mean that I can forget about that sort of detail upfront, and simply slice my data dyamically. The same applies to all the other attributes. Read/Unread, From, To, Subject etc. They’re all sources of metadata that I can use to organise my messages. Being able to index message bodies is very cool, as would being able to add my own keywords for easy retrieval later. Queries like ‘find me all the emails from James about Jelly that I have tagged as interesting’ should then be possible. Email folders would (hopefully) become redundant, or at least, dynamically generated on the fly in response to ad-hoc queries.

Its all about context. Data without a context is data without meaning.

Googling your email. … [Jon’s

Googling your email. [Jon’s Radio]

Such a simple, but great idea. Funny how things happen together – here’s me playing around with “Lucene”, and Jon posts about how ZOE hooks a search engine into email. Does ZOE use Lucene perchance? It does!

Information now comes at us through so many channels, traditional structured storage (email folders / filtering rules etc.) is feeling the strain. I have so many mail filtering rules that apparently non-deterministic effects are starting to appear. Collating all my inputs such as email, RSS feeds etc. into a giant pot and slicing it every which way with a search engine sounds like a superb idea. No longer caring about where my data is, but simply what its about.

Like Lucid Lookups? Lucene!

Finally got around to actually doing something with Lucene. Now I have, I wish I’d done it sooner. Its almost intuitively easy to use once you get the terminology. Almost everything I tried ‘just worked’. Pure java full-text indexing, and just about every piece is extensible. Very nice.

I actually started off looking at Xindice for querying XML with XPath, which looks promising but it might be for my purposes that the flexibility (and simplicity) of Lucene’s indexing is the better option.

Hopefully I’ll get some time to look further into Xindice later in the week.

XML Xtravaganza

Today has been a day of much messing with markup. Put my safari account to good use delving through the O’Reilly books on Java and XML data binding, XML Schema and (just out of curiosity) Java NIO.

Digester is a very handy utility for quickly populating java objects from XML. Brilliant for config files. I’m experimenting with using it to put together a quick declarative framework for turning CSV files into XML. I define my CSV-to-XML mappings as XML and use Digester to build the mapping objects. Once the data has been turned into XML, I should be able to use Castor to build my business objects. And all without having to go near a parser.

CVS Tools. I find the

CVS Tools. I find the following combination of tools work very well together:

  • CVS (duh) on the server.
  • TortoiseCVS integrated Windows client.
  • Pageant for SSH key management and passwordless connection.
  • ViewCVS for a web browsable view of the repository.
  • Syncmail for doing automated diffs-by-email on commits.

[Pushing the envelope]

I have a similar list, with some additional items:

  • CVS command line client (because Tortoise makes 98% of your daily activities easy – the cmd line is still required for that last 2%).
  • WinMerge as a slightly more user friendly diff tool.
  • CVSQuery to enhance ViewCVS slightly by allowing quick searches/reports (e.g. show me all changes in the last 2 weeks for a specific module).
  • Nice Tortoise icon sets (I like Timo) – let’s face it, the default ones suck 🙂

[Joe’s Jelly]

Doh! Forgot to mention WinMerge. Yes I use it as well 🙂

CVS Stuff. CVS Tools.

CVS Stuff.

CVS Tools. Been doing my technology evangelist / mentor bit at work this week, trying to encourage migration away from sourcesafe onto a half-decent scc system. Not having cross-platform access to our source is hampering us in so many ways. We have no budget, so I’ve been focussing on ways to make CVS as attractive as possible. Shamelessy stealing ideas from sourceforge, I find the following combination of tools work very well together:

  • CVS (duh) on the server.
  • TortoiseCVS integrated Windows client.
  • Pageant for SSH key management and passwordless connection.
  • ViewCVS for a web browsable view of the repository.
  • Syncmail for doing automated diffs-by-email on commits.

[Pushing the envelope]

Nice collection of CVS links. I wasn’t aware the TortoiseCVS did ssh CVS auth. Nothing on the website. Does it actually work? It would be HellaCool if it did 🙂 [Brett Morgan’s Insanity Weblog Zilla]

It works fine. Just download the latest version of pageant, set up your SSH keypair on your CVS server, load your private key into pageant and use the :ext: protocol within Tortoise to access your repository. Tortoise uses a modified version of Plink (part of the PuTTY) suite so it all just works.

Subversion

Source Code Control.

My main complaint with CVS is that it doesn’t support file renames (!!) without some kind of wacky hacks if you want to preserve the revision history of a renamed file. I find rename support to be essential when refactoring. I’m keeping a close eye on Subversion (still in alpha) which aims to address the shortcomings of CVS.[Otiose Cognitions]

I want subversion to succeed, trouble with something like source-control is that it has to be stable and reliable first and foremost – that’s its job. So overcoming the initial ‘new (and potentially buggy) stuff’ hurdle is that much harder. Attaining critical mass will be, erm, critical.