xapian icon

Xapian and Lucene - Kudos to Xapian from YouSport.com

Posted in Mon, 26 Jan 2009 05:53:00 GMT

A few years ago I had picked Xapian after evaluating a number of solutions. More recently, the popularity surge of Lucene had me curious to learn about it. I needed to do a rip and replace of MySQL fulltext search due to scaling issues so I decided to check out clucene. I quickly found out the API was not as up to date as Lucene (a fast moving target) and that the mailing list had only had 4 posts in the last year or so. That led to a conclusion to move away from clucene. After that, I was told to check out Solr as an easy way to use Lucene without needing to implement Java. I replaced MySQL with Xapian but still had Solr in the back of my mind to check out.

Recently, an email from Jonathan Drake, Senior Developer at YouSport.com, came across the xapian-discuss mailing list that said:

We were using Solr before but it was constantly causing headaches in terms of scalability and complexity. I gave Xapian a go and so far I'm blown away by how awesome it is. Its incredibly lightweight, its scaled a 100 times better and everyone involved is happier.

I'm curious to hear what scaling and complexity problems they faced, but it's good to hear a strong endorsement of Xapian from a former Solr developer. That, and a quick check of the current users page listing del.icio.us with over 100 million documents, seems to indicate that Xapian remains a strong contender in the search space. That being said, I work with very scalable Lucene-based solutions as well, just in Java projects.

del.icio.us:Xapian and Lucene - Kudos to Xapian from YouSport.com digg:Xapian and Lucene - Kudos to Xapian from YouSport.com reddit:Xapian and Lucene - Kudos to Xapian from YouSport.com spurl:Xapian and Lucene - Kudos to Xapian from YouSport.com wists:Xapian and Lucene - Kudos to Xapian from YouSport.com simpy:Xapian and Lucene - Kudos to Xapian from YouSport.com newsvine:Xapian and Lucene - Kudos to Xapian from YouSport.com blinklist:Xapian and Lucene - Kudos to Xapian from YouSport.com furl:Xapian and Lucene - Kudos to Xapian from YouSport.com fark:Xapian and Lucene - Kudos to Xapian from YouSport.com blogmarks:Xapian and Lucene - Kudos to Xapian from YouSport.com Y!:Xapian and Lucene - Kudos to Xapian from YouSport.com smarking:Xapian and Lucene - Kudos to Xapian from YouSport.com magnolia:Xapian and Lucene - Kudos to Xapian from YouSport.com segnalo:Xapian and Lucene - Kudos to Xapian from YouSport.com

no comments

perl iconmysql iconxapian icon

Encoding Hashed UIDs: Base64 vs. Hex vs. Base32

Posted in , , Mon, 02 Oct 2006 08:08:00 GMT

I recently looked at using various encodings for hashed UIDs, e.g. UIDs generated by a crytographic hash algorithm such as SHA-1 or MD5. These are often useful when the UID does not need to have human meaning but should exhibit some uniformity, such as character set and length. I considered Base64 and hexadecimal first because they are commonly used by crypto libraries and then decided on Base64 and Base32 where appropriate. Base36 is actually the most compact case insensitive encoding (using Arabic numbers and Roman letters) but is not an option for me at the moment because there's no Perl module for it that will take arbitrary text and binary input at the moment. Math::Base36 exists but only handles numbers.

Read more...
del.icio.us:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 digg:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 reddit:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 spurl:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 wists:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 simpy:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 newsvine:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 blinklist:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 furl:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 fark:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 blogmarks:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 Y!:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 smarking:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 magnolia:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32 segnalo:Encoding Hashed UIDs: Base64 vs. Hex vs. Base32

no comments