Asynchronous PostgreSQL Candy

I've put the first (intended) use of the pg_amqp setup to the test. So far I'm very pleased. While none of the code for the usage is open source the real "magic sauce" is open and has had a few bug fixes and bits of robustness added since I last posted. I can, however, describe the use and let your imagination run while.

The parts: PostgreSQL, RabbitMQ, and a bit of Java around Lucene. Our java program actually uses jetty to expose a tiny servlet that accepts search queries and returns json results making AJAX-style searches easy and fast.

The obvious question we all face in these sort of configurations is "how do keep our search indexes up-to-date?" Usually, we relax constraints (wisely so) and determine exactly "how up-to-date does out index needs to be." If you can answer this question with a number sufficiently far from zero, you've won. In our situation, the user experience must be able to search and find the updated items in database via search immediately after insert or update (within a 100ms or so).

Enter message queueing. The first thing we do is make our Java process connect to RabbitMQ, bind a queue and consume. The messages it consumes from the queue have instructions on the precise element that has changed which then causes an query against the database retrieving all the new data to be reindexed (in our case, it is much more than can be easily witnessed from a trigger on update) and updates the Lucene indexes.

The second updated we made is to PostgreSQL by installing pg_amqp. On the tables whose changes should induce reindexing, we add a (or augment the existing) trigger to call: amqp.publish('', 'searchstuff', E'reindex\t' || NEW.rowid). Assuming we have a column called rowid, this will queue a message that looks like "reindex<tab>328432"

The neat part is that if the update is rolled back in the database, the message is never sent. Otherwise, we see the message from the our indexing and search app and viola.

I think this technique is really useful for people that are currently leveraging Lucene-based systems to provide powerful search functionality yet keep their data safe and secure in PostgreSQL.


comments powered by Disqus
Copyright © 2013 - Theo Schlossnagle - Powered by Hexo
- Ported theme GreyShade -