Monthly Archives: August 2009

B+Tree compression and buffering

We’ve been working on big performance improvements for the B+Tree. The goal is to drastically increase the amount of data that bigdata(R) can buffer in memory so we can really take advantage of those 64-bit JVMs while making search faster, reducing the in-memory footprint and GC churn, and reducing the network size of RMI payloads.

First, we have a new binary image for the B+Tree nodes and leaves with pluggable compression schemes that support search and value retrieval directly on the coded (compressed) representation. This will reduce the footprint for data on the disk, in memory, and on the wire and should boost search performance. We will roll out with prefix-compression (front-coding) for keys and with canonical huffman encoding options for keys and values. These compression schemes all based on fastutils[1] and dsiutils[2], which are fantastic libraries.

Second, we are moving away from a per-B+Tree buffer for persistent leaves and nodes to a shared buffer using a mixture of a LRU and a hash map with weak reference values. Combined with the smaller in-memory footprint for leaves and nodes, this will allow bigdata(R) to buffer vastly more data in RAM.

Finally, we are examining options for record-level compression to reduce the on disk footprint even further. Our thinking here is to offer a pluggable scheme, with initial support for Deflate, zip, gzip and bmz[3]. We are already coding (compressing) the keys and values, but there is an additional win to be had from a fast compression pass over the entire B+Tree node or leaf. Most compression libraries are accessed via JNI and will generally do better on larger records, so we may wind up increasing the default branching factor for the index segments to compensate. You should be able to choose a record level compression provider for the Journal, another for the index segments, and have the option to specify the compression provider separately for all shards for a specific index. Different compression schemes make sense for the journal and the index segments because the journal is geared towards fast write absoption while the index segments are geared towards read-only data. Likewise, some indices may derive more benefit from specific compression schemes.

We should have some new performance results based on these changes in a few weeks.



“Mapped” RDF data loader

We’ve just introduced a new “mapped” data loader. The previous data loader assumed that the files were located on specific hosts in a known directory. This was designed map/reduce task in mind, where the RDF files were being dumped into the local directory by the reduce task.

With the mapped data loader you specify a scanner which is executed by the master. The default scanner knows how to identify files to be processed in file system, but it would be easy enough to write a scanner that consumed an HDFS block structured file whose contents were RDF data. This is more like the “map” stage of a map/reduce job, which is why we call it the “mapped” data loader. Regardless of how the scanner identifies the resources to be processed, the master “maps” those resources across client tasks running on the cluster.

Let us know if you are interested in loading data from HDFS or a hadoop map/reduce jobs into a massive distributed semantic web graph and we can help you work through the integration glue.