Runtime query optimization, scale-out quads query, 100% native SPARQL, and the RWStore.

This is a list of some things we’ve been up to lately:

RWStore – This is a new persistence engine for single machine databases. It uses the same interfaces as the existing WORM Journal, but provides sustained high throughput for billions of triples. For example, loading 2B triples of Uniprot data at 30,000 triples per second. The RWStore also has a much smaller on disk footprint and makes it possible to use much larger branching factors on the B+Tree indices, which translates into great query performance as well.

New query engine. We have developed a new query engine which offers increased flexibility and extensibility. The new query engine is based on a bigdata operator model (bops). The engine supports pipelined vectored evaluation for triples and quads both on a single machine and on a cluster (previously we did not support quads in scale-out) and provides the basis for several new query features (analytic query, runtime query optimization, etc).

Runtime query optimizer. We have implemented a new query optimizer which uses sampling to identify correlations in the data and rapidly focuses in on join paths which are optimal for a specific query and data set.

We will be bringing the RWStore and the new query engine into the trunk shortly. We have a bit more work to do to integrate the runtime query optimizer fully.

Here are some things which we are looking at right now:

Analytic query operators for scale-out using multi-block IO against the index segments. This should provide terrific throughput rates for high volume queries and is being designed to work in tandem with the runtime query optimizer since having the right join plan is even more important as the data scale increases.

100% native evaluation for SPARQL. Right now we delegate some query patterns to Sesame, which can be between one and three orders of magnitude slower than native bigdata evaluation. We plan to close those gaps in query performance and add several preview features for SPARQL 1.1.

Embedding a light weight Prolog engine implemented in Java. We have several plans for this.

Dramatically simplified install.


Leave a Reply

Your email address will not be published. Required fields are marked *