Monthly Archives: September 2011

Bigdata 1.0.2 Release

This is a minor version release of bigdata(R). Bigdata is a horizontally-scaled, open-source architecture for indexed data with an emphasis on RDF capable of loading 1B triples in under one hour on a 15 node cluster. Bigdata operates in both a single machine mode (Journal) and a cluster mode (Federation). The Journal provides fast scalable ACID indexed storage for very large data sets, up to 50 billion triples / quads. The federation provides fast scalable shard-wise parallel indexed storage using dynamic sharding and shard-wise ACID updates and incremental cluster size growth. Both platforms support fully concurrent readers with snapshot isolation.

Distributed processing offers greater throughput but does not reduce query or update latency. Choose the Journal when the anticipated scale and throughput requirements permit. Choose the Federation when the administrative and machine overhead associated with operating a cluster is an acceptable tradeoff to have essentially unlimited data scaling and throughput.

See [1,2,8] for instructions on installing bigdata(R), [4] for the javadoc, and [3,5,6] for news, questions, and the latest developments. For more information about SYSTAP, LLC and bigdata, see [7].

Starting with the 1.0.0 release, we offer a WAR artifact [8] for easy installation of the single machine RDF database. For custom development and cluster installations we recommend checking out the code from SVN using the tag for this release. The code will build automatically under eclipse. You can also build the code using the ant script. The cluster installer requires the use of the ant script.

You can download the WAR from:

You can checkout this release from:

Feature summary:

– Single machine data storage to ~50B triples/quads (RWStore);
– Clustered data storage is essentially unlimited;
– Simple embedded and/or webapp deployment (NanoSparqlServer);
– Triples, quads, or triples with provenance (SIDs);
– 100% native SPARQL 1.0 evaluation with lots of query optimizations;
– Fast RDFS+ inference and truth maintenance;
– Fast statement level provenance mode (SIDs).

The road map [3] for the next releases includes:

– High-volume analytic query and SPARQL 1.1 query, including aggregations;
– Simplified deployment, configuration, and administration for clusters; and
– High availability for the journal and the cluster.

Change log:


– (Query time expansion of (foo rdf:type rdfs:Resource) drags in SPORelation for scale-out.)
– (Scale-out LUBM “how to” in wiki and build.xml are out of date.)
– (Query not terminated by error.)
– (NamedGraph pattern fails to bind graph variable if only one binding exists.)
– (IRunningQuery not closed promptly.)
– (DataLoader fails to load resources available from the classpath.)
– (Support for the streaming of bigdata IBindingSets into a sparql query.)
– (ClosedByInterruptException during heavy query mix.)
– (NotSerializableException for SPOAccessPath.)
– (Change dependencies to Apache River 2.2.0)


– (Unicode clean schema names in the sparse row store).
– (TermIdEncoder should use more bits for scale-out).
– (OSX requires specialized performance counter collection classes).
– (BigdataValueFactory.asValue() must return new instance when DummyIV is used).
– (TermIdEncoder limits Journal to 2B distinct RDF Values per triple/quad store instance).
– (SPO not Serializable exception in SIDS mode (scale-out)).
– (ClassCastException when querying with binding-values that are not known to the database).
– (UnsupportedOperatorException for some SPARQL queries).
– (Query failure when comparing with non materialized value).
– (RWStore reports “FixedAllocator returning null address, with freeBits”.)
– (NamedGraph pattern fails to bind graph variable if only one binding exists.)
– (log4j – slf4j bridge.)

Note: Some of these bug fixes in the 1.0.1 release require data migration.
For details, see

For more information about bigdata, please see the following links:


About bigdata:



The Bigdata Mini Cluster

Over the weekend I setup our new “mini” cluster. This is a bit of a play on words. The cluster is actually (8) 2011 Mac Minis. The mini has one great advantage, especially when you drop an SSD into it. It is quiet. Suitable for running a bunch of them in the same room. The same can not be said for server grade hardware. To compensate for the relatively “light” resources in the Mini, we had them outfitted with 8G of RAM and purchased a 256G SSD to be installed into each one. The SSD should nicely complement for the lack on RAM since one of the main uses of RAM is to buffer the disk. We went back and forth on which Mini to get, but finally settled on:

2.7GHz Dual-Core Intel Core i7 (4 cores).
8GB 1333MHz DDR3 SDRAM – 2x4GB
AMD Radeon HD 6630M graphics processor (480 stream processors and 256MB of GDDR5 memory; integrated with the 2.7Ghz mini)

The quad core “server” mini was an interesting alternative, but it has the same 1.5Mb cache per core as the non-server version and the core were significantly slower. The clincher for us was actually the AMD GPU in the 2.7Ghz mini. We plan to try out some interesting parallel acceleration concepts on the GPUs in the cluster. Like the mini, this is GPU lite, but it should be sufficient to test out some new ideas.

This totals out to:

8 x 4 = 32 cores @ 2.7Ghz
8 x 8G RAM = 64G RAM
8 x 580M stream processors = 3840 GPU Cores (Open CL 1.1)
8 x 256M GDDR5 memory = 2GB GPU RAM
8 x 256G SSD = 2TB SSD

This works out to a modest cluster, but it has some interesting aspects with the relatively fast SSD and the capabilities of per-node GPU computing. The full installation procedure for the mini cluster is on the wiki.

One of the things that we want to explore with this cluster is a hybrid shared nothing / shared disk architecture suitable for cloud deployments of bigdata. I’ve written about this idea elsewhere, but the main concept is to maintain a compute / storage divide. The compute nodes will use local disk to cache the shards that they are actively using. The storage layer will provide the long term durability guarantee for the read-only journal and index segment files
which make up the shards.

The great advantage of this approach is that you can ramp up and shut down the compute nodes at will since the durable data is all on the storage layer. This also simplifies the HA design since we can leave most of that to the storage layer. Each time a data service goes through a synchronous overflow, we will copy the old journal to the storage layer. Each time we build an index segment, we will copy it to the storage layer as well. The main HA concern for bigdata is reduced to the write replication chain for the mutable journals, basically making sure that the system is durable if a DS quorum member is lost before the next synchronous overflow event. During normal service shutdown, we just do a synchronous overflow and ALL persistent state is on the storage layer. If you were on EC2, you could just shut off the compute nodes without loosing any data. This approach works just fine with SAN, NAS, or a parallel file system as well.

The openstack project is providing an open source version of the EC2/S3 environment. We will probably start by dropping the 5400 RPM HDDs from the minis into some [ HP microservers] and running the openstack swift object store on the microservers. There are several development steps to get us to this hybrid architecture, but they will all add significant capabilities to the platform (including exabyte scale).

We will be using this cluster to shake down our 1.1 release. I’ll post some performance numbers soon.


Security Models for RDF

We do not specify a single security model. Our position is that handling security for the semantic web depends on the information architecture of the application (how it models things using RDF), the choice of the data model (triples, triples + statement level provenance, or quads), and the access control model to be imposed (e.g., user/group versus roles).

Many applications, especially those which use the triple store as a schema fluid database for a web application, can explicitly model security in their information architecture and use a triples-mode deployment, which has 1/2 of the #of statement indices of a quads mode deployment. Mike has just finished a security design along these lines for a customer.

If it makes sense to collect statements into named graphs and associate permissions with those named graphs, then that is a different architecture and you would use the quads-mode of bigdata. This is efficient IF you can group statements together into named graphs. However, if your named graphs tend to have only one or two statements each then you are much better off with the statement level provenance approach.

bigdata has a data model specifically for this statement level provenance. It associates a unique statement identifier with each (s,p,o) triple. The statement identifier looks like a blank node and can be used in SPARQL query and interchanged as RDF/XML. This is a great choice if you need statement level provenance and/or security. You can read more about statement level provenance support here [1].

[1] Using Statement Identifiers to Manage Provenance