Blazegraph 1.5.2 to Support Hybrid Search using External Solr Indices

While graph databases are a perfect fit for storing and querying structured data, they are not primarily designed to deal efficiently with unstructured data and keyword queries. Therefore, such unstructured data is often kept in dedicated systems that are laid out to tackle the specific challenges for evaluating keyword queries in an efficient way — including advanced techniques such as stemming, TF-IDF indexing, support for complex keyword search requests, scoring, etc.

Graph databases, on the other hand, are about connecting things, so in many scenarios we want to combine the capabilities of structured queries with those of queries against a fulltext index. To give just one simple example, assume we have a structured graph database with data about historical characters and a complementary keyword index over a corpus of historical texts (which may or may not be under our control). Assume we now want to combine structured queries  — asking, e.g., for persons that fall into certain categories such as epochs or countries they lived in  — with historical texts from the index that prominently feature these persons.

blazegraph_by_systap_faviconThe upcoming Blazegraph 1.5.2 release will support such hybrid queries against external Solr fulltext search indices. The fulltext search feature has been implemented as a Blazegraph custom service: using a standard-compliant SPARQL SERVICE call with a reserved service URI, you can now easily combine structured search capabilities over the graph database with information held in an external Solr index.


Blazegraph’s hybrid search capabilities are currently used by the British Museum in the ResearchSpace project, which aims at building a collaborative environment for humanities and cultural heritage research using knowledge representation and Semantic Web technologies.  In this context, Blazegraph’s hybrid search feature supports users in expressing complex search requests for cultural heritage objects. Hybrid SPARQL queries utilizing a Solr index are used to support a semantic autocompletion: As the user types a keyword, hybrid queries are issued in real-time to match keywords against entities in a cultural heritage knowledge graph. Depending on the current context of the search, persons, objects or places are suggested, providing a user friendly means to disambiguate terms as the user types.   If you’re going to be in San Jose for the Smart Data conference, we’re giving a tutorial on the approach.

To illustrate the new hybrid search feature by example, a single SPARQL query like

PREFIX rdf: <>
PREFIX rdfs: <>
PREFIX ex: <>
PREFIX fts: <>
SELECT ?person ?kwDoc ?snippet WHERE {
        ?person rdf:type ex:Artist .
        ?person ex:associatedWith ex:TheBlueRider .
        ?person ex:bornIn ex:Germany .
        ?person rdfs:label ?label .
  SERVICE <> {
        ?kwDoc fts:search ?label .
        ?kwDoc fts:endpoint "http://my.solr.server/solr/select" .
        ?kwDoc fts:params "fl=id,score,snippet" .
        ?kwDoc fts:scoreField "score" .
        ?kwDoc fts:score ?score .
        ?kwDoc fts:snippetField "snippet" .
        ?kwDoc fts:snippet ?snippet .
} ORDER BY ?person ?score


  • first extract all persons associated with the group “The Blue Rider” that were born in Germany, then
  • take the label of these persons as search string and send a request against a Solr server, in order to extract a ranked list of articles for the respective persons (including text snippets where these persons are mentioned), next
  • order the results by person and relevance as requested by the ORDER BY, and finally
  • return the identified person URIs (variable ?person, from the graph database), the ID of the keyword index document (variable ?kwDoc, from the fulltext index), and the associated text snippet provided by the keyword index (variable ?snippet).

As the example illustrates, parameterization of the keyword index is made via a reserved, “magic vocabulary”: for instance, within the SERVICE keyword, the object linked through fts:search identifies the search string to be submitted against the keyword index, while fts:endpoint points refers to the address of the Solr server.

Of course, the hybrid search feature is not domain dependent: no matter what data has been loaded into your database and no matter what the keyword index looks like, you can now post hybrid search queries against your data and the external index. The implementation even allows you to query multiple keyword indices within one query and, by the use of SPARQL 1.1 federation, combine this with requests against multiple SPARQL endpoints at a time. The search string can be dynamically extracted from the database (as in the example above, where we bind variable ?label through a structured query and use it as a search string) or can be  a static search string. Even more, nothing prevents you from using more complex Solr keyword search strings using boolean connectives such as AND, OR, or negation: in SPARQL, these complex search strings can be easily composed by the use of BIND in combination with string concatenation. For instance, we may modify the first part of our example as

        ?person ex:associatedWith ex:TheBlueRider .
        ?person ex:bornIn ex:Germany .
        ?person rdfs:label ?label .
        BIND(CONCAT("\"", ?label, "\" AND -\"expressionism\"") AS ?search)
        SERVICE <> {
                ?kwDoc fts:search ?search .

in order to search for keyword index documents mentioning these persons without explicitly mentioning “expressionism” (the “-” in Solr is used to express negation).

If you want to learn more about Blazegraph’s upcoming Solr index support, please check out the documentation in our Wiki.

We’d love to hear from you.

Do you have a cool new application using Blazegraph or are interested in understanding how to make Blazegraph work best for your application?    Get in touch or send us an email at blazegraph at


Blazegraph 1.5.2 Preview and Bloor Group Briefing Room

Blazegraph 1.5.2 Release Preview

We’re in the final stages of the Blazegraph 1.5.2 release.    This is major release for Blazegraph and brings both exciting new features as well as important performance and query optimizations.  You can see the full set of tickets for the release here (it’s our first one since we’ve migrated to JIRA).

Check it out when the release goes final (we’ll send a mailing list), but here’s some of the new features and improvements:

  • External SOLR Search Integration:  Use the SPARQL Service keyword to search content externally stored in SOLR.
  • Substantial refactoring of the Query Plan Generator in close collaboration with our partner, Metaphacts.  This both fixes a number of bugs in the query optimizer and improves performance.
  • Fixes and improvements for embedded and remote clients including Blueprints and Rexster.
  • Online Backup for the non-HA Blazegraph Server.

Did you see Blazegraph featured in the Bloor Group Briefing room?

Blazegraph was recently featured in the Bloor Group Briefing room. Check out the video presentation below, if you missed it.

We’d love to hear from you.

Do you have a cool new application using Blazegraph or are interested in understanding how to make Blazegraph work best for your application?    Get in touch or send us an email at blazegraph at


Migration of issue tracker (trac => jira)

All, we have finally completed the migration from trac to jira. Please visit and use the new jira instance for Blazegraph.

Many thanks to Brad for making this happen!

Note: trac will remain online in a read-only mode. A cross-walk of trac to jira tickets is available.

All trac tickets have been updated with a comment containing the link to the corresponding JIRA ticket.



What pairs well with Veuve Clicquot and Big Data for Graphs?

Have you ever wondered what to pair with Veuve Clicquot and your Big Data graph challenge? The 2015 Big Data Innovations Summit has the answer:  Blazegraph and Mapgraph.   We won the Big Data Startup Award at the 2015 Big Data Innovations Summit in San Jose.

SYSTAP wins Big Data Startup award at the 2015 Big Data Innovations conference.

SYSTAP wins Big Data Startup award at the 2015 Big Data Innovations conference.

Our solutions for scalable graph technologies were recognized with the award.  Naturally, the champagne pairs perfectly with our Blazegraph database platform and Mapgraph technology for GPU-accelerated graph analytics.  Don’t forget our Blazegraph SWAG (must be present to appreciate…) because In graphs, size matters.

SYSTAP's Brad Bebee receives the Big Data Innovation 2015 award for Big Data Startup.

SYSTAP’s Brad Bebee receives the Big Data Innovation 2015 award for Big Data Startup

Blazegraph™ is our ultra high-performance graph database supporting Blueprints and RDF/SPARQL APIs. It supports up to 50 Billion edges on a single machine and has a High Availability and Scale-out architecture. It is in production use for Fortune 500 customers such as EMC, Autodesk, and many others.  The Wikimedia Foundation recently chose Blazegraph to power the Wikidata Query Service.  Mapgraph™ is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics up to 10,000X faster than other approaches. It can traverse billions of edges in milliseconds.

Whether you need an embedded graph database, a 1 Trillion Edge Graph Database, or the ability to traverse billions of edges in milliseconds, SYSTAP’s award-winning graph solutions can meet your needs.  We’d love to hear about your success with our technology or talk to you about how we can help scale your graph challenge.  Via Twitter:  @Blazegraph or via email.



CPU, Disk, Main Memory for Graphs

Bryan and I were chatting back on the train from NYC on CPU memory bandwidth. Here’s a quick write-up of the discussion.

15 years ago database researchers recognized that CPU memory bandwidth was the limiting factor for relational database performance. This observation was made in the context of relatively wide tables in RDBMS platforms that were heavily oriented to key range scan on a primary key. The resulting architectures are similar to the structure of array (SOA) pattern used by the high performance computing community and within our Mapgraph platform.

Graphs are commonly modeled as 3-column tables. These tables intrinsically have a very narrow stride, similar to column stores for relational data. Whether organized for main memory or disk, the goal of the query planner is to generate an execution strategy that is the most efficient for a given query.

Graph stores that are organized for disk maintain multiple indices with a log structured cost (e.g., a B+Tree) that allow them to jump to the page on the disk that has the relevant tuples for any given access path. Due to the relative cost of memory and disk, main memory systems (such as SPARQLcity) sometimes choose a single index and resort to full table scans when the default index is not useful for a query or access pattern. In making such decisions, main memory databases are trading off memory for selectivity. These designs can consume fewer resources, but it can be much more efficient to have a better index for a complex query plan. Thus, a disk based database can often do as well as or better than a memory based database of the disk based system has a more appropriate index or family of indices. In fact, 90%+ of the performance of a database platform comes from the query optimizer. The difference in performance between a good query plan and a bad query plan for the same database and hardware can be easily 10x, 100x, or 10000x depending on the query. A main memory system with a bad query plan can easily be beaten by a disk -based system with a good query plan. This is why we like to work closely with our customers. For example, one of our long term customers recently upgraded from 1.2.x (under a long term support contract) to 1.5.x and obtained a 100% performance improvement without changing a single line of their code.

Main memory systems become critical when queries much touch a substantial portion of the data. This is true for most graph algorithms that are not hop constrained. For example, an unconstrained breadth first search on a scale free graph will tend to visit all vertices in the graph during the traversal. A page rank or connected components computation will tend to visit all vertices on each iteration and may required up to 50 iterations to converge page rank to a satisfactory epsilon. In such cases, CPU memory architectures will spend most of the wall clock time blocked on memory fetches due to the inherent non-local access patterns during graph traversal. Architectures such as the XMT/XMT-2 (the Urika appliance) handle this problem by using very slow cores, zero latency thread switching, a fast interconnect and hash partitioned memory allocations. The bet of the XMT architecture is that non-locality dominates so you might as well spread all data out everywhere and hide the latency by having a large number of memory transactions in flight. We take a different approach with GPUs and achieve a 10x price/performance benefit over the XMT-2 and a 3x cost savings. This savings will increase substantially when the Pascal GPU is released in Q1 2016 due to an additional 4x gain in memory bandwidth driven by the breadth of the commodity market for GPUs. We obtain this dramatic price/performance and actual performance advantage using zero overhead context switching, fast memory, 1000s of threads to get a large number of in flight memory transactions, and paying attention to locality. The XMT-2 is a beautiful architecture, but locality *always* matters at every level of the memory hierarchy.


Blazegraph 1.5.1 Released!

Blazegraph 1.5.1 is released! This is a major release of Blazegraph™. The official release is made into the Sourceforge Git repository. Releases after 1.4.0 will no longer be made into SVN.

The full feature matrix is here.


You can download the WAR (standalone), JAR (executable), or HA artifacts from sourceforge.

You can checkout this release from:

git clone -b BLAZEGRAPH_RELEASE_1_5_1 --single-branch git:// BLAZEGRAPH_RELEASE_1_5_1

Feature summary:

– Highly Available Replication Clusters (HAJournalServer [10])
– Single machine data storage to ~50B triples/quads (RWStore);
– Clustered data storage is essentially unlimited (BigdataFederation);
– Simple embedded and/or webapp deployment (NanoSparqlServer);
– Triples, quads, or triples with provenance (RDR/SIDs);
– Fast RDFS+ inference and truth maintenance;
– Fast 100% native SPARQL 1.1 evaluation;
– Integrated “analytic” query package;
– %100 Java memory manager leverages the JVM native heap (no GC);
– RDF Graph Mining Service (GASService) [12].
– Reification Done Right (RDR) support [11].
– RDF/SPARQL workbench.
– Blueprints API.

Road map [3]:

– Column-wise indexing;
– Runtime Query Optimizer for quads;
– New scale-out platform based on MapGraph (100x => 10000x faster)

Change log:

Note: Versions with (*) MAY require data migration. For details, see [9].

New features:
– BigdataSailFactory moved to client package (
– This release includes significant performance gains for property paths.
– Both correctness and performance gains for complex join group and optional patterns.
– Support for concurrent writers and group commit. This is a beta feature in 1.5.1 and must be explicitly enabled for the database. Group commit for HA is also working in master, but was not ready for the 1.5.1 QA and hence is not in the 1.5.1 release branch.


– Concurrent unisolated operations against multiple KBs on the same Journal
– Adding Optional removes solutions
– Query solutions are duplicated and increase by adding graph patterns
– Property path operator should output solutions incrementally
– Using a bound variable to refer to a graph
– NPE if remote http server fails to provide a Content-Type header
– problems with UNIONs + complex OPTIONAL groups
– Executable Jar should bundle the BuildInfo class
– SPARQL UPDATE should have nice error messages when namespace does not support named graphs
– NSS startup error: java.lang.IllegalArgumentException: URI is not hierarchical
– Data race in
– GPLv2 license header update with new contact information
– Add hook to override the DefaultOptimizerList
– startHAServices no longer respects environment variables
– Build version in SF GIT master is wrong
– needs updating for Blazegraph transition
– Optimized variable projection into subqueries/subgroups
– OSX vm_stat output has changed
– Concurrent modification problem with group commit
– ClocksNotSynchronizedException (HA, GROUP_COMMIT)
– GlobalRowStoreHelper can hold hard reference to GSR index (GROUP COMMIT)
– Code review on “instanceof Journal”
– BigdataSailFactory.connect()
– Isolation broken in NSS when groupCommit disabled
– GROUP_COMMIT environment variable
– SPARQL Federated Query uses too many HttpClient objects
– DELETE DATA must not allow blank nodes
– BigdataSailFactory? must be moved to the client package

Full release notes are here.



Blazegraph™ Selected by Wikimedia Foundation to Power the Wikidata Query Service

wikidata_logo_200pxBlazegraph™ has been selected by the Wikimedia Foundation to be the graph database platform for the Wikidata Query Service. Read the Wikidata announcement here.  Blazegraph™ was chosen over Titan, Neo4j, Graph-X, and others by Wikimedia in their evaluation.  There’s a spreadsheet link in the selection message, which has quite an interesting comparison of graph database platforms.

Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wikisource, and others.  The Wikidata Query Service is a new capability being developed to allow users to be able to query and curate the knowledge base contained in Wikidata.

We’re super-psyched to be working with Wikidata and think it will be a great thing for Wikidata and Blazegraph™.


Mapgraph™ GPU Acceleration for Blazegraph™: Launch Preview

We’re going to be formally launching our GPU-based graph analytics acceleration products, Mapgraph™ Accelerator and Mapgraph™ HPC, at the NVIDIA GTC conference in San Jose the week of 16 March.   We will also be competing as one of 12 finalists for NVIDIA’s early stage competition for a $100,000 prize.  If you’re in the area, come to GTC on Wednesday, March 18 and vote for us!

MapGraph Logo_200px

Mapgraph™ Accelerator (Beta) serves as a single-GPU graph accelerator for Blazegraph™.  We believe it will provide  the world’s first and best platform for building graph applications with GPU-acceleration.   It will bridge the gap between our Blazegraph™ database platform and the GPU acceleration for graph analytics. Users of the Blazegraph™ platform will be able to leverage GPU-accelerated graph analytics via a Java Native Interface (JNI) and via predicates in SPARQL query similarly to our current RDF GAS API, which provides Breadth-First Search (BFS), Single Source Shortest Path (SSSP), Connected Components (CC), and PageRank (PR) implementations.

Mapgraph™ HPC is a new and disruptive technology for organizations that need to process very large graphs in near-real time. It uses GPU clusters to deliver High Performance Computing (HPC) for your organization’s biggest and most time critical graph challenges.

  • Up to 10,000X Faster for graph analytics than Hadoop technologies
  • 10X Price-Performance advantage over supercomputer solutions
  • Familiar Vertex-Centric Graph Programming Model
  • Demonstrated performance of 32 Billion Traversed Edges Per Second (GTEPS) using 64 NVIDIA K40s on Scale-free Graphs

We are currently enrolling Beta customers for Mapgraph™ Accelerator and Mapgraph™ HPC. Chesapeake Technologies International has already accelerated a military planning application seeing computation times drop from minutes for a single-solution to seconds for the generation of multiple scenarios.  We’re doing a session on it at GTC. Contact us if you’re interested finding out more.

Mapgraph Beta Customer Request

* indicates required


Blazegraph 1.5.1 Feature Preview

Starting with 1.5.1, BlazeGraph supports task-oriented concurrent writers. This support is based on the pre-existing support for task-based concurrency control in BlazeGraph. Those mechanisms were previously used only in the scale-out architecture. They are now incorporated into the REST API and can even be used by aware embedded applications.

This is a beta feature — make backups!

There are two primary benefits from group commit.

First, you can have multiple tenants in the same database instance and the updates for one tenant will no longer block the updates for the other tenants. Thus, one tenant can be safely running a long running update and other tenants can still enjoy low latency updates.

Second, group commit automatically combines a sequence of updates on one (or more) tenant(s) into a single commit point on the disk. This provides higher potential throughput. It also means that it is no longer as important for applications to batch their updates since group commit will automatically perform some batching.

Early adopters are encouraged to enable this using the following symbolic property. While the Journal has always supported group commit at the AbstractTask layer, we have added support for hierarchical locking and modified the REST API to use group commit when this feature is enabled. Therefore this feature is a “beta” in 1.5.1 while work out any new kinks.

# Note: Default is false.

If you are using the REST API, then that is all you need to do. Group commit will be automatically enabled. This can even be done with an existing Journal since there are no differences in the manner in which the data are stored on the disk.

Embedded Applications and Group Commit

If you are using the internal APIs (Sail, AbstractTripleStore, stored queries, etc.) then you need to understand what is happening when group commit is enabled and make a slight change to your code.

  • When you set this property to true, you are asserting that your application will submit all tasks for evaluation to the IConcurrencyManager associated with the Journal and you are agreeing to let the database decide when it will perform a commit.
  • When you set this property to false (the default), you are asserting that your application will control when the database performs a commit. This is how embedded application has been written historically.
  • Any mutation operations must use the following incantation. This incantation will submit a task that obtains the necessary locks and the task will then run. If the task exits normally (versus by throwing an exception) then it will join the next commit group. The Future.get() call will return either when the task fails or when its write set has been melded into a commit point.

    AbstractApiTask.submitApiTask(IIndexManager indexManager, IApiTask task).get();

    There are a few “gotchas” with the group commit support. This is because commits are decided by IApiTask completion and tasks are scheduled by the concurrency manager, lock manager, and write executor service.

  • Mutation tasks that do not complete normally MUST throw an exception!
  • Applications MUST NOT call Journal.commit(). Instead, they submit an IApiTask using AbstractApiTask.submit(). The database will meld the write set of the task into a group commit sometime after the task completes successfully.
  • Servlets exposing mutation methods MUST NOT flush the response inside of their AbstractRestApiTask. This is because ServletOutputStream.flush() is interpreted as committing the http response to the client. As soon as this is done the client is unblocked and may issue new operations under the assumption that the data has been committed. However, the ACID commit point for the task is *after* it terminates normally. Thus the servlet must flush the response only after the task is done executing and NOT within the task body. The BigdataServlet.submitApiTask() method handles this for you so your code looks like this:

  • // Example of task execution from within a BigdataServlet
    try {
    submitApiTask(new MyTask(req, resp, namespace, timestamp,...)).get();
    } catch (Throwable t) {
    launderThrowable(t, resp, ...);

  • BigdataSailConnection.commit() no longer causes the database to go through a commit point. You MUST still call conn.commit(). It will still flush out the assertion buffers (for asserted and retracted statements) to the indices, which is necessary for your writes to become visible. When you task ends and the indices go through a checkpoint, it does not actually trigger a commit. Thus, in order to use group commit, you must obtain your connection from within an IApiTask, invoke conn.commit() if things are successful and otherwise throw an exception. The following template shows what this looks like.
  • // Example of a concurrent writer task using group commit APIs.
    public class MyWriteTask extends AbstractApiTask {
    public Void call() throws Exception {
    BigdataSailRepositoryConnection conn = null;
    boolean success = false;
    try {
    conn = getUnisolatedConnection();
    conn.commit(); // Commit the mutation.
    success = true;
    return (Void) null;
    } finally {
    if (conn != null) {
    if (!success)

    How it works.

    The group commit mechanisms are based on hierarchical locking and pre-declared locks. Tasks pre-declare their locks. The lock manager orders the lock requests to avoid deadlocks. Once a task owns its locks, it is executed by the WriteExecutorService. Lock in AbstractTask is responsible for isolating its index views, checkpointing the modified indices after the task has finished its work, and handshaking with the WriteExecutorService around group commits.

    Most tasks just need to declare the namespace on which they want to operate. This will automatically obtain a lock for all indices in that namespace. Some special kinds of tasks (those that create and destroy namespaces) must also obtain a lock on the global row store (aka the GRS). This is an internal key-value store where BlazeGraph stores the namespace declarations.


    Announcing Blazegraph Release 1.5.0

    Starting with the 1.5.0 release, SYSTAP’s graph database platform will be called Blazegraph™.   It is built on the same platform and maintains 100% binary and API compatibility with Bigdata®.   SYSTAP will be fully integrating the Blazegraph™ brand over the course of 2015 and all of the existing wiki, blog, and other related pages will be updated.

    This is a major release of Blazegraph™.  This is the initial release made into the Sourceforge Git repository.  Releases after 1.4.0 will no longer be made into SVN. [14].

    Blazegraph™ is specifically designed to support big graphs offering both Semantic Web (RDF/SPARQL) and Graph Database (tinkerpop, blueprints, vertex-centric) APIs.   It features robust, scalable, fault-tolerant, enterprise-class storage and query and high-availability with online backup, failover and self-healing.  It is in production use with enterprises such as Autodesk, EMC, Yahoo7!, and many others.   Blazegraph™ provides both embedded and standalone modes of operation.

    Blazegraph™ is a horizontally-scaled, open-source architecture for indexed data with an emphasis on RDF capable of loading 1B triples in under one hour on a 15 node cluster.  It operates in both a single machine mode (Journal), highly available replication cluster mode (HAJournalServer), and a horizontally sharded cluster mode (Federation).  The Journal provides fast scalable ACID indexed storage for very large data sets, up to 50 billion triples / quads.  The HAJournalServer adds replication, online backup, horizontal scaling of query, and high availability.  The federation provides fast scalable shard-wise parallel indexed storage using dynamic sharding and shard-wise ACID updates and incremental cluster size growth.  Both platforms support fully concurrent readers with snapshot isolation.

    Distributed processing offers greater throughput but does not reduce query or update latency.  Choose the Journal when the anticipated scale and throughput requirements permit.  Choose the HAJournalServer for high availability and linear scaling in query throughput.  Choose the BigdataFederation when the administrative and machine overhead associated with operating a cluster is an acceptable tradeoff to have essentially unlimited data scaling and throughput.

    See [1,2,8] for instructions on installing bigdata(R), [4] for the javadoc, and [3,5,6] for news, questions, and the latest developments. For more information about SYSTAP, LLC and bigdata, see [7].

    Starting with the 1.0.0 release, we offer a WAR artifact [8] for easy installation of the single machine RDF database.  For custom development and cluster installations we recommend checking out the code from SVN using the tag for this release. The code will build automatically under eclipse.  You can also build the code using the ant script.  The cluster installer requires the use of the ant script.

    Starting with the 1.3.0 release, we offer a tarball artifact [10] for easy installation of the HA replication cluster.

    Starting with the 1.5.0 release, we offer an executable jar file [13] for getting started quickly with minimal setup.

    You can download the WAR (standalone), JAR (executable), or HA artifacts from:

    You can checkout this release from:

    git clone -b BIGDATA_RELEASE_1_5_0 –single-branch git:// BIGDATA_RELEASE_1_5_0

    Feature summary:

    – Highly Available Replication Clusters (HAJournalServer [10])

    – Single machine data storage to ~50B triples/quads (RWStore);

    – Clustered data storage is essentially unlimited (BigdataFederation);

    – Simple embedded and/or webapp deployment (NanoSparqlServer);

    – Triples, quads, or triples with provenance (RDR/SIDs);

    – Fast RDFS+ inference and truth maintenance;

    – Fast 100% native SPARQL 1.1 evaluation;

    – Integrated “analytic” query package;

    – %100 Java memory manager leverages the JVM native heap (no GC);

    – RDF Graph Mining Service (GASService) [12].

    – Reification Done Right (RDR) support [11].

    – RDF/SPARQL workbench.

    – Blueprints API.


    Road map [3]:

    – Column-wise indexing;

    – Runtime Query Optimizer for quads;

    – New scale-out platform based on MapGraph (100x => 10000x faster)


    Change log:

    Note: Versions with (*) MAY require data migration. For details, see [9].


    New features:

    • Simplified deployer (Executable Jar)
    • Replaced apache client with jetty client (fixes http protocol layer errors in apache)
    • Arbitrary Length Path (ALP) Service (
    • Updated banner.
    • Updated logo.
    • new splash page.
    • Several bug fixes.
    • Several query optimizations.


    Tickets for this release:

    • Slow query with BIND
    • GRAPH ?g { FILTER NOT EXISTS { ?s ?p ?o } } not respecting ?g
    • Graph filter works on different graph that selected one
    • COUNT(DISTINCT) returns no rows rather than ZERO.
    • GRAPH ignored by FILTER NOT EXISTS
    • Replace Apache Http Components with jetty http client
    • double filter error
    • (CONNEG using URL Query Parameter for json or xml results)
    • GROUP BY optimization using distinct-term-scan and fast-range-count
    • 1.4.0 pom references incorrect openrdf version
    • Add a streaming API for construct queries on BigdataGraph
    • Connection management with Blueprints
    • ALP Service (custom property paths)
    • SPARQL UPDATE QUADS DATA error with literals (SES-2063)
    • Eclipse project in git repository is broken
    • LaunderThrowable should not always throw an exception
    • JVMNamedSubqueryOp throws ExecutionException with OPTIONAL and FILTER query
    • Snapshot mechanism breaks with metabit demi-spaces
    • Problem with IPV4 support
    • Add ability to dump threads to status page
    • Loading quads data into a triple store should strip out the context
    • Named subquery results not referenced within query (bottom-up evaluation)
    • expose version information in workbench or endpoint
    • Set query timeout and response buffer length on jetty response listener
    • (Configuration option for jetty request buffer size)
    • (DELETE WITH ACCESS PATH fails if more than one named graph is specified)




    –  (Migrate to openrdf 2.7)

    –  (BackgroundTupleResult overrides final method close)

    –  (explicit bindings get ignored in subselect (duplicate of #714))

    –  (Documentation on BigData Reasoning)

    –  (workbench does not display errors well)

    – (DISTINCT PREDICATEs query is slow)

    – (SELECT COUNT(…) (DISTINCT|REDUCED) {single-triple-pattern} is slow)

    – (RDR RDF parsers are not always discovered)

    – (ORDER_BY ordering not preserved by projection operator)

    – (NQuadsParser hangs when loading latest dbpedia dump.)

    – (ASTComplexOptionalOptimizer did not account for Values clauses)

    – (BigdataGraphFactory create method cannot be invoked from the gremlin command line due to a Boolean vs boolean type mismatch.)

    – (update RDR documentation on wiki)

    – (Server does not generate RDR aware JSON for RDF/SPARQL RESULTS)




    –  (Empty PROJECTION causes IllegalArgumentException)

    – (Journal leaks storage with SPARQL UPDATE and REST API)

    – (remote service queries should put parameters in the request body when using POST)




    –  (Object position of query hint is not a Literal (partial resolution – see #1028 as well))

    – (Add the ability to track and cancel all queries issued through a BigdataSailRemoteRepositoryConnection)

    – (Add critical section protection to AbstractJournal.abort() and BigdataSailConnection.rollback())

    – (GregorianCalendar? does weird things before 1582)

    – (SPARQL UPDATE with runtime errors causes problems with lexicon indices)

    – (very rare NotMaterializedException: XSDBoolean(true))

    – (RWStore commit state not correctly rolled back if abort fails on empty journal)

    – (RWStorage stats cleanup)




    – (Jetty/LBS issues when deployed as WAR under tomcat)

    – (Upgrade apache http components to 1.3.1 (security))

    – (Invalidate BTree objects if error occurs during eviction)

    – (Concurrent binding problem)

    – (Concurrency issues in JVMHashJoinUtility caused by MAX_PARALLEL query hint override)

    – (Add configuration option to turn off bottom-up evaluation)

    –  (Extend BigdataSailFactory to take arbitrary properties)

    –  (SPARQL Update through BigdataGraph)

    –  (Add custom prefix support for query results)

    –  (Allow general purpose SPARQL queries through BigdataGraph)

    –  (Deadlock between AbstractRunningQuery.cancel(), QueryLog.log(), and ArbitraryLengthPathTask)

    –  (Query hints not recognized in FILTERs)

    –  (Stored query service)

    –  (Bad performance for FILTER EXISTS)

    –  (maven build is broken)

    –  (Improve locality for small allocation slots)

    –  (Deadlock in BigdataTriplePatternMaterializer)

    –  (HA Health Status Page)

    –  (Name2Addr.indexNameScan(prefix) uses scan + filter)

    –  (RWStore.commit() should be more defensive)

    –  (Clarify HTTP Status codes for CREATE NAMESPACE operation)

    –  (no link to wiki from workbench)

    –  (Failed to get namespace under concurrent update)

    –  (Can not run LBS mode with HA1 setup)

    –  (Clone/modify namespace to create a new one)

    –  (Export namespace properties in XML/Java properties text format)

    –  (HA Load Balancer)

    –  (Support larger metabits allocations)

    –  (Bigdata/Rexster integration)

    –  (Formatted Layout for Status pages)

    –  (REST API Query Cancellation)

    –  (Panels do not appear on startup in Firefox)

    –  (Executing a new query should clear the old query results from the console)

    –  (Abbreviate URIs that can be namespaced with one of the defined common namespaces)

    –  (Can’t explore an absolute URI with < >)

    –  (Explore page looks weird when empty)

    –  (Allow user to go use browser back & forward buttons to view explore history)

    –  (OutOfMemoryError instead of Timeout for SPARQL Property Paths)

    –  (Change explore URLs to include URI being clicked so user can see what they’ve clicked on before)

    –  (AssertionError: Child does not have persistent identity)

    –  (Search functionality in workbench)

    –  (Query results panel should recognize well known namespaces for easier reading)

    –  (Display the properties for a namespace)

    –  (Create new tabs for status & performance counters, and add per namespace service/VoID description links)

    –  (Configurator for new namespaces)

    –  (Allow user to create namespace in the workbench)

    –  (Output RDF data from queries in table format)

    –  (Export query results)

    –  (Save selected namespace in browser)

    –  (Explore tab in workbench)

    –  (Create shortcut to execute load/query)

    –  (Disable textarea when a large file is selected)

    –  (Allow non-file:// URLs to be loaded)

    –  (Retrieve default namespace on page load)

    –  (Query timeout only checked at operator start/stop)

    –  (order by expr skips invalid expressions)

    –  (JSP page to configure KBs)

    –  (Stochastic assert in AbstractBTree#writeNodeOrLeaf() in CI)




    –   (Deadlines do not play well with GROUP_BY, ORDER_BY, etc.)

    –   (Amortize RTO cost)

    –   (Support BOP fragments in the RTO.)

    –   (Integrate RTO into SAIL)

    –   (Dynamically increase RTO sampling limit.)

    –   (Reification done right)

    –   (Problem with the bigdata RDF/XML parser with sids)

    –   (NSS using jetty+windows can lose connections (windows only; jdk 6/7 bug))

    –   (HA Load Balancer)

    –   (Graph processing API)

    –   (Support HA1 configurations)

    –   (Allow configuration of embedded NSS jetty server using jetty-web.xml)

    –   (multiple filters interfere)

    –   (Stochastic results with Analytic Query Mode)

    –   (Converge on Java 7.)

    –   (Resynchronization of socket level write replication protocol (HA))

    –   (Incremental or asynchronous purge of HALog files)

    –   (Wrong serialization version)

    –   (Describe Limit/offset don’t work as expected)

    –   (Update documentations and samples, they are OUTDATED)

    –   (Name2Addr does not report all root causes if the commit fails.)

    –   (ant task to build sesame fails, docs for setting up bigdata for sesame are ancient)

    –   (should not be pruning any children)

    –   (Clean up query hints)

    –   (Explain reports incorrect value for opCount)

    –   (Filter assigned to sub-query by query generator is dropped from evaluation)

    –   (add sbt setup to getting started wiki)

    –   (Solution order not always preserved)

    –   (mis-optimation of quad pattern vs triple pattern)

    –   (Optimize DatatypeFactory instantiation in DateTimeExtension)

    –   (prefixMatch does not work in full text search)

    –   (update bug deleting quads)

    –   (Incorrect AST generated for OPTIONAL { SELECT })

    –   (Wildcard search in bigdata for type suggessions)

    –   (Expose GAS API as SPARQL SERVICE)

    –   (RDR query does too much work)

    –   (Wildcard projection ignores variables inside a SERVICE call.)

    –   (Unexplained increase in journal size)

    –   (Reject large files, rather then storing them in a hidden variable)

    –   (UNION with filter issue)

    –   (Using “VALUES” in a query returns lexical error)

    –   (Fix SPARQL Results JSON writer to write the RDR syntax)

    –   (Create writers that support the RDR syntax)

    –   (RDR GAS interface)

    –   (RemoteRepository.cancel() does not consume the HTTP response entity.)

    –   (Follower does not accept POST of idempotent operations (HA))

    –   (Allow override of maximum length before converting an HTTP GET to an HTTP POST)

    –   (AssertionError: Child does not have persistent identity)

    –   (Create parser for JSON SPARQL Results)

    –   (HA1 commit failure)

    –   (Batch remove API for the SAIL)

    –   (NSS concurrency problem with list namespaces and create namespace)

    –   (HA5 test suite)

    –   (Full text index range count optimization)

    –   (FILTER not applied when there is UNION in the same join group)

    –   (When I upload a file I want to see the filename.)

    –   (RDF Format selector is invisible)

    –   (CANCEL Query fails on non-default kb namespace on HA follower.)

    –   (Provide workaround for bad reverse DNS setups.)

    –   (BIND is leaving a variable unbound)

    –   (HAJournalServer does not die if zookeeper is not running)

    –   (large sparql insert optimization slow?)

    –   (unnecessary synchronization)

    –   (stack overflow in populateStatsMap)

    –   (Update Basic Bigdata Chef Cookbook)

    –   (AssertionError:  PropertyPathNode got to ASTJoinOrderByType.optimizeJoinGroup)

    –   (unsound combo query optimization: union + filter)

    –   (DC Prefix Button Appends “</li>”)

    –   (Add a quick-start ant task for the BD Server “ant start”)

    –   (Provide a configurable IAnalyzerFactory)

    –   (Blueprints API Implementation)

    –   (Settable timeout on SPARQL Query (REST API))

    –   (DefaultAnalyzerFactory issues)

    –   (Content negotiation orders accept header scores in reverse)

    –   (NSS does not start from command line: bigdata-war/src not found.)

    –   (ProxyServlet in web.xml breaks tomcat WAR (HA LBS)




    – (Journal HA)

    – (Coalesce write cache records and install reads in cache)

    – (HA TXS)

    – (Remove triple-buffering in RWStore)

    – (HA backup)

    – (River not compatible with newer 1.6.0 and 1.7.0 JVMs)

    – (Add a custom function to use full text index for filtering.)

    – (RWS test failure)

    – (Compress write cache blocks for replication and in HALogs)

    – (Latency on followers during commit on leader)

    – (Issue with OPTIONAL blocks)

    – (RWStore needs post-commit protocol)

    – (HA3 LOAD non-responsive with node failure)

    – (Occasional CI deadlock in HALogWriter testConcurrentRWWriterReader)

    – (Accumulating HALog files cause latency for HA commit)

    – (Query on follower fails during UPDATE on leader)

    – (DGC in release time consensus protocol causes native thread leak in HAJournalServer at each commit)

    – (WCS write cache compaction causes errors in RWS postHACommit())

    – (Bad patterns for timeout computations)

    – (HA deadlock under UPDATE + QUERY)

    – (DGC Thread and Open File Leaks: sendHALogForWriteSet())

    – (HAJournalServer can not restart due to logically empty log file)

    – (HAJournalServer deadlock: pipelineRemove() and getLeaderId())

    – (Optimization with skos altLabel)

    – (Consensus protocol does not detect clock skew correctly)

    – (HAJournalServer Cache not populated)

    – (Missing URL encoding in RemoteRepositoryManager)

    – (Error when using the alias “a” instead of rdf:type for a multipart insert)

    – (Failed to re-interrupt thread in HAJournalServer)

    – (Failed to re-interrupt thread)

    – (OneOrMorePath SPARQL property path expression ignored)

    – (Transparently cancel update/query in RemoteRepository)

    – (HAJournalServer reports “follower” but is in SeekConsensus and is not participating in commits.)

    – (Problems in BackgroundTupleResult)

    – (InvocationTargetException on /namespace call)

    – (ask does not return json)

    – (Race between QueryEngine.putIfAbsent() and shutdownNow())

    – (MultiSourceSequentialCloseableIterator.nextSource() can throw NPE)

    – (BlockingBuffer.close() does not unblock threads)

    – (BIND heisenbug – race condition on select query with BIND)

    – (sparql protocol: mime type application/sparql-query)

    – (SELECT ?x { OPTIONAL { ?x eg:doesNotExist eg:doesNotExist } } incorrect)

    – (Interrupt of thread submitting a query for evaluation does not always terminate the AbstractRunningQuery)

    – (Verify that IRunningQuery instances (and nested queries) are correctly cancelled when interrupted)

    – (HAJournalServer needs to handle ZK client connection loss)

    – (HA3 simultaneous service start failure)

    – (HA asynchronous tasks must be canceled when invariants are changed)

    – (FILTER EXISTS in subselect)

    – (Logically empty HALog for committed transaction)

    – (DELETE/INSERT fails with OPTIONAL non-matching WHERE)

    – (Refactor to create HAClient)

    – (ant bundleJar not working)

    – (CBD and Update leads to 500 status code)

    – (describe statement limit does not work)

    – (Range optimizer not optimizing Slice service)

    – (two property paths interfere)

    – (MIN() malfunction)

    – (class cast exception)

    – (Inconsistent treatment of bind and optional property path)

    – (ctc-striterators should build as independent top-level project (Apache2))

    – (AbstractTripleStore.destroy() does not filter for correct prefix)

    – (Assertion error)

    – (BOUND bug)

    – (incorrect join with subselect renaming vars)

    – (Failure to setup SERVICE hook and changeLog for Unisolated and Read/Write connections)

    – (Concurrent QuorumActors can interfere leading to failure to progress)

    – (order by and group_concat)

    – (Code review on 2-phase commit protocol)

    – (RESYNC failure (HA))

    – (alpp ordering)

    – (Query timeout only checked at operator start/stop.)

    – (Closed as duplicate of #490)

    – (HA Leader fail results in transient problem with allocations on other services)

    – (Operator Alerts (HA))




    – (ConcurrentModificationException in ASTComplexOptionalOptimizer)




    – (Maven Build)

    – (Journal leaks memory).

    – (Occasional deadlock in CI runs in

    – (CI (mock) quorums deadlock)

    – (Optimize hash join for subgroups with no incoming bound vars.)

    – (StaticAnalysis#getDefinitelyBound() ignores exogenous variables.)

    – (RDFS Plus Profile)

    – (SPARQL 1.1 Property Paths)

    – (Negative parser tests)


    – (Optimize JOIN VARS for Sub-Selects)

    – (Support PSOutputStream/InputStream at IRawStore)

    – (Use RDFFormat.NQUADS as the format identifier for the NQuads parser)

    – (MemoryManager Journal does not implement all methods).

    – (NSS Admin API)

    – (DESCRIBE with OFFSET/LIMIT needs to use sub-select)

    – (Concise Bounded Description (CBD))

    – (CONSTRUCT should use distinct SPO filter)

    – (VoID in ServiceDescription)

    – (RWStore immedateFree() not removing Checkpoint addresses from the historical index cache.)

    – (nxparser fails with uppercase language tag)

    – (Optimize RWStore allocator sizes)

    – (Ugrade to Sesame 2.6.10)

    – (WAR was deployed using TRIPLES rather than QUADS by default)

    – (Change web.xml parameter names to be consistent with Jini/River)


    – (B+Tree branching factor and HTree addressBits are confused in their NodeSerializer implementations)

    – (BlobIV for blank node : NotMaterializedException)

    – (BlobIV collision counter hits false limit.)

    – (Log uncaught exceptions)

    – (RWStore does not discard logged deletes on reset())

    – (History service / index)

    – (LOG BlockingBuffer not progressing at INFO or lower level)

    – (bigdata-ganglia is required dependency for Journal)

    – (The code that processes SPARQL Update has a typo)

    – (Bigdata scale-up depends on zookeper)

    – (SPARQL UPDATE response inlines large DELETE or INSERT triple graphs)

    – (static join optimizer does not get ordering right when multiple tails share vars with ancestry)

    – (AST2BOpUtility wraps UNION with an unnecessary hash join)

    – (Row store read/update not isolated on Journal)

    – (Concurrent KB create fails with “No axioms defined?”)

    – (DirectBufferPool.poolCapacity maximum of 2GB)

    – (RemoteRepository class should use application/x-www-form-urlencoded for large POST requests)

    – (UpdateServlet fails to parse MIMEType when doing conneg.)

    – (Expose performance counters for read-only indices)

    – (Environment variable override for NSS properties file)

    – (Create a bigdata-client jar for the NSS REST API)

    – (ClassCastException in SIDs mode query)

    – (NotMaterializedException when a SERVICE call needs variables that are provided as query input bindings)

    – (ClassCastException when binding non-uri values to a variable that occurs in predicate position)

    – (Change DEFAULT_MIN_RELEASE_AGE to 1ms)

    – (Conditionally rollback() BigdataSailConnection if dirty)

    – (Property paths do not work inside of exists/not exists filters)

    – (Add web.xml parameters to lock down public NSS end points)

    – (Bigdata2Sesame2BindingSetIterator can fail to notice asynchronous close())

    – (Can not POST RDF to a graph using REST API)

    – (Rare AssertionError in WriteCache.clearAddrMap())

    – (SPARQL REGEX operator does not perform case-folding correctly for Unicode data)

    – (InFactory bug when IN args consist of a single literal)

    – (SIDs mode creates unnecessary hash join for GRAPH group patterns)

    – (Provide NanoSparqlServer initialization hook)

    – (Doubly nested subqueries yield no results with LIMIT)

    – (Flush indices in parallel during checkpoint to reduce IO latency)

    – (AtomicRowFilter UnsupportedOperationException)




    – (RWStore immedateFree() not removing Checkpoint addresses from the historical index cache.)

    – (RWStore does not discard logged deletes on reset())

    – (Prepare critical maintenance release as branch of 1.2.1)




    – (Review materialization for inline IVs)

    – (NotMaterializedException with REGEX and Vocab)

    – (SPARQL UPDATE using NSS via index.html)

    – (MemoryManaged backed Journal mode)

    – (Index cache for Journal)

    – (BTree can not be cast to Name2Addr (MemStore recycler))

    – (NPE in Leaf.getKey() : root cause was user error)

    – (SPARQL INSERT not working in same request after INSERT DATA)

    – (Sub-select in INSERT cause NPE in UpdateExprBuilder)


    – (Failure to set cached value on IV results in incorrect behavior for complex UPDATE operation)

    – (DELETE WHERE fails with Java AssertionError)

    – (LOAD-CREATE-LOAD using virgin journal fails with “Graph exists” exception)

    – (DELETE/INSERT WHERE handling of blank nodes)

    – (NullPointerException when attempting to INSERT DATA containing a blank node)


    1.2.0: (*)


    –  (Monitoring webapp)

    – (Support evaluation of 3rd party operators)

    – (Compact and efficient movement of binding sets between nodes.)

    – (Cluster leaks threads under read-only index operations: DGC thread leak)

    – (Thread-local cache combined with unbounded thread pools causes effective memory leak: termCache memory leak & thread-local buffers)

    – (KeyBeforePartitionException on cluster)

    – (Class loader problem)

    – (Ganglia integration)

    – (Logger for RWStore transaction service and recycler)

    – (SPARQL query can fail to notice when IRunningQuery.isDone() on cluster)

    – (RWStore does not track tx release correctly)

    – (HTTP Repostory broken with bigdata 1.1.0)

    – (SPARQL 1.1 UPDATE)

    – (SPARQL 1.1 Federation extension)

    – (Serialization error in SIDs mode on cluster)

    – (Global Row Store Read on Cluster uses Tx)

    – (IExtension implementations do point lookups on lexicon)

    – (“No such index” on cluster under concurrent query workload)

    – (Java level deadlock in DS)

    – (Uncaught interrupt resolving RDF terms)

    – (KeyAfterPartitionException / KeyBeforePartitionException on cluster)

    – (NoSuchVocabularyItem with LUBMVocabulary for DerivedNumericsExtension)

    – (Query statistics do not update correctly on cluster)

    – (Too many GRS reads on cluster)

    – (Sail does not flush assertion buffers before query)

    – (acceptTaskService pool size on cluster)

    – (Optimize serialization for query messages on cluster)

    – (Test suite for writeCheckpoint() and recycling for BTree/HTree)

    – (Cluster does not map input solution(s) across shards)

    – (Error releasing deferred frees using 1.0.6 against a 1.0.4 journal)

    – (PhysicalAddressResolutionException against 1.0.6)

    – (RWStore reset() should be thread-safe for concurrent readers)

    – (Java API for NanoSparqlServer REST API)

    – (AbstractTripleStore.destroy() does not clear the locator cache)

    – (Empty chunk in ThickChunkMessage (cluster))

    – (Virtual Graphs)

    – (Sesame 2.6.3)

    – (Implement STRBEFORE, STRAFTER, and REPLACE)

    – (Bring bigdata RDF/XML parser up to openrdf 2.6.3.)

    – (SPARQL 1.1 Service Description)

    –        (Aggregation with an solution set as input should produce an empty solution as output)

    –        (Incorrect error handling for SPARQL aggregation; fix in 2.6.1)

    –        (Order the same Blank Nodes together in ORDER BY)

    – (SPARQL 1.1 BINDINGS are ignored)

    – (Bigdata2Sesame2BindingSetIterator throws QueryEvaluationException were it should throw NoSuchElementException)

    – (UNION with Empty Group Pattern)

    – (Exception when using SPARQL sort & statement identifiers)

    – (Load, closure and query performance in 1.1.x versus 1.0.x)

    – (LIMIT causes hash join utility to log errors)

    – (Expose the LexiconConfiguration to Function BOPs)

    – (Query with two “FILTER NOT EXISTS” expressions returns no results)

    – (REGEXBOp should cache the Pattern when it is a constant)

    – (Java 7 Compiler Compatibility)

    – (Review function bop subclass hierarchy, optimize datatype bop, etc.)

    – (CONSTRUCT WHERE shortcut)

    – (Incremental materialization of Tuple and Graph query results)

    – (Modify the IChangeLog interface to support multiple agents)

    – (Expose timestamp of LexiconRelation to function bops)

    – (ClassCastException during hash join (can not be cast to TermId))

    – (Review materialization for inline IVs)

    – (BSBM BI Q5 error using MERGE JOIN)


    1.1.0 (*)


    –  (Lexicon joins)

    – (Store large literals as “blobs”)

    – (Scale-out LUBM “how to” in wiki and build.xml are out of date.)

    – (Implement an persistence capable hash table to support analytic query)

    – (AccessPath should visit binding sets rather than elements for high level query.)

    – (SliceOp appears to be necessary when operator plan should suffice without)

    – (Bottom-up evaluation semantics).

    – (Derived xsd numeric data types must be inlined as extension types.)

    – (Revisit pruning of intermediate variable bindings during query execution)

    – (Lift conditions out of subqueries.)

    – (Native ORDER BY)

    – (Inline predeclared URIs and namespaces in 2-3 bytes)

    – (NanoSparqlServer does not locate “html” resources when run from jar)

    – (Support inlining of unicode data in the statement indices.)

    – (Scalable default graph evaluation)

    – (Prune variable bindings during query evaluation)

    – (Direct translation of openrdf AST to bigdata AST)

    – (Fix StrBOp and other IValueExpressions)

    – (Optimize OPTIONALs with multiple statement patterns.)

    – (Native SPARQL evaluation on cluster)

    – (Cluster does not compute closure)

    – (HTree hash join performance)

    – (inline xsd:unsigned datatypes)

    – (xsd:string cast fails for non-numeric data)

    – (New query hints model.)

    – (Use of read-only tx per query defeats cache on cluster)




    – (BTreeCounters does not track bytes released)

    – (Refactor performance counters using accessor interface)

    – (B+Tree should delete bloom filter when it is disabled.)

    – (RWStore does not prune the CommitRecordIndex)

    – (Persistent memory leaks (RWStore/DISK))

    – (FastRDFValueCoder2: ArrayIndexOutOfBoundsException)

    – (Release age advanced on WORM mode journal)

    – (Add a DELETE by access path method to the NanoSparqlServer)

    – (Add “context-uri” request parameter to specify the default context for INSERT in the REST API)

    – (log4j configuration error message in WAR deployment)

    – (Add a fast range count method to the REST API)

    – (Support temp triple store wrapped by a BigdataSail)

    – (NQuads support for NanoSparqlServer)

    – (Bug fix to DEFAULT_RDF_FORMAT for bulk data loader in scale-out)

    – (Support either lockfile (procmail) and dotlockfile (liblockfile1) in scale-out)

    – (BigdataSail#getReadOnlyConnection() race condition with concurrent commit)

    – (Address is 0L)

    – (TestMROWTransactions failure in CI)



    –  (Query time expansion of (foo rdf:type rdfs:Resource) drags in SPORelation for scale-out.)

    – (Scale-out LUBM “how to” in wiki and build.xml are out of date.)

    – (Query not terminated by error.)

    – (NamedGraph pattern fails to bind graph variable if only one binding exists.)

    – (IRunningQuery not closed promptly.)

    – (DataLoader fails to load resources available from the classpath.)

    – (Support for the streaming of bigdata IBindingSets into a sparql query.)

    – (ClosedByInterruptException during heavy query mix.)

    – (NotSerializableException for SPOAccessPath.)

    – (Change dependencies to Apache River 2.2.0)


    1.0.1 (*)


    – (Unicode clean schema names in the sparse row store).

    – (TermIdEncoder should use more bits for scale-out).

    – (OSX requires specialized performance counter collection classes).

    – (BigdataValueFactory.asValue() must return new instance when DummyIV is used).

    – (TermIdEncoder limits Journal to 2B distinct RDF Values per triple/quad store instance).

    – (SPO not Serializable exception in SIDS mode (scale-out)).

    – (ClassCastException when querying with binding-values that are not known to the database).

    – (UnsupportedOperatorException for some SPARQL queries).

    – (Query failure when comparing with non materialized value).

    – (RWStore reports “FixedAllocator returning null address, with freeBits”.)

    – (NamedGraph pattern fails to bind graph variable if only one binding exists.)

    – (log4j – slf4j bridge.)


    For more information about bigdata(R), please see the following links:

















    About Blazegraph™:


    Blazegraph™ is a horizontally-scaled, general purpose storage and computing fabric for ordered data (B+Trees), designed to operate on either a single server or a cluster of commodity hardware. Blazegraph™ uses dynamically partitioned key-range shards in order to remove any realistic scaling limits – in principle, Blazegraph™ may be deployed on 10s, 100s, or even thousands of machines and new capacity may be added incrementally without requiring the full reload of all data. The Blazegraph™ RDF database supports RDFS and OWL Lite reasoning, high-level query (SPARQL), and datum level provenance.