Parallel materialization of the RDFS closure

There were two excellent presentations yesterday at ISWC 2009 on using parallel techniques to materialize the RDFS closure at extremely high rates (for example, the RDFS closure of U8000 in 15 minutes). This is something that we are going to try out as soon as possible. Unlike either of the systems described in these papers, bigdata using automatic dynamic sharding based on key-ranges of the data. These techniques can be adapted by mapping the computation onto the POS index shards, bringing them to fixed point, and then reusing our high-throughput data loader to quickly relocate the entailments onto the distributed indices. There is clearly a surge in parallel and distributed algorithms for the semantic web, which is extremely exciting.

[1] Jesse Weaver, James A. Hendler. Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples, In Proceedings of the 8th International Semantic Web Conference, pp. 682–697, 2009.

[2] Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen. Department of Computer Science, Vrije Universiteit Amsterdam, the Netherlands, Scalable Distributed Reasoning using MapReduce, In Proceedings of the 8th International Semantic Web Conference, 2009.


One thought on “Parallel materialization of the RDFS closure

  1. Jesse

    Glad to hear that you liked my presentation (the "Parallel Materialization …" one). You may also be interested in Greg's and my SSWS paper "Scalable RDF query processing on clusters and supercomputers." You can find a link to this paper at my website.

Leave a Reply

Your email address will not be published. Required fields are marked *