Skip to content
This repository has been archived by the owner on Aug 10, 2020. It is now read-only.

Latest commit

 

History

History
57 lines (34 loc) · 2.86 KB

README.md

File metadata and controls

57 lines (34 loc) · 2.86 KB

⚠️ Please Note: This repository documents a tool developed to generate RDF data dumps from an independently run instance of Wikibase. Today, Rhizome recommends that you either run the Wikibase Docker distribution or get a managed Wikibase at WbStack. If you specifically are looking to get an RDF dump of your Wikibase, you should use the dumpRdf.php, which is supplied with Wikibase.


Bringing ✨ to Wikibase

This Python 3 script converts entities from a local Wikibase to RDF (turtle format), ready to be imported into a Blazegraph graph database. It attemps to use the same data structure as Wikidata.

Environment

The script expects login credentials to the MySQL/MariaDB instance used by the local Wikibase in environment variables:

environment variable expected value
wbdbhost MySQL/MariaDB Host
wbdbuser MySQL/MariaDB user name
wbdbpasswd MySQL/MariaDB user password
wbdbdb Name of database

Usage

./web2ttl.py [-e <exactMatchProperty>] <localBase> <outfile>

With the optional switch -e it is possible to designate a local Wikibase property to be treated as skos:exactMatch. This allows to match local properties with Wikidata properties or any other graph.

The option -e P2 would use the local property P2. For example, in Rhizome's Wikibase, the property P3 (instance of) is matched with P31 (instance of) on Wikidata and rdf:instance of.

The local_base paramter defines the local URI prefix. In Rhizome's case, this is http://catalog.rhizome.org/.

outfile is where the RDF output will go.

Updating Blazegraph

Given that Wikibase and Blazegraph are running on the same host, a script like this will export the RDF from Wikibase, clear any data in Blazegraph, and then import the new graph:

#!/bin/bash

export wbdbhost=localhost
export wbdbuser=alice       # MySQL/MariaDB user name
export wbdbpasswd=sikrit    # password 
export wbdbdb=wiki          # name of database

./wb2ttl.py -e P2 http://catalog.rhizome.org/ db-export.ttl

chmod a+r db-export.ttl

ABSFILE=`readlink -f db-export.ttl`

curl "http://localhost:9999/blazegraph/namespace/kb/sparql"  --data-urlencode "update=DROP ALL; LOAD <file:///$ABSFILE>;"

This script could be run as a cronjob for reglar updates.

Deployment at Rhizome

See presentation (Google Slides) from WikidataCon 2017