Using Vagrant is by far the easiest way to spin up a development environment and get started with contributing to the Open Cultuur Data API.
Clone the OCD git repository:
$ git clone https://github.com/openstate/open-cultuur-data.git $ cd open-cultuur-data/
Select and link the correct
Vagrantfile
(depending on the Vagrant provider you use):$ ln -s Vagrantfile.virtualbox Vagrantfile
Start the Vagrant box and SSH into it:
$ vagrant up && vagrant ssh
Vagrant will automatically sync your project directory (the directory with the Vagrantfile) between the host and guest machine. Also, it will run a bootstrap script that will take care of installing project dependencies. In the guest, the project directory can be found under /vagrant
. For more information, see the Vagrant documentation on Synced Folders.
- Redis
- Elasticsearch >= 1.1
- Python(-dev) 2.7
- liblxml
- libxslt
- pip
- virtualenv (optional)
Install redis:
$ sudo add-apt-repository ppa:rwky/redis $ sudo apt-get update $ sudo apt-get install redis-server
Install Java (if it isn't already):
$ sudo apt-get install openjdk-7-jre-headless
Install Elasticsearch:
$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.2.0.deb $ sudo dpkg -i elasticsearch-1.2.0.deb
Install liblxml, libxslt and python-dev:
$ sudo apt-get install libxml2-dev libxslt1-dev python-dev
Install pip and virtualenv:
$ sudo easy_install pip
Create an OCD virtualenv and source it:
$ virtualenv ocd $ source ocd/bin/activate
Clone the OCD git repository and install the required Python packages:
$ git clone https://github.com/openstate/open-cultuur-data.git $ cd open-cultuur-data/ $ pip install -r requirements.txt
First, add the OCD template to the running Elasticsearch instance:
$ ./manage.py elasticsearch put_template
Make the necessary changes to the 'sources' settings file (
ocd_backend/sources.json
). For example, fill out your API key for retrieving data from the Rijksmuseum.Start the extraction process:
$ ./manage.py extract start openbeelden
You can get an overview of the available sources by running
./manage.py extract list_sources
.Simultaneously start a worker processes:
$ celery --app=ocd_backend:celery_app worker --loglevel=info --concurrency=2