brat (brat rapid annotation tool) is based on the stav visualiser which was originally made in order to visualise BioNLP'11 Shared Task data. Currently brat is under heavy development but is used by Genia and several external groups for ongoing annotation projects. brat aims to provide an intuitive and fast way to create text-bound and relational annotations.
brat aims to overcome short-comings of previous annotation tools such as:
- De-centralisation of configurations and data, causing synchronisation issues
- Annotations and related text not being visually adjacent
- Complexity of set-up for annotators
- Etc.
brat does this by:
- Data and configurations on a central web server (as Mark Twain said: "Put all your eggs in one basket, and then guard that basket!")
- Present text as it would appear to a reader and maintain annotations close to the text
- Zero set-up for annotators, leave configurations and server/data maintainence to other staff
If you do make use of brat or components from brat for annotation purposes, please cite the following publication:
@inproceedings{,
author = {Stenetorp, Pontus and Pyysalo, Sampo and Topi\'{c}, Goran
and Ohta, Tomoko and Ananiadou, Sophia and Tsujii, Jun'ichi},
title = {{brat}: a Web-based Tool
for {NLP}-Assisted Text Annotation},
booktitle = {Proceedings of the Demonstrations Session
at {EACL} 2012},
month = {April},
year = {2012},
address = {Avignon, France},
publisher = {Association for Computational Linguistics},
note = {(to appear)},
}
If you make use of brat or its components solely for visualisation purposes, please cite the following publication:
@InProceedings{stenetorp2011supporting,
author = {Stenetorp, Pontus and Topi\'{c}, Goran and Pyysalo, Sampo
and Ohta, Tomoko and Kim, Jin-Dong and Tsujii, Jun'ichi},
title = {BioNLP Shared Task 2011: Supporting Resources},
booktitle = {Proceedings of BioNLP Shared Task 2011 Workshop},
month = {June},
year = {2011},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {112--120},
url = {http://www.aclweb.org/anthology/W11-1816}
}
Lastly, if you have enough space we would be very happy if you also link to the brat repository:
...the brat rapid annotation tool\footnote{
\url{http://brat.nlplab.org}
}
Please see LICENSE.
This section describes how to install brat and its third-party dependencies. Do note that brat is a served through a web server and we currently develop against Apache 2.x, these instructions will assume that you use the same but we do have LigHTTPD configuration files in the repository if you feel like trying it out.
Since brat (and this document) is very much under development the information on this document may not be up to date. If it isn't, bash the developers ASAP. You are also very welcome to file issues for bugs that you may find or features that you would like to see in brat, do so at our GitHub page or by mailing the authors. We appreciate if you supply the version you are working on and as much details on your system as possible (we have received tar-balls of whole installations at some points since the installation can be self-contained in a single directory).
Extract brat somewhere convenient where your webserver can reach it:
tar xzf TsujiiLaboratory-brat-${VERSION}-${HASH}.tar.gz
Enter the brat directory:
cd brat
When running brat it needs to read and write data to several directories, let's create them.
mkdir data work
We now need to set the permissions of the directories so that they can be read and written by the webserver. The command-line below is likely to give you the Apache 2 group if you have Apache currently running. If not, see the "Finding Your Apache 2 Group" section of this document:
groups `ps aux | grep apache | grep -v 'grep' \
| cut -d ' ' -f 1 | grep -v 'root' | head -n 1` \
| cut -d : -f 2 | sed 's|\ ||g'
Then simply change the group of the directories (change ${YOUR_APACHE_GROUP}
into the output you got above) and set the correct permissions:
sudo chgrp -R ${YOUR_APACHE_GROUP} data work
chmod -R g+rwx data work
If you can't succeed with the above or you are not concerned with security (say that it is a single-user system), you can run the command below instead or refer to your operating system manual and look-up Apache 2 for their instructions on how to get the Apache group. This will make the directories write-able and read-able by every user on your system:
chmod 777 data work
Extract all the library dependencies.
( cd server/lib && tar xfz simplejson-2.1.5.tar.gz )
Put a configuration in place.
cp config_template.py config.py
Edit the configuration to suit your environment.
vim config.py
Your installation should now be ready, just place your data in the data
directory and make sure it has the right permissions using chmod
as you did
above. If your data consists of no prior annotations and only .txt
files,
create the annotation files (.ann
) as below.
( find data -name '*.txt' | sed -e 's|\.txt||g' \
| xargs -I {} touch '{}.ann' )
This part largely focuses on Ubuntu, but use your *NIX-fu to turn it into what you need if you don't have the misfortune to have the "Brown Lunix Distribution".
brat supports FastCGI which can speed up your installation by roughly x10 since you won't have to invoke the Python interpreter for every request. If you want to use FastCGI as opposed to CGI keep an eye out for configuration comments regarding it. For FastCGI you need the flup Python library:
( cd server/lib/ && tar xfz flup-1.0.2.tar.gz )
Install Apache 2.x if you don't have it already:
sudo apt-get install apache2
Let's edit the httpd.conf.
sudo vim /etc/apache2/httpd.conf
If you are installing brat into your home directory, add the following lines.
<Directory /home/*/public_html>
AllowOverride Options Indexes FileInfo
AddType application/xhtml+xml .xhtml
AddType font/ttf .ttf
# For CGI support
AddHandler cgi-script .cgi
# Comment out the line above and uncomment the line below for FastCGI
#AddHandler fastcgi-script fcgi
</Directory>
# For FastCGI, Single user installs should be fine with anything over 8
#FastCgiConfig -maxProcesses 16
If you are not installing into your home directory adjust the above lines
accordingly. If you installed into your home directory make sure that you have
the userdir
module enabled.
sudo a2enmod userdir
For FastCGI you also want to install its module and then add it and the
rewrite
module that we use to redirect the CGI requests to FastCGI:
sudo apt-get install libapache2-mod-fastcgi
sudo a2enmod fastcgi
sudo a2enmod rewrite
The final FastCGI step is detailed in .htaccess
in the brat installation
directory, which involves uncommenting and configuring the rewrite
module.
Finally tell Apache 2.x to load your new configuration.
sudo /etc/init.d/apache2 reload
Ideally you should set all permissions as needed for the Apache 2 group, but finding it can be painful.
Find out what the Apache group name is, it is usually apache
or www-data
;
it can be found in apache2.conf
or httpd.conf
under /etc/apache2/
or
/etc/httpd/
. Let's assume it's www-data. Then:
sudo chgrp -R www-data data work
chmod -R g+rwx data work
Actually, due to the joy of Linux segmentation you can find the group elsewhere as well. Here is a small heuristic that works on at least two Linux distributions:
locate --regex '(apache2|httpd)\.conf$' | xargs cat | grep 'Group'
If what you get from this looks funky, say with a leading $, try this:
locate envvars | grep -E '(apache2|httpd)' | grep '/etc/' \
| xargs cat | grep -i 'group'
If this doesn't work either dive into /etc/group
and hope that you can find
something that at least looks like apache
or www-data
:
cat /etc/group | cut -d : -f 1 | sort
On a Mac, Apache configuration is quite different, and Aptitude is not available. Instead, we recommend Homebrew if there are packages you need to have installed (you will also need a compiler -- either XCode, or this alternative).
Enable Apache by System Preferences -> Sharing -> Web Sharing.
Clone this repository into ~/Sites
. Edit
/private/etc/apache2/users/$USER.conf
. Then invoke sudo apachectl reload
.
The default user and group name for Apache is _www
(as found in
/private/etc/apache2/httpd.conf
), for use in chgrp
.
brat had a very flaky back-up system (actually my fault for my early morning
hack), there is an alternate script that you can use with good old cron
.
Do as follows.
Add a line like the one below to the crontab for your Apache user:
0 5 * * * ${PATH_TO_BRAT_INSTALLATION}/tools/backup.py
You will now have a back-up made into your work directory every morning at five o'clock.
For help and feedback please contact the authors below, preferably with all on them on CC since their responsibilities and availability may vary:
- Goran Topić <goran is s u-tokyo ac jp>
- Sampo Pyysalo <smp is s u-tokyo ac jp>
- Pontus Stenetorp <pontus is s u-tokyo ac jp>