The liac-arff module implements functions to read and write ARFF files in Python.
Attribute Relationship File Format (ARFF) is the text format file used by Weka to store data in a database. This module is an ARFF file handler based on other python parser modules (such json and yaml).
NOTE: You can clone the arff-datasets repository for a large set of ARFF files.
- Read and Write ARFF files using python built-in structures;
- Supports NUMERIC, REAL, INTEGER, STRING and NOMINAL attribute types;
- Supports names with space;
- Read and Write the description of the file;
- MIT license;
Via easy_install:
$ easy_install liac-arff
Manually:
$ python setup.py install
You can read an ARFF file as follows:
>>> import arff >>> data = arff.load(open('wheater.arff', 'rb'))
Which results in:
>>> data { 'attributes': [ ('outlook', ['sunny', 'overcast', 'rainy']), ('temperature', 'REAL'), ('humidity', 'REAL'), ('windy', ['TRUE', 'FALSE']), ('play', ['yes', 'no'])], 'data': [ ['sunny', 85.0, 85.0, 'FALSE', 'no'], ['sunny', 80.0, 90.0, 'TRUE', 'no'], ['overcast', 83.0, 86.0, 'FALSE', 'yes'], ['rainy', 70.0, 96.0, 'FALSE', 'yes'], ['rainy', 68.0, 80.0, 'FALSE', 'yes'], ['rainy', 65.0, 70.0, 'TRUE', 'no'], ['overcast', 64.0, 65.0, 'TRUE', 'yes'], ['sunny', 72.0, 95.0, 'FALSE', 'no'], ['sunny', 69.0, 70.0, 'FALSE', 'yes'], ['rainy', 75.0, 80.0, 'FALSE', 'yes'], ['sunny', 75.0, 70.0, 'TRUE', 'yes'], ['overcast', 72.0, 90.0, 'TRUE', 'yes'], ['overcast', 81.0, 75.0, 'FALSE', 'yes'], ['rainy', 71.0, 91.0, 'TRUE', 'no']], 'description': u'', 'relation': 'weather' }
You can write an ARFF file with this structure:
>>> print arff.dumps(data) @RELATION weather @ATTRIBUTE outlook {sunny, overcast, rainy} @ATTRIBUTE temperature REAL @ATTRIBUTE humidity REAL @ATTRIBUTE windy {TRUE, FALSE} @ATTRIBUTE play {yes, no} @DATA sunny,85.0,85.0,FALSE,no sunny,80.0,90.0,TRUE,no overcast,83.0,86.0,FALSE,yes rainy,70.0,96.0,FALSE,yes rainy,68.0,80.0,FALSE,yes rainy,65.0,70.0,TRUE,no overcast,64.0,65.0,TRUE,yes sunny,72.0,95.0,FALSE,no sunny,69.0,70.0,FALSE,yes rainy,75.0,80.0,FALSE,yes sunny,75.0,70.0,TRUE,yes overcast,72.0,90.0,TRUE,yes overcast,81.0,75.0,FALSE,yes rainy,71.0,91.0,TRUE,no % % %