Skip to content

Split IATI activity files to limit the number of activities in each one. Also does some basic filtering.

License

Notifications You must be signed in to change notification settings

matmaxgeds/iatisplit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

iatisplit - split IATI activity files into smaller chunks

This is an early beta version Python command-line utility that allows splitting IATI Activity files into smaller chunks, limiting the maximum number of activities in each file.

Usage

Split into files containing a maximum of 100 activities each:

$ iatisplit -n 100 input-data.xml

Command-line options

The only required option is --max-activities / -n.

--max-activities NUMBER

-n NUMBER

Required. Maximum number of IATI activities to include in each output file.

--output-directory DIRECTORY

-d DIRECTORY

Output directory for split IATI documents (defaults to ".", which may fail on non-Unix systems). The directory must already exist. iatisplit will overwrite existing files in the directory.

--output-stub FILENAME

-o FILENAME

Base filename for all output files (tries to guess from filename/URL if not provided)

--start-date YYYY-MM-DD

-s YYYY-MM-DD

Include only IATI activities that start on or after this date. Uses the actual start date if present, then falls back to the planned start date.

--end-date YYYY-MM-DD

- e YYYY-MM-DD

Include only IATI activities that end on or before this date. Uses the actual end date if present, then falls back to the planned end date.

--humanitarian-only

-H

Include only IATI activities with the humanitarian marker on the activity or one of its transactions (IATI 2.02 and above).

--transaction-type TYPE

Include only activities with at least one transaction of the specified type.

--transaction-start-date YYYY-MM-DD

Include only activities with at least one transaction on or after the specified date.

--transaction-end-date YYYY-MM-DD

Include only activities with at least one transaction before or on the specified date.

--verbose

Include a lot of debugging information about processing.

--quiet

Print only error messages.

--version

Print program version and exit.

--help

-h

Print usage information and exit.

Output

The output will appear in a number of files in the current working directory, each with an additional 3-digit number before the original extension. For example, splitting the input file input-data.xml will produce the following output files

  • input-data.001.xml
  • input-data.002.xml
  • input-data.003.xml

etc.

Calling from Python code

Python code can call the function iatisplit.split.split directly. It has the following signature (echoing the command-line parameters):

def split(
  file_or_url, 
  max, 
  output_dir=".", 
  output_stub=None, 
  start_date=None, 
  end_date=None, 
  humanitarian_only=False,
  transaction_type=None,
  transaction_start_date=None,
  transaction_end_date=None
)

Requirements

Requires Python3 and the requests library. See requirements.txt setup.py. (The pip utility will install requirements automatically.)

Installation

  1. From PyPi:
$ pip install iatisplit

or (if you have both Python2 and Python3 on your system)

$ pip3 install iatisplit
  1. From the source code:
$ python setup.py install

or (if you have both Python2 and Python3 on your system)

$ python3 setup.py install

Source code and bug reporting

The source code is available at https://github.com/davidmegginson/iatisplit/

Please report bugs or feature requests at https://github.com/davidmegginson/iatisplit/issues

Author and license

This code was started by David Megginson, and is released into the Public Domain with no warranty of any kind. See UNLICENSE.md for details.

About

Split IATI activity files to limit the number of activities in each one. Also does some basic filtering.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%