Skip to content

Latest commit

 

History

History
34 lines (25 loc) · 1.06 KB

README.md

File metadata and controls

34 lines (25 loc) · 1.06 KB

dpyp

A convenience tool for small-scale data pipelines in Python

  image

About

dpyp is a data-pipeline convenience tool containing functionality for reading and writing batches, cleaning data, diagnosing pipelines, manipulating text, and calculating fields in Python.

PyPI

Usage

  • dpyp consists of seven modules: 'calculate', 'clean', 'diagnose', 'read', 'text', 'write', and 'transform'.
  • Designed for use in small-scale Python pipelines with an emphasis on batch-processing via 'data-dictionaries'.
  • Batch processing of data via dictionaries allows iterative functions to improve readability and ease of use.
  • Built using a combination of base Python and pandas for writing robust small-scale pipelines with text manipulation capabilities.

Dependencies

  • pandas
  • pyarrow
  • numpy

Installation

pip install dpyp

License

See LICENSE.md

Contributing

See CONTRIBUTING.md