HistomicsTK is a Python and REST API for the analysis of Histopathology images in association with clinical and genomic data.
Histopathology, which involves the examination of thin-slices of diseased tissue at a cellular resolution using a microscope, is regarded as the gold standard in clinical diagnosis, staging, and prognosis of several diseases including most types of cancer. The recent emergence and increased clinical adoption of whole-slide imaging systems that capture large digital images of an entire tissue section at a high magnification, has resulted in an explosion of data. Compared to the related areas of radiology and genomics, there is a dearth of mature open-source tools for the management, visualization and quantitative analysis of the massive and rapidly growing collections of data in the domain of digital pathology. This is precisely the gap that we aim to fill with the development of HistomicsTK.
Developed in coordination with the Digital Slide Archive and large_image, HistomicsTK aims to serve the needs of both pathologists/biologists interested in using state-of-the-art algorithms to analyze their data, and algorithm researchers interested in developing new/improved algorithms and disseminate them for wider use by the community.
HistomicsTK can be used in two ways:
As a pure Python package: This is intended to enable algorithm researchers to use and/or extend the analytics functionality within HistomicsTK in Python. HistomicsTK provides algorithms for fundamental image analysis tasks such as color normalization, color deconvolution, cell-nuclei segmentation, and feature extraction. Please see the api-docs and examples for more information.
This can be installed on Linux via pip install histomicstk.
HistomicsTK uses the large_image library to read and various microscopy image formats. Depending on your exact system, installing the necessary libraries to support these formats can be complex. There are some non-official prebuilt libraries available for Linux that can be included as part of the installation by specifying pip install histomicstk --find-links https://manthey.github.io/large_image_wheels. Note that if you previously installed HistomicsTK or large_image without these, you may need to add --force-reinstall --no-cache-dir to the pip install command to force it to use the find-links option.
The system version of various libraries are used if the --find-links option is not specified. You will need to use your package manager to install appropriate libraries (on Ubuntu, for instance, you'll need libopenslide-dev and libtiff-dev).
As a server-side Girder plugin for web-based analysis: This is intended to allow pathologists/biologists to apply analysis modules/pipelines containerized in HistomicsTK's docker plugins on data over the web. Girder is a Python-based framework (under active development by Kitware) for building web-applications that store, aggregate, and process scientific data. It is built on CherryPy and provides functionality for authentication, access control, customizable metadata association, easy upload/download of data, an abstraction layer that exposes data stored on multiple backends (e.g. Native file system, Amazon S3, MongoDB GridFS) through a uniform RESTful API, and most importantly an extensible plugin framework for building server-side analytics apps. To inherit all these capabilities, HistomicsTK is being developed to act also as a Girder plugin in addition to its use as a pure Python package. To further support web-based analysis, HistomicsTK depends on three other Girder plugins: (i) girder_worker for distributed task execution and monitoring, (ii) large_image for displaying, serving, and reading large multi-resolution images produced by whole-slide imaging systems, and (iii) slicer_cli_web to provide web-based RESTFul access to image analysis pipelines developed as slicer execution model CLIs and containerized using Docker.
Please refer to https://digitalslidearchive.github.io/HistomicsTK/ for more information.
For questions, comments, or to get in touch with the maintainers, head to our Discourse forum, or use our Gitter Chatroom.
This work is funded by the NIH grant U24-CA194362-01.