Skip to content

Latest commit

 

History

History
46 lines (26 loc) · 2 KB

README.rst

File metadata and controls

46 lines (26 loc) · 2 KB

contrastive

A python library for performing unsupervised machine learning on datasets with learning (e.g. PCA) in contrastive settings, where one is interested in patterns (e.g. clusters or clines) that exist one dataset, but not the other.

Applications include dicovering subgroups in biological and medical data.

Installation

$ pip install contrastive

Examples

The basic functions enabled by this library are shown below. Generally speaking, we have two datasets, one is a dataset that we can label as foreground_data, which is the dataset in which we are discovering patterns and directions, and another dataset called background_data, which is the dataset that does not have the patterns or directions we are interested in discovering. In some cases, both datasets may contain the signal of interest, but the foreground dataset may have the pattern enriched relative to the background. In these analyses, there is a contrast parameter, known as alpha, which can be thought of as a hyperparameter.

from contrastive import CPCA

mdl = CPCA()
projected_data, alphas = mdl.fit_transform(foreground_data, background_data)

#returns a set of 2-dimensional projections of the foreground data stored in the list 'projected_data', for several different values of 'alpha' that are automatically chosen (by default, 4 values of alpha are chosen)

Built-in plotting: to quickly see the results of PCA, simply enable the plot parameter to true:

from contrastive import CPCA

mdl = CPCA()
projected_data, alphas = mdl.fit_transform(foreground_data, background_data, plot=True)

Interactive GUI: if you are running these analyses inside a jupyter notebook, you can easily launch an interactive GUI as shown here:

from contrastive import CPCA

mdl = CPCA()
projected_data, alphas = mdl.fit_transform(foreground_data, background_data, gui=True)