-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #3 from datamol-io/refactoring
API Refactoring
- Loading branch information
Showing
91 changed files
with
14,809 additions
and
23,094 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# `medchem.catalogs` | ||
|
||
::: medchem.catalogs.list_named_catalogs | ||
::: medchem.catalogs.merge_catalogs | ||
::: medchem.catalogs.catalog_from_smarts | ||
::: medchem.catalogs.NamedCatalogs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
# `medchem.rules` | ||
# `medchem.complexity` | ||
|
||
::: medchem.complexity.complexity_filter | ||
|
||
--- | ||
|
||
::: medchem.complexity._complexity_calc | ||
::: medchem.complexity.ComplexityFilter | ||
::: medchem.complexity.WhitlockCT | ||
::: medchem.complexity.BaroneCT | ||
::: medchem.complexity.SMCM | ||
::: medchem.complexity.TWC |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# `medchem.constraints` | ||
|
||
::: medchem.constraints.Constraints |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# `medchem.functional` | ||
|
||
::: medchem.functional.alert_filter | ||
::: medchem.functional.nibr_filter | ||
::: medchem.functional.catalog_filter | ||
::: medchem.functional.chemical_group_filter | ||
::: medchem.functional.rules_filter | ||
::: medchem.functional.complexity_filter | ||
::: medchem.functional.bredt_filter | ||
::: medchem.functional.molecular_graph_filter | ||
::: medchem.functional.lilly_demerit_filter | ||
::: medchem.functional.protecting_groups_filter | ||
::: medchem.functional.macrocycle_filter | ||
::: medchem.functional.atom_list_filter | ||
::: medchem.functional.ring_infraction_filter | ||
::: medchem.functional.num_atom_filter | ||
::: medchem.functional.num_stereo_center_filter | ||
::: medchem.functional.halogenicity_filter | ||
::: medchem.functional.symmetry_filter |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,6 @@ | ||
# `medchem.groups` | ||
|
||
::: medchem.groups | ||
::: medchem.groups.list_default_chemical_groups | ||
::: medchem.groups.list_functional_group_names | ||
::: medchem.groups.get_functional_group_map | ||
::: medchem.groups.ChemicalGroup |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,132 +1,6 @@ | ||
# `medchem.query` | ||
|
||
This module helps build a filter based on a query language that can be parsed. | ||
By default, the default query parser will be used, which contains the following instructions that can be orchestrated using boolean operation (`or`, `and`, `not` and parenthesis) | ||
|
||
## Example | ||
|
||
```python | ||
import datamol as dm | ||
from medchem.query.eval import QueryFilter | ||
|
||
query = """HASPROP("tpsa" < 120) AND HASSUBSTRUCTURE("[OH]", True)""" | ||
chemical_filter = QueryFilter(query, parser="lalr") | ||
mols = dm.data.cdk2().mol[:10] | ||
chemical_filter(mols, n_jobs=-1) # [False, False, False, False, False, True, True, True, False, False] | ||
``` | ||
|
||
## Syntax | ||
|
||
Any string provided as `query` argument needs to be quoted (similar to json) to avoid ambiguity in parsing. | ||
* An example of valid query is `"""(HASPROP("tpsa" > 120 ) | HASSUBSTRUCTURE("c1ccccc1")) AND NOT HASALERT("pains") OR HASSUBSTRUCTURE("[OH]", max, 2)"""`. | ||
* Examples of invalid queries are | ||
* `"""HASPROP("tpsa" > 120) OR HASSUBSTRUCTURE("[OH]", True, >, 3)"""` : unexpected wrong operator `>` | ||
* `"""HASPROP(tpsa > 120)"""` : tpsa is not quoted | ||
* `"""HASPROP("tpsa") > 120"""` : this is not part of the language specification | ||
* `"""(HASPROP("tpsa" > 120) AND HASSUBSTRUCTURE("[OH]", True, max, 3 )"""`: mismatching parenthesis `(` | ||
|
||
* `"""HASPROP("tpsa" > 120) OR HASSUBSTRUCTURE("CO")"""`, `"""(HASPROP("tpsa" > 120)) OR (HASSUBSTRUCTURE("CO"))"""` and `"""(HASPROP("tpsa" > 120) OR HASSUBSTRUCTURE("CO"))"""` are equivalent | ||
|
||
|
||
### HASALERT | ||
check whether a molecule has an `alert` from a catalog | ||
```python | ||
# alert is one supported alert catalog by `medchem`. For example `pains` | ||
HASALERT(alert:str) | ||
``` | ||
|
||
### HASGROUP | ||
check whether a molecule has a specific functional group from a catalog | ||
|
||
```python | ||
# group is one supported functional group provided by `medchem` | ||
HASGROUP(group:str) | ||
``` | ||
|
||
|
||
### MATCHRULE | ||
check whether a molecule match a predefined druglikeness `rule` from a catalog | ||
```python | ||
# rule is one supported rule provided by `medchem`. For example `rule_of_five` | ||
MATCHRULE(rule:str) | ||
``` | ||
|
||
### HASSUPERSTRUCTURE | ||
check whether a molecule has `query` as superstructure | ||
```python | ||
# query is a SMILES | ||
HASSUPERSTRUCTURE(query:str) | ||
``` | ||
|
||
### HASSUBSTRUCTURE | ||
Check whether a molecule has `query` as substructure. | ||
**Note that providing the comma separator `,` is _mandatory_ here as each variable is an argument.** | ||
|
||
```python | ||
# query is a SMILES or a SMARTS, operator is defined below, is_smarts is a boolean | ||
|
||
HASSUBSTRUCTURE(query:str, is_smarts:Optional[bool], operator:Optional[str], limit:Optional[int]) | ||
|
||
# which correspond to setting this default values | ||
HASSUBSTRUCTURE(query:str, is_smarts=False, operator="min", limit=1) | ||
# same as | ||
HASSUBSTRUCTURE(query:str, is_smarts=None, operator=None, limit=None) | ||
``` | ||
|
||
Not providing optional arguments is allowed, but they need to be provided in the exact same order shown above. Thus: | ||
|
||
* `HASSUBSTRUCTURE("CO")` | ||
* `HASSUBSTRUCTURE("CO", False)` | ||
* `HASSUBSTRUCTURE("CO", False, min)` | ||
* `HASSUBSTRUCTURE("CO", False, min, 1)` | ||
|
||
are all `valid` and `equivalent` (given their default values) | ||
|
||
Furthermore, since the correct argument map can be inferred when no ambiguity arises, the following `are valid but discouraged` | ||
|
||
* `HASSUBSTRUCTURE("CO", False, 1)` | ||
* `HASSUBSTRUCTURE("CO", min, 1)` | ||
|
||
Whereas, this is invalid: | ||
* `HASSUBSTRUCTURE("CO", min, False, 1)` | ||
|
||
|
||
### HASPROP | ||
Check whether a molecule has `prop` as property within a defined limit. | ||
**Any comma `,` provided between arguments will be ignored** | ||
|
||
```python | ||
# prop is a valid datamol.descriptors property, comparator is a required comparator operator and defined below | ||
HASPROP(prop:str comparator:str limit:float) | ||
``` | ||
|
||
### LIKE | ||
Check whether a molecule is similar enough to another molecule. | ||
**Any comma `,` provided between arguments will be ignored** | ||
|
||
```python | ||
# query is a SMILES | ||
LIKE(query:str comparator:str limit:float) | ||
``` | ||
|
||
### Basic operators: | ||
|
||
* comparator: one of `=` `==`, `!=`, `<`, `>`, `<=`, `>=` | ||
* misc: the following misc values are accepted and parsed `true`, `false`, `True`, `False`, `TRUE`, `FALSE` | ||
* operator (can be quoted or unquoted): | ||
* MIN: `min`, `MIN` | ||
* MAX: `max`, `MAX` | ||
* boolean operator: | ||
* AND operator : `AND` or `&` or `&&` or `and` | ||
* OR operator : `OR` or `|` or `||` or `or` | ||
* NOT operator : `NOT` or `!` or `~` or `not` | ||
|
||
|
||
|
||
## API | ||
|
||
::: medchem.query.parser | ||
|
||
--- | ||
|
||
::: medchem.query.eval | ||
::: medchem.query.QueryFilter | ||
::: medchem.query.QueryOperator | ||
::: medchem.query.EvaluableQuery | ||
::: medchem.query.QueryParser |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,16 @@ | ||
# `medchem.rules` | ||
|
||
::: medchem.rules.RuleFilters | ||
|
||
## Basic Rules | ||
|
||
::: medchem.rules.basic_rules | ||
|
||
--- | ||
## Utilities | ||
|
||
::: medchem.rules.rule_filter | ||
::: medchem.rules.in_range | ||
::: medchem.rules.n_heavy_metals | ||
::: medchem.rules.has_spider_chains | ||
::: medchem.rules.n_fused_aromatic_rings | ||
::: medchem.rules.fraction_atom_in_scaff | ||
::: medchem.rules.list_descriptors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# `medchem.structural` | ||
|
||
::: medchem.structural.CommonAlertsFilters | ||
::: medchem.structural.NIBRFilters | ||
::: medchem.structural.lilly_demerits.LillyDemeritsFilters |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,12 +4,8 @@ | |
|
||
--- | ||
|
||
::: medchem.utils.matches | ||
|
||
--- | ||
|
||
::: medchem.utils.loader | ||
|
||
--- | ||
|
||
::: medchem.utils.graph | ||
::: medchem.utils.graph |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Medchem CLI | ||
|
||
Medchem proposes CLI commands in order to filter directly from file paths. CSV, JSON, Excel, Parquet and SDF are supported. | ||
|
||
Available commands can be found with: | ||
|
||
```bash | ||
medchem --help | ||
``` | ||
|
||
To know more about one specific command: | ||
|
||
```bash | ||
medchem common-alerts --help | ||
``` |
Oops, something went wrong.