Skip to content

Latest commit

 

History

History
31 lines (25 loc) · 945 Bytes

README.md

File metadata and controls

31 lines (25 loc) · 945 Bytes

good-bad-names-for-GN

Classify the Scientific names database as 'trusted' or 'not trusted'

Stories in Ready

Gitter

How to run it?

requirements

  • MySQL
  • python 2.7
  • Java
  • Netiti
  • TaxonFinder
  • Parser
  • Data
    • GN database
    • VertNet data
    • datasource authority

step by step to produce the results

1. generate the feature table (name_string_refinery)

 Feature explaination
 https://docs.google.com/document/d/1mblzmi1o0dm70OSvR0qR7vrQ69KBONw_wroArRMeCPc/edit

2. build the good-bad classifier

3. run the classifier

4. write back the predictions into the table