Skip to content

Latest commit

 

History

History
16 lines (8 loc) · 518 Bytes

README.md

File metadata and controls

16 lines (8 loc) · 518 Bytes

NLP-Filter

Experimenting an offensive speech filter, using synthetic data (generated from templates)

GOAL

Trying out various simple classification algorithms.

Working language: German.

  • NOTE 1: since all the data is generated from template, there are clear patterns, and none of the "noise" encountered in the wild.
  • NOTE 2: it might be interesting to extend this using (in order of complexity) a) word embeddings or b) language models, and then test it's performance on a publicly available dataset.