Skip to content

Latest commit

 

History

History
93 lines (63 loc) · 1.98 KB

section_classification.md

File metadata and controls

93 lines (63 loc) · 1.98 KB

Text Classification

Supervised Learning

Notes:

What?

  • Assign document to class(es)
  • Based on document features:
    • Contents
    • Age
    • Popularity

Notes:

  • What are use cases?

Why?

  • Standing queries (Google Alerts)
  • Spam filter
  • Webspam detection
  • Language detection
  • Sentiment detection

Notes:

How?

  • Make humans do it
    • Needs domain experts
    • Does not scale with amount of data
  • Make machines do it
    • Needs lots of data

Notes:

  • How?
  • What's the issue with humans doing it?

Supervised learning

Supervised Learning

Notes:

Supervised Learning Example

Man Woman Classifier

Notes:

Nomenclature

Learning Method
Computes classifier
Classifier
Determines class of document

Class membership

Type Example
Single / one-of Spam
Multiple / any-of Blog categories

Classifiers

Type Example
One class Spam
Two class Sentiment
Multi class Blog categories

Notes: Examples?