- Assign document to class(es)
- Based on document features:
- Contents
- Age
- Popularity
- …
Notes:
- What are use cases?
- Standing queries (Google Alerts)
- Spam filter
- Webspam detection
- Language detection
- Sentiment detection
- Make humans do it
- Needs domain experts
- Does not scale with amount of data
- Make machines do it
- Needs lots of data
Notes:
- How?
- What's the issue with humans doing it?
- Learning Method
- Computes classifier
- Classifier
- Determines class of document
Type | Example |
---|---|
Single / one-of | Spam |
Multiple / any-of | Blog categories |
Type | Example |
---|---|
One class | Spam |
Two class | Sentiment |
Multi class | Blog categories |
Notes: Examples?