Skip to content

Latest commit

 

History

History
40 lines (20 loc) · 1.4 KB

File metadata and controls

40 lines (20 loc) · 1.4 KB

13. You want to build a new email anti-spam system. Your team has several ideas:

->

  • Collect a huge training set of spam email. For example, set up a “honeypot”: deliberately send fake email addresses to known spammers, so that you can automatically harvest the spam messages they send to those addresses. ->

  • Develop features for understanding the text content of the email. ->

  • Develop features for understanding the email envelope/header features to show what set ->

of internet servers the message went through. ->

  • and more. ->

Even though I have worked extensively on anti-spam, I would still have a hard time picking one of these directions. It is even harder if you are not an expert in the application area. ->

So don’t start off trying to design and build the perfect system. Instead, build and train a basic system quickly—perhaps in just a few days [5]. Even if the basic system is far from the “best” system you can build, it is valuable to examine how the basic system functions: you will quickly find clues that show you the most promising directions in which to invest your time. These next few chapters will show you how to read these clues. ->

img


[5] This advice is meant for readers wanting to build AI applications, rather than those whose goal is to publish academic papers. I will later return to the topic of doing research. ->