● News Feed: Admin should be able to map country, categories, publications and RSS feed
● Crawl: The system should be able to pull the latest articles from an RSS feed or sitemap of a publication
● Support for deduplication: The system should not crawl the articles which are already crawled
● Parser: The system should be able to extract title, body and media links from an HTML page
● News Update: The user should be able to see news from subscribed publications only
● Search: The user should be able to search an article based on keywords
● User Preference: The user should be able to subscribe to the publications and categories of interest
● Low Latency
● High Availability
https://www.youtube.com/watch?v=xRBLqs6Gij4&list=PLmtNcpUq3YIJequI5FneNkiEGiHmwm3_o