Skip to content
Asura Enkhbayar edited this page Apr 12, 2021 · 9 revisions

Research Log

  • 01.03.2021 -- started the data collection
  • 09.03.2021 -- finally got the cron job working. automatic collection script running at 6pm everyday
  • 15.03.2021 -- noticed that two of the sources (sciblogs, sciline) haven't been publishing frequently enough.
  • 19.03.2021 -- added three new sources: HealthDay, News Medical, and MedPageToday and started data collection
  • 29.03.2021 -- changed popsci RSS feed URL as the old one did not work anymore
  • 31.03.2021 -- frequency of RSS collection has been increased to every 3hrs in order to determine if feeds are capping at 10 entries
  • 31.03.2021 -- newsmed was still maxing out with 10 articles. collecting every hour now
  • 12.04.2021 -- popsci seems broken. call with juan: decision to write Twitter crawlers for popsci and iflscience
Clone this wiki locally