-
Notifications
You must be signed in to change notification settings - Fork 2
IndexByDate.txt Support
Richard Thomson edited this page Oct 25, 2019
·
2 revisions
Manx can assist in ingesting new documents from a site by fetching a file named IndexByDate.txt from the site and parsing the lines in the file to represent documents at the site. This is based on the structure of the bitsavers archive. The IndexByDate.txt file has a structure like this:
2019-10-25 16:24:59 microdata/periph/reflex/Reflex_II_Video_Training_Workbook.pdf
2019-10-25 16:24:58 microdata/periph/reflex/81-1091B_Reflex_II_Technical_Manual_1981.pdf
2019-10-25 03:50:29 ibm/370/CICS/SC33-0096-2_IBM_3270_Data_Stream_Device_Guide_Jul1987.pdf
2019-10-25 03:49:47 ibm/datacomm/GA27-3093-4_SDLC_Concepts_May1992.pdf
This consists of a date/time stamp and the path to the document, relative to the IndexByDate.txt file. Each document corresponds to a line in this file. Manx can parse this file to identify documents that are not yet in its database and provide a browsing interface to simplify adding these documents. Manx will periodically poll the IndexByDate.txt file and when the file is updated, manx will download a copy of the file for local processing.