Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

DCT detection from filename #7

Open
leondz opened this issue Jun 10, 2011 · 2 comments
Open

DCT detection from filename #7

leondz opened this issue Jun 10, 2011 · 2 comments

Comments

@leondz
Copy link

leondz commented Jun 10, 2011

It's possible to extract DCT (at day granularity) from filenames - is this attemped?

From TimeBank:

VOA19980331.1700.1533.tml
WARNING: Could not determine document creation time, use -c to override

@cnorthwood
Copy link
Owner

No, it's not attempted. Could be useful though.

@leondz
Copy link
Author

leondz commented Jul 27, 2011

I've written a small module for managing DCT detection, works flawlessly on the ~240 docs in TimeBank + ATC, as well at the 1.8mil in the TAC KBP source collection (save one file which really has no explicit DCT information, and very little inferable either). I'll work on integrating this and setting a fallback "guess dct from filename / doc content" option if no value for -c is specified and no other information is available.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants