📑 Paper: https://arxiv.org/abs/2501.00888
🌏 Chinese Web Demo: https://modelscope.cn/studios/vickywu1022/CHRONOS
- We propose CHRONOS, a novel retrieval-based approach to Timeline Summarization (TLS) by iteratively posing questions about the topic and the retrieved documents to generate chronological summaries.
- We construct an up-to-date dataset for open- domain TLS, which surpasses existing public datasets in terms of both size and the duration of timelines.
- Experiments demonstrate that our method is effective on open-domain TLS and achieves comparable results with state-of-the-art methods of closed-domain TLS, with significant improvements in efficiency and scalability.
We release our Open-TLS dataset for open-domain Timeline Summarization.
The target news query is presented in news_keywords.py
and the ground truth timeline is presented in data/open/{NEWS_KEYWORD}/timelines.jsonl
following the below format:
[["YYY-MM-DDT00:00:00", ["", "", ""]]]
pip install -r requirements.txt
The second step is to construct a topic-questions example pool for datasets in data/
.
python question_exampler.py
Or, you can use our provided data/question_examples.json
, which contains examples for the crisis, T17 and Open-TLS datasets.
🔥 To be continued...
@article{wu2025unfoldingheadlineiterativeselfquestioning,
title={Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization},
author={Weiqi Wu and Shen Huang and Yong Jiang and Pengjun Xie and Fei Huang and Hai Zhao},
year={2025},
eprint={2501.00888},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.00888},
}