Infrastructure Ombudsman

Dataset

The dataset can be found in the jsonl format in the dataset folder.
It contains three columns: id, label, and source
id indicates the unique comment id that can be used to get the text through the Reddit/YouToube API.
label indicates the final label assigned to each of the text
source indicates whether the comment/text is from Reddit or YouTube
In case the full dataset with the text is required please email the corresponding author of the paper at [email protected]

Code

The code and configuration parameters for training the large language models LLAMA2 and MistralAI are in the llm.py file under the training folder.
The usage and experimentations in their raw format are in the notebooks folder.

Citation

If you use the dataset or code please cite the given paper:

@inproceedings{chowdhury2024infrastructure,
  title={Infrastructure Ombudsman: Mining Future Failure Concerns from Structural Disaster Response},
  author={Chowdhury, Md Towhidul Absar and Datta, Soumyajit and Sharma, Naveen and KhudaBukhsh, Ashiqur R.},
  booktitle={Proceedings of the ACM Web Conference 2024},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dataset		dataset
figures		figures
notebooks		notebooks
training		training
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infrastructure Ombudsman

Dataset

Code

Citation

About

Languages

towhidabsar/InfrastructureOmbudsman

Folders and files

Latest commit

History

Repository files navigation

Infrastructure Ombudsman

Dataset

Code

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages