Skip to content

Latest commit

 

History

History
41 lines (32 loc) · 871 Bytes

README.md

File metadata and controls

41 lines (32 loc) · 871 Bytes

Document similarity and topic clues A historiographical study case

DH2022 files

data_structured

encpos_structured_sents.json

dump JSON du corpus structuré (dictionnaire par position puis chapitre > liste des phrases)

{
  "ENCPOS_ID": {
    "metadata": [
    ],
    "chapter_title": [
      "first sentence",
      "second sentence",
      ""
    ],
    "chapter_title": [
    ]
   },
  "ENCPOS_ID": {
    "metadata": [
    ],
    "chapter_title": [
      "first sentence",
      "second sentence",
      ""
    ],
    "chapter_title": [
    ]
   }

Tutorial (Google Colab/Drive)

A Jupyter Notebook is available to demo run, check out the tutorial on Google Colab : Open In Collab