Skip to content

Commit

Permalink
Fix document_index notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
NickyHavoc committed Oct 31, 2023
1 parent 549d498 commit 2834772
Showing 1 changed file with 15 additions and 24 deletions.
39 changes: 15 additions & 24 deletions src/examples/document_index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,17 @@
"outputs": [],
"source": [
"# change this value if you want to use a collection of a different name\n",
"from intelligence_layer.connectors.document_index.document_index import CollectionPath\n",
"\n",
"\n",
"COLLECTION = \"demo\"\n",
"\n",
"document_index.create_collection(namespace=NAMESPACE, collection=COLLECTION)"
"collection_path = CollectionPath(\n",
" namespace=NAMESPACE,\n",
" collection=COLLECTION\n",
")\n",
"\n",
"document_index.create_collection(collection_path)"
]
},
{
Expand Down Expand Up @@ -170,8 +178,12 @@
"metadata": {},
"outputs": [],
"source": [
"from intelligence_layer.connectors.document_index.document_index import DocumentContents, DocumentPath\n",
"\n",
"\n",
"for doc in documents:\n",
" document_index.add_document(namespace=NAMESPACE, collection=COLLECTION, name=doc[\"name\"], content=doc[\"content\"])"
" document_path = DocumentPath(collection_path=collection_path, document_name=doc[\"name\"])\n",
" document_index.add_document(document_path, contents=DocumentContents.from_text(doc[\"content\"]))"
]
},
{
Expand All @@ -187,28 +199,7 @@
"metadata": {},
"outputs": [],
"source": [
"document_index.list_documents(namespace=NAMESPACE, collection=COLLECTION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once the documents are uploaded, they are split into chunks.\n",
"Chunks are the subparts of the original texts, usually a few hundred tokens long.\n",
"You can think of a chunk as a paragraph of text.\n",
"While uploading, the DI splits each document into chunks and generates one embedding each.\n",
"\n",
"Let's see what a chunked document looks like:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"document_index.get_document(namespace=NAMESPACE, collection=COLLECTION, name=\"robert_moses\", get_chunks=True)"
"document_index.list_documents(collection_path)"
]
},
{
Expand Down

0 comments on commit 2834772

Please sign in to comment.