From 02854a5e4f269a6825e4548460b0b7330bcdb0fe Mon Sep 17 00:00:00 2001 From: maks-operlejn-ds Date: Thu, 12 Oct 2023 12:37:05 +0000 Subject: [PATCH] Small fixes to docs --- .../presidio_data_anonymization/qa_privacy_protection.ipynb | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb b/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb index 6ea392d93343a..19d58ba9f5c96 100644 --- a/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb +++ b/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb @@ -16,7 +16,7 @@ "source": [ "# QA with private data protection\n", "\n", - "[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/use_cases/question_answering/qa_privacy_protection.ipynb)\n", + "[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb)\n", "\n", "\n", "In this notebook, we will look at building a basic system for question answering, based on private data. Before feeding the LLM with this data, we need to protect it so that it doesn't go to an external API (e.g. OpenAI, Anthropic). Then, after receiving the model output, we would like the data to be restored to its original form. Below you can observe an example flow of this QA system:\n", @@ -643,6 +643,8 @@ "from langchain.vectorstores import FAISS\n", "\n", "# 2. Load the data: In our case data's already loaded\n", + "documents = [Document(page_content=document_content)]\n", + "\n", "# 3. Anonymize the data before indexing\n", "for doc in documents:\n", " doc.page_content = anonymizer.anonymize(doc.page_content)\n", @@ -839,6 +841,7 @@ "metadata": {}, "outputs": [], "source": [ + "documents = [Document(page_content=document_content)]\n", "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)\n", "chunks = text_splitter.split_documents(documents)\n", "\n",