Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev v0.0.3 #25

Merged
merged 3 commits into from
May 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,8 @@ We welcome contributions and feedback from the community and recommend a few bes
* PRs should be titled descriptively, and be opened with a brief description of the scope and intent of the new contribution.
* New features should have appropriate documentation added alongside them.
* Aim for code maintainability, and minimize code copying.
* Minimal test are required before submit a PR, run `script/minimal_test.py` and all test cases are required to be passed.
* Please make sure the code style is checked and aligned:
```bash
pre-commit run --all-files
```
<!-- * Minimal test are required before submit a PR, run `script/minimal_test.py` and all test cases are required to be passed. -->
* Please make sure the code style is checked and aligned, see [Code Style](#code-style) for more details.

### For Feature Requests

Expand Down
15 changes: 15 additions & 0 deletions docs/RELEASE_LOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
# Release Log

## v0.0.3

### New Features
1. **Keep Original Text:** Add the mapping from each claim to the position in the original text. Add `restore_claims` function to **decomposer**, to restore the decomposed claims to the original user input.
2. **Data Structure:** Define the data structure for several intermedia processing function and final output in `utils/data_class.py`.
3. **Speed Up:** Parallel the `restore_claims`, `identify_checkworthiness` and `query_generation` functions to speed up the pipeline.
4. **Token Count:** Add the token count for all component.
5. **Evidence-wise Verification:** Change the verification logic from input all evidence together within a single LLM call, to verify the claim by each evidence for each LLM call.
6. **Factuality Value:** Remove the deterministic output, change the factuality to a number in range [0,1], calculated by the judgement with each simple evidence.
7. **Webpage:** Redesign the webpage.
8. **Default LLM:** Change to GPT-4o.

### Bug fixed
1. **Serper Max Queries:** Serper API allows max of 100 queries in one request, we split the queries into multiple requests if the number of queries exceeds 100.
2. **Evidence and URL:** Link each evidence to the corresponding URL.

## v0.0.2

Expand Down
14 changes: 7 additions & 7 deletions docs/development_guide.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Development Guide

This documentation page provides a guide for developers to want to contribute to the Loki project, for versions v0.0.2 and later.
This documentation page provides a guide for developers to want to contribute to the Loki project, for versions v0.0.3 and later.

- [Development Guide](#development-guide)
- [Framework Introduction](#framework-introduction)
Expand All @@ -11,11 +11,11 @@ This documentation page provides a guide for developers to want to contribute to

Loki leverage state-of-the-art language models to verify the veracity of textual claims. The pipeline is designed to be modular in `factcheck/core/`, which include the following components:

- **Decomposer:** Breaks down extensive texts into digestible, independent claims, setting the stage for detailed analysis.
- **Checkworthy:** Assesses each claim's potential significance, filtering out vague or ambiguous statements to focus on those that truly matter. For example, vague claims like "MBZUAI has a vast campus" are considered unworthy because of the ambiguous nature of "vast."
- **Query Generator:** Transforms check-worthy claims into precise queries, ready to navigate the vast expanse of the internet in search of truth.
- **Evidence Retriever:** Ventures into the digital realm, retrieving relevant evidence that forms the foundation of informed verification.
- **ClaimVerify:** Examines the gathered evidence, determining the veracity of each claim to uphold the integrity of information.
- **Decomposer:** Breaks down extensive texts into digestible, independent claims, setting the stage for detailed analysis. As well as provide the mapping between the original text and the decomposed claims.
- **Checkworthy:** Assesses each claim's potential checkworthiness, filtering out vague or ambiguous statements, as well as the statement of opinion. For example, vague claims like "MBZUAI has a vast campus" are considered unworthy because of the ambiguous nature of "vast."
- **Query Generator:** Transforms check-worthy claims into precise queries, ready to navigate the vast expanse of the internet in search of evidences.
- **Evidence Retriever:** Retrieve relevant evidence that forms the foundation of informed verification, currently, for open-domain questions, we now use the google search (Serper API).
- **ClaimVerify:** Judges each evidence against the claim, determining it is supporting, refuting, or irrelevant.

To support each component's functionality, Loki relies on the following utils:
- **Language Model:** Currently, 4 out of 5 components (including: Decomposer, Checkworthy, Query Generator, and ClaimVerify) use the language model (LLMs) to perform their tasks. The supported LLMs are defined in `factcheck/core/utils/llmclient/` and can be easily extended to support more LLMs.
Expand Down Expand Up @@ -71,7 +71,7 @@ As Loki continues to evolve, our development plan focuses on broadening capabili
- **Dockerization:**
- Packaging Loki into Docker containers to simplify deployment and scale-up operations, ensuring Loki can be easily set up and maintained across different environments.

### 5. Multi-language Support
### 5. Multi-lingual Support
- **Language Expansion:**
- Support for additional languages beyond English, including Chinese, Arabic, etc, to cater to a global user base.

Expand Down
3 changes: 2 additions & 1 deletion factcheck/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,11 @@ def __init__(
checkworthy_model: str = None,
query_generator_model: str = None,
evidence_retrieval_model: str = None,
claim_verify_model: str = None, # "gpt-3.5-turbo",
claim_verify_model: str = "gpt-3.5-turbo",
api_config: dict = None,
num_seed_retries: int = 3,
):
# TODO: better handle raw token count
self.encoding = tiktoken.get_encoding("cl100k_base")

self.prompt = prompt_mapper(prompt_name=prompt)
Expand Down
1 change: 0 additions & 1 deletion factcheck/core/CheckWorthy.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ def identify_checkworthiness(self, texts: list[str], num_retries: int = 3, promp
list[str]: a list of checkworthy claims, pairwise outputs
"""
checkworthy_claims = texts
# TODO: better handle checkworthiness
joint_texts = "\n".join([str(i + 1) + ". " + j for i, j in enumerate(texts)])

if prompt is None:
Expand Down
42 changes: 23 additions & 19 deletions factcheck/core/Retriever/serper_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,50 +62,54 @@ def _retrieve_evidence_4_all_claim(
evidences = [[] for _ in query_list]

# get the response from serper
# TODO: Can send up to 100 queries once
serper_response = self._request_serper_api(query_list)

if serper_response is None:
logger.error("Serper API request error!")
return evidences
serper_responses = []
for i in range(0, len(query_list), 100):
batch_query_list = query_list[i : i + 100]
batch_response = self._request_serper_api(batch_query_list)
if batch_response is None:
logger.error("Serper API request error!")
return evidences
else:
serper_responses += batch_response.json()

# get the results for queries with an answer box
# get the responses for queries with an answer box
query_url_dict = {}
url_to_date = {} # TODO: decide whether to use date
_snippet_to_check = []
for i, (query, result) in enumerate(zip(query_list, serper_response.json())):
if query != result.get("searchParameters").get("q"):
logger.error("Serper change query from {} TO {}".format(query, result.get("searchParameters").get("q")))
for i, (query, response) in enumerate(zip(query_list, serper_responses)):
if query != response.get("searchParameters").get("q"):
logger.error("Serper change query from {} TO {}".format(query, response.get("searchParameters").get("q")))

if "answerBox" in result:
if "answer" in result["answerBox"]:
# TODO: provide the link for the answer box
if "answerBox" in response:
if "answer" in response["answerBox"]:
evidences[i] = [
{
"text": f"{query}\nAnswer: {result['answerBox']['answer']}",
"text": f"{query}\nAnswer: {response['answerBox']['answer']}",
"url": "Google Answer Box",
}
]
else:
evidences[i] = [
{
"text": f"{query}\nAnswer: {result['answerBox']['snippet']}",
"text": f"{query}\nAnswer: {response['answerBox']['snippet']}",
"url": "Google Answer Box",
}
]
# TODO: currently --- if there is google answer box, we only got 1 evidence, otherwise, we got multiple, this will deminish the value of the google answer.
else:
results = result.get("organic", [])[:top_k] # Choose top 5 result
topk_results = response.get("organic", [])[:top_k] # Choose top 5 response

if (len(_snippet_to_check) == 0) or (not snippet_extend_flag):
evidences[i] += [
{"text": re.sub(r"\n+", "\n", _result["snippet"]), "url": _result["link"]} for _result in results
{"text": re.sub(r"\n+", "\n", _result["snippet"]), "url": _result["link"]} for _result in topk_results
]

# Save date for each url
url_to_date.update({result.get("link"): result.get("date") for result in results})
url_to_date.update({_result.get("link"): _result.get("date") for _result in topk_results})
# Save query-url pair, 1 query may have multiple urls
query_url_dict.update({query: [result.get("link") for result in results]})
_snippet_to_check += [result["snippet"] if "snippet" in result else "" for result in results]
query_url_dict.update({query: [_result.get("link") for _result in topk_results]})
_snippet_to_check += [_result["snippet"] if "snippet" in _result else "" for _result in topk_results]

# return if there is no snippet to check or snippet_extend_flag is False
if (len(_snippet_to_check) == 0) or (not snippet_extend_flag):
Expand Down
Loading