Skip to content

Commit

Permalink
Update README.en.md
Browse files Browse the repository at this point in the history
  • Loading branch information
truskovskiyk authored Dec 2, 2024
1 parent 03faa36 commit ad7ffb5
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions ai-search-demo/README.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This is a small demo showing how to build AI search on top of visual data (PDFs,

## Why

The classic way to handle visual documents (PDFs, forms, images, etc.) is to use OCR, Layout Detection, Table Recognition, etc. See [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) [Tesseract](https://github.com/tesseract-ocr/tesseract) or [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) for example. However, we are going to split PDFs by page and embed each as an image to avoid complexity. The main models we are going to use are [Qwen2-VL](https://arxiv.org/abs/2409.12191) for visual understanding and ColPali.
The classic way to handle visual documents (PDFs, forms, images, etc.) is to use OCR, Layout Detection, Table Recognition, etc. See [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) [Tesseract](https://github.com/tesseract-ocr/tesseract) or [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) for example. However, we are going to split PDFs by page and embed each as an image to avoid complexity. The main models we are going to use are [Qwen2-VL](https://arxiv.org/abs/2409.12191) for visual understanding and [ColPali](https://github.com/illuin-tech/colpali).


## Evaluation
Expand Down Expand Up @@ -162,4 +162,4 @@ Deploy models
```
modal deploy llm-inference/llm_serving.py
modal deploy llm-inference/llm_serving_colpali.py
```
```

0 comments on commit ad7ffb5

Please sign in to comment.