This tool automatically converts batch of images containing structured data (tables, formulas, graphs, diagrams, flowcharts, etc.) into markdown format. Markdown files suitable for RAG pipeline. It uses Anthropic's models via API to analyze images and create detailed markdown descriptions based on included robust system prompt.
Before you start, you need to have:
- Python installed on your computer (version 3.7 or higher)
- An Anthropic API key (get it from Anthropic's console)
If you don't have Python installed:
- Go to Python's official website
- Download the latest version for your operating system
- Run the installer
- On Windows: Make sure to check "Add Python to PATH" during installation
- On Mac: Follow the standard installation process
- Open Terminal (Mac) or Command Prompt (Windows)
- Navigate to where you want to save the project:
cd Documents
- Clone the repository:
git clone https://github.com/PetrAPConsulting/image2md.git
- It creates folder image2md with cloned files
- Click the green "Code" button on this page
- Click on the sheet "Local"
- Select "Download ZIP"
- Extract the ZIP file to your desired location
Open Terminal (Mac) or Command Prompt (Windows) in the project folder and run:
pip install anthropic
If that doesn't work, try:
pip3 install anthropic
- Open the
images.py
file in a text editor - Find this line:
self.client = anthropic.Anthropic(api_key="insert_api_key_here")
- Replace
"insert_api_key_here"
with your Anthropic API key - Follow development of Anthropic models and make adjustments in the script when new version is realised. Only models with vision capabilities are supported.
def __init__(self, model: str = "claude-3-5-sonnet-20241022")
def main():
available_models = [
"claude-3-5-sonnet-20241022",
"claude-3-opus-20240229",
"claude-3-haiku-20240307"
]
- Copy your images (.jpg, .jpeg, or .png) to the same folder as the script. Keep images around 1000px x 1000px for token consumption optimalization
- Open Terminal (Mac) or Command Prompt (Windows)
- Navigate to the script's folder:
cd path/to/your/folder
- Run the script:
python images.py
- Select a model when prompted (1-3)
- The script will create markdown (.md) files for each image in the same folder
.jpg
.jpeg
.png
- Automatically detects tables, formulas, graphs, flowcharts, etc.
- Creates markdown tables from image tables
- Converts mathematical formulas to LaTeX format
- Provides detailed analysis of graphs with key values
- Creates nice clear markdown mermaid from flowcharts and process diagrams
- Preserves anotations and tables with measurements
- Generates log files for troubleshooting
- IMPORTANT: If you need output in different language than ENG you need to translate system prompt in python script. Even though Anthropic models are multilingual, language of sytem prompt determinates language of output except data it directly transfers from source, like content of tables.
-
"No module named 'anthropic'"
- Run
pip install anthropic
again - Make sure you're using the correct Python version
- Run
-
"Invalid API key"
- Check if you've correctly inserted your API key in the script
- Verify your API key is active on Anthropic's website
-
"Python not found"
- Make sure Python is installed
- Try using
python3
instead ofpython
First, try your good friend ChatGPT, Claude or Gemini. All three of them are able to help you if you give them occured errors. Or create an issue in this repository with:
- The error message you're seeing
- Your operating system
- Steps you've tried
This project is licensed under the MIT License - see the LICENSE file for details.
- Uses Anthropic's API for image analysis
- Inspired by Petr's need for automated structured data extraction from images