Skip to content

elliotBraem/ai-rachnid

 
 

Repository files navigation

AI-rachnid (Web Scraper)

FastAPI for efficient, AI-driven web scraping using Scrapegraph-ai

Note

This project is a fork of scrapegraph-ai-fastapi, fixed and adapted to support multi page scraping.

Table of Contents

Getting Started

Environment Setup

Copy .env.example to .env and configure your API keys. Due to the special nature of the Gemini model, it is configured separately. Other models are configurable via API_KEY and API_BASE_URL.

GOOGLE_API_KEY=
GOOGLE_API_ENDPOINT=
API_KEY=
API_BASE_URL=

Running with Docker

Ensure you have a Docker instance running. For MacOS, I recommend using OrbStack.

Available commands:

  • npm run docker:build - Build the Docker image
  • npm run docker:dev - Run the container in development mode
  • npm run dev - Build and run in one command
  • npm run docker:stop - Stop running containers
  • npm run docker:clean - Clean up Docker resources

Available Models

The API supports multiple model providers and models, using langchain's init_chat_model.

  • Google Gemini

    • Provider: google_genai
    • Model: google_genai/gemini-1.5-flash-latest // or other model
    • Requires: GOOGLE_API_KEY or GOOGLE_API_ENDPOINT in .env
  • OpenAI

    • Provider: openai
    • Model: gpt-4o-mini // or other model
    • Requires: API_KEY or API_BASE_URL in .env
  • Ollama

    • Provider: ollama
    • Model: ollama/llama3.1 // or other model

You can find more supported models on the langchain website init_chat_model.

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you're interested in contributing to this project, please read the contribution guide.

About

Web Scraper API service built using FastAPI and ScrapeGraphAI

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages

  • Python 93.5%
  • Dockerfile 6.5%