Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to provide a PDF in a prompt? #308

Closed
barapa opened this issue Aug 19, 2024 · 3 comments
Closed

Is it possible to provide a PDF in a prompt? #308

barapa opened this issue Aug 19, 2024 · 3 comments

Comments

@barapa
Copy link

barapa commented Aug 19, 2024

I have tried supplying a PDF via the UserImageMessage type, but that fails with the following error from openai:

openai.BadRequestError: Error code: 400 - {'error': {'message': "You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: ['png', 'jpeg', 'gif', 'webp'].", 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_image_format'}}

Is there a way to do this with magentic?

@barapa barapa changed the title Is it possible to provide a PDF? Is it possible to provide a PDF in a prompt? Aug 19, 2024
@jackmpcollins
Copy link
Owner

@barapa I think this out of scope for magentic. You will have to use https://github.com/Belval/pdf2image or another library to convert from PDF to one of those supported formats.

@jackmpcollins
Copy link
Owner

Anthropic now allows sending PDF document bytes, and magentic https://github.com/jackmpcollins/magentic/releases/tag/v0.35.0 supports this using the DocumentBytes object. See https://magentic.dev/vision/#documentbytes


DocumentBytes is used to provide a document as bytes to the LLM. This is currently only supported by some Anthropic models.

from pathlib import Path

from magentic import chatprompt, DocumentBytes, Placeholder, UserMessage
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel


@chatprompt(
    UserMessage(
        [
            "Repeat the contents of this document.",
            Placeholder(DocumentBytes, "document_bytes"),
        ]
    ),
    model=AnthropicChatModel("claude-3-5-sonnet-20241022"),
)
def read_document(document_bytes: bytes) -> str: ...


document_bytes = Path("...").read_bytes()
read_document(document_bytes)
# 'This is a test PDF.'

@jackmpcollins
Copy link
Owner

Closing this issue as doing the image -> PDF conversion is out of scope for magentic. Hopefully OpenAI adds PDF support too. This would likely be mentioned on their Vision docs here: https://platform.openai.com/docs/guides/vision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants