Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(file_upload): validate mimetype as configured #1459

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

qvalentin
Copy link
Contributor

@qvalentin qvalentin commented Oct 22, 2024

@qvalentin qvalentin force-pushed the fix/mimetype-validation-file-upload branch 2 times, most recently from 2e4689b to f094ebc Compare October 22, 2024 09:27
@qvalentin qvalentin marked this pull request as ready for review October 22, 2024 14:37
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. backend Pertains to the Python backend. labels Oct 22, 2024
@dokterbob dokterbob added evaluate-with-priority What's needed to address this one? unit-tests Has unit tests. security labels Oct 31, 2024
Copy link
Collaborator

@dokterbob dokterbob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look very clean. Thanks this already great as it is, but I would prefer some slight changes to validation architecture -- separating HTTP logic from validation logic. Important as we'll add more such functionality going forward, good to have the API clean now.

file_response = await session.persist_file(
name=file.filename, content=content, mime=file.content_type
)

return JSONResponse(content=file_response)


def validate_file_upload(file: UploadFile):
if config.features.spontaneous_file_upload is None:
return # TODO: if it is not configured what should happen?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most elegant solution is to change the default value for config.features.spontaneous_file_upload to always be populated with SpontaneousFileUploadFeature then to have enabled default to False and move max_files and max_size_mb default values from the frontend to the backend config. This would remove the if ... is None in favour of more explicit code. But I get it if perhaps that's out of scope and left for a later PR.

The second best would be to follow the frontend in this and not allow uploads unless explicitly enabled. Definitely, we should try to avoid having a TODO in a PR. ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behaviour for the e2e tests is that it is None. So rejecting upload when it is not configured will break the tests and might also break things for users.

I agree that the best would be to not have None as the default, but this indeed seems out of scope.

file_response = await session.persist_file(
name=file.filename, content=content, mime=file.content_type
)

return JSONResponse(content=file_response)


def validate_file_upload(file: UploadFile):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have validation functions raise a well-defined Exception and to have a docstring specifying it's behaviour.

For example, in pydantic this is a ValueError and in Django it's a ValidationError. It should definitely not be a HTTPException as the file validation code should not have anything to do with HTTP, it's just validating an UploadFile.

We can catch validation errors and then re-raise them in a HTTP request handler (upload_file).

backend/chainlit/server.py Show resolved Hide resolved
backend/chainlit/server.py Show resolved Hide resolved
backend/chainlit/server.py Outdated Show resolved Hide resolved
backend/tests/test_server.py Outdated Show resolved Hide resolved
@dokterbob dokterbob added the blocked Awaiting update or feedback from user after initial review/comments. label Nov 6, 2024
@qvalentin
Copy link
Contributor Author

Tests look very clean. Thanks this already great as it is, but I would prefer some slight changes to validation architecture -- separating HTTP logic from validation logic. Important as we'll add more such functionality going forward, good to have the API clean now.

Where would you put the validation logic?
It could be a method of SpontaneousFileUploadFeature or just some function somewhere else?

@dokterbob
Copy link
Collaborator

Tests look very clean. Thanks this already great as it is, but I would prefer some slight changes to validation architecture -- separating HTTP logic from validation logic. Important as we'll add more such functionality going forward, good to have the API clean now.

Where would you put the validation logic? It could be a method of SpontaneousFileUploadFeature or just some function somewhere else?

This is what I referred to in that comment: #1459 (comment)

@qvalentin qvalentin force-pushed the fix/mimetype-validation-file-upload branch from d0f41f7 to ddbe7ab Compare November 21, 2024 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Pertains to the Python backend. blocked Awaiting update or feedback from user after initial review/comments. evaluate-with-priority What's needed to address this one? security size:L This PR changes 100-499 lines, ignoring generated files. unit-tests Has unit tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants