-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] use dataset.map in pipeline #179
base: main
Are you sure you want to change the base?
Conversation
2f72673
to
a983f7c
Compare
I'm sorry, but as discussed yesterday I'm not very happy with that. As far as I'm concerned, this adds very little in terms of functionality but a lot of unknown factors regarding HF's dataset caching internals. |
ee2d35f
to
e8d0d1d
Compare
I will respond soon. EDIT: Finally, I got some time to respond. The points I had in mind are:
As usual, that is my impression on this issue which may be a result of lack of knowledge :) |
1a65c49
to
ec3a3ea
Compare
4259649
to
6d63f08
Compare
If
documents
of typeDataset
is passed to the pipeline, usedocuments.map
to add the predictions. In this case, aDataset
is returned instead ofSequence[Document]
.Note: Builds on top of #178 which should be reviewed first.
tests_no_local_datasets
is passing.EDIT: For now, an error is thrown when
datasets
attempts to cache pipeline results, see ee2d35f.