-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve data performance #4
Comments
Are the inputs CSV files? Might be worth converting to |
@gnperdue I believe we need both. Data reformatting and data processing (pre-fetching, etc.) |
@ommiaa - I have some code for converting HDF5 to TFRecords - it was for a neutrino experiment, and probably hard to fully grok, but I can dig it out if you'd like to look at it. |
The TF documentation has examples for CSV to TFRecord also... |
@ommiaa actually, even better than my neutrino stuff -- look at this https://github.com/gnperdue/RandomData/tree/master/TensorFlow It is an HDF5 -> TFRecord converter for the "fashion MNIST" dataset. It is using TF1.X era code (it has been a while since I touched this), but the TF folks have nice TF1->TF2 conversion utilities, so if it doesn't work, you can try that to update for TF2. The HDF5 inputs exist here https://github.com/gnperdue/RandomData/tree/master/hdf5 (in the same repo). IIRC, the official TF documentation for CSV -> TFRecord conversion is pretty good. That is a use case Google cares about (CSV, and JPG/PNG images -> TFRecord is easy, but they don't care about HDF5, so you have to do a bit more "by hand"). |
@gnperdue , @schr476 . I am ramping up with Jason's help. Please confirm the following assumptions (I know they might seem obvious)
|
@ommiaa yes, that is correct. We want to convert HDF5 to TFRecords. Actually, there is an upstream step - we first go CSV to HDF5. You could skip that and go right to TFRecord. |
Develop TF data object (https://www.tensorflow.org/guide/data_performance).
The text was updated successfully, but these errors were encountered: