[FEATURE REQUEST] use of dataset in tfsim.callbacks.EvalCallback #293

Lunatik00 · 2022-09-22T16:21:05Z

Hi, I have a relatively big dataset, considering the available ram, I currently have access to machines that I can use with the dataset, so that is not a problem for me, but since the ram use is a lot I checked if there was an implementation to use a dataset (tf.data.Dataset(), the same way it can be an input for the model.fit() function) and it wasn't, it could help people with less compute resources to use this function with their datasets (I read the dataset using the function tf.keras.utils.image_dataset_from_directory(), it can be batched or unbatched)

owenvallis · 2022-09-23T20:03:01Z

So we do provide the tfrecord sampler for handling datasets that are too large to fit in memory. There are some quirks to setting up the TFRecords, i.e., this sampler requires that each TF Record file contain contiguous blocks of classes where the size of each block is a multiple of example_per_class.

Regarding the EvalCallback. This was meant to hold a smaller subset of the data in memory as we need to rebuild the index every time we call the Callback. Since this is pretty expensive, the expectation is that this is small eval set.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] use of dataset in tfsim.callbacks.EvalCallback #293

[FEATURE REQUEST] use of dataset in tfsim.callbacks.EvalCallback #293

Lunatik00 commented Sep 22, 2022

owenvallis commented Sep 23, 2022

[FEATURE REQUEST] use of dataset in tfsim.callbacks.EvalCallback #293

[FEATURE REQUEST] use of dataset in tfsim.callbacks.EvalCallback #293

Comments

Lunatik00 commented Sep 22, 2022

owenvallis commented Sep 23, 2022