-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to prepare my own dataset #1
Comments
Hi, If you want to test your methodology first, the RVL-CDIP dataset is easier to use since they have already classified all the 400K documents in 16 classes manually; if you want to further extend the classification granularity of the RVL-CDIP to 48 classes, you have several options.
Hope this would be somewhat useful. |
really helpful thxs |
Hi , can you tell me how many GPUs were required for the training purpose? |
lets say i want to classiy domain document into about 48 categories, am I create like The RVL-CDIP Dataset? what`s the proper dpi of document image ?should I process them into grayscale?
400,000 grayscale images in 16 classes, with 25,000 images per class
400,0003 grayscale images in 163 classes, with 25,000 images per class
The text was updated successfully, but these errors were encountered: