-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuning for OCR #1
Comments
Hi SSam! For OCR, you just need to pass def draw_ocr_bboxes(image, prediction):
scale = 1
draw = ImageDraw.Draw(image)
bboxes, labels = prediction['quad_boxes'], prediction['labels']
for box, label in zip(bboxes, labels):
color = random.choice(colormap)
new_box = (np.array(box) * scale).tolist()
draw.polygon(new_box, width=3, outline=color)
draw.text((new_box[0]+8, new_box[1]+2),
"{}".format(label),
align="right",
fill=color)
return image |
Thanks for the answer! But for creation of the training dataset the model output should be a dictionary or text ? |
Don't use, this model limited max length to 1024... A simple paper can execeed 2048 characters. |
@SSamDav @andimarafioti |
Hi,
Do you know how should I change the dataset creation for the task of OCR?
Is just the concatenation of bbox special tokens with the text or do I need to do more?
Thanks for the finetuning code!
The text was updated successfully, but these errors were encountered: