Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning for OCR #1

Open
SSamDav opened this issue Jun 24, 2024 · 4 comments
Open

Finetuning for OCR #1

SSamDav opened this issue Jun 24, 2024 · 4 comments

Comments

@SSamDav
Copy link

SSamDav commented Jun 24, 2024

Hi,

Do you know how should I change the dataset creation for the task of OCR?

Is just the concatenation of bbox special tokens with the text or do I need to do more?

Thanks for the finetuning code!

@andimarafioti
Copy link
Owner

Hi SSam!

For OCR, you just need to pass <OCR> as a task_prompt.
There is another task called "OCR with region", there you need to prepare your dataset a bit more. Your task_prompt would be <OCR_WITH_REGION>. Your labels should be dictionaries with keys for "labels" and "quad_boxes". Then you can redraw the boxes like this:

def draw_ocr_bboxes(image, prediction):
  scale = 1
  draw = ImageDraw.Draw(image)
  bboxes, labels = prediction['quad_boxes'], prediction['labels']
  for box, label in zip(bboxes, labels):
    color = random.choice(colormap)
    new_box = (np.array(box) * scale).tolist()
    draw.polygon(new_box, width=3, outline=color)
    draw.text((new_box[0]+8, new_box[1]+2),
    "{}".format(label),
     align="right",
      fill=color)
  return image

@SSamDav
Copy link
Author

SSamDav commented Jun 26, 2024

Thanks for the answer! But for creation of the training dataset the model output should be a dictionary or text ?

@MonolithFoundation
Copy link

Don't use, this model limited max length to 1024...

A simple paper can execeed 2048 characters.

@ctgushiwei
Copy link

@SSamDav @andimarafioti
hello! For OCR task ,how to label the data and how to define the Dataset class ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants