Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative region coordinates and empty regions #19

Open
jahtz opened this issue Oct 31, 2023 · 2 comments
Open

negative region coordinates and empty regions #19

jahtz opened this issue Oct 31, 2023 · 2 comments

Comments

@jahtz
Copy link

jahtz commented Oct 31, 2023

After running the script, I noticed some negative coordinates, not enough coordinates and some empty regions.

negative coordinates:

<TextLine id="r1l31" custom="readingOrder {index:28;}">
    <Coords points="304,4432 2797,4482 2799,4365 1058,4337 -1323,4351"/>
    <Baseline points="320,4410 443,4412 566,4414 689,4416 812,4418 935,4421 1058,4422 1181,4424 1304,4426 1427,4428 1550,4430 1673,4434 1796,4436 1919,4438 2042,4442 2165,4446 2288,4450 2411,4454 2534,4460 2657,4464 2780,4470"/>
    <TextEquiv>
        ...

-> Value '304,4432 2797,4482 2799,4365 1058,4337 -1323,4351' is not facet-valid with respect to pattern '([0-9]+,[0-9]+ )+([0-9]+,[0-9]+)' for type 'PointsType'.

not enough coordinates and empty regions:

<TextRegion id="region_1535370511662_1" custom="readingOrder {index:1;}">
    <Coords points="206,1554"/>
        <TextEquiv>
            <Unicode/>
        </TextEquiv>
</TextRegion>

-> Value '206,1554' is not facet-valid with respect to pattern '([0-9]+,[0-9]+ )+([0-9]+,[0-9]+)' for type 'PointsType'.

ty!

@stweil
Copy link
Collaborator

stweil commented Oct 31, 2023

Could you please append an example PAGE file which can be used to reproduce the issue?

@jahtz
Copy link
Author

jahtz commented Nov 7, 2023

Sorry for the delay. The files are attached below.
negative_l218.xml: negative coordinates at line 821 and
empty_line_l821.xml: empty region at line 218
Thank you very much!
xmls.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants