-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOPN Parsing: Page Extraction Errors #1726
Comments
Page matching error: #1426 (comment) |
https://candidates.democracyclub.org.uk/elections/local.west-lothian.livingston-south.2022-05-05/sopn/ (and other SOPNs for that election) don't match pages. Chances are this is because the ward names are in the table header. |
Hi, it seems the Fife Council one has problems as each table is spread over two pages in the PDF. https://candidates.democracyclub.org.uk/elections/local.fife.burntisland-kinghorn-and-western-kirkcaldy.2022-05-05/sopn/ |
Wigan strangeness - the correct pages have been used by the parser for all the LA (so far) but the link in the Ashton ward goes to another ward's SoPN |
It's also joined the Hindley and Hindley Green wards, suggesting it's not strict enough when considering if a ward stretches onto two pages of a SoPN. ...it then processed the Hindley Green page (again) for that ward without issue |
Wigan Winstanley ward - it offered the wrong candidate names and linked to the wrong (page of the) SoPN |
https://candidates.democracyclub.org.uk/elections/local.oxford.cowley.2022-05-05/ |
https://candidates.democracyclub.org.uk/elections/local.wigan.ashton.2022-05-05/ |
This 4-page single ward PDF incorrectly generated a "Watch out! The original document contains candidate info for 2 areas." warning https://candidates.democracyclub.org.uk/elections/local.tower-hamlets.bethnal-green-west.2022-05-05/sopn/ |
Same with https://candidates.democracyclub.org.uk/elections/local.tower-hamlets.bethnal-green-east.2022-05-05/sopn/ |
local.lichfield.boney-hay-central.2023-05-04 - the pages for Boney Hay & Central and Bourne Vale wards have been combined |
Exeter SOPNs don't appear to have been parsed by the bot - I've looked at the first 3 so far. |
DocX file for Torbay Council doesn't appear to have been understood by the bot. [Edit] Later Wards within this SOPN document have not been page matched by the bot and required manual (Ctrl + F) Searching to even find the correct page of the SOPN to manually add the candidates. |
Sandwell St. Paul’s is in a limbo half-broken state. The page extraction failed but the table parsing succeeded (albeit in a slightly janky format). The SOPN uploaded is the entire combined PDF file. The suspect for this strange breakage was the backtick in the ward name although Virginia has checked this out and can’t see a problem with it. https://candidates.democracyclub.org.uk/elections/local.sandwell.st-pauls.2023-05-04/sopn/ |
This issue is exclusively to track issues with SOPN Page Extraction.
For SOPN Parsing: Table Parsing Errors, go here: #1728
For SOPN Parsing: Table Extraction Errors, go here: #1727
Page extraction errors are typically when trying to upload a SOPN upload. Most common errors include:
Please add these types of issues in the comments below with a
The text was updated successfully, but these errors were encountered: