-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOPN Parsing: Table Extraction Errors #1727
Comments
Missing first names #1426 (comment) |
Having looked in to #1728 (comment) a little bit locally, I think this is bug when extracting the tables. It seems that both the "surname" and "other name" are parsed together, as if they were in the same column. Debug print of the row parsed for this candidate where you can see the "surname" has both and "other name" is blank (need to scroll to see):
|
Example of a SOPN where it was published as HTML, I printed it to PDF and the bot failed to parse it: https://candidates.democracyclub.org.uk/elections/local.dorset.lyme-charmouth.by.2022-04-07/sopn/ We don't get many of these, but posting this here in case it is useful |
|
|
https://candidates.democracyclub.org.uk/elections/local.watford.central.2022-05-05/sopn/ |
|
|
|
|
|
I think this is actually a good thing - the table on the second page includes the header row again, but the parser skips it because it can't find any party. But the rest of the people are parsed. Although would be better if we could remove the row entirely earlier on |
Looking at this SOPN we have no hope of parsing unfortunately |
https://candidates.democracyclub.org.uk/elections/local.cambridgeshire.arbury.by.2023-05-04/sopn/ Marked |
This issue is exclusively to track issues with SOPN Table Extraction.
For SOPN Parsing: Table Parsing Errors, go here: #1728
For SOPN Parsing: Page Extraction Errors, go here: #1726
Table extraction errors are typically found after a successful SOPN upload, during a bot parse. The bot fails to parse completely and the result is no pre-filled info in the bulk add form.
Please add these types of issues in the comments below with a
The text was updated successfully, but these errors were encountered: