Scrambles the contents of some outline/hierarchical PDF documents #77

steveisakson · 2024-08-10T17:32:08Z

Generate a PDF of this page: https://www.ecfr.gov/current/title-14/chapter-I/subchapter-G/part-139

Convert to MD with pdf-to-markdown.

Compare the PDF with MD. Headings are several lines before the paragraph text that follows in the PDF. Start at the end to find more pronounced differences.

I haven't examined the PDF contents, so this might be related more to the PDFs or how the doc-to-pdf is configured on eCFR.gov. OTOH, they are automatically generated by a (presumably) commercial package. And eCFR has millions of users.

PS - It's not all bad. Your PDF parsing knocks the socks off a lot of other online tools. And the translation to MD is great — thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scrambles the contents of some outline/hierarchical PDF documents #77

Scrambles the contents of some outline/hierarchical PDF documents #77

steveisakson commented Aug 10, 2024

Scrambles the contents of some outline/hierarchical PDF documents #77

Scrambles the contents of some outline/hierarchical PDF documents #77

Comments

steveisakson commented Aug 10, 2024