Skip to content

Releases: weblyzard/inscriptis

Use the requests library for URL fetching

31 Jan 13:49
Compare
Choose a tag to compare
  • use requests for URL fetching (this addresses #17 and prevents 403 responses with some Web servers).

Fixed handling of negative margins.

21 Dec 14:35
Compare
Choose a tag to compare
  • correctly parse negative margins in CSS definitions.
  • This fixes a bug that led for some pages to a high number (>1000) of newlines between content.

Use server encoding, if available in the inscript.py client.

11 Dec 19:50
Compare
Choose a tag to compare

This prevents encoding errors when using inscript.py for converting HTML pages to text.

Decode HTLM entities

15 Nov 15:26
Compare
Choose a tag to compare

Decode HTML entities such as Auml;, Ouml;, Uuml;prior to returning the plain text version of the HTML page.

Improved parsing and PyPI metadata

17 Apr 11:19
Compare
Choose a tag to compare
  • improved handling of highly nested tables
  • more comprehensive PyPI metadata

flask web service and more reliable parsing

24 Nov 08:47
Compare
Choose a tag to compare

Changelog

  1. optional flask web service for converting html to python
  2. bug fixes
    • allow infinitely nested lists
    • fix a css parsing bug
    • correctly handle empty documents