Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump org.jsoup:jsoup from 1.18.1 to 1.18.3 #363

Closed

Conversation

dependabot[bot]
Copy link

@dependabot dependabot bot commented on behalf of github Dec 2, 2024

Bumps org.jsoup:jsoup from 1.18.1 to 1.18.3.

Release notes

Sourced from org.jsoup:jsoup's releases.

jsoup-1.18.3

1.18.3 is a quick release to fix #2235 in 1.18.2.

Please see also the full release notes for jsoup 1.18.2 if you are coming from an earlier release.

Bug Fixes

  • When serializing to XML, attribute names containing -, ., or digits were incorrectly considered invalid and removed. #2235

jsoup 1.18.2

Improvements

  • Optimized the throughput and memory use throughout the input read and parse flows, with heap allocations and GC down between -6% and -89%, and throughput improved up to +143% for small inputs. Most inputs sizes will see throughput increases of ~ 20%. These performance improvements come through recycling the backing byte[] and char[] arrays used to read and parse the input. 2186
  • Speed optimized html() and Entities.escape() when the input contains UTF characters in a supplementary plane, by around 49%. 2183
  • The form associated elements returned by FormElement.elements() now reflect changes made to the DOM, subsequently to the original parse. 2140
  • In the TreeBuilder, the onNodeInserted() and onNodeClosed() events are now also fired for the outermost / root Document node. This enables source position tracking on the Document node (which was previously unset). And it also enables the node traversor to see the outer Document node. 2182
  • Selected Elements can now be position swapped inline using Elements#set(). 2212

Bug Fixes

  • Element.cssSelector() would fail if the element's class contained a * character. 2169
  • When tracking source ranges, a text node following an invalid self-closing element may be left untracked. 2175
  • When a document has no doctype, or a doctype not named html, it should be parsed in Quirks Mode. 2197
  • With a selector like div:has(span + a), the has() component was not working correctly, as the inner combining query caused the evaluator to match those against the outer's siblings, not children. 2187
  • A selector query that included multiple :has() components in a nested :has() might incorrectly execute. 2131
  • When cookie names in a response are duplicated, the simple view of cookies available via Connection.Response#cookies() will provide the last one set. Generally it is better to use the Jsoup.newSession method to maintain a cookie jar, as that applies appropriate path selection on cookies when making requests. 1831
  • When parsing named HTML entities, base entities should resolve if they are a prefix of the input token (and not in an attribute). 2207
  • Fixed incorrect tracking of source ranges for attributes merged from late-occurring elements that were implicitly created (html or body). 2204
  • Follow the current HTML specification in the tokenizer to allow < as part of a tag name, instead of emitting it as a character node. 2230
  • Similarly, allow a < as the start of an attribute name, vs creating a new element. The previous behavior was intended to parse closer to what we anticipated the author's intent to be, but that does not align to the spec or to how browsers behave. 1483
Changelog

Sourced from org.jsoup:jsoup's changelog.

1.18.3 (PENDING)

Bug Fixes

  • When serializing to XML, attribute names containing -, ., or digits were incorrectly marked as invalid and removed. 2235

1.18.2 (2024-Nov-27)

Improvements

  • Optimized the throughput and memory use throughout the input read and parse flows, with heap allocations and GC down between -6% and -89%, and throughput improved up to +143% for small inputs. Most inputs sizes will see throughput increases of ~ 20%. These performance improvements come through recycling the backing byte[] and char[] arrays used to read and parse the input. 2186
  • Speed optimized html() and Entities.escape() when the input contains UTF characters in a supplementary plane, by around 49%. 2183
  • The form associated elements returned by FormElement.elements() now reflect changes made to the DOM, subsequently to the original parse. 2140
  • In the TreeBuilder, the onNodeInserted() and onNodeClosed() events are now also fired for the outermost / root Document node. This enables source position tracking on the Document node (which was previously unset). And it also enables the node traversor to see the outer Document node. 2182
  • Selected Elements can now be position swapped inline using Elements#set(). 2212

Bug Fixes

  • Element.cssSelector() would fail if the element's class contained a * character. 2169
  • When tracking source ranges, a text node following an invalid self-closing element may be left untracked. 2175
  • When a document has no doctype, or a doctype not named html, it should be parsed in Quirks Mode. 2197
  • With a selector like div:has(span + a), the has() component was not working correctly, as the inner combining query caused the evaluator to match those against the outer's siblings, not children. 2187
  • A selector query that included multiple :has() components in a nested :has() might incorrectly execute. 2131
  • When cookie names in a response are duplicated, the simple view of cookies available via Connection.Response#cookies() will provide the last one set. Generally it is better to use the Jsoup.newSession method to maintain a cookie jar, as that applies appropriate path selection on cookies when making requests. 1831
  • When parsing named HTML entities, base entities should resolve if they are a prefix of the input token (and not in an attribute). 2207
  • Fixed incorrect tracking of source ranges for attributes merged from late-occurring elements that were implicitly created (html or body). 2204
  • Follow the current HTML specification in the tokenizer to allow < as part of a tag name, instead of emitting it as a character node. 2230
  • Similarly, allow a < as the start of an attribute name, vs creating a new element. The previous behavior was intended to parse closer to what we anticipated the author's intent to be, but that does not align to the spec or to

... (truncated)

Commits
  • 7c56eb2 [maven-release-plugin] prepare release jsoup-1.18.3
  • bf13b49 Assert namespaced attribute with digit
  • 0a4b830 Fix XML attribute validation
  • f6e82f2 Note 1.18.2 release date
  • 2a174dc [maven-release-plugin] prepare for next development iteration
  • 71063c3 [maven-release-plugin] prepare release jsoup-1.18.2
  • 1a91aac Use the incoming node's parent if outgoing has already been removed
  • df404cf test case for Issue #2212
  • 28db617 Test for #1938
  • d27370a Follow spec so < can start an attribute name
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [org.jsoup:jsoup](https://github.com/jhy/jsoup) from 1.18.1 to 1.18.3.
- [Release notes](https://github.com/jhy/jsoup/releases)
- [Changelog](https://github.com/jhy/jsoup/blob/master/CHANGES.md)
- [Commits](jhy/jsoup@jsoup-1.18.1...jsoup-1.18.3)

---
updated-dependencies:
- dependency-name: org.jsoup:jsoup
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Dec 2, 2024
Copy link

sonarqubecloud bot commented Dec 2, 2024

Copy link

github-actions bot commented Dec 2, 2024

Test Results

 89 files  ±0   89 suites  ±0   1m 36s ⏱️ -21s
420 tests ±0  419 ✅ ±0  0 💤 ±0   1 ❌ ±0 
428 runs  ±0  415 ✅ ±0  0 💤 ±0  13 ❌ ±0 

For more details on these failures, see this check.

Results for commit 2f211e0. ± Comparison against base commit 636a74b.

This pull request removes 60 and adds 23 tests. Note that renamed tests count towards both.
                                <a href="http://aim.org">improve</a>' 
                                <a href="http://arc42.de">arc42.de</a> and some more text
                                <a href="https://arc42.org">arc42 over https</a> even more
                                <a href="local-file.jpg">local file</a> again, text
                                <img src="" alt="2">
                                <img src="t.doc" alt="r"> '
    <area shape="circle" coords="0,1,1" href="#test2">
    <area shape="rect" coords="0,0,1,1" href="#id1" >
    <area shape="rect" coords="0,0,1,1" href="#test1" >
    <area shape="rect" coords="0,0,1,1" href="#test1">
…
org.aim42.htmlsanitycheck.check.ImageMapsCheckerSpec ‑ find image map issues [nrOfFindings: 1, imageMapStr: <img src="image1.jpg" usemap="#map1"><map name="map1">
    <area shape="rect" coords="0,0,1,1" href="#id1" >
</map>
<h2 id="foo" >bad header</h2>, msg: ImageMap "map1" refers to missing link "id1"., #4]
org.aim42.htmlsanitycheck.check.ImageMapsCheckerSpec ‑ find image map issues [nrOfFindings: 1, imageMapStr: <img src="image1.jpg" usemap="#map1"><map name="map1">
    <area shape="rect" coords="0,0,1,1" href="#id1" >
</map>
<map name="map1">
    <area shape="rect" coords="0,0,1,1" href="#id1" >
</map>
<h2 id="id1">aim42 header</h2>, msg: 2 imagemaps with identical name "map1" exist., #1]
org.aim42.htmlsanitycheck.check.ImageMapsCheckerSpec ‑ find image map issues [nrOfFindings: 1, imageMapStr: <img src="image1.jpg" usemap="#map1"><map name="map1">
</map>
, msg: ImageMap "map1" has no area tags., #3]
org.aim42.htmlsanitycheck.check.ImageMapsCheckerSpec ‑ find image map issues [nrOfFindings: 1, imageMapStr: <map name="map1">
    <area shape="rect" coords="0,0,1,1" href="#id1" >
</map>
<h2 id="id1">aim42 header</h2>, msg: ImageMap "map1" not referenced by any image., #2]
org.aim42.htmlsanitycheck.html.HtmlPageSpec ‑ can extract alt attributes from imageTag ' <img alt="1" >
                                <img src="" alt="2">
                                <img src="t.doc" alt="r"> '
org.aim42.htmlsanitycheck.html.HtmlPageSpec ‑ detect correct number of external http links in anchors '<a href="http://arc42.org">arc42</a> and some text
                                <a href="http://arc42.de">arc42.de</a> and some more text
                                <a href="https://arc42.org">arc42 over https</a> even more
                                <a href="local-file.jpg">local file</a> again, text
                                <a href="http://aim.org">improve</a>' 
org.aim42.htmlsanitycheck.html.HtmlPageSpec ‑ detect missing alt attributes in imageTag ' <img alt="1" >
                                <img src="" alt="2">
                                <img src="t.doc" alt="r"> '
org.aim42.htmlsanitycheck.html.ImageMapParserSpec ‑ find all areas within map [nrOfAreas: 1, mapName: mymap, htmlBody: <img src="image.gif" usemap="#mymap">
<map name="mymap">
    <area shape="rect" coords="0,0,1,1" href="#test1" >
</map> , #1]
org.aim42.htmlsanitycheck.html.ImageMapParserSpec ‑ find all areas within map [nrOfAreas: 2, mapName: mymap, htmlBody: <img src="image.gif" usemap="#mymap">
<map name="mymap">
    <area shape="rect" coords="0,0,1,1" href="#test1" >
    <area shape="circle" coords="0,1,1" href="#test2">
</map> , #0]
org.aim42.htmlsanitycheck.html.ImageMapParserSpec ‑ find all hrefs within map [nrOfHrefs: 1, mapName: mymap, htmlBody: <img src="image.gif" usemap="#mymap">
<map name="mymap">
    <area shape="rect" coords="0,0,1,1" href="#test1" >
</map> , hrefs: [#test1], #1]
…

♻️ This comment has been updated with latest results.

Copy link
Author

dependabot bot commented on behalf of github Dec 13, 2024

Looks like org.jsoup:jsoup is up-to-date now, so this is no longer needed.

@dependabot dependabot bot closed this Dec 13, 2024
@dependabot dependabot bot deleted the dependabot/gradle/develop/org.jsoup-jsoup-1.18.3 branch December 13, 2024 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants