Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs2-data-xml streaming parser #25

Closed
wants to merge 11 commits into from
5 changes: 3 additions & 2 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
ThisBuild / tlBaseVersion := "0.23"
ThisBuild / tlMimaPreviousVersions ++= (0 to 11).map(y => s"0.23.$y").toSet
ThisBuild / tlBaseVersion := "0.24"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't decide whether or not to bump the base version. We can make this change binary-compatibly, it's more about semantics.

ThisBuild / developers := List(
tlGitHubDev("rossabaker", "Ross A. Baker")
)
Expand All @@ -12,6 +11,7 @@ lazy val root = project.in(file(".")).aggregate(scalaXml).enablePlugins(NoPublis

val http4sVersion = "0.23.12"
val scalaXmlVersion = "2.1.0"
val fs2DataVersion = "1.4.0"
val munitVersion = "0.7.29"
val munitCatsEffectVersion = "1.0.7"

Expand All @@ -24,6 +24,7 @@ lazy val scalaXml = project
libraryDependencies ++= Seq(
"org.http4s" %%% "http4s-core" % http4sVersion,
"org.scala-lang.modules" %%% "scala-xml" % scalaXmlVersion,
"org.gnieh" %%% "fs2-data-xml-scala" % fs2DataVersion,
"org.scalameta" %%% "munit-scalacheck" % munitVersion % Test,
"org.typelevel" %%% "munit-cats-effect-3" % munitCatsEffectVersion % Test,
"org.http4s" %%% "http4s-laws" % http4sVersion % Test,
Expand Down
37 changes: 21 additions & 16 deletions scala-xml/src/main/scala/org/http4s/scalaxml/ElemInstances.scala
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,19 @@ package org.http4s
package scalaxml

import cats.effect.Concurrent
import cats.syntax.all._
import fs2.data.xml.XmlException
import fs2.data.xml.scalaXml._
import org.http4s.Charset.`UTF-8`
import org.http4s.headers.`Content-Type`

import java.io.ByteArrayInputStream
import java.io.StringWriter
import javax.xml.parsers.SAXParserFactory
import scala.util.control.NonFatal
import scala.xml.Elem
import scala.xml.InputSource
import scala.xml.SAXParseException
import scala.xml.XML

trait ElemInstances {
@deprecated("0.23.12", "This factory is no longer used")
protected def saxFactory: SAXParserFactory
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the (semantically) breaking change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing SAX will also change the types of errors that are raised.


implicit def xmlEncoder[F[_]](implicit charset: Charset = `UTF-8`): EntityEncoder[F, Elem] =
Expand All @@ -52,18 +52,23 @@ trait ElemInstances {
implicit def xml[F[_]](implicit F: Concurrent[F]): EntityDecoder[F, Elem] = {
import EntityDecoder._
decodeBy(MediaType.text.xml, MediaType.text.html, MediaType.application.xml) { msg =>
val source = new InputSource()
msg.charset.foreach(cs => source.setEncoding(cs.nioCharset.name))

collectBinary(msg).flatMap[DecodeFailure, Elem] { chunk =>
source.setByteStream(new ByteArrayInputStream(chunk.toArray))
val saxParser = saxFactory.newSAXParser()
try DecodeResult.successT[F, Elem](XML.loadXML(source, saxParser))
catch {
case e: SAXParseException =>
DecodeResult.failureT(MalformedMessageBodyFailure("Invalid XML", Some(e)))
case NonFatal(e) => DecodeResult(F.raiseError[Either[DecodeFailure, Elem]](e))
}
DecodeResult {
msg.bodyText
.through(fs2.data.xml.events())
.through(fs2.data.xml.dom.documents)
.head
.compile
.lastOrError
.map {
_.docElem match {
case e: Elem => Right(e)
case _ => Left(MalformedMessageBodyFailure("Invalid XML"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this charade, maybe this should become an EntityEncoder[F, Document] instead?

}
}
.recover { case ex: XmlException =>
Left(MalformedMessageBodyFailure("Invalid XML", Some(ex)))
}
.widen
}
}
}
Expand Down