Skip to content
Paul Crovella edited this page Jan 13, 2019 · 9 revisions

When dealing with documents large enough that memory becomes an issue that you'll want a streaming parser - either a pull parser such as this, or a SAX-style push parser. For an overview of the differences between push and pull parsers see XML reader models: SAX versus XML pull parser - the article focuses on XML but the same concepts apply.

Another case for streaming parsers is if you've got some unusual JSON that includes duplicate names on an object's properties. This is allowed by the JSON specification, but json_decode() (and the majority of other implementations) will clobber properties as their keys collide. Streaming parsers allow you to access an element at a time and retrieve data that might otherwise disappear.

Clone this wiki locally