Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entity Size Limit Met when Using Default Java Options #2

Open
Barbarrosa opened this issue Nov 19, 2016 · 1 comment
Open

Entity Size Limit Met when Using Default Java Options #2

Barbarrosa opened this issue Nov 19, 2016 · 1 comment

Comments

@Barbarrosa
Copy link
Collaborator

When attempting to process a Wikipedia dump with the default Java settings, I run into this error:

$ java -jar target/Wiki7ZipXmlDumpReader-0.0.1.jar > ../results.out
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[44771354,588]
Message: JAXP00010004: The accumulated size of entities is "50,000,001" that exceeded the "50,000,000" limit set by "FEATURE_SECURE_PROCESSING".
        at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:596)
        at us.elephanthunter.Wiki7ZipXmlDumpReader.Wiki7ZipXmlDumpReader.Read7ZipStream$1.run(Read7ZipStream.java:103)
        at java.lang.Thread.run(Thread.java:745)

Based on this thread about the same problem, I can use the jdk.xml.totalEntitySizeLimit setting to circumvent this limit.

java -Djdk.xml.totalEntitySizeLimit=500000000 -jar target/Wiki7ZipXmlDumpReader-0.0.1.jar

I would like to pre-configure our application to use such a limit if possible, or else find a way to avoid the need for this higher limit.

@Barbarrosa
Copy link
Collaborator Author

I still ran into this limit when using 500,000,000.

javax.xml.stream.XMLStreamException: ParseError at [row,col]:[489817854,234]
Message: JAXP00010004: The accumulated size of entities is "500,000,001" that exceeded the "500,000,000" limit set by "system property".
        at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:596)
        at us.elephanthunter.Wiki7ZipXmlDumpReader.Wiki7ZipXmlDumpReader.Read7ZipStream$1.run(Read7ZipStream.java:103)
        at java.lang.Thread.run(Thread.java:745)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant