Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got repeated datainputStream directory name. #27

Open
jye0829 opened this issue Aug 22, 2024 · 4 comments
Open

Got repeated datainputStream directory name. #27

jye0829 opened this issue Aug 22, 2024 · 4 comments

Comments

@jye0829
Copy link

jye0829 commented Aug 22, 2024

I got two different regressor model, using pmml-transpiler transfer to .jar format. And import in the project as two jar file.
The error message is
2 files found with path 'com/zhenzhe/javacv/PMML$1266265220.data'.
Adding a packagingOptions block may help, please refer to
https://developer.android.com/reference/tools/gradle-api/8.2/com/android/build/api/dsl/ResourcesPackagingOptions
for more information. It is the transpiler generate two same data directory.Do you have any idea why they are the same
Screenshot 2024-08-22 at 3 22 57 PM

@vruusmann vruusmann transferred this issue from jpmml/jpmml-evaluator Aug 22, 2024
@vruusmann
Copy link
Member

2 files found with path 'com/zhenzhe/javacv/PMML$1266265220.data'.

The identifiers are generated using the IdentifierUtil#create(String, PMMLObject) utility method:
https://github.com/jpmml/jpmml-transpiler/blob/1.3.5/pmml-transpiler/src/main/java/org/jpmml/translator/IdentifierUtil.java#L112-L115

You get two identical identifiers, if the second argument - a PMMLObject object - is yielding the same "system identity hashcode" (SIH).

IIRC, the SIH relates to the location of the Java object in JVM memory space, and it is set automatically by the JVM. I cannot provide any "location hints" in advance, or move the Java object to a different location afterwards.

@vruusmann
Copy link
Member

What's your JVM vendor/version?

Again, IIRC, SIHs are often semi-deterministic, meaning that if you play out the same sequence of events (on a newly created JVM), you are likely to get the same SIHs for your Java objects.

My guess is that you were generating these JAR files using JPMML-Transpiler command-line application in two consecutive but independent sessions - hence the JVM is placing the org.dmg.pmml.PMMLObject instance to the same memory address, which yields the same SIH.

Try to change something about your JVM configuration between these two sessions. For example, change the size of the JVM memory, so that Java objects would get to deposited to a different memory location.

@vruusmann
Copy link
Member

All things considered, there's not enough information for me here, in order to suggest a definite workaround/fix.

I don't see this kind of resource identifier collision as a major bug. In fact, I'd consider it more like a feature, meaning that it's possible to get reproducible JAR builds!

@vruusmann
Copy link
Member

The alternative to SIHs would be random number generation. Feel free to implement this code change locally.

Perhaps there could be a configuration option for choosing the identifier style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants