-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance hog class ResourceSerializer needs refactoring #62
Comments
From quick look, wrt suggested change: absolutely yes one should just do:
instead of using While this makes sense to do and should be easy improvement, I would not have assumed it to have huge performance implications... always interesting to see where changes matter, where not. :) I can't comment much about surrounding design as I do not really know the implementation. |
@jbaiter @bitzl @morpheus-87 @datazuul |
Thanks for the excellent in-depth analysis @mtravnicek and @cowtowncoder!
Yes, that's exactly the purpose. The exploded form is e.g. used in collections where you want the type, the identifier and the labels of a given resource.
I agree that the code is an unholy mess and should be replaced with something better, but I'm not exactly sure how to go about it. The IIIF spec has a really large surface area, and for determining whether to unwrap a resource or not, we have to know if any of the allowed fields is set to a non-null or non-empty value. This is information that is only known at runtime, so this this means either:
I'll be doing some measurements on how much of this reflection-stuff is hurting performance and decide based on that whether to find a better way. |
So I did some JMH benchmarking with a simple test case that creates a manifest with 128 canvases (and should hit the line 128 times), here are the results: With
With
There is no difference outside of the statistical margin with my benchmark :-/ Can you maybe provide one of your pathological samples so I can do more profiling? As for the reflection stuff, that is a pretty huge performance hog indeed! I'll try to refactor it with one of the above solutions (or another one, if you have a good idea!). edit: So I experimented a bit with both solutions, and neither of them is really satisfying as they both result in TONS of boilerplate code. I think the whole API is up for a more large-scale refactoring. Ideally we'd have something like a |
Ok. I am not entirely surprised by change not resulting in statistically meaningful improvement -- I know |
Sorry for later response.... We are using your codebase (thanks) in this digital library app for generating iiif manifests: https://github.com/moravianlibrary/krameriusiiif (Here is some page explaining this tool) When i tried to optimize related serializer class, then Content Download value went back to almost 0 which is normal on localhost and overall latency decreased, which was confirmed by in app measuring. So please replace this oneliner of gen.writeObject to gen.writeString it wont hurt your code (getIdenifier.toString()) and may help me :) and maybe others.... |
When producing moderately small IIIF manfests (monograph with about 35 pages/images) using REST controller in Spring, added latency about 15ms can be noticed in process of serializing using Jackson. At first I thought that this was Jackson being slow... but slowdown can be traced down to custom serializer ResourceSerializer namely line 110 where replacing gen.writeObject with gen.writeString fixes most of the latency. This can be blatantly seen in Chrome response view as blue tail of time to download content.
iiif-apis/src/main/java/de/digitalcollections/iiif/model/jackson/serialization/ResourceSerializer.java
Line 110 in cd92524
Rest of processing in this class namely ModelUtilities.getCompleteness and following decoding/if statements also should be rethinked. Because you are calling this code (reflection!) on every processed element in JSON serialization again and again - it is not cached anywhere. What is the purpose of this code? I reckon that you try to determine if elements should be unwrapped to list or not, but for example "on" #25 , do you ever want it written out in exploded form? Does it have to be defined as (anonymous?) resource in contrast to context represented as simple string (no resource) #36 ? So the whole model approach should be reanalyzed and these flags to unwrap element or not - set it in form or some static flags in Resource instance itself, do not reverse engineer it at runtime!
Perhaps even authors of Jackson can look into/share some opinion on this: @cowtowncoder
The text was updated successfully, but these errors were encountered: