Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test and improve performance #19

Open
proycon opened this issue Sep 20, 2023 · 2 comments
Open

Test and improve performance #19

proycon opened this issue Sep 20, 2023 · 2 comments
Assignees

Comments

@proycon
Copy link
Contributor

proycon commented Sep 20, 2023

Good performance is one of the design goals of this STAM implementation. I
consider this to include both efficient run-time execution (CPU time), as well as resource
consumption (memory), and we often find a trade-off between the two.

The library implements a fair amount of benchmarks (run cargo bench) to quantify this.
Most insightful, however, are comparisons with other systems. The comparison with TextFabric is most
notable and most informative in this, especially with regard to searching/querying.

Performance is constrained by the way the STAM model itself is designed. Its
aim to be flexible in supporting multiple annotation paradigms, and possibly
act as a pivot model between multiple, implies the implementation can not
always be optimised as much as others. I'm again comparing with TextFabric here
where for instance certain queries may map more directly onto internal
structures.

Further tests and possible refactoring rounds are needed to ensure performance is the best we can do.

@proycon
Copy link
Contributor Author

proycon commented Oct 14, 2023

  • An extra DataKey -> Annotations reverse index might be implemented to improve performance for searches by key. Right now this has to go via Datakey -> AnnotationData , AnnotationData -> Annotations , having a direct map would save collecting and gathering the annotations in a new vector, as we could do a direct reference to the reverse index then.

@proycon
Copy link
Contributor Author

proycon commented Oct 14, 2023

Other proposed features related to performance are #22 and #23

@proycon proycon moved this from In Progress to On hold in STAM: Stand-off Text Annotation Model Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant