-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize third party libraries for generating string hash and unique ids #6493
Comments
|
@eatyourgreens I'm confused about the classifier's use of Unless I'm missing something obvious, I don't see that behavior. For instance, let's say I make a classification on Daily Minor Planet, and the session storage object is
More than 5 minutes later, I make another classification in the same browser tab, and the session storage object's
I think my question is two fold: Are javascript dates being used as intended in session.js, and why does session need to be a cryptographically secure hash? |
That's old code that Ed originally wrote, I think. It should reset after 5 minutes of inactivity, but I forget what calls it to refresh the TTL. The original code is here: https://github.com/zooniverse/Panoptes-Front-End/blob/master/app/lib/session.coffee I think the monorepo code might be a direct JS copy of the original. |
There's a test for the case where the TTL has expired but it's a bad test. It tests that a stub has been called, rather than testing that the session ID has changed. Unfortunately, the classifier is full of tests that use stubs like this. The problem with them is that testing if a method was called is not the same as testing if a method behaved as expected. |
Agreed it's a bad test. The condition It's probably worth removing the "five minutes" logic from both PFE and FEM. |
To answer your original question, though, I would expect that I don't think it matters if the value is secure. As far as I'm aware, project teams use it to group classifications by anonymous volunteers, so it only needs to be unique to a session. |
You're right! I don't see how that code has ever worked. Which means any analysis based on PFE session IDs is basically flawed. Good catch! |
Any papers that were published based on that code might be wrong. 😳 |
Maybe 🤔 When someone looks at a classification's session id, without a signed-in user, do they assume it's a metadata value that refreshes every 5 minutes though? When I think "session id" as a web concept, I think of it as an individual browser tab id that refreshes when a new tab or window is opened. |
It doesn't refresh every 5 minutes. It's supposed to refresh if there's no activity for 5 minutes. If you're classifying rapidly, each classification should have the same session ID, because the TTL updates every time you press Done. I'm sure it's been used to measure session length in papers. Off the top of my head, maybe the Microsoft experiment that I helped with, which looked at the effect of intervention messages on session length. |
The per-tab sessions that you're referring to are recorded by Sugar, and counted on https://activity.zooniverse.org. |
That's a good clarification about 5 minutes of inactivity and interesting connection to https://activity.zooniverse.org/. I opened #6499 for FEM and will do the same for PFE. |
I think that session code needs to be fixed, rather than deleted, since classification sessions are used in published papers. A classification session is distinct from a browser session, in that a volunteer can keep a tab open for hours or days and have several classification sessions in one browser session. There might even be code in https://github.com/zooniverse/data-digging that uses classification session IDs to estimate session durations. That'll be wrong now. |
Similar to my comment on your PR, I'm in favor of not changing the classifier's behavior. Session ids have never refreshed after 5min of activity in the PFE classifier or the FEM classifier. Regarding data digging and publications, I'll address the potential incorrect assumptions about session ids with the zoo team and document our decision on the appropriate Issues. |
Package
lib-classifier
cuid
is deprecated (npm link)cuid
is also used throughout the annotation code to generate an id for default annotation objects and new marks in the drawing tools.front-end-monorepo/packages/lib-classifier/src/store/ClassificationStore.js
Lines 57 to 59 in 496e883
hash.js
is a popular npm package last published 6 yrs ago, but it might replicate behavior available in modern browserssessionUtils.getSessionID()
. Sessions are stored in browser session storage, so they're unique to individual tabs.front-end-monorepo/packages/lib-classifier/src/store/utils/session.js
Lines 12 to 14 in 496e883
lib-react-components
cuid
is deprecated. Same third party library as lib-classifier.visx/group
:front-end-monorepo/packages/lib-react-components/src/Media/components/Data/components/DataSeriesPlot/DataSeriesPlot.js
Line 165 in 496e883
app-project
@sindresorhus/string-hash
is not a popular npm package. Last published 3 yrs ago.announcement
(which project teams can change any time) hence a random unique id is not attached to the cookie preference in favor of a hashed string of the announcement's content.stringHash
in UI.js:front-end-monorepo/packages/app-project/stores/UI.js
Lines 60 to 64 in 496e883
lib-user
uuid
: a popular, maintained npm package.front-end-monorepo/packages/lib-user/src/components/MyGroups/components/GroupCreateFormContainer/GroupCreateFormContainer.js
Lines 10 to 16 in 496e883
Describe the solution you'd like
Standardize libraries for the three main use cases:
SHA256 is now included in current browsers via
crypto.subtle.digest
uuid
is a maintained library. We could use it in place ofcuid
throughout lib-classifier and lib-react-components.uuid
an over-large package for our use case and if tree-shaking is working correctly to grab onlyv4
The text was updated successfully, but these errors were encountered: