-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Priority Bidder Ejection #2952
Priority Bidder Ejection #2952
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request rolls back the optimization in your previous PR #2830. p.PriorityGroups
is a 2D matrix, so let c
be its number of columns and r
its number of rows. Let M = rc
be the total number of elements in p.PriorityGroups
and define N
as the number of entries in cookie.uids
. Under these assumptions, I think the time complexity of Choose()
is T(N*M + N + M)
if we exit in line 46 below:
35 func (p *PriorityBidderEjector) Choose(uids map[string]UIDEntry) (string, error) {
36 p.OldestEjector.nonPriorityKeys = getNonPriorityKeys(uids, p.PriorityGroups) // T(N*M)
37 if len(p.OldestEjector.nonPriorityKeys) > 0 {
38 return p.OldestEjector.Choose(uids)
39 }
40
41 if isSyncerPriority(p.SyncerKey, p.PriorityGroups) { // T(M)
42 // Eject Oldest Element from Lowest Priority Group and Update Priority Group
43 lowestPriorityGroup := p.PriorityGroups[len(p.PriorityGroups)-1]
44 oldestElement := getOldestElement(lowestPriorityGroup, uids) // T(N*M)
45 if oldestElement == "" {
46 return p.FallbackEjector.Choose(uids) // T(N)
47 }
48
49 updatedPriorityGroup := removeElementFromPriorityGroup(lowestPriorityGroup, oldestElement)
50 if updatedPriorityGroup == nil {
51 p.PriorityGroups = p.PriorityGroups[:len(p.PriorityGroups)-1]
52 } else {
53 p.PriorityGroups[len(p.PriorityGroups)-1] = updatedPriorityGroup
54 }
55
56 return oldestElement, nil
57 }
58 return "", errors.New("syncer key " + p.SyncerKey + " is not in priority groups")
59 } // T(N*M + M + N*M + N) ~> O(N*M)
usersync/ejector.go
And given that Choose()
runs N
times inside a for
loop in PrepareCookieForWrite()
, I believe we're looking at O(N^2*M)
59 // PrepareCookieForWrite ejects UIDs as long as the cookie is too full
60 func (cookie *Cookie) PrepareCookieForWrite(cfg *config.HostCookie, encoder Encoder, ejector Ejector) (string, error) {
61 for len(cookie.uids) > 0 { // T(N)
62 encodedCookie, err := encoder.Encode(cookie)
63 if err != nil {
64 return encodedCookie, nil
65 }
66
67 // Convert to HTTP Cookie to Get Size
68 httpCookie := &http.Cookie{
69 Name: uidCookieName,
70 Value: encodedCookie,
71 Expires: time.Now().Add(cfg.TTLDuration()),
72 Path: "/",
73 }
74 cookieSize := len([]byte(httpCookie.String()))
75
76 isCookieTooBig := cookieSize > cfg.MaxCookieSizeBytes && cfg.MaxCookieSizeBytes > 0
77 if !isCookieTooBig {
78 return encodedCookie, nil
79 }
80
81 uidToDelete, err := ejector.Choose(cookie.uids) // T(N*M + M + N*M + N)
82 if err != nil {
83 return encodedCookie, err
84 }
85 delete(cookie.uids, uidToDelete)
86 }
87 return "", nil
88 } // T(N*(N*M + M + N*M + N)) -> T(N^2*M + N*M + N^2*M + N*M)) ~> O(N^2*M)
If we reintroduce sort()
and arrange PriorityGroups
into an array of maps that allow for constant lookup time, I beleive we could cut this down to O(N^2 + M)
at the expense of O(N+M)
space:
60 func (cookie *Cookie) PrepareCookieForWrite(cfg *config.HostCookie, encoder Encoder, ejector Ejector) (string, error) {
+ sortedUuidKeys := sortUIDs(cookie.uids) // T(N*logN), Space(N)
+
+ priorityGroupsPutIntoHashTable := make([]map[string]struct{}, len(ejector.PriorityGroups) // Space(M)
+ for i, group := ejector.PriorityGroups { // T(r)
+ priorityGroupsPutIntoHashTable[i] = make(map[string]struct{}, len(group))
+ for _, bidder := range group { // T(c)
+ priorityGroupsPutIntoHashTable[i][bidder] = struct{}{}
+ }
+ } // T(r*c) -> T(M)
+
+ i := 0
61 for len(cookie.uids) > 0 { // T(N)
62 encodedCookie, err := encoder.Encode(cookie)
63 if err != nil {
64 return encodedCookie, nil
65 }
66
67 // Convert to HTTP Cookie to Get Size
68 httpCookie := &http.Cookie{
69 Name: uidCookieName,
70 Value: encodedCookie,
71 Expires: time.Now().Add(cfg.TTLDuration()),
72 Path: "/",
73 }
74 cookieSize := len([]byte(httpCookie.String()))
75
76 isCookieTooBig := cookieSize > cfg.MaxCookieSizeBytes && cfg.MaxCookieSizeBytes > 0
77 if !isCookieTooBig {
78 return encodedCookie, nil
79 }
80
81 - uidToDelete, err := ejector.Choose(cookie.uids)
+ hashedGroupWithLessPriority = priorityGroupsPutIntoHashTable[len(priorityGroupsPutIntoHashTable) - 1]
+
+ // If we pass i, sortedUuidKeys, and the hashedGroupWithLessPriority, I think we could refactor `Choose()` to a time
+ // complexity of O(N) because in N time we check each element of cookie.uids against the contents of hashedGroupWithLessPriority.
+ // > If an element of cookie.uids matches against hashedGroupWithLessPriority, return the match.
+ // > If no element of cookie.uids is found in hashedGroupWithLessPriorit, we go back to simply returning the oldest entry, which is uuidKeys[i]
+ uidToDelete, err := ejector.Choose(cookie.uids, i, sortedUuidKeys, hashedGroupWithLessPriority) // T(N)
82 if err != nil {
83 return encodedCookie, err
84 }
85 delete(cookie.uids, uidToDelete)
+
+ i++
86 }
87 return "", nil
+ // New time complexity: sorting + hashing the priority groups + traversing cookie.uids in this for loop times traversing cookie.uids in Choose()
+ // T(N*logN + M + N^2) ~> O(N^2 + M)
88 }
usersync/cookie.go
Does this make sense? Am I missing something? Consider this pseudocode that's just trying to draft an idea and by no means encompasses all the scenarios here. Please let me know your thoughts.
@guscarreon Just wanted to respond to your proposal of optimizing the ejection process through sorting uids and converting the priority groups into a hash table. I was able to follow all of your suggestions, but my one question was when we set Thank you for taking the time to write out this suggestion, very helpful! :) |
After attempting to implement @guscarreon Choose() runtime optimization changes, I came to the conclusion that it would be too messy to the current code to introduce this update. To introduce this change the following would occur:
|
The latest push removes the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In one of my comments below I mention the assumption that in practice, the majority of times we won't need to eject entries from the cookie.uids
map. But, to be honest, we don't have any metric for this. Do you think it'll be complicated to add a metric that counts the number of setuid
requests that make us remove at least one element from cookie.uids
?
If you think this PR is big enough, maybe we dshould do it in a separate PR if we decide this would be a worthy metric to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Manually tested several scenarios, and it looks good. I didn't find any holes in the implementation. The only question I have is how the server should respond if the uid cannot fit in the cookie and isn't a priority. Right now, a 400 is returned. Does this match PBS-Java? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR addresses the Priority Bidder Ejection portion of this issue: #2173
The goal of this PR is to change how we eject UIDs from a cookie when a cookie is too full.
Currently, we simply eject the oldest UID from the cookie
The Ejector Object
In this PR, we introduce an ejector object with a
Choose()
method that can be implemented in various ways to adopt different strategist for how we Choose which UID to eject.For the
PriorityEjector
, the host company can now define a list of priority bidders at this pathusersync.priority_groups
. The typing of this list is[][]string
, where the last group of the list, represents the lowest priority elements. So this priority ejection strategy ejects the uid that is the lowest priority and the oldest element. Uids that are non priority elements will be ejected first before anything else. We utilize theOldestEjector
to carry out the ejection in the cases where there are non priority elements present, or there are multiple uids that are of the lowest priority. The PriorityEjector will first filter out the elements in these cases, and pass just the relevant uids to theOldestEjector
OldestEjector
simply eliminates the oldestuid
from theuids
it is given.The actual ejection used to take place in
usersync/cookie.go/SetCookieOnResponse()
, but after the refactor, it now takes place inusersync/cookie.go/PrepareCookieForWrite()