Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP scrubber #2990

Merged
merged 21 commits into from
Sep 5, 2023
Merged

IP scrubber #2990

merged 21 commits into from
Sep 5, 2023

Conversation

VeronikaSolovei9
Copy link
Contributor

Added anonymizeIpv6 function to transmitPreciseGeo activitiy to match PBS-Java implementation

Addressed issue #2981

privacy/scrubber.go Show resolved Hide resolved
privacy/scrubber_test.go Outdated Show resolved Hide resolved
privacy/scrubber.go Outdated Show resolved Hide resolved
@SyntaxNode SyntaxNode self-assigned this Jul 31, 2023
config/account.go Outdated Show resolved Hide resolved
config/account.go Outdated Show resolved Hide resolved
config/account.go Show resolved Hide resolved
}

func (ipMasking *IpMasking) Validate(errs []error) []error {
errs = ipMasking.IpV6.validate(errs, Ipv6Bits, "ipv6")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to handle the case of ipMasking being nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically ipMasking is never nil here. Should I modify it to a non-pointer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please. Let's add some defensive coding since nil is allowed here. Alternatively, you may be able to change this to a value type receiver.

privacy/enforcement.go Outdated Show resolved Hide resolved
@@ -76,14 +77,18 @@ type Scrubber interface {
ScrubUser(user *openrtb2.User, strategy ScrubStrategyUser, geo ScrubStrategyGeo) *openrtb2.User
}

type scrubber struct{}
type scrubber struct {
ipMasking *config.IpMasking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be passed as a pointer?

Copy link
Contributor Author

@VeronikaSolovei9 VeronikaSolovei9 Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main idea here was to load these values to a non-pointer config struct, because we always need them. And then pass it as a pointer downstream to avoid copying the struct.
But I can see now why this was not a good idea. I replaced it to non-pointer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then pass it as a pointer downstream to avoid copying the struct.

That is't a surefire premature optimization. Go has optimizations for passing arguments as values which may be much more efficient than using a pointer. It's a bit more reliable for large data structures like the BidRequest, but this struct is relatively small.

privacy/scrubber.go Outdated Show resolved Hide resolved
privacy/scrubber.go Outdated Show resolved Hide resolved
privacy/scrubber.go Outdated Show resolved Hide resolved
config/account.go Outdated Show resolved Hide resolved
config/account.go Outdated Show resolved Hide resolved
config/account.go Show resolved Hide resolved
@@ -103,6 +103,11 @@ func GetAccount(ctx context.Context, cfg *config.Configuration, fetcher stored_r
return nil, errs
}

errs = account.Privacy.IPMasking.Validate(errs)
if len(errs) > 0 {
return nil, errs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reasonable way to add a test that throws this error? Or is it challenging with this Validate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added!

@@ -910,3 +910,61 @@ func TestAccountPriceFloorsValidate(t *testing.T) {
})
}
}

func TestIPMaskingValidate(t *testing.T) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Could you remove this blank line?

@@ -40,7 +45,7 @@ type Account struct {
Validations Validations `mapstructure:"validations" json:"validations"`
DefaultBidLimit int `mapstructure:"default_bid_limit" json:"default_bid_limit"`
BidAdjustments *openrtb_ext.ExtRequestPrebidBidAdjustments `mapstructure:"bidadjustments" json:"bidadjustments"`
Privacy *AccountPrivacy `mapstructure:"privacy" json:"privacy"`
Privacy AccountPrivacy `mapstructure:"privacy" json:"privacy"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, why is this converted to remove its pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to non-pointer because we no have a new IPMasking field inside of the Privacy:

type AccountPrivacy struct {
	AllowActivities *AllowActivities `mapstructure:"allowactivities" json:"allowactivities"`
	IPMasking       IPMasking        `mapstructure:"ip_masking" json:"ip_masking"` 
}

We need to have IPMasking to be always present and to be populated with default values from configs.
PBS host account_defaults configs and/or request.account config can overwrite these values.

This is why config.config.go has these lines:

v.SetDefault("account_defaults.privacy.ip_masking.ipv6.left_mask_bits", 56)
v.SetDefault("account_defaults.privacy.ip_masking.ipv4.left_mask_bits", 24)

Copy link
Contributor

@AlexBVolcy AlexBVolcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just left one little nitpick

assert.Equal(t, test.expected, result, test.description)
})
}
}

func TestScrubIPV4(t *testing.T) {
func TestScrubIp(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Could you capitalize the Ip here so it looks like TestScrubIP

@VeronikaSolovei9 VeronikaSolovei9 changed the title IPv6 scrubber IP scrubber Aug 8, 2023
AlexBVolcy
AlexBVolcy previously approved these changes Aug 8, 2023
Copy link
Contributor

@AlexBVolcy AlexBVolcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@AlexBVolcy AlexBVolcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one comment. Could you also resolve the merge conflicts?

Comment on lines 310 to 311
// For the follow up PR, always keep first 64 bits
// AlwaysKeepBits int `mapstructure:"always-keep-bits" json:"always-keep-bits"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to confirm that we want to keep this commented code out here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO better to remove the comment and add as a note to the associated issue.

# Conflicts:
#	exchange/utils.go
#	privacy/enforcer.go
config/account.go Outdated Show resolved Hide resolved
Comment on lines 310 to 311
// For the follow up PR, always keep first 64 bits
// AlwaysKeepBits int `mapstructure:"always-keep-bits" json:"always-keep-bits"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO better to remove the comment and add as a note to the associated issue.

AnonKeepBits int `mapstructure:"anon-keep-bits" json:"anon-keep-bits"`
}

func (IPv6Config *IPv6) Validate(errs []error) []error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variables names should start with lower case characters. It's common to use a single character for a function receiver reference. Consider renaming to simply ip?

}

func (IPv6Config *IPv6) Validate(errs []error) []error {
if IPv6Config.AnonKeepBits > IPv6BitSize || IPv6Config.AnonKeepBits < 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a pointer function receiver, you need to handle the nil case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPv6 and IPv4 are not pointers in AccountPrivacy:

type AccountPrivacy struct {
	AllowActivities *AllowActivities `mapstructure:"allowactivities" json:"allowactivities"`
	IPv6Config      IPv6             `mapstructure:"ipv6" json:"ipv6"`
	IPv4Config      IPv4             `mapstructure:"ipv4" json:"ipv4"`
}

technically they will be always initialized. Should I add a nil check anyways?

config/account_test.go Show resolved Hide resolved
config/account.go Outdated Show resolved Hide resolved
@@ -38,12 +39,12 @@ func TestScrubDevice(t *testing.T) {
MACMD5: "",
IFA: "",
IP: "1.2.3.0",
IPv6: "2001:0db8:0000:0000:0000:ff00:0:0",
IPv6: "2001:db8:2233:4400::",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this segment still be 0db8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case 0db8 and db8 are identical. Because we use net.ParseIP(ip) and then ipMasked.String() returns it in this form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.ciscopress.com/articles/article.asp?p=2803866 -> paragraph "Rule 1: Omit Leading 0s"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Maybe just add a the 0 to keep this straight forward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooof, I see. This value is returned by the go net.IP functionality. In order to make this change we will need to compare net.IPs, not strings. This will require some assertions refactoring. We now assert the whole expected: &openrtb2.Device with the result.

We will need to change this(current implementation):

assert.Equal(t, test.expected, result, test.description)

to something like this:

assert.Equal(t, net.ParseIP(test.expected.IP).String(), net.ParseIP(result.IP).String(), test.description)
assert.Equal(t, net.ParseIP(test.expected.IPv6).String(), net.ParseIP(result.IPv6).String(), test.description)
assert.Equal(t, test.expected.DIDMD5, result.DIDMD5, test.description)
assert.Equal(t, test.expected.IFA, result.IFA, test.description)
...

Are we good to go with this format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just change 0 to 1? My only intention here was to keep the test straight forward and not confuse the reader with the additional dimension of leading 0 optimization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we change to 0 to a 1 to remove this issue entirely?

account/account.go Show resolved Hide resolved
@@ -199,14 +203,12 @@ func (scrubber) ScrubDevice(device *openrtb2.Device, id ScrubStrategyDeviceID, i

switch ipv4 {
case ScrubStrategyIPV4Lowest8:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need ScrubStrategyIPV4Lowest8?

Copy link
Contributor Author

@VeronikaSolovei9 VeronikaSolovei9 Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this to indicate we want to scrub IPv4. There are just 2 options:

const (
	// ScrubStrategyIPV4None does not remove any part of an IPV4 address.
	ScrubStrategyIPV4None ScrubStrategyIPV4 = iota

	// ScrubStrategyIPV4Lowest8 zeroes out the last 8 bits of an IPV4 address.
	ScrubStrategyIPV4Lowest8
)

I think we should rename it to something like ScrubStrategyIPV4Default or better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. Yes, renaming would be good. Perhaps ScrubStrategyIPV4Subnet to match IPv6 naming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right!

ipv6Errs = account.Privacy.IPv6Config.Validate(ipv6Errs)
if len(ipv6Errs) > 0 {
account.Privacy.IPv6Config.AnonKeepBits = iputil.IPv6BitSize
// record invalid ipv6 metric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you commented asking if this can be done in a separate PR, if so can this comment be removed?

ipv4Errs = account.Privacy.IPv4Config.Validate(ipv4Errs)
if len(ipv4Errs) > 0 {
account.Privacy.IPv4Config.AnonKeepBits = iputil.IPv4BitSize
// record invalid ipv4 metric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same point about removing metric related comment

Copy link
Contributor

@SyntaxNode SyntaxNode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great. Left a few small nitpicks.

ipv4Errs = account.Privacy.IPv4Config.Validate(ipv4Errs)
if len(ipv4Errs) > 0 {
account.Privacy.IPv4Config.AnonKeepBits = iputil.IPv4BitSize
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Could simplify the syntax...

if errs := account.Privacy.IPv6Config.Validate(nil); len(errs) > 0 {
	account.Privacy.IPv6Config.AnonKeepBits = iputil.IPv6BitSize
}

if errs := account.Privacy.IPv4Config.Validate(nil); len(errs) > 0 {
	account.Privacy.IPv4Config.AnonKeepBits = iputil.IPv4BitSize
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines show as having test coverage, but there doesn't seem to an assertion on setting their default values. I made a type locally where IPv4Config.AnonKeepBits = iputil.IPv6BitSize which is setting the ipv6 value for the ipv4 default and this mistake wasn't caught by a test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is tested in a general GetAccount test.
I added extra flag to assert IP config for account id "invalid_acct_ipv6_ipv4".
I also realized I set both ipv6 and ipv4 to max value, instead of default mask bits. I added 2 new constants:

IPv4DefaultMaskingBitSize = 24
IPv6DefaultMaskingBitSize = 56

For the simplified syntax I added new variables for errors to show explicitly we don't want to add IP config validation errors to errs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also tested this update end to end including test with publisher account id.


type IPv4 struct {
AnonKeepBits int `mapstructure:"anon-keep-bits" json:"anon-keep-bits"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Let's use underscores to be consistent with the rest of this object: anon-keep-bits -> anon_keep_bits

@@ -38,12 +39,12 @@ func TestScrubDevice(t *testing.T) {
MACMD5: "",
IFA: "",
IP: "1.2.3.0",
IPv6: "2001:0db8:0000:0000:0000:ff00:0:0",
IPv6: "2001:db8:2233:4400::",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we change to 0 to a 1 to remove this issue entirely?

Copy link
Contributor

@AlexBVolcy AlexBVolcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hhhjort hhhjort merged commit 89f79e3 into master Sep 5, 2023
3 checks passed
@SyntaxNode SyntaxNode deleted the anonymizeIpv6 branch October 2, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants