Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a command to explicitly purge the metastore #59

Merged
merged 15 commits into from
Aug 18, 2024

Conversation

eric-maynard
Copy link
Contributor

@eric-maynard eric-maynard commented Aug 1, 2024

Description

This adds a new command, purge, that must be run in order to purge entities from the metastore.

Additionally, bootstrapping will now fail if entities exist in the metastore instead of purging them.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Previous behavior:

java -jar .app/polaris-service-1.0.0-all.jar bootstrap polaris-server.yml
. . .
(Succeeds)

New behavior:

java -jar .app/polaris-service-1.0.0-all.jar bootstrap polaris-server.yml
. . .
It appears this metastore manager has already been bootstrapped...

New behavior + purge:

java -jar .app/polaris-service-1.0.0-all.jar purge polaris-server.yml
. . .
(Succeeds)
. . .
java -jar .app/polaris-service-1.0.0-all.jar bootstrap polaris-server.yml
(Succeeds)

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

@eric-maynard eric-maynard requested a review from a team as a code owner August 1, 2024 17:17
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence with some --overwrite flag that purges (some) state.

bootstrap is a one-off operation that, I think, should fail when the system's already bootstrapped.

I think, it's better to have an explicit command or tool to purge / reset to avoid accidentally resetting principals. WDYT?

@eric-maynard
Copy link
Contributor Author

Thanks for taking a look @snazy -- to be clear, the current behavior is that bootstrap will silently purge the state. This PR proposes a flag you have to set in order to confirm that you really want to do this. And without that flag bootstrap will fail when the system is already bootstrapped.

@snazy
Copy link
Member

snazy commented Aug 1, 2024

Oh - right - that "bark at you" code part is new. Can we add the "barking" part and have a "purge root-principals" command or so?

@snazy
Copy link
Member

snazy commented Aug 1, 2024

I feel that's a bit safer - humans tend to be lazy (and forget that they have such --overwrite flags set) and wonder why things break.

// While bootstrapping we need to act as a fake privileged context since the real
// CallContext hasn't even been resolved yet.
PolarisCallContext polarisContext =
new PolarisCallContext(
sessionSupplierMap.get(realmContext.getRealmIdentifier()).get(), diagServices);
CallContext.setCurrentContext(CallContext.of(realmContext, polarisContext));

if (!overwrite) {
PolarisMetaStoreManager.EntityResult preliminaryRootPrincipalLookup =
metaStoreManager.readEntityByName(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be version safe in the future? For now I assume this is safe but in the future could readEntityByName fail for a Polaris catalog bootstrapped from an earlier version?

This seems like a good check to have for now, I'm just more worried about what happens if this call fails because the current version of Polaris just doesn't understand how to look up the Principal in the past.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important concern and it's why we don't rely on a check like this to auto-bootstrap... but it seems to me that this check is much less critical. That scenario would be very bad for other reasons, but here you would only be losing the protection of the --overwrite flag.

RussellSpitzer
RussellSpitzer previously approved these changes Aug 5, 2024
@RussellSpitzer
Copy link
Member

@snazy I think we probably can't safely always validate (or require that an implementation can validate) that existing information is in the destination. I like this though because it is at least safer than the current behavior.

@snazy
Copy link
Member

snazy commented Aug 5, 2024

I think we probably can't safely always validate (or require that an implementation can validate) that existing information is in the destination. I like this though because it is at least safer than the current behavior.

Sure, having this check is definitely a good thing. What I'm thinking of is:

  • not having an --overwrite flag, but let bootstrap always fail with that check but
  • have a new command to explicitly purge the principal

Thinking of humans leaving the --overwrite in for all deployments - (and complain that authn no longer works after a restart).

@eric-maynard
Copy link
Contributor Author

I don't think we want to break bootstrap into purge + bootstrap. Not all metastore managers will necessarily purge on bootstrap. We want to just gate bootstrapping behind an additional flag on a best-effort basis in the case that we detect bootstrapping may have already been performed.

--overwrite doesn't appear in the docs and only appears in this error message, so for now it's hard to imagine someone always leaving --overwrite on their bootstrap command without knowing what it does. And if we can imagine such a state, it's not such a stretch to also imagine that they leave the purge command in there too.

@collado-mike
Copy link
Contributor

collado-mike commented Aug 5, 2024

IDK, I like the option to purge because I feel it is very explicit and we should expect the implementation to be thorough. bootstrap is necessarily an unauthenticated command, whereas the purge operation ought to require root credentials (there's a question of how does someone bootstrap if they lost their root credentials, at which point I think they should require direct database access).

If bootstrap --overwrite merely focuses on replacing the root principal/secrets, there's a risk that someone could bootstrap an existing installation but not delete the pre-existing data. In my mind, purge means delete all the things, so purge then bootstrap should result in the expected behavior.

@eric-maynard eric-maynard changed the title Add a flag to verify existing entities should be purged by bootstrapping Add a command to explicitly purge the metastore Aug 6, 2024
@eric-maynard
Copy link
Contributor Author

Thanks for taking a look @collado-mike & @snazy -- I have implemented a purge command based on my understanding of the discussion above. Please take a look at let me know if this matches what you had in mind. Thanks!

@eric-maynard eric-maynard requested a review from snazy August 6, 2024 01:57
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, but overall LGTM.
Deferring review to @collado-mike, because this related code base a bit better ;)

polaris-server.yml Outdated Show resolved Hide resolved
@collado-mike collado-mike enabled auto-merge (squash) August 9, 2024 22:39
auto-merge was automatically disabled August 13, 2024 20:54

Head branch was pushed to by a user without write access

@MonkeyCanCode
Copy link
Contributor

@flyrain can we have this merge if not other blockers so the bootstrap won't wipe the backend if accidently got ran more than once. Then I can test if anything additional will be needed on the helm side as well to handle those.

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments.

Comment on lines +64 to +65
Map<String, PolarisMetaStoreManager.PrincipalSecretsResult> results =
metaStoreManagerFactory.bootstrapRealms(configuration.getDefaultRealms());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we can use var to simplify a bit.

Comment on lines +69 to +70
for (Map.Entry<String, PolarisMetaStoreManager.PrincipalSecretsResult> result :
results.entrySet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can use var to simplify and keep them in one line

@eric-maynard
Copy link
Contributor Author

eric-maynard commented Aug 17, 2024 via email

@flyrain
Copy link
Contributor

flyrain commented Aug 17, 2024

I think it's fine to use var as Polaris requires jdk 11 for core and jdk21 for others. var was introduced in jdk 10. To be clear, they are not blockers. I will merge it tonight if no more comments.

@flyrain flyrain merged commit d6278e2 into apache:main Aug 18, 2024
3 checks passed
@flyrain
Copy link
Contributor

flyrain commented Aug 18, 2024

Thanks @eric-maynard for the change. Thanks @collado-mike @snazy @RussellSpitzer @MonkeyCanCode for the review.

@flyrain flyrain mentioned this pull request Aug 18, 2024
2 tasks
@MonkeyCanCode MonkeyCanCode mentioned this pull request Aug 18, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants