Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA-398341: Populate fingerprints of CA certificates on startup #6006

Merged
merged 2 commits into from
Oct 22, 2024

Conversation

snwoods
Copy link
Contributor

@snwoods snwoods commented Sep 18, 2024

Also CP-51527: Add --force option to pool-uninstall-ca-certificate.

Addresses the issues raised here by @stormi #5955

@@ -378,8 +378,10 @@ let rec cmdtable_data : (string * cmd_spec) list =
; ( "pool-certificate-uninstall"
, {
reqd= ["name"]
; optn= []
; help= "Uninstall a pool-wide TLS CA certificate."
; optn= ["force"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pool-certificate-uninstall is deprecated, so should I be updating it or leaving it as-is? My initial instinct was the latter, but they both use the same underlying function Cli_operations.pool_uninstall_ca_certificate and so the --force option functionality is included in pool-certificate-uninstall regardless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave it as-is, same as the api call

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed force from both this and certificate_uninstall in datamodel_pool.ml, but when testing pool-certificate-uninstall, force still works. This is expected because as I say, certificate-uninstall is using Cli_operations.pool_uninstall_ca_certificate, the same as pool-uninstall-ca-certificate. Is this what you intended? It adds functionality to the deprecated function but doesn't change the documentation.

(datamodel_pool.certificate_uninstall is completed unused AFAICT)

; optn= ["force"]
; help= "Uninstall a pool-wide TLS CA certificate. The optional parameter \
'--force' will remove the DB entry even if the certificate file is \
non-existent"
; implementation= No_fd Cli_operations.pool_uninstall_ca_certificate
; flags= [Deprecated ["Use pool-uninstall-ca-certificate"]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this Deprecated flag actually do? It doesn't show when you do xe help pool-certificate-uninstall and it didn't show when I ran the command itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably because there's very little autogenerated code. The cli server should let the client print a message, probably. (will it break the internal regression tests, though?)

@@ -1397,9 +1397,9 @@ let certificate_install ~__context ~name ~cert =

let install_ca_certificate = certificate_install

let certificate_uninstall ~__context ~name =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand how the legacy names work. Why is the legacy name "certificate_uninstall" still needed (here and in message_forwarding.ml and datamodel_pool.ml if the deprecated xe command doesn't even use it? It calls pool_install_ca_certificate which calls uninstall_ca_certificate, not certificate_uninstall. What is it providing compatibility with, is this for upgrades from much older xapis?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API clients might still use this function, this is why it needs to be kept.

( Eq (Field "fingerprint_sha256", Literal "")
, Eq (Field "fingerprint_sha1", Literal "")
)
, Eq (Field "type", Literal "ca")
Copy link
Contributor

@gangj gangj Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know do we also need to upgrade for the "host" certificate, thank you~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do, but I added this already in certificates_sync.sync (https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/certificates_sync.ml#L96-L104). This PR is because I originally missed updating the user/ca certificates too 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see, thank you!
Then can we also cover the "host" certificate upgrade here and then we don't need the same functionality in two places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wondered about that, but certificates_sync.sync already existed (it does more than just update the fingerprint_sha* fields) so it makes sense for host certificates to be updated there. However, it currently only updates host certs. Would it make sense to expand certificates_sync.sync to include user certificates too so that we have it in one place? What do you think @psafont ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense as it is, currently. It's used for host certificates, exclusively, when the before the stunnel component gets started.

These certificates gets used by the client part, so the update needs to happen at a time that's independent (or risk introducing races) Maybe we can consolidate some of the modules, because the certificate handling is a bit disperse now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, leave it as it is in this PR for now but with the potential to consolidate the certificate handling code in the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right

let sha256 = pp_fingerprint ~hash_type:`SHA256 certificate in
Ok (sha1, sha256)
with Unix.Unix_error (Unix.ENOENT, _, _) ->
Error (`Msg (Printf.sprintf "filename %s does not exist" filename))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add another catch for any exception? This is out of cautiosness. Because the function is called on startup, it cannot raise any exception.

  | exn ->
      Error (`Msg (Printexc.to_string exn))

(String, "name", "The certificate name")
; ( Bool
, "force"
, "If true, remove the DB entry even if the file is non-existent"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is deprecated, please don't modify it. (we have to change the lifecycle, but the last entry is the deprecation!)

; ( Bool
, "force"
, "If true, remove the DB entry even if the file is non-existent"
)
]
~allowed_roles:_R_LOCAL_ROOT_ONLY
~lifecycle:[(Published, "1.290.0", "Uninstall TLS CA certificate")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the lifecycle stating it has been modified

, "force"
, "If true, remove the DB entry even if the file is non-existent"
)
]
~allowed_roles:(_R_POOL_OP ++ _R_CLIENT_CERT)
~lifecycle:[(Published, "1.290.0", "Uninstall TLS CA certificate")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the lifecycle, as in the other call

ocaml/xapi/certificates.ml Outdated Show resolved Hide resolved
ocaml/xapi/certificates.ml Outdated Show resolved Hide resolved
[
(Published, "1.290.0", "Uninstall TLS CA certificate")
; ( Changed
, "24.29.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psafont Is this the correct thing to do? This is currently the correct version but it could change before release. Does this just need to be updated again before release or is there some way to auto-fill it? This only happens when you have lifecycle:[] right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These versions are painful to encode because they cannot be calculated atomatically, like the case of lyfecycle:[]. Right now, this should be 24.31.0 (because 24.30.0 has been released)

@@ -1147,6 +1147,10 @@ let server_init () =
, []
, fun () -> report_tls_verification ~__context
)
; ( "Update shared certificate's metadata"
, [Startup.OnlyMaster]
, fun () -> Certificates.Db_util.upgrade_ca_fingerprints ~__context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a one-off upgrade case? That is, once you have the xapi version with this change in it and it has run once, you'll never need it anymore? Then it would better fit in xapi_db_upgrade.ml (bump the schema version and add a < rule).

(Ref _host, "host", "The host"); (String, "name", "The certificate name")
(Ref _host, "host", "The host")
; (String, "name", "The certificate name")
; ( Bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would break existing clients, because the new param does not have a default. Use ~versioned_params for that.

Copy link
Member

@robhoes robhoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@lindig
Copy link
Contributor

lindig commented Oct 4, 2024

Can we pick this up again? @snwoods

@snwoods snwoods force-pushed the private/stevenwo/CA-398341 branch 2 times, most recently from 69cfd55 to a210e11 Compare October 8, 2024 11:07
ocaml/idl/datamodel_host.ml Outdated Show resolved Hide resolved
let upgrade_ca_fingerprints =
{
description= "Upgrade the fingerprint fields for ca certificates"
; version= (fun x -> x < (5, 779))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 779? You have used 783 in the datamodel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah 779 was when the fingerprint_sha256 and fingerprint_sha1 fields were added, so I was thinking anything from that version should already have those fields populated, but of course if they'd previously upgraded from <779 they wouldn't have it, sorry.

@@ -904,6 +904,66 @@ let upgrade_update_guidance =
)
}

let upgrade_ca_fingerprints =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dead code until you add the new function to the rules list below.

Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains an unrelated commit from #6045

let's wait until that PR is merged

@snwoods
Copy link
Contributor Author

snwoods commented Oct 8, 2024

Tested on a toolstack-next host and it worked as expected (updated the fingerprints when updating to this code, but didn't run the upgrade code from then on)

@psafont
Copy link
Member

psafont commented Oct 9, 2024

Let's wait until the metrics fix is in to merge this #6043

benjamreis added a commit to xcp-ng-rpms/xapi that referenced this pull request Oct 10, 2024
See: xapi-project/xen-api#6006

`sdn-controller-ca.pem` could have been lost in a previous upgrade,
failing reading it would prevent XAPI from starting.

Signed-off-by: Benjamin Reis <[email protected]>
@lindig
Copy link
Contributor

lindig commented Oct 21, 2024

I believe @psafont the PR you referenced was merged; so we can merge this now?

@psafont
Copy link
Member

psafont commented Oct 21, 2024

The versions in the datamodel need to be updated to 24.35.0, otherwise it's good to go

psafont and others added 2 commits October 21, 2024 16:57
SHA256 and SHA1 certificates' fingerprints do not get populated when the
database is upgraded, so empty values need to be detected and amended on
startup.

Signed-off-by: Pau Ruiz Safont <[email protected]>
Signed-off-by: Steven Woods <[email protected]>
This allows the CA certificate to be removed from the DB even if the
certificate file does not exist.

Signed-off-by: Steven Woods <[email protected]>
@psafont psafont added this pull request to the merge queue Oct 22, 2024
Merged via the queue into xapi-project:master with commit 97aa03f Oct 22, 2024
15 checks passed
@snwoods snwoods deleted the private/stevenwo/CA-398341 branch November 29, 2024 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants