Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: add triage/fix scenario to wiki for when a krm service such as enabling monitoring in client-landing-zone times out intermittently - during reconcile and requires a re kpt apply to allow the dependency tree to continue #865

Open
fmichaelobrien opened this issue Feb 29, 2024 · 0 comments
Assignees
Labels
automation developer-experience documentation Improvements or additions to documentation

Comments

@fmichaelobrien
Copy link
Contributor

fmichaelobrien commented Feb 29, 2024

Both myself and a customer ran into this one requiring an out-of-band fix - periodically (one only one of my recent 2 orgs

example of a working project with the permission working

michael@cloudshell:~ (client-project-cso3)$ gcloud services list --enabled | grep NAME
NAME: cloudbilling.googleapis.com
NAME: cloudresourcemanager.googleapis.com
NAME: compute.googleapis.com
NAME: iam.googleapis.com
NAME: iamcredentials.googleapis.com
NAME: logging.googleapis.com
NAME: monitoring.googleapis.com
NAME: oslogin.googleapis.com
NAME: serviceusage.googleapis.com

see also #801
#807

One org obrien.industries is working with the log sinks
the other newer org cloud-setup is not

the issue is likely missing IAM permissions on clean account cloud-setup.org - where an older org that even had an older hub-env is ok obrien.industries below

Update: same issue on 2nd org - looks like logging-sa needs roles/storage.admin

https://github.com/GoogleCloudPlatform/pubsec-declarative-toolkit/blob/main/solutions/core-landing-zone/lz-folder/audits/logging-project/cloud-storage-buckets.yaml#L20
missing permissions that are already set on
https://github.com/GoogleCloudPlatform/pubsec-declarative-toolkit/blob/main/solutions/core-landing-zone/namespaces/logging.yaml#L82

Screenshot 2024-01-31 at 13 41 34

both have logging-sa as loggingadmin at the org level

[email protected] | logging-sa | Logging Admin
-- | -- | --

and monitoring admin at the kcc project level

[email protected] | logging-sa | Logging AdminMonitoring Admin
-- | -- | --

setters.yaml

apiVersion: v1
kind: ConfigMap
metadata: # kpt-merge: /setters
  name: setters
  annotations:
    config.kubernetes.io/local-config: "true"
    internal.kpt.dev/upstream-identifier: '|ConfigMap|default|setters'
data:
  org-id: "45..2144"
  lz-folder-id: "388..43"
  billing-id: "014...F85"
  management-project-id: "kcc-oi-7970"
  management-project-number: "729..84"
  management-namespace: config-control
  allowed-trusted-image-projects: |
    - "projects/cos-cloud"
  allowed-contact-domains: |
    - "@obrien.industries"
  allowed-policy-domain-members: |
    - "C03kdhrkc"
  allowed-vpc-peering: |
    - "under:organizations/459...44"
  logging-project-id: logging-project-oi0130
  security-log-bucket: security-log-bucket-oi0130
  platform-and-component-log-bucket: platform-and-component-log-bucket-oi0130
  retention-locking-policy: "false"
  retention-in-days: "1"
  dns-project-id: dns-project-oi0130
  dns-name: "obrien.industries."

single service IAM issue

michael@cloudshell:~/kcc-oi-20231206/kpt (kcc-oi-7970)$ kubectl get gcp -n logging
NAME                                                                                      AGE   READY   STATUS     STATUS AGE
logginglogbucket.logging.cnrm.cloud.google.com/platform-and-component-log-bucket-oi0130   17h   True    UpToDate   17h
logginglogbucket.logging.cnrm.cloud.google.com/security-log-bucket                        17h   True    UpToDate   17h

NAME                                                                                                AGE   READY   STATUS     STATUS AGE
logginglogsink.logging.cnrm.cloud.google.com/logging-project-oi0130-data-access-sink                17h   True    UpToDate   17h
logginglogsink.logging.cnrm.cloud.google.com/mgmt-project-cluster-platform-and-component-log-sink   17h   True    UpToDate   17h
logginglogsink.logging.cnrm.cloud.google.com/org-log-sink-data-access-logging-project-oi0130        17h   True    UpToDate   17h
logginglogsink.logging.cnrm.cloud.google.com/org-log-sink-security-logging-project-oi0130           17h   True    UpToDate   17h
logginglogsink.logging.cnrm.cloud.google.com/platform-and-component-services-infra-log-sink         17h   True    UpToDate   17h
logginglogsink.logging.cnrm.cloud.google.com/platform-and-component-services-log-sink               17h   True    UpToDate   17h

NAME                                                                      AGE   READY   STATUS     STATUS AGE
monitoringmonitoredproject.monitoring.cnrm.cloud.google.com/kcc-oi-7970   17h   True    UpToDate   17h

NAME                                                                       AGE   READY   STATUS         STATUS AGE
storagebucket.storage.cnrm.cloud.google.com/security-incident-log-bucket   17h   False   UpdateFailed   17h


michael@cloudshell:~/kcc-oi-20231206/kpt (kcc-oi-7970)$ kubectl describe storagebucket.storage.cnrm.cloud.google.com/security-incident-log-bucket -n logging
Name:         security-incident-log-bucket
Namespace:    logging
Labels:       <none>
Annotations:  cnrm.cloud.google.com/blueprint: kpt-pkg-fn-live
              cnrm.cloud.google.com/management-conflict-prevention-policy: none
              cnrm.cloud.google.com/project-id: logging-project-oi0130
              cnrm.cloud.google.com/state-into-spec: merge
              config.k8s.io/owning-inventory: 8bee7142b357086a1a649139f252ed0f59791b0e-1706652642396373649
              config.kubernetes.io/depends-on: resourcemanager.cnrm.cloud.google.com/namespaces/projects/Project/logging-project-oi0130
              internal.kpt.dev/upstream-identifier: storage.cnrm.cloud.google.com|StorageBucket|logging|security-incident-log-bucket
API Version:  storage.cnrm.cloud.google.com/v1beta1
Kind:         StorageBucket
Metadata:
  Creation Timestamp:  2024-01-30T22:17:21Z
  Generation:          1
  Resource Version:    45866
  UID:                 72c94615-c235-4c08-beda-7bb823eaea08
Spec:
  Autoclass:
    Enabled:                 true
  Location:                  northamerica-northeast1
  Public Access Prevention:  enforced
  Retention Policy:
    Is Locked:                  false
    Retention Period:           86400
  Uniform Bucket Level Access:  true
Status:
  Conditions:
    Last Transition Time:  2024-01-30T22:17:21Z
    Message:               Update call failed: error fetching live state: error reading underlying resource: summary: Error when reading or editing Storage Bucket "security-incident-log-bucket": googleapi: Error 403: [email protected] does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission 'storage.buckets.get' denied on resource (or it may not exist)., forbidden
    Reason:                UpdateFailed
    Status:                False
    Type:                  Ready
  Observed Generation:     1
Events:
  Type     Reason        Age                    From                      Message
  ----     ------        ----                   ----                      -------
  Warning  UpdateFailed  4m17s (x531 over 17h)  storagebucket-controller  Update call failed: error fetching live state: error reading underlying resource: summary: Error when reading or editing Storage Bucket "security-incident-log-bucket": googleapi: Error 403: [email protected] does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission 'storage.buckets.get' denied on resource (or it may not exist)., forbidden
michael@cloudshell:~/kcc-oi-20231206/kpt (kcc-oi-7970)$ 

@fmichaelobrien fmichaelobrien added documentation Improvements or additions to documentation developer-experience automation labels Feb 29, 2024
@fmichaelobrien fmichaelobrien self-assigned this Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automation developer-experience documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant