Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purging backups using MedusaTask is not cleaning backups from Minio which contains SSTables with empty "bti-Rows.db" files #1318

Open
Arun-Trichy opened this issue May 22, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Arun-Trichy
Copy link

Arun-Trichy commented May 22, 2024

What happened?
Created some backups using MedusaBackupSchedule CRD, and created a MedusaTask CRD with "operation: purge" to remove old backups based on max_backup_age & max_backup_count configurations.

Did you expect to see something different?
All old backups should get removed as per max_backup_age & max_backup_count configurations. But could see some backups still present in Minio and all the backups present are having a common file "bti-Rows.db" with size 0.

How to reproduce it (as minimally and precisely as possible):
Create backup using k8ssandra-operator and ensure that there is SSTables with size 0 or empty files. Run the MedusaTask to purge old backups and observe Minio.

Environment
used Minio for storing the backups, followed https://docs.k8ssandra.io/tasks/backup-restore/

kc get k8ssandracluster demo -n medusa -o yaml
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  annotations:
    k8ssandra.io/initial-system-replication: '{"dc1":3}'
  creationTimestamp: "2024-05-21T11:37:58Z"
  finalizers:
  - k8ssandracluster.k8ssandra.io/finalizer
  generation: 3
  name: demo
  namespace: medusa
  resourceVersion: "1440488"
  uid: 7d8fa01a-a543-41f6-b2ce-6f8a6d113b94
spec:
  auth: true
  cassandra:
    datacenters:
    - config:
        jvmOptions:
          gc: G1GC
          heapSize: 4096M
      metadata:
        name: dc1
      size: 3
      storageConfig:
        cassandraDataVolumeClaimSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 5Gi
          storageClassName: standard
    serverType: dse
    serverVersion: 6.8.41
    superuserSecretRef:
      name: demo-superuser
  medusa:
    storageProperties:
      backupGracePeriodInDays: 0
      bucketName: medusafull
      concurrentTransfers: 1
      host: minio.minio.svc.cluster.local
      maxBackupAge: 1
      maxBackupCount: 2
      multiPartUploadThreshold: 104857600
      port: 9000
      prefix: test
      secure: false
      storageProvider: s3_compatible
      storageSecretRef:
        name: medusa-bucket-key
      transferMaxBandwidth: 50MB/s
  secretsProvider: internal
status:...


kc get medusabackup
NAME                                          STARTED   FINISHED   NODES   FILES   SIZE        COMPLETED   STATUS
medusa-differ-backup-2m-schedule-1716293280   18h       18h        3       892     1.17 MB     3           SUCCESS
medusa-full-backup-2m-schedule-1716293280     18h       18h        3       836     914.94 KB   3           SUCCESS

kc get medusabackupjob
NAME                                          STARTED   FINISHED
medusa-differ-backup-2m-schedule-1716292080   18h       18h
medusa-differ-backup-2m-schedule-1716292200   18h       18h
medusa-differ-backup-2m-schedule-1716292320   18h       18h
medusa-differ-backup-2m-schedule-1716292440   18h       18h
medusa-differ-backup-2m-schedule-1716292560   18h       18h
medusa-differ-backup-2m-schedule-1716292680   18h       18h
medusa-differ-backup-2m-schedule-1716292800   18h       18h
medusa-differ-backup-2m-schedule-1716292920   18h       18h
medusa-differ-backup-2m-schedule-1716293040   18h       18h
medusa-differ-backup-2m-schedule-1716293160   18h       18h
medusa-differ-backup-2m-schedule-1716293280   18h       18h
medusa-full-backup-2m-schedule-1716292080     18h       18h
medusa-full-backup-2m-schedule-1716292200     18h       18h
medusa-full-backup-2m-schedule-1716292320     18h       18h
medusa-full-backup-2m-schedule-1716292440     18h       18h
medusa-full-backup-2m-schedule-1716292560     18h       18h
medusa-full-backup-2m-schedule-1716292680     18h       18h
medusa-full-backup-2m-schedule-1716292800     18h       18h
medusa-full-backup-2m-schedule-1716292920     18h       18h
medusa-full-backup-2m-schedule-1716293040     18h       18h
medusa-full-backup-2m-schedule-1716293160     18h       18h
medusa-full-backup-2m-schedule-1716293280     18h       18h

kc get medusatask
NAME                   AGE
purge-backups-1        17h
purge-backups-1-sync   17h
purge-backups-3        17h
purge-backups-3-sync   17h

kc get medusatask purge-backups-3 -o yaml
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaTask
metadata:
  creationTimestamp: "2024-05-21T12:20:43Z"
  generation: 1
  name: purge-backups-3
  namespace: medusa
  ownerReferences:
  - apiVersion: cassandra.datastax.com/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: CassandraDatacenter
    name: dc1
    uid: 186f42eb-a4e4-4351-b7cb-15fb42de81fe
  resourceVersion: "1272569"
  uid: 56dbc707-1537-4a68-9432-56dc179d8bd7
spec:
  cassandraDatacenter: dc1
  operation: purge
status:
  finishTime: "2024-05-21T12:21:13Z"
  finished:
  - podName: demo-dc1-default-sts-0
    totalObjectsWithinGcGrace: 693
  - podName: demo-dc1-default-sts-2
    totalObjectsWithinGcGrace: 631
  - podName: demo-dc1-default-sts-1
    totalObjectsWithinGcGrace: 673
  startTime: "2024-05-21T12:20:58Z"

kc get medusatask purge-backups-3-sync -o yaml
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaTask
metadata:
  creationTimestamp: "2024-05-21T12:21:13Z"
  generation: 1
  name: purge-backups-3-sync
  namespace: medusa
  ownerReferences:
  - apiVersion: cassandra.datastax.com/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: CassandraDatacenter
    name: dc1
    uid: 186f42eb-a4e4-4351-b7cb-15fb42de81fe
  resourceVersion: "1272615"
  uid: b02ea9d0-ca79-402e-9025-99f806826cee
spec:
  cassandraDatacenter: dc1
  operation: sync
status:
  finishTime: "2024-05-21T12:21:28Z"
  finished:
  - podName: demo-dc1-default-sts-2
  startTime: "2024-05-21T12:21:28Z"

Minio contents of one of the Cassandra nodes:
image
image
Medusa sidecar container logs showing Purging operation removing all files except "bti-Rows.db"
image

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: K8OP-23

@Arun-Trichy Arun-Trichy added the bug Something isn't working label May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: No status
Development

No branches or pull requests

1 participant