vault snap to S3 in kubernetes

Originally published on LinkedIn on 2023-05-24.

Hashicorp Vault Link zu Überschrift

In vault-enterprise there would be an integrated solution to upload backups to S3 compatible storages. In the Open Source version of Vault this feature is missing, so I had to create some solution for this.

Most times I use the integrated raft storage inside the vault cluster in kubernetes. Data of this StatefulSet is stored in PersistentVolumeClaims. (Where I also patched the default persistentVolumeReclaimPolicy to Retain the deletion of the StatefulSet, but that’s a different story)

So I searched fo inspiration in the internet and found some sources which I more or less followed.

Embedded: GitHub - adfinis/vault-raft-backup-agent: Vault Raft Integrated Storage Snapshot Automation

Embedded: HashiCorp Vault Backup and Restore Raft Snapshots from Kubernetes to AWS S3

Build a vault-backup container Link zu Überschrift

I decided to build a container based on alpine 3.18. There are actual vault and aws-cli packages available and they did work well in my case. The storage footprint of the image is around 90MB. And we already have gitlab pipelines to create and test containers. So the effort was minimal.

FROM alpine:3.1

LABEL version="1.0.0" \
      maintaner="Gerhard Sulzberger <********@*****>" \
      description="Image used for vault-backups"

RUN apk --no-cache add \
    vault=1.13.2-r0 \
    aws-cli=2.11.21-r0 \
    libcap=2.69-r0 \
    && addgroup vaultbackup \
    && adduser -G vaultbackup -g "Backup User" -s /bin/ash -D vaultbackup \
    && setcap cap_ipc_lock= /usr/sbin/vault

USER vaultbackup
WORKDIR /home/vaultbackup
COPY backupVault.sh /home/vaultbackup/backupVault.sh
ENTRYPOINT  ["/bin/sh", "-e", "/home/vaultbackup/backupVault.sh"]

The main magic happens inside the backupVault.sh script

#!/bin/sh
VAULT_TOKEN=$(vault write -field=token auth/approle/login role_id="$VAULT_APPROLE_ROLE_ID" secret_id="$VAULT_APPROLE_SECRET_ID")
export VAULT_TOKEN
DATE=$(date +%Y%m%d-%H%M%S)
vault operator raft snapshot save /tmp/vault-raft-"$DATE".snap
/usr/bin/aws --endpoint-url "$AWS_ENDPOINT_URL" s3 cp /tmp/vault-raft-"$DATE".snap s3://"$AWS_BUCKET"/vault/
rm /tmp/vault-raft-"$DATE".snap
echo "Completed the backup - " "$DATE"

The vault cli creates a short living token via approle authentication. It does the snapshot and with the aws-cli the snapshot will be uploaded to the object storage. Which is not a standard AWS S3 in my case.

The script for sure has to be improved, because there is zero safetynet in case of any failure. But it is a starting point which I can extend with some “TRY/CATCH” functionalities.

Kubernetes CronJob Link zu Überschrift

Kubernetes CronJob resource can be used to schedule when to run the vault-backup pod.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: vault-snapshot-cronjob
spec:
  schedule: "15 */12 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: vault-backup
            image: registry******/******/vault-backup:1.0.0
            imagePullPolicy: IfNotPresent 
            envFrom:
            - secretRef:
                name: vault-snapshot-agent-token
            - secretRef:
                name: vault-snapshot-s3
            env:
            - name: VAULT_ADDR
              value: https://vault-active.vault.svc.cluster.local:8200
          imagePullSecrets:
            - name: registry-cred
          restartPolicy: Never
      backoffLimit: 4

There is a bit of magic in the way how the secrets are managed. The external-secrets-operator manages the secrets which are used as environment variables. Also the imagePullSecrets are manged in this way. The external-secrets-operator uses vault kubernetes auth for authentication and if it has the right policy and permission it will create a kubernetes secret out of a vault secret.

A good documentation how this works can be found here.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: vault-snapshot-s3
spec:
  dataFrom:
  - extract:
      # AWS_ACCESS_KEY_ID
      # AWS_BUCKET
      # AWS_ENDPOINT_URL
      # AWS_SECRET_ACCESS_KEY
      key: kvstorage/path/to/secret/vault-snapshot-s3
      conversionStrategy: Default
  refreshInterval: 30s
  secretStoreRef:
    kind: ClusterSecretStore
    name: secrets
  target:
    creationPolicy: Owner
    deletionPolicy: Retain

Vault is the perfect place to store credentials.

Antipattern?

I missused it a bit for other environment variables. I saw it now and will have to change that later and remove the AWS_ENDPOINT_URL / AWS_BUCKET variables.

They probably should be configured like any other environment variable inside the manifest like VAULT_ADDR.

Deployment Link zu Überschrift

The deployment is done in argocd with the app-of-apps pattern. By this I just can check in everything into git, and argocd will to the work.

argocd application overview of vault-backup

Restore Link zu Überschrift

Embedded: Standard Procedure for Restoring a Vault Cluster | Vault | HashiCorp Developer

Conclusion Link zu Überschrift

With a bit of effort it is possible to automate backups of opensource vault within kubernetes to S3 object storages.

There are still todos and things not mentioned in this article.

Improve backup script & entrypoint
A daily routine to test restore procedures
Monitoring and alerts with kube-prometheus operator

Would be interessting how other IT-folks solved the backup of hashicorp vaults open source version. If you are willing to share insights or crtiticism, please add a comment.

Offtopic Link zu Überschrift

As we are working much behind screens, we get sometimes tired and think in circles. Then it is time for a break. As much I like tech. I will always enjoy the ancient technology to mow gras with a scythe.

Have a nice day and don’t forget to enjoy life.

Take a break