Autor: Stephan Fuhrmann

11. Januar 2024

Migrating data to Hashicorp Vault

I’m an Expert Network Software & CI/CD at IONOS. My responsibilities include the network software landscape. I’ve done a migration of passwords from a legacy system to Hashicorp Vault. In this article I want to share with you the workflow and insights.

Introduction

One best-practice in IT is to store data in a centralized system to avoid the so-called ‘secret sprawl’. In my article I want to give you a short summary of my journey of migrating data from a legacy system to a Hashicorp Vault KV2 secrets engine.

Preparations

There are multiple options to enter data in Hashicorp Vault. There’s a frontend, a REST API and a command line interface client. I’ve chosen the Vault command line client which seemed a good way for batch processing.

If you want to test your migration without interfering with a production system, a good start is to start up a Docker container with an ephemeral Vault Server.

Extraction

The data to be imported first needs to be extracted from the legacy system. I recommend transforming that to a minimal machine readable format, for example CSV or JSON.

Minimal means that only the secret coordinates are in the file(s). This helps to focus on the migration process and validate the results.

Besides that I consider normalization to be a best-practice. Let’s consider IPv6 addresses as keys. 0000:0000:0000:0000:0000:0000:0000:0001 and ::1 are two notations referring to the same IPv6 address, but are syntactically different and could allocate two different keys in Vault creating a duplication and conflict. Going with the full-blown lower-case IPv6 address is the way I’ve chosen to not get in conflicts with the different IPv6 notations that are around.

The Hashicorp Vault command line client has an option to import JSON files containing objects. I’ve chosen this format as the final format of my extraction. One JSON file maps to one directory entry in the Vault KV2 engine. The files look like

{
  "0000:0000:0000:0000:0000:0000:0000:0001": "foo",
  "127.0.0.1": "bar"
}

Where the IP addresses are the keys, and ‘foo’ and ‘bar’ the secrets.

Login

Before you can proceed using the Vault command line client, you need to log in to a session to your Vault server with your credentials. This can be done with these commands:

$ export VAULT_ADDR=https://YOUR_SERVER_ADDRESS/
$ vault login -method=ldap username=$USERNAME

This logs you in with the LDAP authentication method if your Vault instance is configured to use that. Besides that, the ‘export’ stores the address of the vault server in the environment variable VAULT_ADDR so consecutive ‘vault’ commands (see below) can use it.

Migration

If the preparation is done and you have a directory full of JSON files in the latter mentioned format, you can do your migration with a bash for-loop. In this loop $i is the name of the JSON document you are uploading:

$ for i in *; do echo $i; vault kv put -mount=$MOUNTPOINT $VAULTDIR/$i @$i; done

This will print on the console for every ‘put’ the status of the vault call. A good idea is also to redirect the per-vault command output to one file to not lose the outputs for error handling.

Validating the successful migration

Reading back all transferred data from Vault and validating the coverage is what I’ve used to check whether there were errors in the process of migration.

$ for i in $(vault kv list -mount=$MOUNTPOINT) $VAULTDIR; do vault kv get -format=json -mount=$MOUNTPOINT $VAULTDIR/$i | jq .data.data; done

This will return JSON objects like

{
    "0000:0000:0000:0000:0000:0000:0000:0001": "foo",
    "127.0.0.1": "bar"
}

Without the ‚jq‘ command you’d get the metadata from Vault which is not important in this use-case.

If you count the key-value pairs, you have the number of key value secret pairs in your KV2 engine directory. This can be done with the help of ‘jq’ and ‘awk’:

$ for i in $(vault kv list -mount=$MOUNTPOINT) $VAULTDIR; do vault kv get -format=json -mount=$MOUNTPOINT $VAULTDIR/$i | jq ".data.data | length"; done | awk '{s+=$1} END {print s}'

This expression returns just one integer of key-value pairs in all Vault nodes in the folder $VAULTDIR.

Note that “vault kv list” command always returns output lines with an extra “Keys” and “----” row besides the payload data. I’ve chosen to not filter this and ignore the errors.

Deleting Vault files … really

If you want to delete files in vault because of a failed migration, you can do this by deleting the metadata of the files:

$ for i in *; do echo $i; vault kv metadata delete -mount=$MOUNTPOINT $VAULTDIR/$i; done

Deleting the metadata will get rid of the whole file including all versions.

Discussion

We’ve walked through the approach I have used to migrate secrets to Hashicorp Vault.

What’s your opinion on my thoughts and decisions?

Do you know more efficient ways to migrate data to a Hashicorp Vault KV2 engine? Let me know in the comments!