When deploying infrastructure and applications, more frequently than not we have to deal with passwords, API keys, private keys and such. There are several approaches on how to handle them.

your secrets are safe with me

For the purpose of this blogpost, I’ll be focusing on secrets required by machines - more specifically for automatically provisioned and maintained ones, as one must.

I believe we have pretty decent human-friendly tools for passwords (namely password managers, they could be even combined with MFA, or several of the tools described here can be used by humans), so I will leave that as an exercise to the reader. Human access to machines (SSH and friends) is a more delicate topic, which I want to cover in another post.

Let’s play utopia

When handling secrets, there are some broad features desired for secrets management:

  • Encrypted at rest (*)
  • Encrypted in transit
  • Change auditing (who changed a secret and why)
  • Easy to create new secrets (including developers for application secrets**)
  • Auditable and automated secrets deployments
  • Auditable usage (for each usage per secret)
  • Reliable (the secret sauce has to be available for the applications when required)
  • Easy rotation, without causing downtime
  • Limited access and opportunity to decrypt or recover secrets
  • Easy to develop applications consuming those secrets
  • They are not printed on the logs
  • They are not part of the command line.

* Relying on hard disk encryption here is cheating. Ideally we don’t want secrets decrypted on the filesystem at all.

** If you make it harder on developers to do the right thing, they won’t use it. We need to help developers to make the best decision for the business, not become a burden to them.

We aim to have the easiest possible way to deploy and rotate secrets, limiting as much as possible when and where the secrets are in clear text. On the other hand, you want auditability on every step of the way.

Coming back to earth

But this is the real world.

The utopia gives us the general direction, but implementing all those requirements to every single secret would be extremely expensive. While we all should value security, it would mean nothing if it means causing the business to go bankrupt. As anyone using the security engineer hat knows, we have to evaluate the assets, risks and the threat model to analyse the trade offs on each individual case. The password to JIRA database doesn’t necessarily need to be treated the same as passwords to the PCI-compliant production database.

Spoiler alert: you will never reach utopia. No single tool will ever give you everything. Some tools will help with some requirements, some tools will help with other requirements; you’ll need to write quite a lot of glue code and, in some cases, you might have to change the applications themselves.

That doesn’t mean you should give up of security all together because you can’t be perfect. Perfect is the enemy of good.

Encryption to the rescue

To help us with secrets, we are going to use encryption a lot. Encryption is what allow us to safely store and transit a certain message that can only be read by the recipient. Anything else watching the transmission or seeing the message will not be able to understand the it.

For the purpose of this blogpost, let me give a very oversimplication of encryption algorithms available:

  • Symmetric: the same key is used to encrypt and to decrypt. So you need to share this key beforehand with the message’s recipient. You will see sentences like ‘shared password’, ‘shared secret’ or ‘pre-shared secret’.
  • Asymmetric: the message is encrypted with the public key; the message can only be decrypted by the private key. Together they are known as the ‘key pair’. Examples would be keys generated by GPG, OpenSSL. And before I get any GNU nerd talking about PGP vs GPG, just don’t.


The keys used to encrypt/decrypt data can either be generated on the local filesystem (e.g. when you run gpg/openssl commands locally) or come from a key management system (like AWS KMS or HSM appliances). For the purpose of this blogpost, they will be treated the same, and I won’t talk about key rotation.

Can I mention that Base64 is not encryption? Also, friends don’t let friends write their own cryptography - let the experts do their thing here. Just use a well tested and established crypto.

If the biggest problem you are facing with secrets right now relates to which encryption algorithm to use and how to best execute key management and rotation, I do assume your secrets management is extremely good already and you don’t need this blogpost (or you are just bike shed’ing).

Bring us the hammers

There are heaps of different tools handling secrets encryption. So, I want to propose to classify them in two types: pushed secrets tools (that push clear text secrets to a trusted server or service) and pulled secrets tools (applications have to pull secrets from something external).

Another spoiler: you might use multiple tools to achieve the desired result, sometimes even more than one tool of the same type.


Pushed Secrets

Push the penguin

If you are committing secrets in clear text to a git repository or maintaining secrets manually, you start from here.

The tools described here either push the secrets to their end location using the orchestrator (configuration management tool or CI/CD) or push the secrets to one of the pulled secrets services. They are offline tools, you run them only once during deployment.

Even when used without other tools, this class of tools help us achieve a lot of interesting outcomes:

  • Encrypted at rest and in transit (partial): the git repos, developers’ workstations (for some tools). Only orchestrator and end nodes will have secrets in clear text.
  • Change auditing: via SCM commits
  • Easy to create new secrets: via SCM commits
  • Auditable and automated secrets deployments: via orchestrator
  • Reliable: the secrets are readily available for the application in clear text when it starts.
  • Easy rotation: via SCM commit; but it does require downtime
  • Limited access and opportunity to decrypt or recover secrets: only on the orchestrator or nodes
  • Easy to develop applications consuming those secrets: they are regular files on the filesystem.

On the other hand, secret rotation requires downtime and there’s no way to audit usage. The orchestrator and the machines have to be considered ‘trusted’.

There are several popular open source tools in this space. Let me give some examples.

Configuration Management specific tools

  • Ansible vault: available for Ansible users. Relies on a symmetric key; the secret is decrypted on the machine running ansible, not on the nodes. Encrypts the whole file, and all files in an ansible run should be encrypted using the same key.
  • Chef Data Bags and Chef vault: available for Chef users. Data Bags uses symmetric keys; the secret is decrypted in the node only, and they should have the shared secret. Chef vault restricts which nodes can decrypt a certain Data Bag. Encrypts the whole file.
  • hiera-eyaml: available in puppet or standalone. Uses asymmetric keys (GPG*** and PKCS7), but gpg-agent appears supported on standalone only. When used with puppet, the secrets will be decrypted by the puppet server. Only supports yaml files, and encryption is done on specific yaml values (not the whole file). As most of the file is preserved in clear text, it allows huge visibility and editing ability even for those without without access to the private key.
  • Blackbox for puppet(see below)

SCM based tools

  • git-crypt: Uses asymmetric keys (GPG***). Encrypts the whole file. Transparent support in git; files are kept in clear text locally for those which have access to the private key, but are pushed encrypted to upstream.Relies on the user using the right convention to avoid secrets being pushed to git upstream unencrypted.
  • Transcrypt: Very similar to git-crypt in behaviour, but uses OpenSSL’s symmetric cipher instead of GPG.
  • git-secret: Uses asymmetric keys (GPG***). Encrypts the whole file. It’s not transparent as git-crypt and transcrypt, as the user has to add files individually via command line.
  • Blackbox: Uses asymmetric keys (GPG***). Encrypts the whole file. All files in the repository are encrypted using the same key. Supported in puppet as well. It’s be possible to never have clear text secrets locally using this tool.
  • SOPS: It works well with GPG*** (asymmetric) and AWS KMS (symmetric). Encrypts parts of the file (yaml and json values), like eyaml. Supports json, yaml and binary files. Relies on the fact that the people editing the file always have access to the key, as the file cannot be modified outside sops. Different files can have different keys, and you can have both KMS and GPG on the same file.

We also see quite a few in-house bash/python solutions (using GPG, KMS or any source of keys) or usage of CI environment variables during build time.

*** gpg-agent is required to have private keys with passphrase. Because of the way GPG works, it’s possible to have multiple private keys allowed to open the same file (as far as they are all included as recipients). All tools mentioned here appear to be taking advantage of that.

Bootstrapping

Regardless of which tool you decide to use, you still need to solve their bootstrap; the machine which will decrypt the secret needs to have a certain private key or shared secret. And you might even want to rotate those keys every so often.

Because of that, it’s common to use at least two of these tools.

There’s no easy way of solving this, and usually the solution is either human intervention (e.g. secrets are stored in another secure location and copied manually when the puppet server starts) or some other sort of trusting mechanism (e.g. using IAM roles).

Access Management to secrets is trusted to the orchestrator, which by definition has access to all secrets and deploy them only to registered and trusted nodes.


Pulled Secrets

cats and milk

The secrets are available for the application either encrypted or needs to be retrieved from somewhere external. They are online tools, they need to be working while the applications are running or being restarted.

This class of tools has the power to help you achieve audit for usage, better access control for secrets (granularity per secret), transparent secret rotation (as well as using ephemeral credentials) and, when used with a pushed secrets tool, full encryption at rest. But the tools themselves don’t give you anything for free, it will be required to change applications and deployments to achieve that.

This external service also becomes a big single point of failure for the whole infrastructure. This application/appliance needs to be extremely reliable and high available, otherwise your whole production will be all down pretty soon. The ongoing maintenance cost is a lot higher.

Defining secrets in these tools are much more complicated by definition. As they are more complicated, there’s a natural tendency to create silos (those who create secrets, those who consume it).

This last mile requires a lot more effort than the previous class of tools to get any benefit, so do not take this decision lightly.

Tools

Some examples:

  • KeyWhiz: it’s a key management system and secret management system.
  • Knox
  • CredStash,Biscuit, Sneaker: tools using KMS as the encryption backend.
  • AWS SSM Parameter store: allow AWS EC2 and ECS instances to request specific secrets.
  • Vault: it’s a key management system, access management system, secret management system, and certificate authority. It would replace AWS IAM/STS, AWS KMS, AWS SSM Parameter Store (AWS Certification Manager can’t generate certificates so easily for internal servers yet). HA is officially supported only on consul clusters.


The tools in this space have a lot more features, including auditability, interfaces, APIs and node authentication. A good read on them can be done in this gist.

It’s common as well to see applications communicating straight to HSM to handle decryption, but the effort and price could be prohibitive. The logs and configuration for HSM tend to not be so readily accessible, and error messages are pretty cryptic.

I’ve also seen some bad ideas about putting a ‘proxy’ to inject all secrets needed by requests. Based on the fact that you’d need to implement the authentication, authorization per service per secret, make sure it’s high available and extremely protected, provide an automatic way deploy secrets to this server from code, I’d just suggest you don’t reinvent the wheel and use use one of the tools that already exist.

Please note that Docker Secrets and Kubernetes Secrets were not included. They only provide a secure way for the cluster managers to share secrets with the containers, using a RAM disk (still clear text on the filesystem). They don’t offer any auditing or granular access control. Also, there’s no versioning concept for secrets yet. We have to wait to see on which direction they will go, if any.

Lightweight pulled secrets

Due to the nature of security tools, sometimes teams implement complex tools without a lot of appreciation for the security outcome expected. Some teams implement AWS SSM/HSM/Vault, but the secrets are decrypted during boot time (or docker container initialisation), and written in plain text to the filesystem.
Effectively, you just lost most of the advantages of these tools (auditability on when a secret was needed, ability to rotate keys without downtime, and encryption at rest), but you still have to handle the maintenance cost. IMO, this class of tooling should only be used if you decide to go all the way in.

Bootstrapping

Note that none of the tools mentioned here handle how the secrets end up in the external secret management system; usually that tends to be one of the pushed secrets tools described before.

Also, the nodes have to authenticate with the secret management system. There are several ways of doing it, via certificate, IAM roles, depending on the tool.


Deployment Strategies

Regardless of which combination of tools you are using, there are several ways of deploying secrets and pipelines.

Daniel Somerfield has done an amazing talk about secrets, and came up with a classification for different secrets deployment strategies:

  • Orchestrator decryption: The secrets are encrypted in source code, the orchestrator (Puppet, Chef, Ansible - let me stretch that to CI tools) will deploy the secret, in clear text, to the instances running the applications.
  • Application decryption: The secrets are encrypted in source code, the orchestrator will deploy them encrypted. It’s the application responsibility to decrypt the secret before using it.
  • Operational compartmentalisation: The secrets lifecycle is completely independent of the the application deployment; it’s not deployed together with the application, but by a different team/pipeline, and they are just exposed to the applications.


At some point we need human intervention, but humans are forgetful, unreliable and prone to be run over by buses.


Pushed secrets tools, when used by themselves, would typically be deployed using orchestrator decryption or operational compartmentalisation.


Pulled secrets tools would typically be used in application decryption and operational compartmentalisation deployment strategies. Docker and Kubernetes secrets could also be used to achieve operational compartmentalisation deployments.

Credential-less applications

Password

You are probably now thinking this is just too hard, can you just get rid of secrets altogether? Sometimes you can. Sometimes your access management system (e.g. AWS IAM roles, security groups) can be enough for some services. IAM role is extremely powerful as we don’t run into bootstrap problems.

You can achieve similar effects using Vault on premises for certain services, but you are left to solve how to bootstrap the clients to retrieve passwords from Vault.

But there will always be some secrets. I think it would be naive to believe otherwise.

This is fine

this is fine dog

Secrets are bad, but we just have to deal with the fact that they do exist.

There are several tools (open source and proprietary) which can fit some parts of your workflow, but you need to find the right balance for each case. It’s fine and expected to use multiple tools at the same time, as far as they complement each other for your use case.

Don’t commit clear text secrets to repositories. You are a single git clone away from bad things.

Don’t go overboard and install vault + consul cluster + deployment if you don’t absolutely need the detailed usage auditing or if you are not willing the change the applications to use it. If you are in AWS, ask yourself what exactly are you getting out of it that is better for the company then AWS IAM/STS + SSM Parameter Store + KMS + Cloudwatch.

Be just as paranoid as needed. Solve the problems you have, not the problems you want to solve.