Skip to content

It's All Your Vault

Compare
Choose a tag to compare
@jstange jstange released this 04 Aug 17:25
· 3204 commits to master since this release

Original PR: #66
Known Migration Hiccups

Hashicorp Consul and Vault

These are now bundled as as part of your Mu Master's service suite. The original idea was to use this as a groomer-agnostic replacement for Chef Vault. However, it's entirely too complicated for automata to use in the way we currently do. In general seems intended to be used like a backup safe for extremely sensitive data. That's still useful, as is bundling Consul, which supports other useful Hashicorp products, so we're rolling with it.

See the recipe mu-master::vault, which relies on some of Hashicorp's community cookbooks. Note that both Consul and Vault support clustering, so multi-node configurations are theoretically possible.

More on using Hashicorp Vault

Rewrite of mu-configure and installation process

The important work for this branch. The new toolchain is best illustrated by stepping through the build process for new master:

  1. The administrator downloads and executes install/installer from this repo. This is now a very simple stub, which installs a current release of Chef Client, and then...
  2. Runs chef-apply against the standalone recipe mu-master::init. When doing a fresh install it does so by fetching it straight from this repo. This recipe installs Chef Server, clones the cloudamatic/mu Github repository, installs our custom Ruby package, handles the installation of Gem bundles, and other tasks in order to get a minimally functional Mu tooling environment working.
  3. Once that recipe completes and we have a minimally functional Chef Server and set of Mu tools, it hands off to mu-configure. As in the past it prompts the administrator for sundry required configuration parameters, though these can also be provided as arguments for unattended installs. Once specified, it applies the configuration, installs all of the other software required for a fully-functioning master, and creates the mu Chef organization and user, associated with the root account.

mu-configure has been completely rewritten as a Ruby script. As before, it has a menu-driven interface, but is capable of much more complicated logic and validation than before. Also, every menu option has a corresponding command-line switch (run mu-configure --help for details), so that installations or reconfigurations can be scripted. Our installer script will pass these arguments to mu-configure unaltered.

Certain software and services which were configured by hand-crafted Bourne shell code have now been moved into reasonably sane Chef recipes, notably mu-master::389ds and mu-master::ssl-certs. mu-master::init, used during the earliest installation phase, is also part of the regular run list, though on an already-bootstrapped node it mostly just enforces permissions and manages local Ruby installations.

Our crusty old mu_setup script has been renamed to deprecated-bash-library.sh so that it's clear what it is. Its only remaining use is as a library for mu-upload-chef-artifacts, which should be slated for a revamp in the future. We are also no longer dependent on shell environment variables such as MU_INSTALLDIR, though these continue to be honored in certain places. The old mu.rc configuration files are no longer needed or acknowledged; all configuration should be derived from mu.yaml (systemwide) and ~/.mu.yaml (individual non-root users).

An intentional corollary of this work is that it should now be possible to build this software into an AMI, scrubbed of certificates and other identifying information, and build new Mu Masters by simply spinning up said AMI and running mu-configure again. This should be much faster and more reliable than building from scratch.

mu-ssh

Handy little utility. It's just a wrapper around mu-node-manage that passes arguments as a search pattern, and then tries to ssh to the resulting nodes, each in turn. So I can do something like mu-ssh drupal and get a quick interactive shell on whatever node or nodes have that string in their name.

Internal SSL CA

Besides now originating from a Chef recipe, we also attempting to use SAN fields to build in all the names whereby our automated services might refer to our Mu master. This gets around the idiosyncrasies of several software packages, notably Chef's libraries and utilities.

Load Balancer improvements

We can now reference Amazon Certificate Manager certificates for SSL listeners, as well as the IAM certificates we've always supported. The Basket of Kittens syntax is unchanged, so it will only be necessary to alter BoKs if there is a name collision across multiple certificates.

We can also now set SSL listener policies with the tls_policy parameter. This only seems to work on Application Load Balancers (an AWS issue), for now. Our new default is to use only TLS1.2 and known-good encryption algorithms, which is standard industry practice.

mu_master_user Chef resource

Simple little resource interface for user management from Chef, for cookbooks which reference mu-master as a dependency. Only valid when the recipe is running on a Mu Master. You can do stuff like this:

mu_master_user "someuser" do
  realname "Some Guy"
  email "[email protected]"
end

Complete set of valid parameters for this are:

attribute :username, :kind_of => String, :name_attribute => true, :required => true
attribute :realname, :kind_of => String, :required => true
attribute :email, :kind_of => String, :required => true
attribute :password, :kind_of => String, :required => false
attribute :admin, :kind_of => Boolean, :required => false, :default => false
attribute :orgs, :kind_of => Array, :required => false
attribute :remove_orgs, :kind_of => Array, :required => false

Berkshelf

We have a new skeletal platform repository Berksfile, which is recommended for all projects. It will honor cookbook version constraints set in the global Mu Berksfile, so that we don't get those fun "surprise upgrades." See lib/extras/platform_berksfile_base.

Bug fixes, minor enhancements

  • Multiple issues with rsyslog centralization
  • Multiple issues with the mu-jenkins cookbook
  • User allocation could sometimes result in duplicate UIDs
  • Many, many workarounds for Opscode problems with Chef Server installations, upgrades, and restarts
  • A number of nuisance problems with mu-user-manage (or rather, its support libraries)
  • Peculiarities for CentOS7 and RHEL7
  • Less and less dependency on being run in Amazon Web Services. We're about an inch from being ok on Google Cloud Platform, bare metal, etc. Branch the_goog should carry this work the rest of the way.
  • Ironed over most of the Chef 13 deprecation warnings. No functional changes here, but less noise in Chef runs, which makes debugging easier.
  • Nagios alerts stick the local hostname in the subject and body of emails so you can tell what Mu Master they came from
  • Auto-subnetting now behaves correctly in accounts with 5+ AZs
  • mu-utility::nat may be fixed (only partially tested)
  • Minor behavioral cleanups for mu-upload-chef-artifacts
  • Retired some defunct community cookbooks
  • Added Momma Cat logs to logrotate, and enabled compression on rotated logs.
  • Workarounds for irritating edge case bugs in chef-vault
  • Retired some weak SSL ciphers from generic Apache, Nginx, and Splunk configs.


Known Migration Issues

As you update existing Mu masters to the new master branch, you may trip over something dumb like this:

.git/hooks/post-checkout: line 9: .git/hooks/../../install/deprecated-bash-library.sh: No such file or directory

It's safe to just nuke /opt/mu/lib/.git/hooks/* and then run mu-self-update -b master again.

Chef Server seems to have gotten incredibly fragile over time, even the latest release. An upgrade, reconfigure, or occasionally even a restart seems to implode uncomfortably often. This is ultimately an Opscode issue, not related to our work, but may impact anyone doing an update.

Sometimes you have to turn off iptables for a second to get rabbitmq to start, for example (whatever port it's trying to poke isn't pertinent once it's up, and it's not documented).

Another one I've seen but been unable to reproduce is Berkshelf uploads of edited cookbooks melting down with an internal Chef Server error that makes no sense. It's definitely not our fault, whatever it is.

There's not much we can do about that that we're not already doing. Only @rpattcorner has masters that need updating, so if a mu-self-update -b master on those guys faceplants, just call me in to massage them instead of worrying about adding more dopey workarounds.


Hashicorp Vault basics

Meanwhile, from my fact-finding mission with Hashicorp Vault on our customer's behalf:

Usage of this thing is pretty complicated, by design. I think it’s best used for secrets that need to be ultra-protected, and written only infrequently and manually. If you were an SSL vendor, for example, you might stash your root CA key with something like this. When you need a piece of data to be managed with pedantically correct security, this is what you use.

I think it might actually be the wrong solution for day-to-day operational storage of, say, passwords that a Rails application needs to log into a third-party API. Chef Vault should probably continue as our go-to for low-rent stuff like that.

The magic I’ve rigged up in Mu will automatically build a working, uninitialized Vault server local to the master. It’s using Consul as a backing store, which is the preferred method. We’re running it with a single node, but it’s got intelligent clustering built in and could be pretty easily expanded across multiple nodes.

vault init

The first thing you do when you’ve got a working instance up is run vault init. This will dump something like the following, which I’ve lifted from one of my test instances so as not to expose anything of value:

Unseal Key 1: TR9H49mP40APxUkuwmsmhb81npus7OjlozrWvvkG6zcB
Unseal Key 2: 7L4WSZ6KjUBkcEmZA5qlc9cajucNRwY/C7haI7gatawC
Unseal Key 3: 1if+07uWcsBrK6PaE8jqzY1iIJRCCj/OepfT7yjNReID
Unseal Key 4: y7IVZZgl7oE7IvhNgni5F69Jnn//ncYnl7VltVW/E/cE
Unseal Key 5: 8Sv9/705EQE0eRIOkir2qfUxMAyw0P/W5prsecVo47kF
Initial Root Token: a9d53520-84d1-76df-4119-154c95ba7205

Vault initialized with 5 keys and a key threshold of 3. Please securely distribute the above keys. When the Vault is re-sealed, restarted, or stopped, you must provide at least 3 of these keys to unseal it again.

Vault does not store the master key. Without at least 3 keys, your Vault will remain permanently sealed.

There’s a fair bit to unpack here. First, the Unseal Keys. As I understand it, a Vault server spends most of its life in “sealed” mode, which is to say locked up tight (no read or write access). Like the message says, you need at least three of those keys to unseal it. It will auto-seal itself every time something breathes on it. You can stash them on personal equipment, keep them on a USB key in a safe, distribute them to the four corners of the globe… the degree of crazy here is malleable.

Those keys get lost… you’re done. That thing’s not coming open. The above is an example set.

vault unseal

To unseal a sealed vault, you run vault unseal and it prompts you for a key. Then it tells you how many more keys you need to feed it before it’ll unlock. Repeat until you make the threshold.

# vault unseal
Key (will be hidden):
Sealed: false
Key Shares: 5
Key Threshold: 3
Unseal Progress: 0
Unseal Nonce:

vault auth

Now that the Vault is unsealed… guess what, you’re still expected to authenticate! That Initial Root Token we got earlier? That’s how you do it, at least the first time.

# vault auth
Token (will be hidden):
Successfully authenticated! You are now logged in.
token: 45c22be3-1c42-e21f-a07d-d5be2d5127e8
token_duration: 0
token_policies: [root]

You get something much like a session when you do that. It times out eventually and you have to reauth. Obviously, having everyone use the root token isn’t what you’re supposed to do- the vault token-create command is used to create other tokens, which can be associated (I think) with different policies, which we’ll get into in a minute. There’s also the generate-root command to spit out a new root token.

vault write

Now that we’re authenticated, let’s actually store something!

# vault write secret/thingy  'somepassword=LOUDNOISES!!1!'
Success! Data written to: secret/thingy

That secret part of that path isn’t arbitrary. It’s the default generic key/value store. I think you can mount and create other paths, but my flailing with the vault mount command only led me to dead-ends.

vault read

I wanna read it back!

# vault read secret/thingy
Key                     Value
---                     -----
refresh_interval        768h0m0s
somepassword            LOUDNOISES!!1!

…in a machine-readable format!

# vault read -format=json secret/thingy
{
        "request_id": "3686fe31-9660-b278-5d04-e31f445cd2c3",
        "lease_id": "",
        "lease_duration": 2764800,
        "renewable": false,
        "data": {
                "somepassword": "LOUDNOISES!!1!"
        },
        "warnings": null
}

How would we use this in code?

Well, I don’t think we should. This is a huge PITA, by design. I can in theory automate some of these setup steps, at the expense of security correct-ness and a great deal of my sanity. We’d have let a recipe or script meddle with the unseal keys and root token, then stash them someplace at best quasi-secure for a human admin to get at them later. We’d probably also want to automatically install the Vault client onto machines we generate (easy enough), and manufacture/manage the deployment of auth keys on a per-node basis (aggravating) so that they can authenticate from inside Chef recipes and get stuff. So, basically, re-implement Chef Vault, which already does all this in an intelligent fashion. I don’t wanna.

This is useful as-is for the things Hashicorp Vault is meant to do, which is provide viciously-guarded storage for important secrets. We can very easily sling together a demo BoK that adds replicas, building a Vault cluster. I’m not seeing the value in using it as a Chef Vault replacement until such time as someone asks us to implement a Groomer layer that doesn’t have its own similar solution.

Addendum: vault policies

There are ways to pare access to various things. I think in a world where we need segregated access, we’d create or mount more paths, then use a mechanism like the following to grant access. In this case I just meddle the default policy, but in theory you’d create other policies and associate them with authentication tokens, judiciously passed around.

# vault policies default

# Allow tokens to look up their own properties
path "auth/token/lookup-self" {
    capabilities = ["read"]
}

# Allow tokens to revoke themselves
path "auth/token/revoke-self" {
    capabilities = ["update"]
}

I used the above command to dump the generic default policy to a file, manually edited it to add the following stanza, and then used vault policy-write default - < default_policy to write it back.

path "theoreticalotherpath/*" {
    capabilities = ["create", "read", "update", "delete", "list"]
}