Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WinRM? More like "rm windows" #74

Merged
merged 68 commits into from
Dec 8, 2017
Merged

WinRM? More like "rm windows" #74

merged 68 commits into from
Dec 8, 2017

Conversation

jstange
Copy link
Contributor

@jstange jstange commented Oct 19, 2017

WinRM

All bootstrap communication with Windows nodes now takes place using WinRM instead of Cygwin sshd. We use the https listener, and certificate-based authentication. Cygwin is no longer set up via userdata (but is still in play, see below).

Re-grooming of nodes, whether with mu-node-manage or MommaCat call-ins, will attempt WinRM first. If it doesn't work, it will also mix in attempts over ssh.

Internal SSL cert enhancements

Partly to support WinRM functionality. We're now using some of the v3 extensions, adding Subject Alternative Names and the like. Conveniently, this seems to fix some new SSL issues that have cropped up on older branches, which were probably(?) triggered by new OpenSSL releases.

mu-node-manage -m certs will invoke the cert-generation method to create node certificates that don't yet exist, which is mostly for adding -winrm authentication certs for existing Windows nodes. It will also regenerate node certificates that have expired.

The disposition of Cygwin

We haven't gotten rid of it, just taken it out of the critical path for bootstrapping of new nodes. It's still being installed and configured as an alternate node access method.

mu-node-manage will attempt to use WinRM first when connecting to Windows nodes. If that fails, it will attempt ssh. This has the side benefit of maintaining backwards-compatibility for existing Windows nodes that were not bootstrapped with WinRM.

We looked at other ssh implementations for Windows, but the ones available for 2012r2 still seem to be sketchy. This is something to revisit with things from the Windows 10/2016 ecosystem, which may have more robust native support.

knife-windows

It didn't actually support certificate authentication with WinRM, but the underlying gem is the same as we're using elsewhere. It was a trivial port, so I added the functionality and submitted a pull request. For the time being our bundle pulls the gem from our fork of the main repo.

Fork: https://github.com/eGT-Labs/knife-windows
Pull request: chef/knife-windows#438

I don't know how to properly massage the Chef peoples' internal development processes, so who knows whether they'll ever merge it. I included some helpful documentation on how we're doing the certificate magic as bait.

Known issues

Building Windows nodes continues to be unnecessarily difficult, even with Cygwin out of the way. I've done yet another round of tightening around Windows' random bootstrap idiosyncrasies. Maybe we've accounted for all of the weird edge cases, maybe not.

WinRM connections between Mu and Windows nodes aren't actually verifying (as in SSL) on connect. This appears to be an issue with the WinRM gem- even if we specify the appropriate trusted signing cert, the connection still blows up on verify, so this is the only available workaround. It's a low security risk in a controlled environment, but still a target for later correction. It impacts Mu's library connections over WinRM as well as Chef's via knife-windows, which uses the same gem.

We don't have a good mechanism for cleaning defunct certificates out of Windows' certificate stores, e.g. WinRM client certificates inherited from source machine images. This should probably be some convoluted Powershell that gets added to userdata.

It is currently not possible to invoke the Cygwin installer/package manager from Chef, so the initial installation is happening during some pre-Chef magic buried in Mu::Cloud. We do the followup, e.g. enabling LSA and configuring sshd, in mu-tools::windows-client. The installer issue is either Opscode's bug or Cygwin's. If it ever gets fixed, we should shift the remaining bits of our Cygwin installation into mu-tools::windows-client, and consider incorporating the cygwin community cookbook to do real package maintenance.

The AWS layer's "build an AMI" code uses SSH to log in and clean up the node it's about to image. That still works, but to be canonically correct for Windows it should probably do that over WinRM. More importantly, this logic should probably get factored away from the cloud-specific implementation.

Chef's user resource seems to have stopped being able to set passwords on Windows. I added a workaround in the windows_users resource in mu-tools to deal with it until they fix their bug. Sprinkle elsewhere as needed.

(([adsi]('WinNT://./#{usr}, user')).psbase.invoke('SetPassword', '#{pass}'))

Upgrade Notes

Chef server and client versions are bumped with this release. Chef Server continues to have upgrade issues that are not of our making. If you encounter a hang or other problem with a chef-server-ctl operation during mu-self-update, I suggest the following steps:

  1. rpm -e chef-server-core && rm -rf /opt/opscode
  2. Reboot the machine, as ludicrous as that sounds (Chef Server often fails to stop its own daemons)
  3. chef-apply /opt/mu/lib/cookbooks/mu-master/recipes/init.rb to reinstall Chef Server cleanly.
  4. Re-run your mu-self-update

Testing

Basic regression is fine here. Apart from the Chef updates, the non-bugfix stuff in this branch is all around Windows, and so is only relevant in FEMA DMSE, where it's already in use. We can merge and tag as soon as we've validated that we didn't break anything else in the process.

Mu Master added 30 commits August 29, 2017 10:23
… cert set; put those certs in S3 and grant nodes access
…ing cloud metadata on a pattern match; deploy updates to node profile IAM policies
@jstange jstange merged commit d3570e4 into master Dec 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant