RFC - Cloud and Infrastructure Automation Support
Octopus has a number of customers doing infrastructure automation in cloud IaaS environments such as Amazon EC2 and Azure. We know a lot of customers are using Octopus Deploy as a central component to this, but we also know from talking to people that there are a few areas where this causes a bit of friction. We've got some ideas for things we can add to the product that will make this easier, but it is a huge area and we know everybody is doing it slightly differently, so we'd like your feedback.
What we know is that there's no way we can provide an easy interface that will suit the majority of people, there are simply too many variables and different patterns. We also know that the customers who want to go down this road are a pretty cluey bunch and not afraid of getting their hands dirty when it comes to writing some scripts to manage machines.
Finally what we know is that customers would like to use Octopus Deploy to do more of the orchestration of this process, and that's what we want to solve.
Two schools of thought
We see two distinct patterns when it comes to infrastructure automation, and it's about whether the environment is changed as part of a deployment or more loosely coupled.
Scenario 1 - Tie the environment to the deployment.
A new release is ready to go live. Brand new instances are created and configured, tentacles are installed and registered with the Octopus server. The release is deployed to the new machines, the new machines are put in the load balanced pool of servers and the instances running the old release are removed from the pool, de-registered with Octopus and destroyed.
The chief problem with this currently is that Octopus evaluates the deployment target machines at the beginning of the deployment. So if you have a PowerShell step in your deployment that creates new machines, they wouldn't be part of the deployment. Because of this, the orchestration of the deployment needs to take place outside of Octopus. Many people use their build server for this, to run the script to create the environment then invoke the deployment via the Octopus API. This works but it is somewhat counter intuitive about where the control over deployment lives.
Scenario 2 - Manage environment separately
The alternate approach is that an environment is defined and managed separately. This is often the case for very elastic environments where machines can be added or removed automatically according to demand.
As an example, an AWS CloudFormation template is stored in source control. This file defines the shape of the environment and its behaviours and policies around automatic scaling. When the configuration is changed, scripts are run to push the new configuration to AWS causing EC2 to add, remove or change machines as needed.
While this works for people today, it requires extra scripts to be deployed and run for new instances to request any deployments. Additionally, as instances are destroyed Octopus will continue to poll them to carry out health checks so scheduled tasks are created to de-register deprovisioned machines.
Enter Environment Policies
A solution to the problems in Scenario #2 would be to have some behaviour rules on a per environment basis.
Allowing dynamic registration for an environment would allow machines to be de-registered and "forgotten" if they don't respond to health checks or can't be contacted at deployment time. If a machine in a dynamic environment isn't responding, Octopus will remove it from the environment and stop trying to reach it. Octopus could keep a list of "Recently Removed" machines on the environment page as a visual indication.
Enabling Automatic Deployments for an environment would initiate a new deployment whenever a machine came online. If a new machine instance is created as the result of an auto-scaling or machine recycling policy, when the tentacle is registered Octopus would evaluate what releases it should have and create a deployment to it. This would also be useful in environments that are intermittently connected and can miss releases (i.e. we have some customers who need to deploy to client machines with VPNs that are often offline).
Re-evaulate Machines Step
A simple solution to scenario #1 would be for Octopus to have the ability to re-evaluate the deployment targets midway through a deployment. A PowerShell script which creates new machines could set the new instance names in a variable which could be used to re-target the deployment.
Direct support for IaaS platforms
The features outlined above are some basic pieces of functionality which will enable more scenarios for lots of our customers. For deeper hooks into the various IaaS platforms we think that we'll be best served by releasing Step Templates in our library and potentially some open source scripts, tools and sample projects to help you best tailor an Octopus Deploy project to your needs.
We need your feedback
If you are doing infrastructure automation, would these options make your life and / or scripts easier ? If you're wondering how to do it, would this help ? If you're doing something that isn't helped by these ideas at all, what are we missing ?