Sitecore and DevOps: Continuous Integration

Continuous integration (CI) is the process of automating and testing code every tine a developer pushes code to a source code repository.  In a perfect world, this means that every time a developer changes or adds some code, the full suite of regression tests will run, including any new tests, and if they all pass, there’s a good level of confidence that no obvious new bugs have been introduced.  Extreme examples of CI lead to Continuous Deployment, where CI processes can kick off further steps to promote the code through environments to get to production without any more action by a developer or release engineer.

In reality, many organizations are not equipped for this level of Continuous Deployment, and so an interim level of automation is implemented.

In this post I’ll cover a branching model that we have used to great effect at my current company, and how this ties into our CI tool to provide us fast feedback and lower development effort.

The branching model

We use a slight variation on the popular Git Flow model – why not pure Git Flow?  Because of our historical use of SVN and the pain of merging that existed within it.  Since then we haven’t had a need to really fold back into pure Git Flow, and the differences are generally minimal.

At its core, our model has the following branches:

  • master – this is always the current state of live code.  If we need to do a hotfix in a pinch, this is the branch we cut from.
  • develop – this is our integration branch – individual features are merged into this branch, and this branch is used to compile code to deploy to our integration environment
  • release – this is the code that is ready, to deploy to pre-production and production
  • n feature branches – there can be any number of these branches, cut from the release branch that can be merged into develop and release

While we take some shortcuts with this model when doing a greenfield site build, we shift to this model close to launch, and maintain it post-launch as we continue to deliver features and fixes for our clients.

One of the great things about this model is that it allows us a large amount of flexibility to continue to deliver work to the integration environment, while still being choosy about what gets to production.  Typically the feature branches are merged into develop when they are pushed to source control, but are only merged into release when we are due to do a production deploy.

TeamCity and Automated Merges

As I’ve mentioned before, TeamCity is a wonderful tool for implementing a CI pipeline – there are other commercial offerings such as Bamboo, as well as open source tools such as Jenkins, that do much of the same work, however TeamCity is the tool I’m most familiar with.

One of the nicest features of TeamCity is the ability, when using Git or Mercurial, is to attempt and automatic merge of one branch into another under certain conditions.  We use this feature to merge a feature branch into our integration branch, which in turn triggers a build to deploy to the integration environment.  This means that a developer is not required to manually merge in their changes, so they are able to focus more on development, and less about managing branches of code.

It is also possible to configure rules around which branches will be merged, and when.  For example, we can configure TeamCity not to merge any branches if it matches the pattern /incomplete-feature/*, therefore if a developer creates a new branch called /incomplete-feature/helloworld, they can get the benefits of getting the code off of their machine, but not have to worry about it being complete enough to maintain the stability of the integration environment.  If there are unit tests within the solution, it is also possible to run those as a precursor to the merge, which can be configured not to happen if some tests fail, thus maintaining the integrity of the integration branch.

 

Takeaways

Continuous integration is a great way to speed up the development cycle, get faster feedback on failed tests and increase overall productivity.  However with Sitecore, this level of continual deployment can come with some disadvantages.  Sitecore can be notoriously slow at starting up after a change to the /bin directory – and if you have an active development team, this problem can be exacerbated to the point where it is impossible for the dev or QA teams to verify anything on the integration environment.

Due to this, we have trialed two different ideologies that have had similar levels of success;

  1. Do not build to your integration environment after every commit, but time them for every 2 hours or so.  This gives enough time for most features to be tested without the site slowing to a crawl, but also provides the developers several opportunities during the day to validate their new code
  2. Do builds more often to your integration environment to enable developers to validate their work, but set up a separate environment to read from the same branch of code, that deploys less often for the QA team to use.  This allows developers to move faster – as they are generally doing high level “happy path” testing – but allows the QA team stability to do their more in depth testing.

We have also found that, as the release and develop branches drift further apart, more merge conflicts are likely to happen.  While these conflicts are not a show stopper, they can interrupt a developers flow if they need to go back to a task to merge it manually, thus reducing productivity.  The best way to prevent those merge conflicts from happening is to keep develop and release (or master) more in sync by merging and deploying to production more often.  If you have embraced automated regression testing in your organization, there are few reasons to hold back on deploying changes more often – more deploys = smaller change sets, so it is easier to identify where a bug crept in, it also means faster ROI for your client on their change.

Sitecore and DevOps: Deploying Sitecore Changes

One of the most challenging things when dealing with an ongoing Sitecore project is how to ensure the correct Sitecore item changes are deployed along with the right code.  Ensuring the correct items are being promoted through environments along with the associated code can be both time consuming, and fraught with disaster if something goes wrong.

As someone who is continually trying to drive out inefficiencies and create repeatable processes, I really struggled with the manual way my company tracked and pushed items between environments!  The process was as follows;

  • Developer makes changes locally and notes all the item changes in the jira issue
  • Deployment engineer takes that list and packages all the items from the integration environment
  • Deployment engineer installs package to upper environment

If all goes well, the same package can be used again for further environments.

NARRATOR: Things rarely went well

Between the human error of not noting all items required, and then not packaging everything in the list, we found we were spending hours a week troubleshooting issues relating to Sitecore items missing from environments.  Worse still was when the same item was modified for multiple unrelated issues, but only one was to be promoted to the next environment.

After some investigation we calculated actual numbers for how long we were spending on these manual processes, and some pretty good estimates on how long we spent trying to fix errors.  The numbers were terrifying – tens of hours a week during busy periods (I work for an agency with some very active clients)!

We use TDS to manage Sitecore item changes, and there is an option within TDS to generate packages based on what has been synced.  However we found that the time to install that package with every deploy became an inconvenience – we wanted fast builds to integration environments to get the shortest feedback loop, and these packages of 1000+ items were not cutting it.

The solution

I will prefix this by saying this solution is not for everyone.  In many cases using the built in functionality of TDS will be sufficient, however we had some specific requirements that lead us down this path.  As with most software development, if we could go back and do it again, perhaps we would do it differently! Maybe we’d even use Unicorn instead, who knows…

In the future I will try to expand on each of these steps with specific blog posts to detail some of the specifics we implemented.

Step 1 – Source Control is the Source Of Truth

In a tightly controlled deployment pipeline, if something is not in Source Control or your Application Lifecycle Management (ALM) tool, it shouldn’t be going anywhere near your production server.  We pulled that same logic back all the way to our integration environment.  This meant that for a Sitecore item change to be made anywhere other than a developer’s local instance, it had to be checked into git (our source control provider of choice).  This wasn’t anything new, but for that very reason, we had an accurate ledger of Sitecore changes we could peek into at any given moment.  This will come in very useful down the road.

Step 2 – Automate everything

It used to be that to deploy code to the integration environment a developer had to merge their code into the integration branch, then RDP into the server, pull the integration branch, open it in Visual Studio then run the appropriate build THEN manually sync TDS items.

Between the time wasted merging, the money wasted on additional licenses and the general opportunity for mistakes (or nefarious “let me just fix this bug directly on the server.. oops I forgot to commit it” moments), it was not a pretty process.

Every single part of this could be automated, it just needed a change in approach.  Enter; TeamCity.  TeamCity is a fantastic tool that is essentially a Build Runner that knows a bunch of stuff, but nothing specific.  By this I mean that it comes with out of the box support for all the major version control systems, it knows about a lot of build runners (think MSBuild, make, NAnt, powershell etc) and it can trigger builds on demand or by triggers.

What we did was configure builds that would, when a developer pushes a change in a feature branch, automatically merge that branch into the integration branch.  That will then fire another build that runs all required compilation (.NET via MSBuild and front end assets via grunt), builds a Sitecore Update package of changed items and combines everything together in one NuGet compatible artifact to send to Octopus deploy.

All this took about a week of two people’s time to build out, test and finalize, but in that time we had eliminated approximately 1.5 hours per day of developer wasted time!

Step 3 – Wait… Building a Sitecore update package?

Yes!  By utilizing some code from the fantastic Unicorn and Sitecore Courier projects, it is possible to read TDS serialized items and generate a Sitecore Update package from them.

Earlier I noted that Source Control becomes the source of truth for Sitecore items, and this is where that comes into play.  We can use the TeamCity REST API to get a list of all the changes made between two builds (which directly correlates to two commits in source control).  That way we can get a definitive list of Sitecore items that have changed, run some logic on those items (to exclude content items that may have been committed, for example) then build an update package.

Step 4 – So you have your artifact, now what?

For the longest time we utilized TeamCity as our deployment tool as well – it does the job, but it’s not really what it is designed for.  Over time we migrated to using a tool called Octopus Deploy to actually push our code into the various environments.  This way we could have a tool with a true audit trail, that was designed to deploy web applications, and did so over secure HTTP connects to remote agents, rather than via UNC path that we had to use for TeamCity.

Our deploys via Octopus as a sequence of powershell scripts that we have custom written to handle some of our specific environment setups.  One of the key steps in this process is to push the Update package to the CM server, install it and publish those items.

To do this, we utilize a tool by an old colleague, Kevin Obee, called Sitecore.Ship.  This acts as a CI helper within Sitecore to do everything we need it to.  We customized the standard version of Ship (thank you, Open Source) to remove some third party dependencies and to be dropped into a running application with as little fuss as possible.

After pushing our update package into Sitecore, we use a JSON manifest that is generated as part of the Update package generation process to inform Ship of what items to publish.  This means we don’t have to do a manual publish, and also that we don’t have to do a full site publish to ensure we get every item that was installed.

 

Summary

This was a high level overview of how we can push changes to our clients environments much quicker, much cheaper, and with lower rates of errors (those errors still happen, but they are generally caused by items not being checked in, and therefore they are found on the integration environment very quickly).  If people are using other solutions to get their item changes up through environments, I’d love to hear about them!