Restoring Standardized Backups

As I’ve mentioned before, enabling scrum teams to be self-sufficient is vital to increasing velocity – if there is a need for a new environment, it should be a trivial task to get one created, not a red-tape filled nightmare where knowledge is centralized on a handful of people.  However, it is also unrealistic to believe that all developers will have all the knowledge to complete this task – one would have to be familiar with IIS, SQL, DNS, whatever cloud offering is being used (if any) not to mention how all these elements fit together to be able to troubleshoot something not working.

Fortunately, with a little standardization and access to a Powershell prompt, it is possible to automate almost all of the steps required.  In this post I’ll go over the main parts of what is required to get a new  Sitecore site up configured and running.

Breaking down the entire process, we will need to do the following things (at least, there may be more for your specific scenario):

  • Create the website folders under inetpub and setting permissions
  • Create the website definition in IIS
  • Restore the database backups
  • Update connection strings
  • Apply a patch file to update the data folder

Creating website folders

New-Item -ItemType "Directory" -Path "$inetpubRoot\$siteName"

$Acl = Get-Acl "$inetpubRoot\$siteName"
$Ar = New-Object System.Security.AccessControl.FileSystemAccessRule("BUILTIN\IIS_IUSRS", "FullControl", "ContainerInherit,ObjectInherit", "None", "Allow")

$Acl.SetAccessRule($Ar)
Set-Acl "$inetpubRoot\$siteName" $Acl

if((Test-Path "$inetpubRoot\$siteName\Website") -eq $false) {
     New-Item -ItemType "Directory" -Path "$inetpubRoot\$siteName\Website"
}
if((Test-Path "$inetpubRoot\$siteName\Data") -eq $false) {
     New-Item -ItemType "Directory" -Path "$inetpubRoot\$siteName\Data"
}

In the code above there are 2 pre-defined variables – $inetbubRoot, which is that path to where you want the website created – C:\inetpub\wwwroot when working locally – and $siteName, which is the folder you want created.

There are also a couple of lines relating to giving the IIS_IUSRS built in account Full Control over the folder we just created.  Full Control gives Sitecore the ability to create folders required, as well as creating log and index files (among many others).

Creating IIS definitions

New-Item IIS:\AppPools\$siteName -Force 
New-Item IIS:\Sites\$siteName -bindings @{protocol="http";bindingInformation="*:80:$siteName"} -physicalPath "$inetpubRoot\$siteName\Website" -Force 
Set-ItemProperty IIS:\Sites\$siteName -name applicationPool -value $siteName -Force

Here we make use of some IIS Powershell cmdlets to create a new application pool, create a new site definition and bind the desired hostname (also defined by the $siteName variable for consistency between folder structure and IIS), and finally associate the site definition to the application pool.

Restoring database backups

This step is likely to be quite specific to your particular setup.  In this example, we are restoring .bacpac files to Azure SQL PaaS, however you may be restoring to an On Prem instance of SQL Server, or restoring .bak files.  You could even take this further and, if using Azure SQL, could associate the restored database with an Elastic Pool.

 if((Get-AzureRmSqlDatabase -ResourceGroupName $resourceGroupName -ServerName $sqlServer | Where-Object {$_.DatabaseName -eq $dbName}).count -eq 1) {
     Remove-AzureRmSqlDatabase -ResourceGroupName $resourceGroupName -ServerName $sqlServer -Databasename $dbName -Force | Out-Null
}
New-AzureRmSqlDatabaseImport -ResourceGroupName $resourceGroupName -ServerName $sqlServer -DatabaseName $dbName -StorageKey "" -StorageKeyType "StorageAccessKey" -StorageUri $path -Edition Premium -ServiceObjectiveName P4 -DatabaseMaxSizeBytes 300000000 -AdministratorLogin  -AdministratorLoginPassword

Working with Azure is a simple task most of the time, and restoring backups is no exception, albeit with one quirk – it is not possible to overwrite a database, you have to remove it and re-import.  The code above will get a list of all databases on the provided server and check if you’re trying to overwrite something that already exists.  If it does, it will remove it first, then move on to the import.

Updating connection strings

This is probably one of the quirkiest parts of the restore process – it requires that you pass the password of your SQL user in as plain text, so it can be placed in the connection string, and managing the different types of connection strings (‘vanilla’ sql, Entity Framework, mongo) can also be a challenge.

Part of the solution to managing different connection strings is this block of Powershell

if($currentValue -match "^User Id\=")
{
     #Set standard connection string
}


if($currentValue -match "^mongodb:")
{
     #Set mongo connection string 
}

if($currentValue -match "^metadata\=res:")
{
     #Set Entity Framework connection string
}

Again, standardization of your database names can really help here to link the connection string node to the database.

Patch file

The final piece of the puzzle to restoring a Sitecore instance is to create a patch file that contains the new data folder, and potentially setting a hostName attribute for the default entry.  Again, depending on your specific setup, this may be harder to accomplish, but taking a simple Sitecore instance with one site defined, we can use a template patch file and a few lines of Powershell to complete the task.

$xml = [xml]$devserverxml
$ns = New-Object System.Xml.XmlNamespaceManager($xml.NameTable)
$ns.AddNamespace('patch','http://www.sitecore.net/xmlconfig/')
$nodes = $xml.SelectNodes("/configuration/sitecore/sc.variable[@name='dataFolder']/patch:attribute",$ns)
foreach($node in $nodes) {
     $node.InnerText = $dataFolder
}
$nodes = $xml.SelectNodes("/configuration/sitecore/sites/site[@name='website']/*")
foreach($node in $nodes) {
     $node.InnerText = $hostname
}
$xml.save($path)

This will load a file where the path is defined in $devserverxml into an XML object that can be traversed and updated.

 

Hopefully this article has helped as a starting point to automate some of the tedious tasks we face as developers.  As time goes on I’ll add new posts with more examples of how we’ve tackled some of the more inconvenient automation problems

Sitecore Backup Scripts

When working with any system one of the biggest challenges is having quick and simple access to production data.  This is even more significant when developing for a CMS, as the content is constantly changing.  Having recent backups available is vital for many reasons, such as having an accurate test environment for new features, being able to reproduce a bug found in production or even just setting up a new local instance for development.

Automation is key in the modern IT world – any repetitive task that takes a significant amount of time or effort is a candidate for automation.  IT Ops teams are invariably ahead of development teams with this as they need to provide backups of production systems for disaster recovery scenarios, so taking their knowledge and applying it to a developer’s problem seems appropriate.

What we have developed is a standardized approach to backups for all sites;

  • Create archives of the website and data folders, excluding unnecessary files
  • Use the SqlPackage.exe application to back up databases to .bacpac files
  • Store these in dedicated containers in Azure blob storage

This results in a discrete container of all the required fields to create a new running instance.  This standardization means it is possible to write scripts that are generic enough to get the backed up data and restore it across any number of completely independent sites, thus increasing productivity.

When backing up the Website and Data folders, there is a lot of ‘runtime’ data this isn’t required for a clean restore – logs, diagnostics, App_Data and temp combined can run up many GBs of transient data that adds bloat to a backup.  The sitecore_analytics_index can also grow to a massive size, and could be excluded if not required.

SqlPackage.exe is a utility that is shipped with many different products, including Visual Studio and SQL Server, as well as part of stand-alone utilities.  For a cloud-centric company, SqlPackage is an indispensable utility that enables the automation of moving data between on premises SQL Server and Azure SQL PaaS.  Simply passing a connection string, a filename and some basic parameters, is all it takes to export a database, and with access to the ConnectionStrings.config file in your sitecore solution, everything is right there for. In fact, getting an array of connection strings is a fairly trivial snippet of Powershell;

$connectionstrings = @()
$connectionStringsFilePath = Join-Path -Path $webSiteRoot -ChildPath "Website\App_Config\ConnectionStrings.config"
$xml = New-Object System.Xml.XmlDocument
[xml] $xml = Get-Content $connectionStringsFilePath
$xml.SelectNodes("/connectionStrings/add") | % { 
if(-not($_.connectionString.StartsWith("mongo"))) { return $_ } } | % {
     $connectionstrings += $_.connectionString
}

 You can then take this array and use it either with SqlPackage.exe or raw T-SQL to create the required backups and transfer them to a centralized repository.

Among other things to consider are;

  • Do you need to transfer the analytics data from Mongo?
  • Do you really need to transfer the reporting databases? Is this just more bloat to transfer and store?
  • How much are your bandwidth costs? If your VMs and Storage are within Azure, you won’t pay for data transfer, but if you straddle multiple cloud providers, you may get billed pretty heavily – remember to weigh the cost/benefit of having regular backups.