Cannot use the SKU Basic with File Change Audit for site

The problem

We’ve recently begun attempting to scale our Azure App Services up and down for our test environments; scaling them up to match production performance levels (SKU: PremiumV2) during the day and then back down to minimal (SKU: Basic) at the end of the working day to save on costs. Just in the last month or two however, we’ve started to get the error “Cannot use the SKU Basic with File Change Audit for site XXX-XXX-XXX-XXX”.

Initially, we thought it had to do with the fact that we had a Diagnostic setting that was tracking AppServiceFileAuditLogs, but even after removing that Diagnostic setting before attempting the scale down, the issue persisted.

After banging our head against the walls with no progress being made, we opened a low-severity ticket with Azure Support to understand what was going on. They suggested we remove the following App Configuration settings:

  • DIAGNOSTICS_AZUREBLOBCONTAINERSASURL
  • DIAGNOSTICS_AZUREBLOBRETENTIONINDAYS
  • DiagnosticServices_EXTENSION_VERSION
  • WEBSITE_HTTPLOGGING_CONTAINER_URL
  • WEBSITE_HTTPLOGGING_RETENTION_DAYS

Again, these did not have any effect.

It was at this time that I was in the portal browsing around for something else and happened to notice the “JSON View” option at the top right of the App Service so I checked it out and grep’d for audit just to see what I’d find.

Bingo: fileChangeAuditEnabled

Not so bingo: fileChangeAuditEnabled: null

But seeing that setting got me to thinking. What if there’s a bug in what JSON View is showing. The error we’re receiving is clearly saying it’s enabled, but the website is showing null; what if there’s some kind of type-mismatch going on behind the portal that is showing null but actually has a setting? What if we could use a different mechanism to test that theory?

Well, it just so happens that Azure PowerShell has a Get-AzResource function that can do just that and this blog post shows us how to do that.

The solution

First, let’s get the resource:

$appServiceRG = "example-resource-group"
$appServiceName = "example-app-service-name"
$config = Get-AzResource -ResourceGroupName $appServiceRG `
    -ResourceType "Microsoft.Web/sites/config" `
    -ResourceName "$($appServiceName)/web" `
    -apiVersion 2016-08-01

We now have an object in $config that we can now check the properties of by doing:

$config.Properties

And there it is:

fileChangeAuditEnabled                 : True

Now all we need to do is configure it to false (and also unset a property called ReservedInstanceCount which is a duplicate of preWarmedInstanceCount but cannot be included when we try and reset the other setting due to what I assume is Azure just keeping it around for legacy reasons):

$config.Properties.fileChangeAuditEnabled = "false"
$config.Properties.PSObject.Properties.Remove('ReservedInstanceCount')

Next, as per the suggestion from Parameswaran in the comments (thank you!), create a new Array (since existing arrays are of fixed size and cannot be modified) while removing AppServiceFileAuditLogs from the list of azureMonitorLogCategories

$newCategories = @()

ForEach ($entry in $config.Properties.azureMonitorLogCategories) {
    If ($entry -ne "AppServiceFileAuditLogs") {
        $newCategories += $entry
    }
}

$config.Properties.azureMonitorLogCategories = $newCategories

And finally, let’s set the setting:

$config | Set-AzResource -Force

Next, for any Deployment Slots you have on this resource, repeat these steps again, but using the following resource retrieval code:

$config = Get-AzResource -ResourceGroupName $appServiceRG `
    -ResourceType "Microsoft.Web/sites/slots" `
    -ResourceName "$($appServiceName)" `
    -apiVersion 2016-08-01

Now, when you try to scale down from a PremiumV2 SKU to a Basic SKU, you will no longer receive the error of “Cannot use the SKU Basic with File Change Audit for site XXX-XXX-XXX-XXX”.

Deploying an Azure App Service from scratch, including DNS and TLS

As many of you have probably gathered, over the past few weeks, I’ve been working on building a process for deploying an Azure App Service from scratch, including DNS and TLS in a single Terraform module.

Today, I write this post with success in my heart, and at the bottom, I provide copies of the necessary files for your own usage.

One of the biggest hurdles I faced was trying to integrate Cloudflare’s CDN services with Azure’s Custom Domain verification. Typically, I’ll rely on the options available in the GUI as the inclusive list of “things I can do” so up until now, if we wanted to stand up a multi-region App Service, we had to do the following:

  1. Build and deploy the App Service, using the azurewebsites.net hostname for HTTPS for each region (R1 and R2)

    e.g. example-app-eastus.azurewebsites.net (R1), example-app-westus.azurewebsites.net (R2)
  2. Create the CNAME record for the service at Cloudflare pointing at R1, turning off proxying (orange cloud off)

    e.g. example-app.domain.com -> example-app-eastus.azurewebsites.net
  3. Add the Custom Domain on R1, using the CNAME verification method
  4. Once the hostname is verified, go back to Cloudflare and update the CNAME record for the service to point to R2

    e.g. example-app.domain.com -> example-app-westus.azurewebsites.net
  5. Add the Custom Domain on R2, using the CNAME verification method
  6. Once the hostname is verified, go back to Cloudflare and update the CNAME record for the service to point to the Traffic Manager, and also turn on proxying (orange cloud on)

While this eventually accomplishes the task, the failure mode it introduces is that if you ever want to add a third (or fourth or fifth…) region, you temporarily have to not only direct all traffic to your brand new single instance momentarily to verify the domain, but you also have to turn off proxying, exposing the fact that you are using Azure (bad OPSEC).

After doing some digging however, I came across a Microsoft document that explains that there is a way to add a TXT record which you can use to verify ownership of the domain without a bunch of messing around with the original record you’re dealing with.

This is great because we can just add new awverify records for each region and Azure will trust we own them, but Terraform introduces a new wrinkle in that it creates the record at Cloudflare so fast that Cloudflare’s infrastructure often doesn’t have time to replicate the new entry across their fleet before you attempt the verification, which means that the lookup will fail and Terraform will die.

To get around this, we added a null_resource that just executes a 30 second sleep to allow time for the record to propagate through Cloudflare’s network before attempting the lookup.

I’ve put together a copy of our Terraform modules for your perusal and usage:

Using this module will allow you to easily deploy all of your micro-services in a Highly Available configuration by utilizing multiple regions.

Using a certificate stored in Key Vault in an Azure App Service

For the last two days, I’ve been trying to deploy some new microservices using a certificate stored in Key Vault in an Azure App Service. By now, you’ve probably figured out that we love them around here. I’ve also been slamming my head against the wall because of some not-well-documented functionality about granting permissions to the Key Vault.

As a quick primer, here’s the basics of what I was trying to do:

resource "azurerm_app_service" "centralus-app-service" {
   name                = "${var.service-name}-centralus-app-service-${var.environment_name}"
   location            = "${azurerm_resource_group.centralus-rg.location}"
   resource_group_name = "${azurerm_resource_group.centralus-rg.name}"
   app_service_plan_id = "${azurerm_app_service_plan.centralus-app-service-plan.id}"

   identity {
     type = "SystemAssigned"
   }
 }

data "azurerm_key_vault" "cert" {
   name                = "${var.key-vault-name}"
   resource_group_name = "${var.key-vault-rg}"
 }
resource "azurerm_key_vault_access_policy" "centralus" {
   key_vault_id = "${data.azurerm_key_vault.cert.id}"
   tenant_id = "${azurerm_app_service.centralus-app-service.identity.0.tenant_id}"
   object_id = "${azurerm_app_service.centralus-app-service.identity.0.principal_id}"
   secret_permissions = [
     "get"
   ]
   certificate_permissions = [
     "get"
   ]
 }
resource "azurerm_app_service_certificate" "centralus" {
   name                = "${local.full_service_name}-cert"
   resource_group_name = "${azurerm_resource_group.centralus-rg.name}"
   location            = "${azurerm_resource_group.centralus-rg.location}"
   key_vault_secret_id = "${var.key-vault-secret-id}"
   depends_on          = [azurerm_key_vault_access_policy.centralus]
 }

and these are the relevant values I was passing into the module:

  key-vault-secret-id       = "https://example-keyvault.vault.azure.net/secrets/cert/0d599f0ec05c3bda8c3b8a68c32a1b47"
  key-vault-rg              = "example-keyvault"
  key-vault-name            = "example-keyvault"

But no matter what I did, I kept bumping up against this error:

Error: Error creating/updating App Service Certificate "example-app-dev-cert" (Resource Group "example-app-centralus-rg-dev"): web.CertificatesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="The service does not have access to '/subscriptions/[SUBSCRIPTIONID]/resourcegroups/example-keyvault/providers/microsoft.keyvault/vaults/example-keyvault' Key Vault. Please make sure that you have granted necessary permissions to the service to perform the request operation." Details=[{"Message":"The service does not have access to '/subscriptions/[SUBSCRIPTIONID]/resourcegroups/example-keyvault/providers/microsoft.keyvault/vaults/example-keyvault' Key Vault. Please make sure that you have granted necessary permissions to the service to perform the request operation."},{"Code":"BadRequest"},{"ErrorEntity":{"Code":"BadRequest","ExtendedCode":"59716","Message":"The service does not have access to '/subscriptions/[SUBSCRIPTIONID]/resourcegroups/example-keyvault/providers/microsoft.keyvault/vaults/example-keyvault' Key Vault. Please make sure that you have granted necessary permissions to the service to perform the request operation.","MessageTemplate":"The service does not have access to '{0}' Key Vault. Please make sure that you have granted necessary permissions to the service to perform the request operation.","Parameters":["/subscriptions/[SUBSCRIPTIONID]/resourcegroups/example-keyvault/providers/microsoft.keyvault/vaults/example-keyvault"]}}]

I checked and re-checked and triple-checked and had colleagues check, but no matter what I did, it kept puking with this permissions issue. I confirmed that the App Service’s identity was being provided and saved, but nothing seemed to work.

Then I found this blog post from 2016 talking about a magic Service Principal (or more specifically, a Resource Principal) that requires access to the Key Vault too. All I did was add the following resource with the magic SP, and everything worked perfectly.

resource "azurerm_key_vault_access_policy" "azure-app-service" {
   key_vault_id = "${data.azurerm_key_vault.cert.id}"
   tenant_id = "${azurerm_app_service.centralus-app-service.identity.0.tenant_id}"

   # This object is the Microsoft Azure Web App Service magic SP 
   # as per https://azure.github.io/AppService/2016/05/24/Deploying-Azure-Web-App-Certificate-through-Key-Vault.html
   object_id = "abfa0a7c-a6b6-4736-8310-5855508787cd" 

   secret_permissions = [
     "get"
   ]

   certificate_permissions = [
     "get"
   ]
 }

It’s frustrating that Microsoft hasn’t documented this piece (at least officially), but hopefully with this knowledge, you’ll be able to automate using a certificate stored in Key Vault in your next Azure App Service.

Posts navigation