As many of you have probably gathered, over the past few weeks, I’ve been working on building a process for deploying an Azure App Service from scratch, including DNS and TLS in a single Terraform module.
Today, I write this post with success in my heart, and at the bottom, I provide copies of the necessary files for your own usage.
One of the biggest hurdles I faced was trying to integrate Cloudflare’s CDN services with Azure’s Custom Domain verification. Typically, I’ll rely on the options available in the GUI as the inclusive list of “things I can do” so up until now, if we wanted to stand up a multi-region App Service, we had to do the following:
- Build and deploy the App Service, using the
azurewebsites.net hostname for HTTPS for each region (R1 and R2)
- Create the CNAME record for the service at Cloudflare pointing at R1, turning off proxying (orange cloud off)
example-app.domain.com -> example-app-eastus.azurewebsites.net
- Add the Custom Domain on R1, using the CNAME verification method
- Once the hostname is verified, go back to Cloudflare and update the CNAME record for the service to point to R2
example-app.domain.com -> example-app-westus.azurewebsites.net
- Add the Custom Domain on R2, using the CNAME verification method
- Once the hostname is verified, go back to Cloudflare and update the CNAME record for the service to point to the Traffic Manager, and also turn on proxying (orange cloud on)
While this eventually accomplishes the task, the failure mode it introduces is that if you ever want to add a third (or fourth or fifth…) region, you temporarily have to not only direct all traffic to your brand new single instance momentarily to verify the domain, but you also have to turn off proxying, exposing the fact that you are using Azure (bad OPSEC).
After doing some digging however, I came across a Microsoft document that explains that there is a way to add a TXT record which you can use to verify ownership of the domain without a bunch of messing around with the original record you’re dealing with.
This is great because we can just add new awverify records for each region and Azure will trust we own them, but Terraform introduces a new wrinkle in that it creates the record at Cloudflare so fast that Cloudflare’s infrastructure often doesn’t have time to replicate the new entry across their fleet before you attempt the verification, which means that the lookup will fail and Terraform will die.
To get around this, we added a null_resource that just executes a 30 second sleep to allow time for the record to propagate through Cloudflare’s network before attempting the lookup.
I’ve put together a copy of our Terraform modules for your perusal and usage:
Using this module will allow you to easily deploy all of your micro-services in a Highly Available configuration by utilizing multiple regions.