How Does Azure Routing Work - The Legend of Hanuman

How Does Azure Routing Work


Here comes yet another “How does it work” post on Azure networking. I have observed many folks who assume that routing in Azure works one way, but are shocked to learn that there are more layers than they anticipated. In this post, I will explain how routing really works in Azure networking.

Table of Contents

The Misconception

I will start by revisiting a Microsoft diagram that I previously used for a discussion on the importance of routing in network security.

image 14

The challenge with the above architecture is to make traffic flow through the firewall. Most people will answer that User-Defined Routes (UDRs) via Route Tables are required. Yes, that is true. But they fail to understand that two (I would argue three) other sources of routes are also present in this diagram. The lack of that additional knowledge may impact this simple scenario. And I know for certain that if this scenario were the typical mid-large organisation, then the lack of knowledge would become:

  • An operational issue
  • A security issue
  • A troubleshooting issue
  • A connectivity issue

The NIC Is The Router

One of my first posts in this series was “Azure Virtual Networks Do Not Exist“. In that post, I explained that all traffic routes directly from the source NIC to the destination NIC. There is no subnet, no default gateway, and no virtual network. Instead, a virtual network is a mapping of a mesh connectivity between all NICs in that virtual network. When you peer virtual networks, the mapping expands to mesh all NICs in the peered virtual networks.

Where does routing happen if there is no default gateway or subnet? The answer (just like “where are NSG rules processed?” is the NIC is the router.

Remember that everything is a virtual machine, including “serverless computing”, somewhere in the platform.

image 2

If packets travel directly from source to destination, then there is no router appliance between the source and the destination. That means that the source must be its own router.

Some Basic Routing Theory

A route is an instruction: if you want to get to address A then go to place X. X might be the destination, or it might be the first hop to get to the destination.

For example, I might have a remote network of 192.168.0.0/16. I have an Azure App Service that wants to use a site-to-site connection to reach out to a server with an address of 192.168.1.10. A route might say:

  • Prefix: 192.168.0.0/16
  • Next Hop Type: Virtual Network Gateway (VPN or ExpressRoute)

The NIC of the App Service will learn that route (see BGP later). Packets from the App Service will go directly to the NIC(s) of the Virtual Network Gateway and then route over VPN/ExpressRoute to 192.168.1.10.

Maybe I will manipulate that route a little to force egress traffic through a firewall. My firewall will have an internal IP address of 10.0.1.4. I can introduce a route (see User-Defined Routes later) of:

  • Prefix: 192.168.0.0/16
  • Next Hop Type: Virtual Appliance
  • Next Hop IP Address: 10.0.1.4

Now packets to 192.168.1.10 will go to my firewall. It’s important now that the firewall has a route to 192.168.0.0/16 – normally it would by default in a hub & spoke design.

The second piece of knowledge to have is that there must be a route for the response. There is no implied return route. Either a human or the network must implement that return route. And it’s really important that the return route is the same as the egress route; stateful firewalls will block TCP responses when they have not permitted the requests – this is one of those “you’ll learn it the hard way” things when dealing with site-to-site connections and firewalls.

The Laws Of Azure Routing

I will revisit this at the end, but here’s what you need to know when you are designing/troubleshooting routing in Azure:

  1. Route source priority
  2. Longest prefix match

Law 1: Route Source Priority

You might know that User-Defined Routes (UDRs) exist. But there are two (or three) other sources of routes and they each have a priority.

System Routes

The first source of routes that is always there is System (or Default) routes. System routes are created when you create or configure a virtual network. For example, every subnet in a brand-new virtual network has many system routes out of the box. The major routes we are concerned with are:

  • Route(s) to the address prefix(es) of the virtual network to route directly (VirtualNetwork) to the destination NICs.
  • A route to send all other traffic to the Internet (including Azure).

Yes, I am leaving out a bunch of other system routes that are implemented to protect Microsoft 365 from hacking but I want to keep this simple.

Another important System route is what is created when you peer two virtual networks. A route is created in each of the peered virtual networks to state that the next hop to the new neighbour is via peering. This is a human-friendly message; what it means is that the NICs in the connected peer are now part of the local virtual network’s mesh – packets from local NICs will route directly to NICs in the peered virtual network.

BGP Routes

Border Gateway Protocol (BGP) is a mechanism where one routing appliance shares its knowledge of routes with neighbours. For example, a router in Dublin might say “If you want to get to any NICs in Dublin then come to me”. A router in Paris might hear that message and relay it by saying “I know how to get to Dublin so if you want to get to Dublin, come to me”. A router in Munich might pick up that relay from Paris and advertise locally that it knows how to get to Dublin. A PC in Munich wants to send a packet to a NIC in Dublin. The Munich network says that the route to Dublin is via the router in Munich, so the flow of packets will be:

Munich PC > Munich router > Paris router > Dublin Router > Dublin IP NIC

Azure implements BGP in two scenarios:

  • Site-to-site networking
  • Azure Route Server

You must configure BGP when using ExpressRoute for remote site connections. You optionally configure BGP when configuring a BGP tunnel. What most people don’t realise is that you will still have BGP routes with a BGP-less VPN tunnel thanks to the Local Network Gateway which generates BGP routes for the remote site prefixes. In the case of site-to-site networking, BGP routes are propagated from the GatewaySubnet and propagate to all other subnets in the virtual network and (by default) to all peered virtual networks/subnets.

The other scenario is Azure Route Server (ARS), which also includes Virtual WAN, where the router is Azure Route Server – Azure Route Server originated in Virtual WAN. ARS can peer with other appliances, such as a router Network Virtual Appliance (NVA), and share routes with it:

  • Routes of remote connected networks are learned from the NVA and propagated to the Azure hub/spokes. The hub/spokes now know that the route to the remote networks is to use the router as the next hop (not your firewall!).
  • The prefixes of the hub/spokes are shared with the NVA to enable remote networks to know how to get to them.

User-Defined Routes (UDRs)

This is the one kind of route that we can directly manage as Azure architects/administrators/operators. A resource called a Route Table is created. The Route Table is associated with a subnet and applies its settings to all NICs in the subnet. There are two important things we can use the Route Table for:

  • Disable BGP Propagation: We can disable inward BGP route propagation to the associated subnet. This means that we can prevent routes to remote sites from bypassing our firewall by using the Virtual Network Gateway/NVA as the next hop.
  • User-Defined Routes: We can implement routes that force traffic in ways that we want.

UDRs have several possible next hops for packets:

  • Virtual Appliance: A router or firewall – you additionally specify the IP address of the virtual appliance NIC to use.
  • Internet: Including the Internet and Azure
  • Virtual Network Gateway: An Azure site-to-site connection in the virtual network or shared with the virtual network via peering.
  • Virtual Network: Send packets to the same virtual network.
  • None: The packets are dropped at the source NIC and are never transmitted – a useful security feature.

Hidden Programmed Routes

You won’t find this one in any official documentation on routing but it does exist and you’ll learn about them either by accident or by educated observation of behaviour.

Microsoft will sometimes introduce a system route to fix an issue where if you do X, they will program a route to be generated. Unfortunately, this (probably a) type of System route cannot be visibly observed in any way because no diagnostics tools exist for that subnet.

One example of this is Private Endpoint. When you create a subnet, network policies for Private Endpoint are disabled by default. This causes a chain of things to happen:

  • UDRs are ignored by Private Endpoints in the subnet
  • Each Private Endpoint in the subnet will create its own /32 (the IP address of the Private Endpoint is the destination prefix) System Route in the virtual network and directly peered virtual networks. This means that a /32 route for the Private Endpoint is added to the GatewaySubnet of the hub/spoke depending on your design.

That GatewaySubnet System route has broken the spirit of many Azure admins over the years. You can’t see it and, from our perspective, it shouldn’t exist. The result was that traffic from on-premises to Private Endpoints went directly to the Private Endpoint, even if we set up a UDR to force traffic to the spoke virtual network to go via the firewall. This is because of the second law of routing: Longest Prefix Match.

Route Deactivation

We have established that there are three* sources of routes. What happens if two or three of them create routes to the same prefix? That can happen; in fact, you will probably make it happen if you want to force traffic through a firewall.

Let’s imagine a scenario where there are 3 routes to 192.168.0.0/16 from:

What happens? The fabric handles this automatically and applies a prioritisation rule to deactive the routes from lesser sources. The priority is as follows:

  1. UDR: Routes that you explicity create in Azure will deactive routes from BGP & System to the same prefix. UDR beats BGP & System.
  2. BGP: Routes that are created by admins/networks in other locations will deactivate routes from System to the same prefix. BGP beats System.
  3. System: System routes are Azure generated and get beat by BGP and UDR routes to the same prefix.

Let’s consider a simple/common example. We have a virtual network with a subnet. If you want to see this in action, add a VM to the subnet, power it up, open the Azure NIC resource, and go to Effective Routes (wait 30 seconds). Withotu doing anything to the subnet/virtual network a System Route will be created for all NICs in the subnet:

  • Prefix: 0.0.0.0/0
  • Next Hop Type: Internet

What that means is that any traffic that doesn’t have a route will be sent to Internet.

Let’s say that I want to force that traffic through a firewall appliance with an IP address of 10.0.1.4. I can associate a new Route Table to the subnet and add a UDR to the subnet:

  • Prefix: 0.0.0.0/0
  • Next Hop Type: Virtual Appliance
  • Next Hop IP Address: 10.0.1.4

Two routes to 0.0.0.0/0 are present. Which one will be used? That decision is already made. The System route to 0.0.0.0/0 is automatically deactivated by the fabric as soon as a higher (BGP or UDR) route is added to the subnet. The only active route to 0.0.0.0/0 in that subnet is my UDR via the firewall.

Law 2: Longest Prefix Match

There is another scenario where there may be multiple route options. A packet might be destined to an IP address and multiple active routes might be applicable. In this case, Azure applies “Longest Prefix Match” – you can think of it as the best matching route. This one is best explained with an example.

Let’s say a packet is going 10.10.10.4. However, the source NIC has 3 possible routes that could apply:

  • System: 0.0.0.0/ via Internet
  • BGP: 10.10.10.0/24 via Virtual Network Gateway
  • UDR: 10.0.0.0/8 via a firewall

All of the routes are active because the prefixes are different. Which one is chosen? Tip: Route priority (UDR/BGP/System) is irrelevant now.

I don’t know the internal mechanics of this but I suspect that an AND operation is done using the destination address and the route prefix. Remember that each octet in a 32 bit IP address is 8 bits:

Here is the calculation for the System route, which sums to 0 bits:

Route Prefix 0 0 0 0
Destination 10 10 10 4
AND Bits 0 0 0 0

Here is the calculation for the BGP route, which sums to 24 bits:

Route Prefix 10 10 10 0
Destination 10 10 10 4
AND Bits 8 8 8 0

Here is the calculation for the UDR route, which sums to 8 bits:

Route Prefix 10 0 0 0
Destination 10 10 10 4
AND Bits 8 0 0 0

Which route is the best match? The BGP route is because it has the longest prefix match to the destination IP address.

Review: The Laws of Azure Routing

Now you’ve learned how Azure routes are generated, how they are prioritised, and how they are chosen when a packet is sent. Let’s summarise the laws of Azure routing:

  1. Route Source Priority: When there are routes to the same prefix, BGP beats Sytem, and UDR beats BGP & System.
  2. Longest Prefix Match: When multiple routes can be used to send a packet to a destination, the route with the longest bit match will be selected.
  3. It’s Always DNS: Ask any Windows admin – when routing isn’t the cause of issues, then it’s DNS 🙂




Share this content:

I am a passionate blogger with extensive experience in web design. As a seasoned YouTube SEO expert, I have helped numerous creators optimize their content for maximum visibility.

Leave a Comment