How does Riverbed Steelhead Auto Discovery work?

Riverbed Steelhead devices have a method they use to find each other on the network. Riverbed has named this Enhanced Auto Discovery. This is intended to reduce time to deployment and simplify the configuration on the devices. The core of this method uses setting Options in the TCP headers within the initial 3 way handshake. There are a few concepts to go over to fully understand the process of Steelhead Auto Discovery.

The Steelhead is a Layer 2 Bridge

Every Riverbed is a layer 2 bridge, for traffic to enter the optimization engine it must be bridged through the Steelhead. In the appliance there are 2 interfaces that make of the bridge, they are named the LAN and WAN interfaces.  The LAN interface connects the network where the client machines, or server machines are located. The WAN interfaces connects to the external network where the router for the VPN, MPLS, P2P, 3G Radio, or whatever medium may be in use resides.

These interfaces together are called the IN PATH interfaces. In a Cisco device that is doing IRB style bridging the IN PATH interface would be similar to a BVI interface. The IN PATH interface is required to have an IP address assigned to it, this is the IP Steelheads use to communicate to each other on. For example Inner Channel is negotiated between IP addresses of IN PATH interfaces, this is one reason that IP reachability between the IN PATH IP addresses is important.

Clients and Servers

There are a couple of designations that help to clarity which devices initiate which connections. The Client Steelhead (CSH) is the Steelhead that receives the first Naked SYN from the client machine initiating the connection. A Server Steelhead (SSH) is a Steelhead that receives the SYN+ from a Client Steelhead and is the Steelhead closest to the destination server.

The Channels

Channels denote areas where specific devices communicate with each other, and  where traffic is optimized or not optimized. There are 2 separate Outer Channels and a single Inner Channel.

Screen Shot 2013-08-18 at 10.38.49 PM

Outer Channel (local) – This channel is between the Steelhead LAN interface and the client machine sourcing of the Naked SYN. This may be a branch workstation for example.

Outer Channel (Remote) – This is the channel between the Steelhead LAN interface and the server machine that the initial SYN was intended to be received by. This may be some sort of application or file server at a data center for example.

Inner Channel – This is the network between the WAN interfaces of the Steelheads. This is a TCP connection between the IN PATH IP address where all optimized traffic between Steelheads passes.  The default for used for this connection is TCP 7800.

 The Process

Now on to the steps used for the Steelhead devices to find each other. There are 7 main steps involved, in the following example you are initiating a TCP connection from a Client workstation to a Server to copy a file.

Screen Shot 2013-08-18 at 10.54.42 PM

 

  1. The client machine sends a Naked SYN packet, this is a SYN with out the PROBE TCP option set. This SYN packet is received on the LAN interface of the CSH and is intercepted by the CSH. At this stage the Steelhead checks licenses to make sure this traffic is entitled, if it is not the traffic is just passed through unchanged. If all is well the CSH will add the TCP header Option number 76. This option header is named AUTO-DISCOVERY PROBE and will include such information as the IN PATH IP address and what role this Steelhead is assuming. A packet with an TCP option is noted by a ‘+’ in documentation, for example SYN+. After the header is set the packet is forwarded out the WAN interface.
  2. This SYN+ is received by the Steelhead that will become the SSH (in this example) on the WAN interface. Again at this stage licenses on this Steelhead are checked to make sure this traffic is entitled, if not the traffic is just passed through untouched. If the SYN does not for the option headers set it is also passed through untouched.
  3. Since this packets is a SYN+ the Steelhead then sends a SYN/ACK+ with FWD Negotiation option back towards Client machine. This SYN/ACK+ is then intercepted by the CSH on the return path.
  4. The Steelhead near the Server machine starts a negotiates 3 way handshake with Server machine. If this Steelhead does not encounter another Steelhead between itself and the Server machine it will assume the role of SSH and complete an 3 way handshake with the Server machine. If a Steelhead is encountered this process repeats down the line towards the server.
  5. The newly declared SSH will then send a SYN/ACK+ with PROBE RESPONSE option in headers towards CSH.
  6. The SYN/ACK+ is intercepted by the CSH and the CSH initiates the Inner channel between the CSH to the SSH.
  7. Once the Inner Channel is formed the CSH completes the 3 way between itself and the Client machine.

Now that the above steps are completed the traffic between the Client machines and Server machines is on its way and the Steelhead is transparently (to the client and server) doing its optimization magic. All of the traffic in this flow, after it has been optimized is transmitted, over the Inner Channel. If a new traffic flow needs to communicate between the Client and Server the process starts again for that new traffic flow and this is the case for every TCP connection that occurs from site to site.

Where Trouble can Occur

There are a few basic items that can easily stop this process from occurring, therefor blocking optimization from occurring.

  • If there is something strips the Option headers out of TCP packets such as a firewall or IDS.
  • The IN PATH interfaces do not have Layer 3 reachability between each other. This will prevent the Inner Channel from forming.
  • The traffic does not follow from the LAN to WAN and WAN to LAN interfaces on both the SSH and CSH.
  • The boxes are improperly licensed or the amount of traffic/tcp session is exceeding the license

Over all this is a pretty simple process and not to different from how other vendors handle things such as Cisco WAAS. Even though they use technologies such as WCCPv2. There are other scenarios that  such as Virtual In Path and Server Side Out of Path that are a bit different, but this is the Riverbed recommended way of doing things.