Redundancy in Ethernets

EPICS Collaboration Meeting

April 29,1998

PI: E. W. Kamen

Grad Students: Payam Jahromi, Simon Singh

School of Electrical & Computer Engr.

Georgia Institute of Technology

Atlanta, GA 30332-0250

Phone: (404)-894-2994

Fax: 404-894-4641

E-mail: kamen@ee.gatech.edu

Project Objective

Develop a model-based approach for evaluating the reliability and performance of computer networks for information and control applications. Also developing tools and/or rules for the design of networks so that:

1. a desired degree of robustness to network component failures is achieved.
2. excessive communication delays cannot occur between critical nodes

We are also interested in the application to accelerators, and in particular, APT.

In this talk we focus on the design of the network so that (1) is achieved.

Is Redundancy Needed?

Consider a switched Ethernet:

Suppose that there is a single path between every critical node, and the total number of switches in the paths is equal to Q.

It is assumed that for each switch, the probability R(t) that the switch is operational over the time interval 0 to t is given by

where 1/l is the mean-time-to-failure (MMTF).

Then the probability that all Q switches are operational over the time interval 0 to t is given by

and the MTTF for the first switch to fail in the collection of Q switches is 1/l Q.

The large decrease in the MTTF (from 1/l to 1/Ql ) is due to the large variance that is implicit in the exponential reliability function. A key question is the variance of the time to failure of a switch.

As an example, the MTTF of the 3Com 2700 Switch is 174,016 hours = 19.86 years.

If there are 50 switches, the MTTF of the first switch to fail in the collection is 174,016/50 = 3480.32 hours = 0.4 year.

We would like to acquire data on the MTFF of network components in existing accelerator systems.

The MTTF of 0.4 year for the 50-switch network definitely motivates the use of redundancy.

Suppose that the degree of robustness we want is "full redundancy" between all critical nodes; that is, failure of a link or a switch between two critical nodes does not disrupt communication between the nodes.

This implies that there must be a secondary path between nodes.

A major issue in achieving redundancy in Ethernet is the constraint that there can be no active multiple paths (loops) in sending frames one node to another.

Active loops will result in:

 Frame cloning
 Learning problems

The basic problem is that multiple active paths can result in frames being duplicated as a message circulates, and recirculates, through the multiple paths.

Hence, in Ethernet it is NOT POSSIBLE to have more than one operational path between a pair of nodes at any one time.

Such is not the case in ATM.

As shown below, A/B switches could be used to switch paths in and out when a fault is detected:

However, this configuration does not yield full redundancy since communications will be disrupted if an A/B switch fails.

Also, the use of A/B switches decreases the reliability of the overall network.

Alternate approach: Use two-port NICs and switches that support the Spanning Tree Algorithm and Protocol (IEEE 802.1d, first issued in May 1990).

Using the Spanning Tree Algorithm, switches "learn" the multiple paths, and then eliminate them by disabling switch ports to form a tree (spanning tree).

Then if a failure is detected, an alternate path is enabled.

A two-level three-switch configuration is illustrated below:

This configuration provides full redundancy from node-to-node (a NIC is viewed as being part of the node).

The switches and multi-port NICs are available from vendors.

Rules for Achieving Full Redundancy

Use a hierarchical (tree) topology with

Level 1 = backbone switch

Level 2 = switching hubs

Level 3 = switching hubs

M

End nodes may be connected to switches at any level. Switches in Levels 2,3,… are connected to a switch in the level above.

This results in a standard tree configuration with no redundancy.

Rules for achieving full redundancy:

1. Every critical node with a two-port NIC is connected to two separate switches in the same level.
2. All switches in Level 2 are connected together to form a string.
3. Every switch in a level (>2) is connected to two separate switches in the level above.

In this configuration, the backbone switch is not duplicated. If it fails, the string connection of switches in Level 2 takes over.

This may be acceptable for temporary operation until the backbone switch is replaced.

There are many possible redundant configurations.

For example, we could add an alternate backbone switch, with every switch in Level 2 connected to both the primary and alternate backbones.

In this case, the switches in Level 2 do not have to be connected together to form a ring.

We are now studying various redundant configurations, with the goal to determine configurations that are "optimal" in the sense of maximizing robustness to link or switch failures.

Redundancy with Repeaters

An IEEE standard for redundancy with repeaters was released in May 1997: 802.12 Supplements to Demand Priority Access Method, Physical Layer and Repeater Specification.

It is based on using dual uplinks on repeaters that are connected to separate repeaters in the next level. End nodes are equipped with a second NIC to which a redundant link is attached. The manner in which reception of duplicate packets is resolved is not part of the standard.

The standard does not appear to be implemented as of yet.

May be used to realize the device network below the input/output controllers.

| LANL | Lansce | UC | DOE |

L O S   A L A M O S   N A T I O N A L    L A B O R A T O R Y
Operated by the University of California for the US Department of Energy