# PUTTING SWITCHED FABRIC TO WORK FOR SOFTWARE RADIO

Rodger H. Hosking (Pentek, Inc., Upper Saddle River, NJ, USA, rodger@pentek.com)

# ABSTRACT

The most difficult problem for designers of highperformance, software radio systems is simply moving data within the system because of data throughput limitations. Driving this dilemma are processors with higher clock rates and wider buses, data converter products with higher sampling rates, more complex digital communication standards with increased bandwidths, disk storage devices with faster I/O rates, FPGAs and DSPs offering incredible computational rates, and system connections and network links operating at higher speeds.

Traditional system architectures relying on buses and parallel connections between system boards and mezzanines fall far short of delivering the required peak rates, and suffer even worse if they must be shared and arbitrated. New strategies for solving these problems exploit gigabit serial links and switched fabric standards to create significantly more powerful architectures ideally suited for embedded software radio systems.

### **1. INTRODUCTION**

Software radio systems continually benefit from technology developed for consumer electronics, personal computers, IT infrastructure, and telecom systems. These very competitive markets place high value on price, features, and performance delivered to the customer and care little about details of hardware "under the hood."

To fuel these markets, silicon vendors have developed higher density processors, memories, peripheral interfaces, and multi-media interfaces. The most effective way to connect these functions has shifted strongly towards gigabit serial links, so these new devices come fully equipped with native gigabit serial interfaces.

Abundant evidence of this transition can be found in mass market PCs, which now use PCI Express for motherboard traffic and expansion cards, and serial ATA disk drives for mass storage. These serial interfaces require far fewer signal traces than the traditional parallel PCI buses they replace. This reduces the density of printed circuit boards and results in smaller diameter cables with more compact connectors. At the same time, data rates through these new serial links are faster than their parallel predecessors. In order to take advantage of the wealth of highvolume, low-cost devices for mass-market electronics, and to reap the same benefits of easier connectivity, even the most powerful high-end software radio RISC and DSP processors from Freescale and Texas Instruments are now sporting gigabit serial interfaces.

## 2. GIGABIT SERIAL STANDARDS

The descriptive phrase "gigabit serial" covers a truly diverse range of implementations and application spaces. Figure 1 shows most of the popular standards used in embedded systems suitable for software radio, along with how each standard is normally deployed in a system.

| Standard         | Main Application        |
|------------------|-------------------------|
| Gigabit Ethernet | Computer Networking     |
| FibreChannel     | Data Storage            |
| PCI Express      | Peripheral Interconnect |
| Serial RapidIO   | System Interconnect     |
| Aurora           | Streaming Data          |
| Serial ATA       | Data Storage            |
| Infiniband       | System Interconnect     |
| Hypertransport   | Peripheral Interconnect |

Figure 1. Popular Gigabit Serial Standards

All of these standards are endowed with numerous subspecifications that define the physical layer, the cable medium, data rates, and system topologies. The venerable Ethernet heads the list as the ubiquitous networking link, originally operating at 10 MHz rates, but now commonly running at 1 GHz (1 GigE). Next generation 10 GHz devices are already appearing with 100 GHz not far behind.

Within the last decade, high performance hard disk drives for real-time data storage shifted from parallel SCSI interfaces to serial FibreChannel links running at 1 or 2 GHz, over both copper and optical cable. Now 4 GHz rate interfaces and drives are becoming common, delivering peak data rates of up to 400 MB/sec.

As mentioned earlier, PCI Express (PCIe) evolved to replace the parallel PCI bus for motherboard peripherals and expansion slots in personal computers. Current data speeds of 2.5 GHz will be extended to 5 and 10 GHz as the supporting technology evolves. Since PCIe delivers only point-to-point interconnects, a routable derivative protocol called Advanced Switching Interconnect is being developed.

Serial RapidIO emerged as a solution targeted for interconnecting components and boards in real-time embedded systems. Unlike some of the other standards, Serial RapidIO offers low latency and deterministic behavior, essential in applications like software radio.

Aurora is a link-layer protocol developed by Xilinx to support efficient point-to-point serial connectivity for streaming data between FPGAs. In software radio applications, Aurora is ideally suited for raw data streams from A/D converters requiring maximum throughput with low overhead and an extremely lightweight protocol.

## **3. FPGA TECHNOLOGY**

Following the advancing trend towards gigabit serial interconnects, Xilinx and Altera have incorporated increasing support features in their recent device families. As shown in Figure 2, the Xilinx Virtex-II Pro was the first device to offer RocketIO gigabit serial transceivers.

|                   | Virtex-II Pro<br>XC2VP50 | Virtex-4<br>XC4VFX100 | Virtex-5<br>XC5VLX220T |
|-------------------|--------------------------|-----------------------|------------------------|
| Logic Cells       | 53,136                   | 94,896                | 221,184                |
| Block RAM (bits)  | 4,176k                   | 6,768k                | 7,632                  |
| Max I/O User Pins | 852                      | 768                   | 680                    |
| Multipliers       | 232                      | 160                   | 128                    |
| 405 Power PCs     | 2                        | 2                     | -                      |
| Rocket I/O Serial | 16                       | 20                    | 16                     |
| Gbit ENET Ports   | -                        | 4                     | 4                      |
| PCI Express Ports | -                        | -                     | 1                      |

Figure 2. Evolving Gigabit Serial Features of Xilinx FPGAs

They include the low level electrical interface, the SERDES (serializer and de-serializer), and 8B/10B encoding engine that delivers clock and data over a single differential pair of copper lines. This interface constitutes the underlying physical and transport layers common to most of the popular gigabit serial standards, including Aurora, PCI Express, Serial RapidIO, Infiniband, and Hypertransport.

Protocol engines for specific standards can be configured using FPGA logic so that FPGAs can adapt to different protocols as required. They interface to the SERDES and correctly process protocol-specific packets, header information, control functions, error detection and correction and payload data format. The strategy makes FPGA-based XMC modules truly "fabric agnostic" and allows one hardware design to be deployed in several different fabric environments.

This flexibility in using one hardware product to cover several different protocols encourages board vendors to develop FPGA-based products for the general market. It also affords system integrators the luxury of not having to commit to any particular standard when selecting boards for their systems.

Since gigabit serial interfaces on FPGAs were so well received, FPGA vendors took the next step and adding additional levels of integration to support the most popular gigabit serial protocol: gigabit Ethernet. The Xilinx Virtex-4 incorporates four 1 GigE MACs (media access controllers) connected to RocketIO electrical transceivers. These MACs offload a significant amount of low level protocol processing to save FPGA resources for more worthy tasks.

In their latest Virtex-5 family, Xilinx offers their RocketIO GTP transceivers with bit rates up to 3.125 GHz. Altera offers their Stratix-II GX multi-gigabit transceivers with bit rates up to 6.375 GHz.

The new Xilinx Virtex-5 LXT devices advance the technology even further by including a built-in PCI Express end point engine. This saves FPGA resources for other tasks and offers a standardized internal interface for sending and receiving data.

# 4. EMBEDDED SYSTEM STANDARDS

Standardization of gigabit serial fabric protocols and silicon devices with available interfaces fostered the development of standards suitable for deploying them in real-time embedded systems for software radio.

# 4.1 XMC – Switched Fabric for PMC

The hallmark of any successful standard is that it continues to evolve with technology, and none offers a better example than XMC. Figure 4 shows the current listing and status of the many standards defining XMC under VITA 42.

The VITA 42.0 base specification includes general information, reference and inheritance documentation,

| Std   | Description                              | Status   |
|-------|------------------------------------------|----------|
| 42.0  | XMC Base Specification                   | Released |
| 42.1  | Parallel RapidIO Protocol Layer Standard | Draft    |
| 42.2  | Serial RapidIO Protocol Layer Standard   | Draft    |
| 42.3  | PCI Express Protocol Layer Standard      | Draft    |
| 42.4  | HyperTransport Protocol Layer Standard   | Draft    |
| 42.5  | Aurora Pin Assignment on VITA 42         | Draft    |
| 42.10 | General Purpose I/O Standard             | Draft    |

Figure 4. VITA 42 XMC Standards

dimensional specifications, connectors, pin numbering and primary allocation of pairing and grouping of pin functions. This document is still designated as a draft document, but it was released for trial use.

The VITA 42.0 base specification does not dictate signal types, data rates, protocols, voltage levels or grouping for these signals. Instead, it wisely leaves that up to the several sub-specifications that follow, allowing XMCs to evolve as new standards emerge.

XMCs can be single- or double-wide modules that use high-performance connectors with full-duplex gigabit serial differential pairs. A single width XMC can have one or two connectors, while a double width XMC can have up to four connectors. Each connector can support data transfer rates up to 2.5 Gbytes/sec using 3.125 GHz bit rate signals.

XMC modules can be installed on a wide range of carrier boards including VME, VPX, cPCI and PCI. The clear benefit here is that by following these definitions, XMC and carrier board designers can achieve a much wider range of inter-operability, the essential goal of industry standards.

#### 4.2 VXS – Switched Fabric for VMEbus

In January 2003, Motorola unveiled its two-fold VME Renaissance initiative for extending both function and performance of the venerable and widely supported VMEbus. In addition to 2eSST (two-edge source synchronous transfer) technology which boosts data transfer speeds up to 320 MB/sec across the parallel backplane bus, the VME Renaissance also introduced VXS, a new switched fabric backplane strategy exploiting the emerging gigabit serial link standards.

| Std   | Description                                            | Status   |
|-------|--------------------------------------------------------|----------|
| 41.0  | VXS - VMEbus Switched Serial Base Spec                 | Released |
| 41.1  | 4X InfiniBand™ Protocol Layer Standard                 | Released |
| 41.2  | 4X Serial RapidIO <sup>™</sup> Protocol Layer Standard | Released |
| 41.3  | 1000 Mb/s Ethernet Protocol Layer Standard             | Draft    |
| 41.4  | 4X PCI Express Protocol Layer Standard                 | Draft    |
| 41.5  | Aurora Protocol Layer Standard                         | Draft    |
| 41.6  | 1X Gigabit Ethernet Control Layer Standard             | Draft    |
| 41.7  | Processor Mesh                                         | Draft    |
| 41.10 | Live Insertion System Requirements                     | Draft    |
| 41.11 | Payload Rear Transition Module Standard                | Draft    |

Figure 3. VITA 41 VXS Standards

Defined under the VITA 41 Specification, VXS also includes many sub-specifications as shown in Figure 3. The first three have been released for use and further definition of the remaining sub-specification is underway in working groups in the VSO (VITA Standards Organization). As with VITA 42, the base specification 41.0 defines electrical and mechanical structures completely independent of any protocol definitions, which are then defined in the sub-specifications.

VXS is fully backwards compatible with VME64X and incorporates a new seven-row MultiGig RT2 connector that sits between the P1 and P2 connectors at the backplane interface that accommodates two VXS links. Each VXS link operates in a full duplex mode for simultaneous transfers in each direction, and gangs four serial lines together to increase speed. Serial bit rates are defined for frequencies up to maximum of 10 gigabits/sec, although lower frequencies are supported for initial systems. With the 4X ganging and a nominal bit frequency of 3.125 GHz, both input and output paths for dual 4X links are capable of moving data between system cards at 2.5 GBytes/sec.

VXS defines two types of system cards. VXS payload cards are processor, CPU, memory, and data converter 6U VMEbus cards with one MultiGig RT2 connector for the VXS interface. VXS switch cards also use the 6U VME board form factor but, unlike VXS payload card, the P1 and P2 connectors are removed. In their place are multiple MultiGig RT2 connectors supporting up to eighteen 4X full-duplex serial links plus a power connector. The VXS switch card implements the crossbar switching function to connect payload cards together.

VXS requires a new backplane incorporating these new MultiGig RT2 connectors. The VXS backplane can take on many different layouts to accommodate specialized system needs, but will normally feature two to twenty payload card slots and one or more switch card slots. The standard boardto-board pitch of 0.8 inches is maintained throughout for compatibility with existing VMEbus card cage mechanical hardware (card guides, frames, etc).

The objective of the backplane is to connect the VXS links of each payload card to links on the switch cards to support the necessary board-to-board connectivity. Some smaller systems may require only a few payload slots and a very simple switch card, while others may need to use a full width backplane and multiple switch cards to handle the required traffic.

Many VXS-based systems designate the VMEbus as a control plane with all high-speed transfers moving across gigabit links. The VITA 41.6 specification seeks to replace this control plane function with a separate gigabit Ethernet port using extra pins defined on the RT2 connector. This concept extends well into some versions of the next generation architecture, VITA 46 or VPX, which replaces the VMEbus connectors with additional gigabit serial connectors for even higher board-to-board connectivity.

#### 5. PRODUCT EXAMPLES

#### 5.1 XMC Software Radio Transceiver

Figure 5 below shows an XMC product for software radio applications that takes maximum advantage FPGA resources. The Xilinx Virtex-II Pro FPGA device in the center provides interfaces to the many A/D and D/A converters, timing and memory resources. The VP50 FPGA offers a generous 232 DSP slices to maximize performance of algorithms critical to software radio like digital down conversion, analysis, energy detection, and demodulation.



Figure 5. Model 7140 Software Radio Transceiver PMC / XMC Mezzanine Module

Two 14-bit 105 MHz A/D converters and two 16-bit 500 MHz D/A converters provide direct connections to baseband and IF interfaces to communications receivers and transmitters, supporting a symmetrical two-channel transceiver application. Also on board are a four-channel digital down converter to bring specific IF signals down to baseband for processing, storage or forwarding. A digital upconverter channel translates baseband digital output signals up to IF frequencies for analog conversion by the D/A converters. Input and output bandwidths of up to 40 MHz are supported at IF frequencies up to 140 MHz.

Three banks of 256 MB SDRAM are used for data buffering of transients captured by the A/Ds, as a delay memory, and for storing waveforms for generating analog outputs by the D/A converters. Engines built into the FPGA and controlled through the software drivers and libraries support all of these functions.

A PCI 2.2 compliant PCIbus interface contains a ninechannel DMA controller for moving data to and from the A/Ds, D/As, digital upconverter, digital downconverter, and the memory banks. DMA chaining simplifies applications by automating complex data movement operations, all performed by hardware within the module.

Lastly, this powerful software radio module is equipped with an XMC interface utilizing the RocketIO gigabit serial transceivers on the VP50 FPGA. Each of the two 4X links can send and receive data at 1.25 Gbytes/sec through the XMC interface of the carrier board. Protocol engines for PCI Express, Serial RapidIO, or Aurora can be installed as IP to match the carrier board requirements.

### 5.2 VXS Processor Board for XMC Modules

Figure 6 below shows a simplified block diagram of a VXS processor board that takes full advantage of gigabit serial technology for interconnecting every major board resource. It is directly compatible with the Model 7140 XMC mezzanine module just described.

The Freescale MPC8641D processor is a dual core AltiVec device, a very popular platform for software radio and SCA framework offerings. Two 1 GB SDRAM banks support the each processor core with dedicated memory. Dual gigabit Ethernet ports are connected through a physical layer interface to front panel 1000base-T connectors.

The MPC8641D has two additional gigabit serial ports for high-performance I/O. One port is a 4X or 8X PCI Express (PCIe) root complex or endpoint port capable of operating at bit rates up to 2.5 GHz. This port is connected to a PCIe-to-Dual PCI-X bridge supporting two local PCI-X buses.



Figure 6. Model 4207 VXS Processor with Gigabit Serial Switch, FPGA Co-Processor, 1 Gig Ethernet, Fibre Channel, Optical Serial and XMC I/O

The second gigabit serial port can be configured as a second 8X PCIe port or a 4X Serial RapidIO (SRIO) port. This migration from parallel interfaces to gigabit serial ports as the primary system interface for this leading edge processor gives ample evidence of this trend in embedded systems.

Perhaps the most significant resource on the board is the gigabit serial crossbar switch. Because the switch operates strictly on the physical layer, it is completely transparent to protocol of the traffic it handles. The switch can join any two ports as source and destination, with multiple transfers flowing through the switch simultaneously. The processor configures the switch to extremely flexible assignment of data flow patterns for a wide range of applications.

The VXS interface connects to two 4X ports of the crossbar switch providing direct access to and from all resources on the board. Two XMC mezzanine sites are equipped with PCI-X interfaces to support PMC modules and with dual 4X gigabit serial links on the XMC connectors joined to ports on the crossbar switch. This supports extremely high-speed I/O transfers between the mezzanine modules and the processor or VXS backplane.

For high-performance signal processing functions, a Virtex-4 FX60 or FX100 FPGA features a PCI-X interface and 2 Gbytes of SDRAM. Its built-in memory controller and arbitration logic with write-posting and read-prefetch allows dual access to the SDRAM from PCI-X and from user applications inside the FPGA. The RocketIO gigabit transceivers implement dual 4X links to the crossbar switch so that the FPGA can send and receive traffic using various protocols installed as IP cores.

A dual FiberChannel controller connects to the upper PCI-X bus for convenient access to processor memory or FPGA memory. FibreChannel data is transferred through dual 1X 4 Gbit/sec links on the crossbar switch.

Dual optical high-speed serial ports with front panel connectors are joined to crossbar switch ports. This allows a switch connection to the FibreChannel controller that supports peak rates of up to 800 MB/sec to fast hard disk or tape drives. The native PCIe and SRIO interfaces of the MPC8641D immediately qualify the board for use in VXS systems using either one of those protocols. XMC modules sporting a specific gigabit serial protocol can be routed directly through the crossbar switch to the VXS backplane connector to make that protocol available to other system boards using the same protocol.

High performance real-time recording systems can take advantage of the FibreChannel interface with PCI-X links to data buffers formed in either FPGA SDRAM or processor SDRAM. In these systems, the FibreChannel DMA controller can read and write data in these buffers to and from the external hard disk. Buffer data can be filled from or delivered to the VXS interface, PMC modules, XMC modules, or through algorithms executing in the FPGA or the processor.

By installing two Model 7140 PMX/XMC modules on the Model 4207 processor board, a complete four-channel software radio transceiver system is created. Data in and out of the four A/Ds and four D/As can be routed through the MPC8641D processor, through the FX60 FPGA, through the FibreChannel interface to hard disk, or through the VXS backplane port. Each scenario is achieved by appropriately configuring the crossbar switch to enable the desired paths. Since the processor controls the crossbar switch, the signal flow architecture can be changed during runtime to track application requirements.

### 6. SUMMARY

This wealth of strategic resources found on these two product examples and the ability to configure high-speed data links between them presents an excellent example of how gigabit serial links can be exploited for real-time software radio applications.

Software drivers for software radio applications are evolving to take advantage of these new links through highlevel drivers and that help manage the data transfers. By eliminating system bottlenecks, gigabit serial links can boost performance and open up new applications previously unattainable with earlier technology. **Copyright Transfer Agreement:** The following Copyright Transfer Agreement must be included on the cover sheet for the paper (either email or fax)—not on the paper itself.

"The authors represent that the work is original and they are the author or authors of the work, except for material quoted and referenced as text passages. Authors acknowledge that they are willing to transfer the copyright of the abstract and the completed paper to the SDR Forum for purposes of publication in the SDR Forum Conference Proceedings, on associated CD ROMS, on SDR Forum Web pages, and compilations and derivative works related to this conference, should the paper be accepted for the conference. Authors are permitted to reproduce their work, and to reuse material in whole or in part from their work; for derivative works, however, such authors may not grant third party requests for reprints or republishing."

Government employees whose work is not subject to copyright should so certify. For work performed under a U.S. Government contract, the U.S. Government has royalty-free permission to reproduce the author's work for official U.S. Government purposes.