

### Understanding FPGA Tradeoffs for Software Radio Applications

Rodger Hosking Pentek, Inc.

#### Software Defined Radio Forum Conference Orlando • November 2003





Trends with time ------

# **Traditional Programmable Logic**

- Used primarily to replace discrete digital hardware circuitry for:
  - Control logic
  - Glue logic
  - Registers and gates
  - State machines
  - Counters and dividers



- Devices were selected by hardware engineers
- Programmed functions were seldom changed after the design went into production

# FPGAs - New Device Technology

- On-chip processor cores
- Internal clock rates up to 600 MHz
- Reduced power with core voltages approaching 1 volt
- Dedicated on-chip hardware multipliers
- Memory densities of over 10 million bits
- Flexible memory structures
- Logic densities of over 10M gates
- Silicon geometries near 0.1 microns
- High-density BGA and flip-chip packaging
- On-board giga-bit serial interfaces
- Over 1200 user I/O pins
- Numerous configurable interface standards



# FPGAs - Enhanced Development Tools

- High Level Design Tools
  - Block Diagram System Generators
  - Schematic Processors
  - High-level language compilers for VHDL & Verilog



- Advanced simulation tools for modeling speed, propagation delays, skew and board layout
- Faster compilers and simulators save time
- Graphically-oriented debugging tools
- IP (Intellectual Property) Cores
  - FPGA vendors offer both free and licensed cores
  - FPGA vendors promote third party core vendors
  - Wide range of IP cores available

# FPGAS: Key Benefits for Software Radio

- Parallel Processing
- Hardware Multipliers for DSP



- FPGAs can now have over 500 hardware multipliers
- Flexible Memory Structures
  - Dual port RAM, FIFOs, shift registers, look up tables, etc.
- Parallel and Pipelined Data Flow
  - Systolic simultaneous data movement
- Flexible I/O
  - Supports a variety of devices, buses and interface standards
- High Speed
- Available IP cores optimized for special functions







Proceeding of the SDR 03 Technical Conference and Product Exposition. Copyright © 2003 SDR Forum. All Rights Reserved

7

# Software Radio Modular Products

- Mezzanine Module
  - Input Interface (A/D or digital interface)
  - Digital Receiver ASIC (optional)
  - FPGA
- Processor Board
  - Bi-directional FIFO Buffer
  - DSP or RISC Processor





### **16-Channel Narrowband Rcvr & A/D**

| Model | Chip   | A/D      | Max Fs  | Max BW | FPGA     | Form |
|-------|--------|----------|---------|--------|----------|------|
| 7131  | GC4016 | 6645-105 | 105 MHz | 10 MHz | XC2V3000 | PMC  |

- Transformer coupled input for IF sampling
- Device drivers for VxWorks
- Suitable for low cost generic platforms





Proceeding of the SDR 03 Technical Conference and Product Exposition. Copyright © 2003 SDR Forum. All Rights Reserved



### 2-Channel Wideband Receiver & A/D



#### REF JTEK **Complete Receiver System for VME**

Quad A/D & Recvr, Dual FPGA, and Quad Power PC 





|                      | Virtex-E |         | Virtex-II |          | Virtex-II Pro |         |         |
|----------------------|----------|---------|-----------|----------|---------------|---------|---------|
|                      | XCV300E  | XCV600E | XC2V1000  | XC2V3000 | XC2VP30       | XC2VP40 | XC2VP50 |
| Logic Cells          | 6,912    | 15,552  | 11,520    | 32,256   | 30,816        | 43,632  | 53,236  |
| System Gates         | 412k     | 986k    | 1000k     | 3000k    |               |         |         |
| Max Block RAM (bits) | 131k     | 295k    | 720k      | 1,728k   | 2,448k        | 3,456k  | 4,176k  |
| Max I/O User Pins    | 316      | 512     | 432       | 720      | 644           | 804     | 852     |
| 18 x 18 Multipliers  | -        | -       | 40        | 96       | 136           | 192     | 232     |
| Power PCs            | -        | -       | -         | -        | 2             | 2       | 2       |
| Gbit Transceivers    | -        | -       | -         | -        | 8             | 12      | 16      |

### **Three Software Radio FPGA Strategies**

- FPGA Design Kit
  - Allows custom extensions to standard modules
  - Includes VHDL factory configuration source code
  - User block in data path simplifies development
- IP Core Libraries
  - High-Performance Software Radio Algorithms
  - Highly Optimized for Specific FPGAs
  - Exploits Parallelism of FPGA Hardware
- Factory Installed Cores
  - Pre-configured and installed IP functions
  - No customer FPGA development required
  - Performance optimized for specific modules









 Allows FPGA design engineers to easily add functions to standard factory configuration



- Includes VHDL source code for standard functions:
  - Control and status registers
  - A/D and Digital receiver interfaces
  - Mezzanine interfaces
  - Triggering, clocking, sync and gating functions
  - Data packing and formatting
  - Channel selection
  - A/D / Receiver multiplexing
  - Interrupt generation
  - Data tagging and channel ID
  - Default User Block for inserting custom code

# **Typical FPGA Code Modules**

- Simplified view of VHDL source code modules
- User Block modules are configured as default "straight wire" connections







## **Processor Reconfiguration of FPGA**

- FPGA Loader Utility
  - FPGA configuration loader utility executes on processor
  - Supports easy FPGA reconfiguration during runtime for adaptive processing
  - Supports easy FPGA reconfiguration for field upgrades
  - Eliminates need to disassemble system to modify hardware
  - Extends product longevity





### **IP Core Library Functions**

| Core | Description                             |
|------|-----------------------------------------|
| 401  | 1k-point Quad Radix-4 Complex FFT       |
| 403  | 4k-point Single Radix 4 Complex FFT     |
| 404  | 4k-point Quad Radix-4 Complex FFT       |
| 421  | 140 MHz Wideband Digital Down Converter |
| 422  | 280 MHz Wideband Digital Down Converter |
| 440  | Radar Pulse Compressor                  |

- Suitable for any Xilinx FPGA Platform
- Licensing based on Xilinx SignOnce<sup>™</sup> Project License
  - Customers use a common, pre-approved standardized license
  - Streamlines legal process and simplifies ordering
  - Core may be incorporated into any single project or product
  - No limit on the number of licensed products produced

# **Quad Parallel Pipelined 4k FFT**

- Core 404: 4,096-point parallel pipelined Quad FFT
- Four input & output streams at 25% offset
  - Four input/output points for each input clock
- FFT calculation time:
  - Four FFTs are computed in parallel every 4096 clocks
  - Effective Calculation for each FFT = 4096 clocks / 4 = 1024 clocks
  - 100 MHz Clock Example: 4k FFT Time = 1024 x 10 ns = 10.24 usec





- FFT Calculation Time Depends on FPGA Clock
- FPGA Clock = Data Source Sample Clock



Maximum FPGA clock rates depend on Xilinx speed grade

| Xilinx FPGA<br>Speed Grade | Max<br>Clock | Core 404<br>Quad<br>4k FFT Quad |
|----------------------------|--------------|---------------------------------|
| -6                         | 140 MHz      | 7.31 usec                       |
| -5                         | 127 MHz      | 8.06 usec                       |
| -4                         | 111 MHz      | 9.23 usec                       |
| (reference)                | 100 MHz      | 10.24 usec                      |

- Comparison to 4k FFT Calculation on Programmable Processors
  - 500 MHz G4 AltiVec PowerPC: 105 usec (VSIPL Library) > 10X
  - 300 MHz TMS320C6203 DSP: 212 usec (TI Benchmarks) > 20X

# Single Stage Digital Receiver

- Operates at input clock rates to 140 MHz
- Real or complex output, spectrum inversion & offset



- Requires XC2V1500 Xilinx Virtex-II FPGA (or larger)
- Two will fit inside the XC2V3000



## **Dual Stage 2x Digital Receiver**

- Operates at input clock rates to 280 MHz
- Same performance and features as single stage DDR



- Demultiplexer sends alternate input samples to each stage
- Each stage operates at one half the input clock rate
- Formatter combines two stages into a single output
- Requires XC2V3000 Xilinx Virtex-II FPGA (or larger)



Proceeding of the SDR 03 Technical Conference and Product Exposition. Copyright © 2003 SDR Forum. All Rights Reserved



### **Comparison: ASIC vs. FPGA**

|                             | GC1012B       | IP Core 421       | IP Core 422 |
|-----------------------------|---------------|-------------------|-------------|
| Input Resolution            | 12 bits       | 16 bits           |             |
| Input Format                | Real Only     | Real or Complex   |             |
| Maximum Input Data Rate     | 100 MHz       | 148 MHz           | 296 MHz     |
| Maximum Input Bandwidth     | 50MHz         | 74 MHz            | 148 MHz     |
| NCO Frequency Resolution    | 28 bits       | 32 bits           |             |
| NCO Phase Offset            | None          | 32 bits           |             |
| NCO Output Resolution       | 12 bits       | 18 bits           |             |
| NCO (SFDR)                  | 75 dB         | 110 dB            |             |
| Mixer Output Resolution     | 13 bits       | 17                | bits        |
| Number of FIR Filter Sets   | Two           | Four              |             |
| FIR Filter Programmability  | Fixed         | User Programmable |             |
| FIR Coefficient Resolution  | 14 bits       | 18 bits           |             |
| Default 80% Filter Ripple   | ±0.1 dB       | ±0.04 dB          |             |
| Default 80% Image Rejection | 75 dB         | 100 dB            |             |
| Output Resolution           | 10 to 16 bits | 16 or 24 bits     |             |



- Operates at data rates up to 150 MHz
- 16-, 20- or 24-bit Resolution
- Block Floating Point arithmetic preserves dynamic range







- Minimum frame spacing (maximum throughput version)
  - 64 point FFT: 1.57 usec
  - 8k point FFT: 198 usec

Tradeoffs: ASIC vs. FPGA

- When to Use an ASIC Digital Receiver
  - FPGA implementations may require power and cost tradeoffs of 5 to 10 times over ASICs
  - ASIC designs have more a complete feature set
  - ASIC designs are more more thoroughly tested, characterized and documented
  - ASIC filter designs are often optimized with specialized hardware structures that may be difficult to replicate
  - Gain and scaling often consume many hours of optimization



Tradeoffs: ASIC vs. FPGA

- When to Use an FPGA Digital Receiver
  - Specialized phase and frequency control of the NCO for complex FSK and frequency agile signals
  - Tight delivery schedules
  - Limited volume production
  - Unusual output resampling requirements
  - Custom filtering requirements not met by down loading FIR coefficients into the ASIC



# SW Radio FPGA Design Guidelines

- FPGA benchmarks can be misleading
  - Make sure you allow for data movement both in and out of the FPGA
  - Make sure external devices or interfaces are compatible with the benchmark clock of the FPGA
  - Make sure the calculation accuracy (number of bits of precision) is sufficient
  - Check exception handling overflows, saturation, and divide by zero
  - Take advantage of bit-true simulation tools



# SW Radio FPGA Design Guidelines

- Be careful in trying to replace standard ASIC solutions with FPGA designs
  - ASICs usually consume less power than the FPGA counterpart
  - Full characterization of a new FPGA function over all operational modes may consume significant engineering time
  - Minor changes to FPGA designs can often impact performance in unexpected ways
  - Test benches take time to develop but can be useful in validating new changes
  - Software system development tools (like MATLAB) help shorten FPGA VHDL design cycles





- Be careful in trying to replace programmable DSP functions with FPGA designs
  - Memory requirements often grow unexpectedly -especially for upgrades
  - DSPs have native support for large SDRAMs with lower cost and power per bit than FPGA RAM
  - Take advantage of SDRAM controllers now available as IP cores for FPGAs
  - Changes to FPGA code may require an new FPGA pin-out and, therefore, a new PC board!



# **FPGA Benefits for Software Radio**

- Replace ASIC functions with custom features
- Perform algorithms faster and easier than with a DSP
- Handle signal processing tasks not possible with a DSP
- Reduce data rates by pre-processing data at front end
- Take advantage of available IP cores
- Support re-configurable processing & field upgrades
- Avoid product obsolescence extend product life cycles
- Protect proprietary algorithms and IP inside FPGAs





- For additional information:
  - Pentek: www.pentek.com
  - GateFlow: www.pentek.com/gateflow
  - Xilinx: www.xilinx.com
  - Altera: www.altera.com
  - Mathworks: www.mathworks.com

