UALink: A New Opportunity for FPGA Data Acquisition?

UALink, a new high-speed interconnect, in FPGA-based data acquisition systems ?
Sometimes components are developed for a specific application but later prove useful in other fields as well. Especially if you cannot design your own chips, it becomes valuable to creatively repurpose existing technologies originally made for different purposes.
For example, TI DSPs, initially designed for wireless communications, found their way into medical ultrasound equipment. GPUs, originally created for gaming, are now widely used for advanced signal processing and imaging.
A current example is UALink, a new high-speed interconnect technology that could potentially play a role in FPGA-based data acquisition systems.
Data Acquisition and Digital Beamforming in Medical Ultrasound
Medical ultrasound systems often use dozens to nearly a hundred high-speed ADCs—typically 64 to 96 units with 12- to 14-bit resolution and sampling rates around 60 to 80 million samples per second (Msps). This massive data stream is processed in real time by an FPGA that performs digital beamforming. The most common algorithm used is Delay And Sum (DAS).
The beamformer output consists of one or multiple (e.g., 16) beamformed lines simultaneously, with data per sample having a resolution of about 20 to 22 bits at the same sampling rate. This large data flow is sent via PCI Express (PCIe) to a PC module for further processing.
Signal Processing: From DSP to GPU
Historically, TI’s C6000-series VLIW DSPs were used for signal processing after the beamformer. Originally targeted at wireless base stations, these DSPs proved excellent for processing medical ultrasound signals such as B-mode images, Pulsed Wave Doppler, and Color Doppler. They also managed scan sequence control and data transport using advanced DMA features.
Later, this processing shifted to GPUs on PC modules, with data transferred via PCIe to GPU memory. This approach offered greater flexibility and eventually became more cost-effective.
Display and Scan Conversion
Because the scan lines from ultrasound transducers do not directly match the pixel grid of displays, a special signal processing step called scan conversion is needed. This converts beamformed data into a format suitable for video display, ensuring correct and smooth image rendering.
The Need for More Advanced Beamforming
More advanced beamforming techniques exist that can produce even better ultrasound images, but these require significantly more computing power and flexibility. GPUs are better suited for this than FPGAs, as they can more easily support new algorithms.
The ideal scenario is to transport ADC data directly into the memory of a GPU pool However, this is technically complex and currently expensive.
Bottleneck: Data Transport from ADC to GPU
The biggest challenge is real-time transport of massive ADC data streams into GPU memory. FPGAs mainly use PCIe for this, but PCIe sometimes lacks sufficient speed for the highest data rates. Hence, hybrid solutions are common:
Real-time digital beamforming in the FPGA, with output sent via PCIe to the GPU.
Buffering ADC data in the FPGA, then sending it via PCIe to the GPU for advanced beamforming (real-time but only for part of the image).
Non-real-time processing where large datasets are stored and analyzed later, mainly in research settings.
This bottleneck affects not only medical ultrasound but also other fields like CERN, where enormous data from particle collisions must be processed. Fast data flow is also critical when driving many video cameras or monitors with intermediate signal processing.
Heterogeneous Computing: Integrating Multiple Technologies
Bringing together data acquisition hardware (like ADCs), FPGAs, CPUs, DSPs, and GPUs into one system falls under Heterogeneous Computing. This field combines specialized processors to achieve optimal performance, requiring custom system design where interfaces and data transport work seamlessly.
UALink: A Potential Breakthrough?
NVIDIA’s NVLink is a well-known high-speed interconnect for GPUs but is not available for FPGAs. The new open standard UALink, developed by a consortium including AMD and Intel, promises high bandwidth and low latency for AI accelerators and HPC. Although UALink is not yet used in FPGA products, it could become an affordable and efficient way to deliver massive digital data directly to GPUs.
Advances in AI training and HPC drive the availability of such interconnects, opening possibilities for new data acquisition systems where FPGAs and GPUs are connected via UALink, potentially eliminating current data transport bottlenecks.
Conclusion
The combination of high-speed ADCs, FPGAs, and GPUs is essential for modern data acquisition and signal processing, such as in medical ultrasound. The main challenge remains fast transport of raw data to GPUs for advanced processing. Emerging technologies like UALink offer hope for breakthroughs that enable more efficient cooperation between FPGAs and GPUs, leading to more powerful, flexible, and cost-effective data acquisition systems in the near future.
A Few Useful Links:
- UALink Consortium (official website and specifications)
https://ualinkconsortium.org - Introduction to Heterogeneous Computing (Wikipedia)
https://en.wikipedia.org/wiki/Heterogeneous_computing
Discussione (0 nota(e))