Building a Streaming-Ready RaptorQ FEC Core in Chisel: My CSE228 Project
During my CSE 228 (Agile Hardware Design) course this quarter, I worked on chisel-raptorq, a Chisel-based generator that emits a streaming-compatible, parameterized Forward Error Correction (FEC) codec inspired by the RaptorQ standard. The goal was simple but ambitious: build a hardware IP core capable of protecting real-time video streams over lossy IP networks without depending on retransmissions.
This post summarizes what I built, what I learned, what works today, and where the project can go next.
Code: https://github.com/jyrj/chisel-raptorq
What is RaptorQ — and Why Hardware?
RaptorQ is part of the family of fountain codes, also called rateless erasure codes. From a block of k source symbols, RaptorQ can generate an unlimited number of repair symbols. A receiver needs only k (or slightly more) symbols—any mix of source and repair—to fully recover the original block.
This is ideal for:
- lossy wireless networks
- multicast / broadcast
- high-jitter WAN links
- real-time media applications
The flexibility of sending as many repair packets as needed makes fountain codes more robust than fixed-rate codes like traditional Reed-Solomon.
Most RaptorQ deployments are software libraries. But for real-time video—where latency, determinism, and throughput matter—a hardware implementation offers several advantages:
- deterministic timing
- consistent per-symbol throughput
- reduced CPU load
- seamless integration with streaming interfaces like AXI-Stream
These motivations formed the backbone of the project.
What chisel-raptorq Does (Today)
Unlike a typical student proof-of-concept, this project achieves a complete, functioning RS + LT codec pipeline, with parameterization, verification, and streaming support. The current implementation includes:
✔ Parameterized Generator Architecture
The codec is driven by a flexible configuration class:
sourceK- number of RS parity symbols
- LT repair capacity
- symbol width (GF(2^8) today)
- pipeline structure
- future multi-lane support
The generator emits RTL specialized for any given configuration, verified via ParametersTest.scala.
✔ GF(256) Arithmetic Core
A purely combinational GF(256) multiplier powers the RS outer code. Features:
- no lookup tables
- uses the RaptorQ primitive polynomial
- verified against RFC 6330 test vectors
- cycle-accurate Scala model for reference
This is the mathematical backbone of the entire RS stage.
✔ Streaming Reed-Solomon Encoder (RS(255,223))
Fully functional streaming encoder:
- accepts
223source symbols - computes
32parity symbols - processes the stream in-order
- matches results from a software model
Verified via RSEncoderTest.scala.
✔ Reed-Solomon Decoder (Syndrome → BM → Chien)
The RS decoder implements:
Syndrome calculation
Berlekamp–Massey algorithm
- finds the error locator polynomial
Chien Search
- finds exact error locations
Current limitation (intentional and documented):
- ❗ Forney error-value computation is not implemented yet → The decoder can detect and locate errors but cannot correct them. → If syndromes ≠ 0, it flags an unrecoverable error.
This was the planned point to stop for the quarter.
✔ Functional Luby Transform (LT) Encoder and Decoder
The LTCodec includes:
- PRNG-based symbol selection (RFC 6330 style)
- LTEncoder that generates repair symbols on the fly
- LTDecoder using iterative belief propagation
- Verified end-to-end recovery for moderately sized blocks
This makes the codec “RaptorQ-inspired”—a true RS + LT fountain pipeline.
✔ Full Test Suite Passes
Using ChiselTest, the project includes:
- module-level verification
- configuration sweep testing
- RS encoder/decoder tests
- LT end-to-end tests
- GF arithmetic unit tests
sbt clean test results in a full green test suite.
What I Learned — Technical & Design Lessons
This project pushed me across several domains simultaneously: finite-field math, FEC theory, Chisel generator workflows, and streaming hardware design.
Deep Understanding of FEC Internals
Implementing a full RS + LT pipeline required:
- revisiting finite-field algebra (GF(2^8))
- understanding RS parity generation
- mastering the BM algorithm and Chien search
- implementing iterative message-passing (for LT decoding)
- integrating all of these into a streaming-friendly pipeline
RaptorQ’s “rateless” nature also taught me about symbol scheduling, PRNG-based graph generation, and ripple decoding.
Hardware Generator Design Using Chisel
I learned how to:
- write reusable, parameterized generators
- structure RTL hierarchically
- harmonize DecoupledIO semantics
- use Scala software models as golden references
- build robust test suites
This experience fundamentally changed how I approach hardware design—Chisel generators feel like writing compilers for hardware.
Trade-Offs and Real-World Constraints
Designing for real streaming applications forced me to think about:
- latency budgets
- flow-control hazards
- buffering strategy
- pipeline depth
- FPGA resource usage
- integration with packet-processing frontends
These are considerations you don’t normally encounter in a purely algorithmic course project.
Incomplete Correction Logic Shows the Real Challenge
Stopping the RS decoder at “error location but not error value” gave me a first-hand appreciation of:
- the complexity of integrating full Forney correction
- the verification load required for full recovery tests
- how much effort goes into industrial-grade codec validation
Even without full correction, the working pipeline is very educational.
Challenges & What Held Me Back
Even with a working RS + LT codec, several challenges remain:
1. Forney Algorithm & Full Correction
The biggest missing piece is calculating the error magnitudes and applying corrections. Chien Search gives you where errors are. Forney tells you how much to fix.
This is the next major implementation step.
2. Transition to AXI-Stream
Current implementation uses DecoupledIO. A production core should support:
- TVALID
- TREADY
- TDATA
- TLAST
This requires rearchitecting the I/O layer.
3. Deeper Verification Infrastructure
Future work should include:
- randomized symbol-erasure patterns
- full end-to-end RS+LT recovery tests
- multi-configuration regression sweeps
- video-frame-level integration testing
4. High-Throughput and Multi-Lane Architectures
Supporting real-time 1080p/4K video at low latency will eventually need:
- multi-lane pipelines
- BRAM-backed symbol buffers
- parallel LT droplets
- deeper pipelining for GF ops
How to Use chisel-raptorq Today
If you want to run or modify the project:
1. Clone
git clone https://github.com/jyrj/chisel-raptorq.git
cd chisel-raptorq
2. Install JDK + sbt
3. Run full verification
sbt clean test
All RS/LT tests should pass.
4. Modify parameters
Change RaptorFECParameters:
val custom = RaptorFECParameters(
sourceK = 64,
numParitySymbolsRS = 16,
ltRepairCap = 30
)
5. Explore the code
GF256Mult.scalaRSEncoder.scala/RSDecoder.scalaLTEncoder.scala/LTDecoder.scalaParameters.scala
6. Extend the system
Possible directions:
- implement Forney correction
- integrate AXI-Stream
- stress-test recovery logic
- target FPGA and collect performance data
- support 10/12-bit symbols
- multi-lane pipelined encoder/decoder
Why This Matters — and What’s Next
Real-time video streaming over unpredictable networks (robotics, drones, remote sensing, surveillance) needs:
- low latency
- deterministic pipelines
- robust erasure recovery
This project provides:
- a working RS+LT codec in hardware
- a parameterized, extensible generator
- a verification-ready infrastructure
- a foundation for future, industrial-quality RaptorQ-class FEC accelerators
The next milestone is clear:
Implement Forney, integrate AXI-Stream, and demonstrate real-world recovery on actual video data.
Final Thoughts
Working on chisel-raptorq taught me more than any singular topic: it brought together algorithmic FEC theory, hardware design, Chisel generators, verification discipline, and system-level thinking.
While there is room to grow, the project already implements a significant portion of a real RaptorQ-style FEC core, and it is fully verified at the module and pipeline level.
The repository is open-source, modular, and built to be extended. I’m excited for what comes next—whether from me, future students, or the wider hardware community.
If you’d like to contribute or explore the generator, clone the repo and start experimenting.
