Building a Streaming-Ready RaptorQ FEC Core in Chisel: My CSE228 Project

During my CSE 228 (Agile Hardware Design) course this quarter, I worked on chisel-raptorq, a Chisel-based generator that emits a streaming-compatible, parameterized Forward Error Correction (FEC) codec inspired by the RaptorQ standard. The goal was simple but ambitious: build a hardware IP core capable of protecting real-time video streams over lossy IP networks without depending on retransmissions.

This post summarizes what I built, what I learned, what works today, and where the project can go next.

Code: https://github.com/jyrj/chisel-raptorq

What is RaptorQ — and Why Hardware?

RaptorQ is part of the family of fountain codes, also called rateless erasure codes. From a block of k source symbols, RaptorQ can generate an unlimited number of repair symbols. A receiver needs only k (or slightly more) symbols—any mix of source and repair—to fully recover the original block.

This is ideal for:

lossy wireless networks
multicast / broadcast
high-jitter WAN links
real-time media applications

The flexibility of sending as many repair packets as needed makes fountain codes more robust than fixed-rate codes like traditional Reed-Solomon.

Most RaptorQ deployments are software libraries. But for real-time video—where latency, determinism, and throughput matter—a hardware implementation offers several advantages:

deterministic timing
consistent per-symbol throughput
reduced CPU load
seamless integration with streaming interfaces like AXI-Stream

These motivations formed the backbone of the project.

What chisel-raptorq Does (Today)

Unlike a typical student proof-of-concept, this project achieves a complete, functioning RS + LT codec pipeline, with parameterization, verification, and streaming support. The current implementation includes:

✔ Parameterized Generator Architecture

The codec is driven by a flexible configuration class:

sourceK
number of RS parity symbols
LT repair capacity
symbol width (GF(2^8) today)
pipeline structure
future multi-lane support

The generator emits RTL specialized for any given configuration, verified via ParametersTest.scala.

✔ GF(256) Arithmetic Core

A purely combinational GF(256) multiplier powers the RS outer code. Features:

no lookup tables
uses the RaptorQ primitive polynomial
verified against RFC 6330 test vectors
cycle-accurate Scala model for reference

This is the mathematical backbone of the entire RS stage.

✔ Streaming Reed-Solomon Encoder (RS(255,223))

Fully functional streaming encoder:

accepts 223 source symbols
computes 32 parity symbols
processes the stream in-order
matches results from a software model

Verified via RSEncoderTest.scala.

✔ Reed-Solomon Decoder (Syndrome → BM → Chien)

The RS decoder implements:

Syndrome calculation
Berlekamp–Massey algorithm
- finds the error locator polynomial
Chien Search
- finds exact error locations

Current limitation (intentional and documented):

❗ Forney error-value computation is not implemented yet → The decoder can detect and locate errors but cannot correct them. → If syndromes ≠ 0, it flags an unrecoverable error.

This was the planned point to stop for the quarter.

✔ Functional Luby Transform (LT) Encoder and Decoder

The LTCodec includes:

PRNG-based symbol selection (RFC 6330 style)
LTEncoder that generates repair symbols on the fly
LTDecoder using iterative belief propagation
Verified end-to-end recovery for moderately sized blocks

This makes the codec “RaptorQ-inspired”—a true RS + LT fountain pipeline.

✔ Full Test Suite Passes

Using ChiselTest, the project includes:

module-level verification
configuration sweep testing
RS encoder/decoder tests
LT end-to-end tests
GF arithmetic unit tests

sbt clean test results in a full green test suite.

What I Learned — Technical & Design Lessons

This project pushed me across several domains simultaneously: finite-field math, FEC theory, Chisel generator workflows, and streaming hardware design.

Deep Understanding of FEC Internals

Implementing a full RS + LT pipeline required:

revisiting finite-field algebra (GF(2^8))
understanding RS parity generation
mastering the BM algorithm and Chien search
implementing iterative message-passing (for LT decoding)
integrating all of these into a streaming-friendly pipeline

RaptorQ’s “rateless” nature also taught me about symbol scheduling, PRNG-based graph generation, and ripple decoding.

Hardware Generator Design Using Chisel

I learned how to:

write reusable, parameterized generators
structure RTL hierarchically
harmonize DecoupledIO semantics
use Scala software models as golden references
build robust test suites

This experience fundamentally changed how I approach hardware design—Chisel generators feel like writing compilers for hardware.

Trade-Offs and Real-World Constraints

Designing for real streaming applications forced me to think about:

latency budgets
flow-control hazards
buffering strategy
pipeline depth
FPGA resource usage
integration with packet-processing frontends

These are considerations you don’t normally encounter in a purely algorithmic course project.

Incomplete Correction Logic Shows the Real Challenge

Stopping the RS decoder at “error location but not error value” gave me a first-hand appreciation of:

the complexity of integrating full Forney correction
the verification load required for full recovery tests
how much effort goes into industrial-grade codec validation

Even without full correction, the working pipeline is very educational.

Challenges & What Held Me Back

Even with a working RS + LT codec, several challenges remain:

1. Forney Algorithm & Full Correction

The biggest missing piece is calculating the error magnitudes and applying corrections. Chien Search gives you where errors are. Forney tells you how much to fix.

This is the next major implementation step.

2. Transition to AXI-Stream

Current implementation uses DecoupledIO. A production core should support:

TVALID
TREADY
TDATA
TLAST

This requires rearchitecting the I/O layer.

3. Deeper Verification Infrastructure

Future work should include:

randomized symbol-erasure patterns
full end-to-end RS+LT recovery tests
multi-configuration regression sweeps
video-frame-level integration testing

4. High-Throughput and Multi-Lane Architectures

Supporting real-time 1080p/4K video at low latency will eventually need:

multi-lane pipelines
BRAM-backed symbol buffers
parallel LT droplets
deeper pipelining for GF ops

How to Use chisel-raptorq Today

If you want to run or modify the project:

1. Clone

git clone https://github.com/jyrj/chisel-raptorq.git
cd chisel-raptorq

2. Install JDK + sbt

3. Run full verification

sbt clean test

All RS/LT tests should pass.

4. Modify parameters

Change RaptorFECParameters:

val custom = RaptorFECParameters(
  sourceK = 64,
  numParitySymbolsRS = 16,
  ltRepairCap = 30
)

5. Explore the code

GF256Mult.scala
RSEncoder.scala / RSDecoder.scala
LTEncoder.scala / LTDecoder.scala
Parameters.scala

6. Extend the system

Possible directions:

implement Forney correction
integrate AXI-Stream
stress-test recovery logic
target FPGA and collect performance data
support 10/12-bit symbols
multi-lane pipelined encoder/decoder

Why This Matters — and What’s Next

Real-time video streaming over unpredictable networks (robotics, drones, remote sensing, surveillance) needs:

low latency
deterministic pipelines
robust erasure recovery

This project provides:

a working RS+LT codec in hardware
a parameterized, extensible generator
a verification-ready infrastructure
a foundation for future, industrial-quality RaptorQ-class FEC accelerators

The next milestone is clear:

Implement Forney, integrate AXI-Stream, and demonstrate real-world recovery on actual video data.

Final Thoughts

Working on chisel-raptorq taught me more than any singular topic: it brought together algorithmic FEC theory, hardware design, Chisel generators, verification discipline, and system-level thinking.

While there is room to grow, the project already implements a significant portion of a real RaptorQ-style FEC core, and it is fully verified at the module and pipeline level.

The repository is open-source, modular, and built to be extended. I’m excited for what comes next—whether from me, future students, or the wider hardware community.

If you’d like to contribute or explore the generator, clone the repo and start experimenting.

Building a Streaming-Ready RaptorQ FEC Core in Chisel: My CSE228 Project#

What is RaptorQ — and Why Hardware?#

What chisel-raptorq Does (Today)#

✔ Parameterized Generator Architecture#

✔ GF(256) Arithmetic Core#

✔ Streaming Reed-Solomon Encoder (RS(255,223))#

✔ Reed-Solomon Decoder (Syndrome → BM → Chien)#

✔ Functional Luby Transform (LT) Encoder and Decoder#

✔ Full Test Suite Passes#

What I Learned — Technical & Design Lessons#

Deep Understanding of FEC Internals#

Hardware Generator Design Using Chisel#

Trade-Offs and Real-World Constraints#

Incomplete Correction Logic Shows the Real Challenge#

Challenges & What Held Me Back#

1. Forney Algorithm & Full Correction#

2. Transition to AXI-Stream#

3. Deeper Verification Infrastructure#

4. High-Throughput and Multi-Lane Architectures#

How to Use chisel-raptorq Today#

1. Clone#

2. Install JDK + sbt#

3. Run full verification#

4. Modify parameters#

5. Explore the code#

6. Extend the system#

Why This Matters — and What’s Next#

Final Thoughts#