Winning the ICLAD Hackathon 2025 & My Experience at ICLAD Conference

On June 26–27, 2025, I had the privilege of attending the International Conference on Large-Scale AI for Design (ICLAD) at Paul Brest Hall, Stanford University and even more exciting, I participated in (and my team won!) the ICLAD Hackathon, a competition focused on NVIDIA’s Comprehensive Verilog Design Problems (CVDP) benchmark and agentic AI debugging for real hardware.

This blog is a reflection of the hackathon journey, what I learned about agentic LLM-based hardware debugging, my mentorship experience with Mark from NVIDIA, and my overall impressions of the conference, which was co-located with DAC 2025.

Recieving Prize for the ICLAD Hackathon


🚀 Winning the ICLAD Hackathon: Agentic AI for Hardware Debugging

The ICLAD Hackathon was centered around the CVDP benchmark suite, NVIDIA’s newly proposed dataset for evaluating LLM-driven Verilog reasoning, debugging, and synthesis. The challenge required designing an agentic workflow capable of identifying subtle RTL bugs, generating corrections, and validating fixes across test patterns.

The hackathon wasn’t just about coding, it was about deeply understanding how LLMs reason about RTL, where they fail, how to guide them, and how to build structured multi-step pipelines that minimize hallucination while maximizing correctness.

🎯 What Our Team Built

Our solution combined:

1. Static RTL analysis

We created an initial parsing + structural profiling step to extract:

  • module hierarchy
  • wiring structure
  • state machines
  • arithmetic paths
  • common hazard patterns

This gave the agent context before attempting any correction.

2. Multi-agent debugging pipeline

We implemented a self-refine loop:

  • Judge agent → detects anomalies and localizes the bug
  • Fix agent → proposes Verilog patches
  • Verifier agent → runs simulated or reasoning-based tests

3. CVDP-focused evaluation workflow

The CVDP dataset has tricky cases—MISRs, pipeline stalls, state transitions, mixed blocking/non-blocking semantics, etc.
We tuned prompts and reasoning strategies specifically for:

  • control logic divergence
  • missing resets
  • incorrect bit slicing
  • handshake mismatches

4. Iterative test generation

Agent-generated test vectors were used to validate the corrected design.

This holistic system allowed us to catch deep structural bugs reliably, not just surface-level syntax issues.

🧠 Mentorship from Mark (NVIDIA)

One of the highlights was being mentored by Mark from NVIDIA, who guided us through:

  • how CVDP was constructed
  • what failure modes LLMs commonly exhibit
  • why dataset diversity matters for Verilog reasoning
  • how agentic LLMs can support debugging rather than replace engineers

His feedback helped us refine our approach:

  • We reduced chain-of-thought “drift”
  • We added more structured correction workflows
  • We improved our signal-level reasoning heuristics

This mentorship was invaluable—I left with a much stronger understanding of how industrial hardware teams are thinking about LLMs in the verification loop.

🏆 Winning the Hackathon

After 12 intense hours of building, testing, refining, and debating prompt design, our team was announced as one of the winning teams of the ICLAD Hackathon.

This win validated a lot of the research work I’ve been doing at UCSC on LLM-aided RTL debugging and invariance detection and gave me confidence to push even harder in this domain.


🏛️ My Experience at ICLAD 2025, Stanford University

The two days of the ICLAD conference were packed with groundbreaking insights, industry direction, and discussions featuring some of the most influential names in AI, hardware design, and EDA.

Below are my key highlights.


Day 1 Highlights (June 26)

🔑 Christopher Manning’s Keynote

“Meaning and Intelligence in Language Models”

Chris Manning’s talk explored:

  • the historical evolution of language models
  • why LLMs took decades to mature
  • whether LLMs truly understand meaning
  • agentic workflows with Universal Transformers
  • learning through interaction, not passive corpora

Hearing the founder of the Stanford NLP group talk about meaning, reasoning, and agent intelligence was extremely inspiring, especially for someone working on LLM reasoning for hardware.


🛠️ Oral Session 1: Agentic AI for Hardware Design Automation

This session showcased:

  • TPU-Gen automated TPU generator
  • OpenROAD Agent for script generation
  • ASIC-Agent for end-to-end design flows
  • HiVeGen and Spec2RTL-Agent

Takeaway:
The entire industry is moving toward multi-agent systems for hierarchical RTL generation and self-correcting flows.


🔬 Invited Session 1 (NVIDIA + OpenAI)

Talk 1: CVDP Benchmarks – Nathaniel Pinckney, NVIDIA

This talk directly tied into our hackathon problem:

  • future of RTL datasets for LLM evaluation
  • CVDP and VerilogEval
  • challenges in bug localization and reasoning
  • lack of standardized Verilog benchmarks

Nathaniel’s insights helped me understand how my own research on invariance detection fits into the broader ecosystem.

Talk 2: Building Useful AI Agents – Karina Nguyen, OpenAI

Karina presented:

  • RLAIF
  • agentic tool-use in code tasks
  • product thinking in RL agents
  • reliability and verification of autonomous agents

Her discussion on “agents that debug codebases with minimal supervision” resonated deeply with my current research on LLM-based hardware bug hunting.


Oral Session 2: Code Understanding and Synthesis

Highlights included:

  • VeriDebug (contrastive embedding!)
  • ChipXplore (natural language exploration of RTL)
  • AssertionForge (assertion generation from specs)

This session reinforced how structured reasoning is becoming central in hardware LLM research.


Panel: Will Agentic AI Replace HW/SW Engineers?

Panelists from Synopsys, Cadence, Qualcomm, and ChipAgents debated:

  • augmented vs automated workflows
  • safety and correctness
  • how much trust to place in autonomous EDA agents

The consensus:
Engineers won’t be replaced—our workflows will be elevated.


Day 2 Highlights (June 27)

🎤 Keynote 2: Jeff Dean (DeepMind / Google Research)

Jeff Dean explored:

  • an end-to-end AI-automated chip design flow
  • multi-modal reasoning
  • synthesizing RTL to layout with agents
  • challenges in scaling

This was one of the most ambitious visions presented at the event.


🎛️ AI for Analog, VHDL & HLS Sessions

Sessions included:

  • LATENT (analog Trojan insertion)
  • LEDRO (analog optimization with LLMs)
  • VHDL customization with LLMs

Seeing analog design enter the LLM space was exciting—this domain is rarely explored, and the results looked surprisingly promising.


Reasoning & Self-Improvement Track

Talks such as:

  • ROSUM-MCTS
  • Think, Prune, Train
  • SystemVerilog assertion generation using self-refinement

showed how researchers are exploring cognitive-style loops for hardware reasoning.


📊 Datasets & Benchmarking Track

This track was highly relevant to my work. Highlights:

  • DeepCircuitX
  • OpenRTLSet
  • HLS-Eval
  • SLDB

The rise of open hardware datasets will define the next decade of AI-for-EDA research.


Final Thoughts

Winning the ICLAD Hackathon 2025 while attending the conference was a milestone I’ll never forget. It tied together:

  • my research on agentic AI for hardware verification
  • my interest in autonomous debugging
  • my UCSC work in invariance detection and RTL signal reasoning
  • industry mentorship from NVIDIA researchers

ICLAD wasn’t just a conference—it was a glimpse into the future of chip design.

The vision is clear:

LLM agents will not replace engineers, but they will become essential partners—co-designers, debuggers, and reasoning engines that help us build faster, verify deeper, and innovate beyond current limits.

I left Stanford more inspired, more motivated, and more confident in the direction of my research career.