SENIOR SDE @ AMD · 2025–
DCAuto
Firmware CI fabric for MI450 data-center GPUs. Orchestrates deployments and validation across physical test systems, with idempotent self-check-in and fallback-recovery flows.
SOFTWARE ENGINEER · BACKEND & DEVELOPER TOOLS
Yash Bhardwaj · Software Engineer
I build backend systems and developer tools, and make hard debugging feel less lonely.
/ about
I'm a software engineer who builds backend systems and developer tools. Right now I'm a Senior SDE at AMD on DCAuto, a firmware CI platform, where I also maintain Sherlog Holmes, an LLM log-triage assistant my team uses daily.
I came up through Qualcomm and Licious before this, mostly writing services in Python and Java backed by Postgres, Mongo, and a lot of Redis. BITS Pilani '21. Outside work, I'm at the gym, on a run, or walking by the lake.
Tell me what you're building.
Currently working with
/ experience
SEP 2025 - PRESENT · DCAUTO PLATFORM
DCAuto is AMD's internal platform for firmware CI and regression testing on MI450 data-center GPUs. It orchestrates deployments and validation workloads across racks of physical test systems, so a single flaky node can stall a whole regression run.
I built the self-check-in and fallback-recovery flows that keep those runs moving. The work is idempotent by design: a system can drop out mid-run, recover, and rejoin without double-counting or corrupting state, and the orchestrator retries instead of failing the batch.
I also built Sherlog Holmes (internally DCAutoAI), a five-stage triage system for the 50,000-line logs that GPU firmware validation produces. It uses embeddings and FAISS vector search to find the failure signal, then an LLM to propose a root cause. It ships as Flask APIs plus async services on Celery, Azure Service Bus, and MongoDB, and posts results to the firmware team in Teams. They use it daily.
AUG 2022 - JUN 2025 · QUALCOMM SOFTWARE CENTER
Software engineer on Qualcomm Software Center, an Electron app (Node.js, Angular, TypeScript) that delivers software and drivers to OEMs worldwide. I shipped the first macOS universal release, running natively on both Intel and Apple Silicon.
I owned release engineering. I automated the Jenkins and AWS CI/CD pipelines and the packaging steps that had been manual, and cut release times by 50%.
I also built the deploy, logging, and monitoring pipelines the team relied on to catch problems before they reached OEMs.
JUN 2021 - JUL 2022 · POS PLATFORM
Software development engineer on the POS platform that runs Licious nationwide. I wrote Java (Spring Boot) and PHP microservices for high-volume order processing across more than 100 delivery centers, and reworked the REST APIs to cut response times by 25%.
I wrote REST-assured regression suites to keep that surface stable, and automated the Docker and Kubernetes deployments so releases stopped depending on hand-run steps.
/ projects
SENIOR SDE @ AMD · 2025–
Firmware CI fabric for MI450 data-center GPUs. Orchestrates deployments and validation across physical test systems, with idempotent self-check-in and fallback-recovery flows.
AMD · LLM SYSTEMS
Five-stage failure triage for 50,000-line GPU firmware logs. Embeddings plus FAISS search plus LLM root-cause analysis, shipped as Flask APIs and Celery workers over Azure Service Bus and MongoDB, with Teams alerts.
SOFTWARE ENGINEER @ QUALCOMM · 2022–25
Electron app delivering drivers to global OEMs in Node, Angular, and TypeScript. Shipped the first macOS universal build (Intel and Apple Silicon); automated Jenkins and AWS CI/CD to cut release times 50%.
Tinkering rebuilding these