2020 — 2021
Building a robot that learns on its own
From zero to a DeepQ neural network trained in a custom 3D simulation.
In 2020 I had one fixed idea in my head: build a robot that would learn to move on its own. Not buy one. Build it. And teach it.
R0-BB1 was born on my desk. Cheap servos, a microcontroller, lots of tape. But the interesting part wasn't out there — it was inside: the DeepQ neural network that would control it.
Simulation first
Training a reinforcement agent in the physical world is brutal. Every mistake is broken plastic. So the first thing was to build a custom 3D simulation where the robot could fall a thousand times without consequence.
The loop was simple to describe, brutal to tune:
- The agent observes its environment
- Picks an action
- Receives a reward (walking = +, falling = —)
- Updates the weights of its network
Hundreds of thousands of iterations later, R0-BB1 started walking.
There's nothing like watching something you built making decisions on its own.
R0-BB2: the second generation
In 2021 came R0-BB2. Better hardware, finer sensors, and a model trained on a simulation dataset 10x larger. But the important lesson was different: the detail of the simulated environment matters more than the complexity of the model.
What I took away
- Systems that learn are a conversation between you and the machine, not a procedure
- Patience to iterate pays more than speed to code
- The difference between a project that works and one that doesn't is the quality of the simulation
This project brought me into applied AI the only honest way: breaking things until they stopped breaking.