2020 — 2021

Building a robot that learns on its own

From zero to a DeepQ neural network trained in a custom 3D simulation.

In 2020 I had one fixed idea in my head: build a robot that would learn to move on its own. Not buy one. Build it. And teach it.

R0-BB1 was born on my desk. Cheap servos, a microcontroller, lots of tape. But the interesting part wasn't out there — it was inside: the DeepQ neural network that would control it.

Simulation first

Training a reinforcement agent in the physical world is brutal. Every mistake is broken plastic. So the first thing was to build a custom 3D simulation where the robot could fall a thousand times without consequence.

The loop was simple to describe, brutal to tune:

The agent observes its environment
Picks an action
Receives a reward (walking = +, falling = —)
Updates the weights of its network

Hundreds of thousands of iterations later, R0-BB1 started walking.

There's nothing like watching something you built making decisions on its own.

R0-BB2: the second generation

In 2021 came R0-BB2. Better hardware, finer sensors, and a model trained on a simulation dataset 10x larger. But the important lesson was different: the detail of the simulated environment matters more than the complexity of the model.

What I took away

Systems that learn are a conversation between you and the machine, not a procedure
Patience to iterate pays more than speed to code
The difference between a project that works and one that doesn't is the quality of the simulation

This project brought me into applied AI the only honest way: breaking things until they stopped breaking.