A minimal, educational implementation of reverse-mode automatic differentiation (backpropagation) built from scratch in Python. Inspired by pytorch library, this project builds a scalar-valued autograd engine and uses it to train a small Multi-Layer Perceptron (MLP).
This project demonstrates how modern deep learning frameworks like PyTorch compute gradients under the hood. Everything is built on top of a single Value class that wraps a scalar number and keeps track of its computational history, enabling automatic gradient computation through the chain rule.
The notebook (engine.ipynb) walks through:
- Building the
Valueautograd engine - Visualizing computation graphs with Graphviz
- Verifying gradients against PyTorch
- Implementing
Neuron,Layer, andMLPclasses - Training an MLP with gradient descent
micrograd/
├── engine.ipynb # Main notebook: autograd engine + neural network demo
└── README.md # Project documentation
The Value class is the heart of this project. It wraps a scalar and records every mathematical operation applied to it, forming a dynamic computation graph. Calling .backward() on the output node propagates gradients all the way back to the inputs via topological sort + chain rule.
Supported operations:
| Operation | Method/Operator |
|---|---|
| Addition | +, __add__, __radd__ |
| Multiplication | *, __mul__, __rmul__ |
| Power | **, __pow__ |
| Division | /, __truediv__ |
| Negation | -, __neg__ |
| Subtraction | -, __sub__ |
| Exponential | .exp() |
| Tanh activation | .tanh() |
| Backpropagation | .backward() |
from engine import Value
x1 = Value(2.0, label='x1')
w1 = Value(-3.0, label='w1')
b = Value(6.881, label='b')
o = (x1 * w1 + b).tanh()
o.backward()
print(x1.grad) # ∂o/∂x1
print(w1.grad) # ∂o/∂w1Using Graphviz, the notebook renders the full computation graph of a neuron, showing each node's data value and its computed gradient after backpropagation.
draw_dot(o) # renders an SVG computation graphBuilt entirely on top of Value, these classes implement a fully functional MLP:
| Class | Description |
|---|---|
Neuron(nin) |
A single neuron with nin inputs, random weights & bias, tanh activation |
Layer(nin, nout) |
A layer of nout neurons |
MLP(nin, nouts) |
A multi-layer perceptron: stacks layers defined by nouts |
model = MLP(3, [4, 4, 1]) # 3 inputs → [4, 4] hidden → 1 output
print(len(model.parameters())) # 41 trainable parametersThe MLP is trained on a small binary classification dataset using mean squared error loss and manual gradient descent:
xs = [
[2.0, 3.0, -1.0],
[3.0, -1.0, 0.5],
[0.5, 1.0, 1.0],
[1.0, 1.0, -1.0],
]
ys = [1.0, -1.0, -1.0, 1.0] # target labels
for k in range(30):
# Forward pass
ypred = [model(x) for x in xs]
loss = sum((yout - ygt)**2 for ygt, yout in zip(ys, ypred))
# Backward pass
for p in model.parameters():
p.grad = 0.0 # zero gradients before accumulation
loss.backward()
# Gradient descent update
for p in model.parameters():
p.data += -0.05 * p.grad
print(k, loss.data)After 30 iterations the loss drops from ~0.071 to ~0.015, and predictions approach the target values (±1.0).
The notebook includes a side-by-side check confirming that the custom Value engine produces identical gradients to PyTorch's autograd:
x2 grad: 0.5000 (PyTorch: 0.5000 ✓)
w2 grad: 0.0000 (PyTorch: 0.0000 ✓)
x1 grad: -1.5000 (PyTorch: -1.5000 ✓)
w1 grad: 1.0000 (PyTorch: 1.0000 ✓)
| Package | Purpose |
|---|---|
numpy |
Numerical utilities |
matplotlib |
Plotting |
graphviz |
Computation graph visualization |
torch |
Gradient verification only |
Install dependencies:
pip install numpy matplotlib graphviz torchNote: You also need the Graphviz system binary installed and added to your
PATHfor graph rendering to work.
-
Clone the repository:
git clone https://github.com/rabobahago/micrograd.git cd micrograd -
Install dependencies:
pip install numpy matplotlib graphviz torch
-
Open the notebook:
jupyter notebook engine.ipynb
-
Run all cells to follow the full implementation from the autograd engine to a trained neural network.
This project is ideal for understanding:
- How backpropagation works at the scalar level
- How PyTorch-style autograd graphs are built dynamically
- How
grad,_backward, and_prevtie together in reverse-mode AD - How simple building blocks (
Value) compose into full neural networks
- Andrej Karpathy — Neural Networks: Zero to Hero
- Andrej Karpathy — micrograd (GitHub)
- The spelled-out intro to neural networks and backpropagation (YouTube)
This project is for educational purposes. Feel free to use, modify, and share it.