Project Structure
Repository layout
tinylm/
│
├── tinylm/ # Source code
│ ├── __init__.py
│ ├── tokenizer/ # BPE tokenizer — pure Python → optimized
│ │ └── __init__.py
│ ├── model/ # Transformer model — pure Python → PyTorch
│ │ └── __init__.py
│ ├── training/ # Training loop, optimizer, scheduler
│ │ └── __init__.py
│ ├── inference/ # Prefill, decode loop, sampling, KV cache
│ │ └── __init__.py
│ └── observability/ # Profiling, metrics, logging
│ └── __init__.py
│
├── tests/ # One test directory per source module
│ ├── __init__.py
│ ├── tokenizer/
│ ├── model/
│ ├── training/
│ └── inference/
│
├── docs/ # MkDocs documentation source
│ ├── index.md
│ ├── setup/
│ ├── phase0/ # Pure Python
│ ├── phase01/ # Manual autograd
│ ├── phase02/ # NumPy + PyTorch autograd
│ ├── phase03/ # PyTorch manual ops
│ ├── phase04/ # PyTorch proper
│ ├── phase1/ # Modernization
│ ├── phase2/ # Hardware optimization
│ └── assets/diagrams/ # ASCII and rendered diagrams
│
├── experiments/ # One-off scripts, notebooks, explorations
│ └── (not committed to main)
│
├── data/
│ ├── raw/ # Original downloaded datasets
│ └── processed/ # Tokenized, binary-format training data
│
├── .github/
│ └── workflows/
│ └── docs.yml # Auto-deploy docs on push to main
│
├── mkdocs.yml # Docs site configuration
├── pyproject.toml # Dependencies managed by uv
├── uv.lock # Lockfile — exact versions, reproducible
├── .python-version # Python version pin for uv
├── .gitignore # Python + data files ignored
└── README.md
Design decisions
Why separate tinylm/ source from tests/?
Standard Python project layout. Keeps source and tests cleanly separated. pytest discovers tests automatically in the tests/ directory.
Why data/raw and data/processed separately?
Raw data is the source of truth — never modified. Processing (tokenization, binary encoding) is reproducible from raw. If the processing pipeline changes, you re-run it from raw. This pattern prevents the nightmare of "which version of the data did I train on?"
Neither data/raw nor data/processed are committed to git (both in .gitignore). Datasets are downloaded separately.
Why experiments/ not committed?
Experiments are exploratory — one-off scripts to test an idea, Jupyter notebooks to visualize attention weights, quick benchmarks. They're not production code and shouldn't pollute the main branch. Keep them local.
Why observability/ as its own module?
From day one, every training run and inference call is instrumented. Profiling, loss curves, GPU utilization, tokens/sec — these aren't bolted on later. They're first-class. The observability module contains the logging setup, metric collectors, and profiler wrappers that every other module imports.
Running tests
# Run all tests
pytest
# Run with coverage
pytest --cov=tinylm --cov-report=term-missing
# Run specific module tests
pytest tests/tokenizer/
Code style
# Format
black tinylm/ tests/
# Lint
ruff check tinylm/ tests/
# Both (run before every commit)
black tinylm/ tests/ && ruff check tinylm/ tests/
What's next
Environment is set up, repo is structured, docs are live. Time to build. → Phase 0 Overview