Browser-only character model
MicroMedGPT
A tiny GPT model for generating new medication names.
Model snapshot
Waiting for exported browser weights.
Generated names
What It Is
MicroMedGPT is a (very) small language model trained to invent new medication-like names one character at a time. Can a very small model learn the spelling rhythms of drug names well enough to make plausible new ones?
The model has a tiny vocabulary of 27 tokens: the letters a to
z, plus one end-of-name token. That means every name is treated
as a sequence of characters rather than as words or subwords. This
keeps the machinery understandable and makes the model feel closer to a
transparent teaching example than a black box.
Karpathy's MicroGPT
The project was inspired by Andrej Karpathy's
microgpt.py,
a compact, dependency-free GPT implementation written in ordinary Python.
Karpathy's version demonstrates the whole idea end to end: a transformer,
an autograd engine, training, and sampling, all in one file.
Training happens offline in pure Python, without PyTorch, TensorFlow, NumPy, or a GPU.
The Corpus
The training list is compiled from the US FDA National Drug Code Directory.
It uses human prescription drug entries and pulled both proprietary names and
non-proprietary names. The raw directory contains many catalogue-like product
descriptions, so a cleaner lowercases the text, removes punctuation and
numbers, keeps alphabetic characters only, and strips common formulation
words such as tablet, capsule,
injection, solution, cream, and
spray.
After cleaning, the corpus contains 6,091 drug names. This is enough for the model to notice some of the characteristic endings and internal shapes of medicine names, while still being small enough to train in about a minute on a laptop.
Training And Inference
The offline training run uses 1,000 steps. At the end of that run the final training loss was 2.262, which corresponds to a perplexity of about 9.61. Perplexity is a rough measure of how uncertain the model is about the next character. Lower is better.
Once trained, the Python script exported the learned weights as a JSON file. This webpage loads that file and runs the same transformer calculation in JavaScript. No server call is made when you press generate; the sampling happens locally in your browser.