Browser-only character model

MicroMedGPT

A tiny GPT model for generating new medication names.

Model snapshot

Waiting for exported browser weights.

Generated names

    What It Is

    MicroMedGPT is a (very) small language model trained to invent new medication-like names one character at a time. Can a very small model learn the spelling rhythms of drug names well enough to make plausible new ones?

    The model has a tiny vocabulary of 27 tokens: the letters a to z, plus one end-of-name token. That means every name is treated as a sequence of characters rather than as words or subwords. This keeps the machinery understandable and makes the model feel closer to a transparent teaching example than a black box.

    Karpathy's MicroGPT

    The project was inspired by Andrej Karpathy's microgpt.py, a compact, dependency-free GPT implementation written in ordinary Python. Karpathy's version demonstrates the whole idea end to end: a transformer, an autograd engine, training, and sampling, all in one file.

    Training happens offline in pure Python, without PyTorch, TensorFlow, NumPy, or a GPU.

    The Corpus

    The training list is compiled from the US FDA National Drug Code Directory. It uses human prescription drug entries and pulled both proprietary names and non-proprietary names. The raw directory contains many catalogue-like product descriptions, so a cleaner lowercases the text, removes punctuation and numbers, keeps alphabetic characters only, and strips common formulation words such as tablet, capsule, injection, solution, cream, and spray.

    After cleaning, the corpus contains 6,091 drug names. This is enough for the model to notice some of the characteristic endings and internal shapes of medicine names, while still being small enough to train in about a minute on a laptop.

    Training And Inference

    The offline training run uses 1,000 steps. At the end of that run the final training loss was 2.262, which corresponds to a perplexity of about 9.61. Perplexity is a rough measure of how uncertain the model is about the next character. Lower is better.

    Once trained, the Python script exported the learned weights as a JSON file. This webpage loads that file and runs the same transformer calculation in JavaScript. No server call is made when you press generate; the sampling happens locally in your browser.