Local AI Demo

Client-side Deterministic AI with IntgrNN: Runtime Learning in 206 Bytes

A demo walkthrough for game developers — and anyone curious about what small-scale machine learning actually looks like.


What is this?

IntgrNN is a neural network library that uses only integers. It is fast, small and deterministic across all platforms without needing floating point hardware. The networks typically fit in the L1 cache and run 2x faster with 4x memory savings over floating point networks.

The size and speed efficiency of integer neural networks let us deploy adaptive AI into game clients, handheld devices, toys and webpages. This AI runs fast, takes little space, uses dramatically less power, and maintains complete user privacy without even being online. It acts the same way, every time, on every device. If the AI does not work as intended, it is fully traceable, debuggable and patchable.

For game developers, this opens a new design space. Agents that learn during play. Behavior that adapts to the player. Systems that are fully deterministic and debuggable, but still surprising. The demo below shows what that looks like.


The Demo

A small creature named enen will solve five puzzles.

enen starts with random neural network weights — no pre-training, no loaded answers. Each trial, it guesses, gets feedback, and retrains on everything it remembers. After a few tries, the mistakes stop repeating.

The entire brain that does this fits in 206 bytes. That’s smaller than the paragraph you just read.

What Did I Just See?

You watched five networks go from random initialization to correct behavior. Here’s what each puzzle tested and how the network handled it.

Puzzle 1: SIZE

The problem: Two mushrooms with different sizes and colors. Bigger is always safe, but color varies randomly. The network sees four inputs (size A, size B, color A, color B) and has to learn that two matter and two don’t.

The network: 4 inputs → 8 hidden neurons → 1 output. 49 bytes.

What happened: enen initially picks arbitrarily. After a few wrong answers and retraining passes, the weights shift until size dominates and color no longer influences the decision. This takes about 9 trials.

The skill: Ignore irrelevant features.

Puzzle 2: EXCEPTIONS

The problem: Two shapes. Circles are safe, squares are dangerous — except blue squares, which are the safest of all. The network can’t just learn “pick circles.” It has to learn that blue+square overrides the general rule.

The network: 4 inputs → 8 hidden neurons → 1 output. 49 bytes.

What happened: This takes longer than Puzzle 1 because the exception cases are rarer. The network has to see enough blue squares to learn they’re special.

The skill: Learn that some combinations are special.

Puzzle 3: CONTEXT

The problem: A light and two paths. Light ON means left is safe. Light OFF means right is safe. There’s no single correct answer — the choice depends on context. This is the XOR problem, famously unsolvable without a hidden layer.

The network: 2 inputs → 4 hidden neurons → 1 output. 17 bytes.

What happened: This puzzle takes the longest. XOR requires the hidden layer to learn intermediate representations (“do light and path match?”) before the output can be correct. Some weight initializations converge faster than others.

The skill: Use context to change the decision.

Puzzle 4: ORDER

The problem: Two buttons, A and B. Press both to open a door, but order matters — A first, then B. The network’s only input is what was pressed last.

The network: 1 input → 4 hidden neurons → 2 outputs (one per button). 18 bytes.

What happened: The network learns two rules: “if nothing pressed, press A” and “if A pressed, press B.” This converges quickly once both cases have been seen.

The skill: Remember and respond to history.

Puzzle 5: EVERYTHING

The problem: A light and two objects of different sizes. Light ON means pick bigger. Light OFF means pick smaller. This combines Puzzle 1 (size comparison) with Puzzle 3 (context-dependent rules).

The network: 3 inputs → 8 hidden → 4 hidden → 1 output. Two layers, 73 bytes.

What happened: Instead of stopping when enen demonstrates learning, this puzzle runs a gauntlet: 10 warmup trials for learning, then 20 scored trials. After warmup, the network typically scores 75-90%.

The skill: Combine learned abilities.


How Does That Work?

Two things make this possible: experience replay for learning, and integer arithmetic for efficiency.

Experience Replay

When enen gets something wrong, it doesn’t just learn from that one mistake. It retrains on every trial it has ever seen.

void learn(...) {
    history_.push_back(new_sample);

    for (int epoch = 0; epoch < EPOCHS_PER_TRIAL; epoch++) {
        for (const auto& sample : history_) {
            forward(sample);
            backward(sample);
        }
    }
}

This technique — experience replay — was introduced in the early 1990s and became widely known through DeepMind’s Atari work in 2015. The insight is simple: small networks trained online can forget earlier lessons when they only see the latest gradient. Replaying the full history prevents that. Every trial reinforces the complete picture.

The cost is quadratic. Trial 10 retrains on 10 samples; trial 20 retrains on 20. But at this scale, quadratic is fine. A learn() call with 10 samples and 50 epochs runs about 500 forward/backward passes through a 50-parameter network. On a modern CPU, that’s tens of microseconds. On a microcontroller, it’s under a millisecond. Either way, it fits comfortably in a 16ms frame budget with room to spare.

Integer Arithmetic

IntgrNN stores weights, activations, and gradients as 8-bit integers. There’s no floating point anywhere — not in the forward pass, the backward pass, or the weight update.

This has three consequences.

Size: Each parameter is one byte. A 4→8→1 network has 41 parameters and fits in 49 bytes. All five demo networks together total 206 bytes — small enough to live in L1 cache on any modern processor.

Speed: Integer math is faster than floating point on most hardware, and dramatically faster on hardware without an FPU. With the network and its history fitting in cache, memory latency effectively disappears. A forward pass through the largest demo network takes single-digit microseconds.

Determinism: Integer operations produce identical results across platforms. The same seed gives the same initial weights. The same inputs give the same outputs. The same learning gives the same behavior. This matters for debugging, replay systems, networked games, and any application where reproducibility is required.


Try It

The demo runs in a terminal. The video above shows the full sequence.

Thanks for reading. If any of this seems useful, clone the demo and experiment. Change the puzzles, try different architectures, break things. The code is small enough to understand in an afternoon.

I think there’s a new kind of interactive experience waiting to be built with this stuff — agents that genuinely adapt, systems that learn during play, behavior that surprises even the designer. If you build something interesting, I’d like to see it.