← Back to all insights
AI

Errors are patterns

A wrong answer is sometimes the clearest view into how a system behaves.

Angus Uelsmann Angus Uelsmann 4 min read
Repeated AI mistakes reveal patterns in system behavior instead of isolated random errors.
AI-generated image

A language model can give an answer that feels random. That does not mean randomness happened. Sometimes the wrong answer is more useful than the answer itself, because it shows the shape of the system behind it.

  • ai
  • systems
  • patterns
  • debugging

Errors are patterns before they are problems.

Core claim

  • A repeated mistake is usually not just noise.
  • Models do not roll dice unless the system gives them dice.
  • When different systems fail in a similar shape, the failure becomes a signal.

The pattern

Ask a language model for a random number between 1 and 1000.

Sometimes it gives you a number that feels random.

Then someone else asks the same thing and gets the same number.

Then more people test it across different assistants and similar answers appear again.

At that point, the interesting part is not the number.

The interesting part is that different systems can learn a similar shape of randomness.

randomness check
looks A random number was requested and a number was returned.
unknown No tool, code execution, entropy source or actual sampling may have been used.
risk The output feels like randomness, but may only be the model producing what randomness looks like in text.
Claude returning 742 in a mobile chat view.
One system. Same number. Different surface.
Google AI overview showing 742 after the same random-number prompt.
Google showing the same answer shape.
Gemini answering 742 for the random-number prompt.
Gemini landing on 742 again.
Another assistant interface also returning 742.
The pattern repeats across apps.

The wrong mental model

The mistake is easy to make.

You ask for a random number, so it feels like the system should choose one from the range.

Like rolling a die.

But without a tool, a model is not necessarily sampling from the number space. It is generating a plausible continuation of the prompt.

You are not asking for randomness. You are asking for what randomness looks like in text.

That distinction matters.

Where it works

For many casual cases, this does not matter.

If you ask for a random dinner idea, a placeholder number, a playful choice or a throwaway example, the output only needs to feel varied enough.

A model can be useful there because the task is not really about entropy. It is about suggestion.

Not every request that says random actually needs randomness.

Sometimes “surprise me” is enough.

Where it breaks

It breaks when the user thinks the system made a real random choice.

Giveaway winners. Security tokens. Sampling logic. A/B assignment. Fair ordering. Anything where the result must not just look arbitrary, but actually be generated from a reliable source of randomness.

That is a different job.

Code execution, CSPRNGs, system entropy, hardware signals, timing, user interaction and proper sampling logic are not the same thing as a model predicting the next plausible token.

A language-shaped answer is not automatically a system-shaped solution.

The real problem

The real problem is not that a model says the same number twice.

The real problem is when people treat the answer as proof that the requested process happened.

This shows up far beyond random numbers.

  • A model can explain an API without checking whether that API exists.
  • A model can write secure-looking code without understanding the surrounding data boundaries.
  • A model can produce a confident architecture answer while hiding the assumptions that make it fragile.

The output can look like the thing you asked for.

That does not mean the system actually did the thing you meant.

Why errors matter

Errors are useful because they expose system behavior.

A weird answer is not always just a failure to laugh at. Sometimes it is the clearest diagnostic surface you get.

Repeated errors show where the model has learned a shortcut. Shared errors across different systems show where the shortcut may come from something larger: similar data, similar preference training, similar prompts, similar human expectations.

The failure is not random when it keeps the same shape.

That is why small mistakes matter.

Closing

The number is not the point.

The pattern is.

When a system gives the same wrong-looking answer in the same kind of situation, it is showing you something.

Good debugging starts there.

A wrong answer can be a better signal than a lucky correct one.

Related note

This idea came out of a Threads discussion as well: the original post.

Found this useful? Support the work →

If this is the kind of thinking you want in your product, say hello.

Start the conversation.