testing
types
LLMs
AI
correctness

Tests Verify Behavior, Types Verify Structure

Written by Lucian Ghinda

Working with LLMs to generate code has made me think deeply about the relationship between tests and types. Both matter for code correctness, but they provide fundamentally different kinds of confidence.

Tests verify behavior

Tests answer a straightforward question: Does the system behave as expected in these scenarios?

They provide concrete evidence that specific cases work. When you run your test suite and see green, you know those particular scenarios pass. Tests also protect against regressions. When you refactor code or add new features, your existing tests catch breaks in functionality you already built. During development, they give you fast feedback about whether your changes work. I wrote about how to think about testing as developer and what purpose does the tests have for your development process.

But tests have clear limits. They only check what you thought to test. If you miss an edge case or make wrong assumptions about how the system should behave, your tests will pass while bugs remain. Passing tests do not mean there are no bugs. So they are as strong as your understanding of the product, target market, business goals, and user needs.

Tests give you confidence through examples. They show that specific inputs produce expected outputs.

Types verify structure

Types answer a different question: Can this category of errors happen?

They provide guarantees before running the code. A type checker validates your program structure without executing it. Types eliminate entire categories of invalid states. If your type system says you cannot pass a string where an integer is expected, that error simply cannot occur at runtime. They also make refactoring safer. When you change a function signature, the type checker finds every place that needs updating.

But types also have limits. They express structure, not business logic. A function that takes two integers and returns an integer might be typed correctly but still implement the wrong calculation. Types cannot encode real-world correctness. They prevent structural mistakes but allow logical ones.

Types give you confidence through constraints. They restrict what operations are possible on your data.

The reality with LLMs

When generating code with LLMs, this distinction becomes crucial.

LLMs are surprisingly good at producing structurally valid code. Modern language models understand syntax, common patterns, and basic type requirements. The real problem is they often generate code that compiles and runs but does the wrong thing.

They make wrong assumptions about your requirements. They miss edge cases you forgot to mention. They misunderstand the business logic you tried to describe. The code looks right and passes basic structural validation, but it implements the wrong behavior.

This is where tests become essential. A test suite that captures what the code should actually do catches these subtle logic errors that types cannot see. Types help ensure the code is structurally sound, but tests validate that the LLM understood what you needed.

What developers have proven

Developers in dynamically-typed languages like Ruby have built massive, reliable systems with comprehensive tests and no static types for decades. These systems handle millions of users and billions of dollars in transactions.

The pattern is consistent: Strong test coverage catches the errors that matter in production. Tests verify the actual behavior users depend on.

When working with LLM-generated code, this pattern holds even more strongly. The generated code might satisfy type constraints easily, but only tests tell you if it solves your actual problem.

Both matter, differently

I am not arguing against types. In statically-typed languages, types catch whole classes of errors automatically. They make code easier to refactor and maintain. They provide valuable documentation about structure.

But when evaluating LLM-generated code, tests provide the confidence you need most. They verify that the system does what you intended, not just that it satisfies structural requirements.

The code might be perfectly typed and completely wrong. Tests catch that. The code might lack type annotations but work exactly as needed. Tests verify that too.

Both tests and types improve code quality, but they do it through different mechanisms. Understanding this difference helps you work more effectively with LLMs to generate reliable code.

#goodenoughtesting #subscribe #email

Get free samples and be notified when a new workshop is scheduled

You might get sometimes an weekly/monthly emails with updates, testing tips and articles.
I will share early bird prices and discounts with you when the courses are ready.