Lying Requires Beliefs and Assertion

Lying is distinct from deception. The standard philosophical definition: stating a false claim to someone, where the speaker does not believe the claim to be true.

Key requirements:

  1. An ability to make statements: lying involves a specific assertoric commitment to a proposition, distinguishing it from jokes, questions, fiction, or role-play
  2. Beliefs on the part of the liar: you can only lie about something you have a belief about

These criteria raise immediate difficulties when applied to AI systems. Whether LLMs can make genuine assertions (vs. producing text that resembles assertions) is contested. Whether they possess beliefs in the relevant sense is even more contested.

For MI research: Some studies claim to detect “when an LLM knows it’s lying” by probing internal states. But philosophical scrutiny reveals these studies may conflate detecting truth-correlated directions in activation space with detecting genuine belief states. A state must be used by the system — must causally drive appropriate behavior — to count as a belief.

The distinction matters: a probe finding truth-direction correlation isn’t the same as finding the model “believes” something and is “lying” about it.

Related: 05-atom—deception-requires-intention