General-Purpose AI Model Governance

The Concept

General-purpose AI (GPAI) models are AI models that display “significant generality” and can competently perform a wide range of distinct tasks, regardless of how they’re placed on the market. The EU AI Act creates a separate regulatory track for these models, recognizing they don’t fit cleanly into application-specific risk categories.

Why It Matters

Traditional product safety regulation assumes you can assess risk based on intended use. A medical device is evaluated for medical contexts. A toy is evaluated for child safety contexts.

GPAI models break this assumption. A language model can be deployed for customer service, medical advice, legal research, or creative writing. The same model, the same weights, but radically different risk profiles depending on application.

The Act’s response: regulate the model layer separately from the application layer, with obligations that address model-level concerns.

How It Works

All GPAI Model Providers Must:

  • Maintain technical documentation including training methodology
  • Provide information to downstream providers integrating the model
  • Establish policies to comply with copyright law
  • Publish a sufficiently detailed summary of training content

GPAI Models with Systemic Risk (Additional Obligations):

  • Conduct model evaluations including adversarial testing
  • Assess and mitigate systemic risks
  • Track and report serious incidents
  • Ensure adequate cybersecurity protections

Systemic Risk Classification: A model is presumed to have systemic risk if trained with total compute greater than 10²⁵ floating-point operations. The Commission can also designate models based on capability assessments.

Implications

This creates a tiered system within the GPAI category:

  • Tier 1: All GPAI models → transparency and documentation
  • Tier 2: GPAI with systemic risk → additional safety obligations

The compute threshold is notable, a proxy for capability that doesn’t require evaluating the model’s actual behavior. Crude but operationally practical.

Tensions and Limitations

The framework struggles with:

  • Open-weight models: Obligations attach to providers, but open models have diffuse contribution
  • Capability emergence: Risks may emerge post-training in ways compute thresholds don’t capture
  • Fine-tuning effects: Downstream modifications can introduce risks the original provider didn’t anticipate

The Codes of Practice (Art. 56) are intended to fill implementation gaps, but detailed standards are still developing.

Related: 05-molecule—risk-based-ai-classification, 07-molecule—value-chain-accountability-ai