Researcher Degrees of Freedom in LLM Choice

The many decision points in LLM-based research that affect results but are often unreported. Small methodological choices can significantly impact findings.

Decision Points

Model Selection:

  • Which model family?
  • Which model size?
  • Which checkpoint/version?
  • API vs. local deployment?

Prompting:

  • System prompt content
  • Few-shot examples (which? how many?)
  • Output format specification
  • Temperature and sampling parameters

Evaluation:

  • Which metrics?
  • Human vs. automated evaluation
  • Sample size and selection
  • Statistical tests applied

The Problem

Researchers may (consciously or not) select combinations that produce desired results. Results may not replicate with different reasonable choices.

Mitigation

  • Pre-registration of methodology
  • Robustness checks across model/prompt variations
  • Full reporting of all parameters
  • Sensitivity analysis for key decisions

Related: 05-atom—evaluation-metric-limitations, 05-molecule—prepartadigmatic-field-dynamics