Welcome to DevInterp’s documentation!
DevInterp is a Python library for conducting research on developmental interpretability, a novel AI safety research agenda rooted in Singular Learning Theory (SLT). DevInterp proposes tools for detecting, locating, and ultimately controlling the development of structure over training.
Read more about developmental interpretability here!
For questions, join the DevInterp discord!
Warning
This library is under active development. The API may change between releases.
Installation
devinterp is distributed through PyPI. Install with uv:
uv add devinterp
Requirements: Python 3.10 or higher.
Quick Start
Compute the Local Learning Coefficient
from devinterp.slt.llc import llc
result = llc(
model=model,
dataset=dataset, # HuggingFace Dataset with "input_ids"
observables={"train": dataset},
lr=0.001,
n_beta=30,
num_chains=4,
num_draws=200,
)
print(result["llc_mean"]) # scalar LLC
print(result["llc_per_chain"]) # (num_chains,) per-chain LLC
print(result["loss_trace"]) # (num_chains, num_steps) per-step loss,
# num_steps = num_draws * num_steps_bw_draws + num_burnin_steps
Sample with Observables
from devinterp.slt.sampling import sample
tree = sample(
model=model,
dataset=train_data,
observables={
"train": train_data,
"code": (code_data, 5), # (dataset, batches_per_draw)
},
lr=0.001,
n_beta=30,
num_chains=4,
num_draws=200,
)
# tree is an xr.DataTree backed by Zarr with full per-token loss traces
Concepts
Posterior Sampling with SGLD
The core workflow:
Start at a checkpoint \(\hat{w}^*\)
Take SGLD steps (SGD + noise) using one dataset for gradients
Evaluate losses on multiple datasets (observables) at each draw
Store the full per-token loss chains as Zarr datasets
Compute observables (LLC, susceptibilities, BIF) from these chains
The SGLD noise allows exploring low-loss directions while staying near the original checkpoint. This samples from the local posterior distribution around the checkpoint.
Local Learning Coefficient (LLC)
The LLC measures model complexity by counting “effective parameters” in a region of weight space:
Unlike parameter count or Hessian rank, LLC accounts for singularities – regions where multiple parameter configurations produce identical outputs. This makes it suitable for neural networks.
Why LLC matters:
Detect phase transitions during training (sudden capability changes)
Predict generalization via the Free Energy formula
Compare checkpoints across training
Susceptibilities
Susceptibilities measure how a model component responds to distribution shifts. For example, how does an attention head’s behavior change when shifting from general text toward code or math?
This is computed by sampling with different weight restrictions (parameter subsets) and measuring the covariance between sampling loss and observable loss.
See Structural Inference: Interpreting Small Language Models with Susceptibilities (Baker et al., 2025) for details.
Bayesian Influence Functions (BIF)
BIF computes pairwise correlations between observable loss traces across sequences from SGLD sampling results. This reveals which sequences influence each other’s loss under posterior sampling, providing a measure of functional similarity.
Architecture
Each analysis has two entry points:
High-level (
llc(),bif(),susceptibilities()): runs sampling and post-processing in one callLow-level (
compute_llc(),compute_bif()): takes a pre-computedxr.DataTreefromsample(), useful when you want to run sampling once and compute multiple analyses.compute_susceptibilities()takes adict[str, xr.DataTree](one tree per weight restriction), since susceptibilities require a separate sampling run for each restriction.
The sampling pipeline stores full per-token losses to Zarr via sample(), and
post-processing functions operate on the resulting xr.DataTree.
Model Requirements
The current API assumes autoregressive language models with fixed-length tokenized sequences:
Model must accept
input_idsand return logits (HuggingFace models, TransformerLensHookedTransformer, or any model returning a tensor or object with.logits)Dataset must be a HuggingFace
Datasetwith an"input_ids"column of uniform-length sequencesLoss is next-token cross-entropy
For non-standard models, sample_single_chain() in devinterp.slt.sampler accepts a
custom evaluate callable.
Hyperparameter selection
All sampling is sensitive to hyperparameters. See our Sampling Hyperparameter Guide.
Further Reading
You’re Measuring Model Complexity Wrong - Introduction to LLC and phase transitions (2024)
Towards Spectroscopy: Susceptibility Clusters in Language Models (2026)
The Local Learning Coefficient: A Singularity-Aware Complexity Measure (2023)
Algebraic Geometry and Statistical Learning Theory Watanabe (2009)
Credits & Citations
This package was created by Timaeus. Most of the sampling, LLC, susceptibility, and BIF implementations were developed internally; this package is a port of that joint work.
If this package was useful in your work, please cite it as:
@misc{devinterp2026,
title = {DevInterp},
author = {Snell, William and Wind, Johan Sokrates and Snikkers, Billy
and Fraser, Sandy and Newgas, Adam and Hoogland, Jesse
and Wang, George and Gordon, Andrew and Zhou, William
and van Wingerden, Stan},
year = {2026},
version = {2.0},
howpublished = {\url{https://github.com/timaeus-research/devinterp}},
}
Guides
API Reference
- Configs, Observables, Postprocessing
- Submodules
- devinterp.slt.bif module
- devinterp.slt.config module
- devinterp.slt.covariance module
- devinterp.slt.llc module
- devinterp.slt.lm_loss module
- devinterp.slt.observables module
- devinterp.slt.sampler module
- devinterp.slt.sampling module
- devinterp.slt.susceptibilities module
- devinterp.slt.weight_restrictions module
- devinterp.slt.writing module
- devinterp.slt.zarr_schema module
- Module contents
- Sampling Methods
- Utilities