Probing regulatory logic

Published

April 21, 2026

SAE features are annotated with biological programs, and steering them moves gene expression in sensible directions. But does AIDO.Cell encode the causal direction of gene regulation — does perturbing a transcription factor propagate to its targets more strongly than the reverse?

If yes, we have a real regulatory model inside the scFM. If no, it’s encoding co-expression memory (which genes go together) without a notion of who drives whom. That’s a big distinction for any downstream perturbation- prediction story, so we tested it directly.

Test setup

  • TF: TBX21 (master Th1 regulator).
  • Feature: f3092 — PR ≈ 11 on CD8 T cells, top genes include CXCR3, IFNG, CCL5, TBX21, TNF — a coherent Th1 module.
  • Cell type: CD8 T cells (n = 316).
  • Ground truth: CollecTRI signed regulon — 53 literature-curated TBX21 targets.
  • Control targets (in-feature, in-regulon): CXCR3, IFNG, TNF.
  • Random pool (in-feature, not in-regulon): CCL5, KPNA2, OASL.

For each cell, alpha, and target gene \(T_i\):

  • Forward effect \(F_i\) = change in \(\text{logit}(T_i)\) when we steer f3092 at the TBX21 token.
  • Reverse effect \(R_i\) = change in \(\text{logit}(\text{TBX21})\) when we steer f3092 at the \(T_i\) token.

If TBX21 causally drives its targets, \(|F_i| \gg |R_i|\).

Result: no directional asymmetry

Across all three targets and all tested alphas (±2, ±5, ±15), paired permutation tests on \(|F_i| - |R_i|\) fail almost everywhere. The forward and reverse effects are roughly symmetric — perturbing TBX21 moves its targets about the same amount as perturbing a target moves TBX21.

Interpretation

Under this probe, AIDO.Cell’s representation of the TBX21-Th1 module looks like a bidirectional co-expression cluster, not a directed TF-to-target graph. Two caveats before drawing grand conclusions:

  1. This is one TF with ~3 targets in one feature. The negative is for this system, not necessarily all of regulatory biology.
  2. The steering is at the feature level, not a direct perturbation of the TF’s token. A more direct test would intervene on TBX21’s own gene embedding and measure propagation.

That said: if the scFM did encode directionality, we’d expect to see a signal on the Th1 module with CollecTRI ground truth. We don’t.

Why this matters

It suggests that “steering a feature” works as a transcriptional-state operator, not as a causal intervention on a regulatory network. For interpretable perturbation prediction this is good news (state-level steering is tractable and interpretable) and bad news (we shouldn’t claim the scFM has learned causal regulation from expression data alone). The field has been suspicious of this for a while; this is a small, direct data point.

Further reading

  • Full experimental design, statistics, and per-target figures: reports/regulatory_logic_experiment.md.