The Hierarchical Model

The MAP (Meta-Analytic Predictive) prior framework treats each historical study as an exchangeable draw from a common distribution. The model is:

yi | θi ~ N(θi, σi² / ni)
θi | μ, τ ~ N(μ, τ²)

Here, yi is the observed mean in study i, θi is the true study-specific mean, μ is the grand mean, and τ is the between-study standard deviation (heterogeneity). The key insight is that τ controls how much information flows from historical studies to the new study:

This automatic adjustment is the core advantage over ad-hoc methods. The model borrows exactly as much as the data warrant.

Heterogeneity Estimation

This tool uses the DerSimonian-Laird (DL) estimator for τ², the standard method-of-moments estimator from meta-analysis. Given k studies with inverse-variance weights wi = ni / si²:

Q = Σ wi(yi - ŷ)2   τ²DL = max(0, (Q - (k-1)) / C)

where C = Σwi - Σwi² / Σwi and ŷ is the weighted mean. The I² statistic expresses heterogeneity as a percentage: I² = max(0, (Q - (k-1))/Q) × 100%.

Limitation: DL can underestimate τ when k is small (< 10 studies). For regulatory submissions, consider the full Bayesian estimation, which places a proper prior on τ and provides credible intervals.

Effective Sample Size (ESS)

The effective sample size quantifies how many concurrent control animals the historical data are equivalent to. Under the normal-normal hierarchical model:

ESS = σ² / (σ² / Nhist + τ²)

This formula captures the discount: as τ increases, the denominator grows and ESS shrinks. When τ = 0, ESS = Nhist (full information). The formula comes from the predictive variance of a new study mean under the hierarchical model (Neuenschwander et al., 2010).

Robustification

Following Schmidli et al. (2014), the MAP prior is mixed with a vague (uninformative) component:

πrobust = (1 - w) · πMAP + w · πvague

The default weight w = 0.2 means 20% of the prior mass comes from a vague component that is essentially uninformative. This protects against prior-data conflict — the scenario where historical controls differ from the current study more than the heterogeneity model predicts. The practical effect is a further reduction of ESS:

ESSrobust = (1 - w) × ESS

Sample Size Calculation

The ESS is independent of the experimental design. It quantifies the information content of historical controls regardless of whether the new study is a t-test, ANOVA, or Dunnett design. The design only determines the classical ncontrol that the ESS is subtracted from.

Two-group (t-test)

n = ⌈ 2 · ((zα/2 + zβ) · σ / δ)² ⌉

Multi-group (Dunnett: 1 control vs. k treatments)

For k many-to-one comparisons, the per-comparison α is Bonferroni-corrected to α/k. The optimal control group allocation is ncontrol = ntreatment × √k (Dunnett, 1955).

ntreatment = ⌈ ((zα/(2k) + zβ) / d)² ⌉   ncontrol = ⌈ ntreatment × √k ⌉

Custom / other designs

For ANOVA, factorial, dose-response, or other designs: compute ncontrol with your preferred tool (e.g. G*Power) and enter it directly in "Custom" mode. The ESS reduction applies to the control group only.

Reduction formula (all designs)

nconcurrent = max(ncontrol,classic - ESSrobust,  nmin)

Treatment group sizes are never reduced — historical data only inform the control condition. The floor nmin (default: 5) ensures that a concurrent control group is always present for:

Why Naive Pooling Fails

"Naive pooling" means treating all historical control animals as if they were concurrent — simply adding them to the current control group. This ignores the between-study variance τ². The consequence:

SEnaive = σ / √Nhist  vs.   SEtrue = √(σ²/Nhist + τ²)

The naive SE is always smaller than the true SE (when τ > 0), leading to falsely narrow confidence intervals and an inflated type I error rate. Pocock (1976) demonstrated this in clinical trials; Sacks et al. (1982) showed that historically controlled studies systematically overestimate treatment effects.

Example: With τ = 5 and 89 historical animals, naive pooling would give SE = 1.59, but the true SE is 5.13 — a 3.2× underestimation. The nominal α = 5% becomes an actual α of ~33%. One in three experiments would show a "significant" result by chance alone.

Limitations & Assumptions

Software & Validation

This tool implements the analytical ESS approximation in JavaScript for browser-based computation. For regulatory submissions or publications, we recommend validating results against a full Bayesian implementation performed by a statistician.

References