name: bayesian-reanalysis description: Bayesian Monte Carlo re-analysis using literature-derived priors, posterior sampling, HDI, and ROPE analysis to assess probability of meaningful treatment effect.

Bayesian Monte Carlo Re-Analysis

When to Use

When prior evidence exists and you want to answer: "Given what we already know, what's the probability this treatment has a meaningful effect?"

This is fundamentally different from frequentist analysis. Frequentist methods ask "how surprising is this data if the null is true?" Bayesian analysis asks "given the data AND prior evidence, what do we believe about the treatment effect?"

Use this module when:

U1 (Underpowered Trial): A "negative" trial exists but prior evidence suggests the effect might be real. The Bayesian posterior can shift the balance when the frequentist CI is wide.
T2 (Interventional Efficacy): Multiple prior studies exist with varying results. Bayesian synthesis provides a coherent probability statement.
T4 (Repurposing): Evidence from the original indication provides an informative prior for the new indication.

Do NOT use when:

No prior evidence exists (uninformative priors add nothing over frequentist analysis)
The prior evidence is from fundamentally different populations/mechanisms
The trial is well-powered and clearly positive or negative (Bayesian analysis won't change the conclusion)

Prerequisites

Before running this module, these sandbox files should exist:

./parsed_hypothesis.json — PICO and trial data
./prior_evidence_report.json — must contain extracted_effect_sizes with at least one entry of quality "moderate" or "high"
./power_analysis_results.json — frequentist analysis results (for MCID and CI)

If prior_evidence_report.json has no extracted_effect_sizes or all are quality "low", skip this module and note in output that Bayesian analysis was not performed due to insufficient prior evidence.

Method

This module uses Monte Carlo sampling from the exact analytical posterior (Normal-Normal conjugate). The template:

Reads prior_evidence_report.json and constructs an informative prior via inverse-variance weighted pooling
Computes the exact analytical posterior
Draws 10,000 posterior samples for HDI and ROPE analysis
Runs sensitivity analysis across 3 prior specifications
Generates a 4-panel diagnostic chart

Instructions

Step 1: Extract Trial Data

From ./parsed_hypothesis.json, extract:

Endpoint type (binary, continuous, survival)
Observed effect (RR, HR, mean difference)
95% CI bounds (to derive SE)

From ./power_analysis_results.json, extract:

MCID (minimum clinically important difference)

Step 2: Customize and Run the Template

Copy templates/bayesian_mc_reanalysis.py to ./bayesian_mc_reanalysis.py.

Edit the CONFIG section only:

ENDPOINT_TYPE = "survival"  # "binary" | "continuous" | "survival"
OBSERVED_EFFECT = 0.82      # HR, RR, or mean difference
CI_LOWER = 0.63
CI_UPPER = 1.07
MCID = 0.80                 # Minimum clinically important difference
ROPE_BOUNDS = None           # Auto-compute, or set [lower, upper] on log scale
N_SAMPLES = 10_000

Then run:

python3 ./bayesian_mc_reanalysis.py

The template automatically:

Reads prior_evidence_report.json and constructs the prior
Pools effect sizes using inverse-variance weighting
Maps study quality/type to prior SD
Generates skeptical and enthusiastic priors for sensitivity
Computes HDI (Highest Density Interval)
Runs ROPE (Region of Practical Equivalence) analysis
Produces 4-panel chart and JSON output

Step 3: Interpret Results

The template writes ./bayesian_reanalysis.json and ./bayesian_reanalysis.png.

Key outputs to report:

P(meaningful effect): Core probability statement
- > 0.80: Strong posterior support
- 0.50–0.80: Moderate support
- 0.20–0.50: Weak support
- < 0.20: Very weak
HDI (Highest Density Interval): Narrowest 95% credible interval. More informative than equal-tailed CI for skewed posteriors. Report on natural scale (HR, RR, etc.).
ROPE analysis: Does the posterior overlap with "practically null" effects?
- P(ROPE) > 0.95: Accept practical equivalence — no meaningful effect
- P(ROPE) < 0.05: Reject equivalence — real effect exists
- Otherwise: Undecided — data insufficient
Sensitivity: Is the verdict robust across skeptical / evidence-based / enthusiastic priors?
- Robust = same category under all 3 → strong conclusion
- Sensitive = flips between priors → data insufficient to overcome prior uncertainty

Step 4: Write the Verdict

Synthesize into a clear statement:

BAYESIAN SUPPORT: P(meaningful) > 0.80 under evidence-based prior, > 0.50 under skeptical prior. Prior and data agree. ROPE rejected.
BAYESIAN LEAN: P(meaningful) 0.50–0.80 under evidence-based prior. Data shift the posterior but don't overcome skepticism. Consider adaptive design.
BAYESIAN NEUTRAL: P(meaningful) 0.20–0.50. Prior and data are in tension or both weak. ROPE undecided.
BAYESIAN AGAINST: P(meaningful) < 0.20. Even with favorable prior, the data do not support a meaningful effect.

Limitations and Caveats

You MUST mention these in any report that includes Bayesian results:

LLM-extracted priors: Effect sizes used to construct the prior were extracted by an LLM from PubMed abstracts, not by a trained systematic reviewer. Prior SDs are inflated ~50% vs. textbook values to partially compensate, but misextraction remains possible.
No heterogeneity modeling: The prior SD is based on study count and concordance, not on formal between-study heterogeneity (I²/tau²). A meta-analysis with high heterogeneity deserves a wider prior than reflected here.
Double-counting risk: If the trial being analyzed was included in a meta-analysis used as prior, the posterior is overconfident. Check that the prior studies are independent of the current trial.
ROPE bounds are heuristic: Default ROPE (± half of |log(MCID)|) is a computational convenience, not a clinically validated equivalence margin.
Conjugate model only: The Normal-Normal model assumes symmetric uncertainty on the log scale. For very rare events or extreme effects, this approximation breaks down.

Bottom line: This Bayesian analysis provides a structured framework for prior-data synthesis, not a definitive probability. The sensitivity analysis is the most important output — if the verdict flips between priors, the data cannot adjudicate.

Prior Construction Reference

The template constructs priors automatically, but understanding the rules is important for interpretation:

| Evidence Source | Prior SD (log scale) | Rationale | |----------------|---------------------|-----------| | Meta-analysis, concordant | 0.15 | Strong prior — inflated from 0.10 for LLM extraction | | 3+ studies, concordant | 0.22 | Strong prior — inflated from 0.15 | | 2 studies, concordant | 0.28 | Moderate prior — inflated from 0.20 | | 2 studies, conflicting | 0.35 | Wider — conflicting evidence | | Single RCT | 0.30 | Moderate — inflated from 0.25 | | Single observational | 0.40 | Weak — confounding + extraction uncertainty | | No usable evidence | 0.50 | Uninformative — skip module |

Note: All SDs are intentionally wider than textbook recommendations (~50% inflation) because effect sizes are extracted by an LLM, not a trained reviewer. If the spread of extracted values exceeds the base SD, it is inflated further.

Prior mean: Inverse-variance weighted pooled estimate from moderate/high quality studies, on the analysis scale (log-HR, log-RR, or raw difference).

ROPE Bounds Reference

ROPE (Region of Practical Equivalence) defines effect sizes too small to matter:

| Measure | Default ROPE | Rationale | |---------|-------------|-----------| | HR/RR (log scale) | ± half of |log(MCID)| | Effects within this range are clinically negligible | | Mean difference | ± half of |MCID| | Same principle for continuous outcomes |

Set ROPE_BOUNDS explicitly in CONFIG if the default is inappropriate for the clinical context.

Output Schema

The template writes ./bayesian_reanalysis.json:

{
  "module": "bayesian_reanalysis",
  "method": "monte_carlo",
  "n_samples": 10000,
  "endpoint_type": "binary|continuous|survival",
  "prior": {
    "source": "description of evidence used",
    "citations": ["PMID:12345678"],
    "mean": 0.0,
    "sd": 0.0,
    "scale": "log_hr|log_rr|raw_difference",
    "quality": "high|moderate|low|none",
    "n_studies": 0,
    "rationale": "How the prior was constructed"
  },
  "likelihood": {
    "observed": 0.0,
    "se": 0.0,
    "scale": "log(HR)"
  },
  "posterior": {
    "mean": 0.0,
    "sd": 0.0,
    "ci_95": [0.0, 0.0],
    "hdi_95": [0.0, 0.0],
    "natural_scale": {
      "point_estimate": 0.0,
      "hdi_95": [0.0, 0.0]
    }
  },
  "prob_meaningful_effect": 0.0,
  "mcid": 0.0,
  "rope": {
    "bounds": [0.0, 0.0],
    "bounds_natural_scale": [0.0, 0.0],
    "prob_in_rope": 0.0,
    "decision": "reject_equivalence|accept_equivalence|undecided"
  },
  "sensitivity": {
    "skeptical": {
      "prior_mean": 0.0,
      "prior_sd": 0.0,
      "post_mean": 0.0,
      "post_sd": 0.0,
      "hdi_95": [0.0, 0.0],
      "prob_meaningful": 0.0
    },
    "evidence_based": { "..." : "same structure" },
    "enthusiastic": { "..." : "same structure" },
    "robust": true
  },
  "verdict": "bayesian_support|bayesian_lean|bayesian_neutral|bayesian_against",
  "severity": "low|medium|high|critical",
  "title": "One-line summary",
  "analysis": "Detailed paragraph explaining the Bayesian reasoning",
  "charts": ["./bayesian_reanalysis.png"]
}

Bayesian Monte Carlo Re-Analysis

Recommended for

Our review

Strengths

Limitations

Security analysis

Examples

name: bayesian-reanalysis description: Bayesian Monte Carlo re-analysis using literature-derived priors, posterior sampling, HDI, and ROPE analysis to assess probability of meaningful treatment effect.

Bayesian Monte Carlo Re-Analysis

When to Use

Prerequisites

Method

Instructions

Step 1: Extract Trial Data

Step 2: Customize and Run the Template

Step 3: Interpret Results

Step 4: Write the Verdict

Limitations and Caveats

Prior Construction Reference

ROPE Bounds Reference

Output Schema

Prompt Engineering

Data Visualization

RAG Architecture Setup