Does SR 11-7 apply to LLMs and generative AI?

Yes when LLM output affects a regulated business decision. The guidance is technology-agnostic. We engineer LLM-backed systems with SR 11-7 documentation, validation, and monitoring plus LLM-specific concerns: prompt versioning, eval coverage, hallucination monitoring, citation tracking.

Yes when the LLM's output affects a business decision in a regulated workload. The guidance is technology-agnostic — it applies to any model whose output is consumed in regulated decision-making. We engineer LLM-backed systems with the same SR 11-7 documentation, validation, and monitoring expected of traditional models, plus the LLM-specific concerns: prompt versioning, eval-harness coverage, hallucination monitoring, and citation tracking.

How do you handle model-validation independence on smaller teams?

Validation is structurally independent of development — a different pod, different reporting chain. Where independence isn't achievable, compensating controls (external review, committee oversight) are documented and visible to supervisors.

Validation is structurally independent of development. On Prosigns engagements that's typically a different engineering pod doing validation than the one doing development; on combined engagements with internal teams, we coordinate so the validation function reports through a different chain than development. Where independence isn't achievable, we document the compensating control (external review, governance-committee oversight) and the supervisor sees the limitation explicitly.

What's a typical model inventory look like?

Per-model entry: name, version, purpose, risk tier, owners, validation status, last/next validation dates, monitoring status, dependencies, known limitations. The inventory is operational and the audit-time source of truth.

Per-model entry: model name, version, business purpose, risk tier, business owner, model owner, validation status (pending / approved / approved-with-conditions), last-validation date, next-validation date, ongoing-monitoring status, dependencies (upstream data, downstream consumers), known limitations. The inventory itself is operational — supervisors pull samples on exam, and the entries should already be the audit-time source of truth.

How does ongoing monitoring work in practice?

Three layers: input-distribution drift, output-distribution drift, outcomes-based performance once labeled feedback is available. Alerting per risk tier; sustained degradation triggers documented review and potential retraining.

Three monitoring layers run continuously. (1) Input-distribution drift on the feature space against the training-time baseline. (2) Output-distribution drift on the prediction space. (3) Outcomes-based performance on labeled feedback once available — accuracy, calibration, fairness metrics. Alerts fire per the model's risk tier; sustained degradation triggers documented review and potential retraining.

What about challenger models?

Challenger models are part of validation. We build a simpler challenger (logistic regression, GBT) alongside production and compare outcomes-based performance over the validation window. Reports surface the comparison; the challenger isn't deployed.

Challenger models are part of validation discipline. We typically build a simpler challenger (logistic regression, gradient-boosted tree) alongside the production model and compare outcomes-based performance over the validation window. The challenger isn't deployed; it's a benchmark for the production model's incremental complexity. Validation reports surface the comparison.

How do you document training data?

Source, lineage, sampling, time window, exclusions, missing-value treatment, known biases. Sensitive-attribute handling per fair-lending considerations. Produced concurrently with engineering, version-controlled with the model, part of validation evidence.

Training-data documentation covers source, lineage, sampling methodology, time window, exclusions, treatment of missing values, and known biases. Sensitive-attribute handling (race, gender, age) is documented per fair-lending considerations where applicable. The documentation is produced concurrently with engineering, lives in version control alongside the model, and is part of the validation evidence.

Trust · SR 11-7

SR 11-7 engineering.

Federal Reserve SR 11-7 (and the parallel OCC Bulletin 2011-12) is the supervisory guidance on model risk management (MRM) for US banking organizations. It establishes expectations for model development, implementation, use, validation, and governance — applicable to any model whose output affects business decisions, including ML systems.

Authority: Board of Governors of the Federal Reserve System; OCC Bulletin 2011-12 (parallel guidance)
Effective since: April 2011 (still the supervisory guidance of record)
Posture: Aligned

What it is, what it covers

Engineering posture.

Every ML system in a regulated financial-services workload is a model under SR 11-7. The supervisory guidance pre-dates the modern ML stack but it applies fully — supervisors expect the same documented development, independent validation, ongoing monitoring, and governance committee structures whether the model is a logistic regression from 2014 or a transformer-backed credit classifier from 2025.

Prosigns engineers ML systems for SR 11-7 alignment from kickoff. Model development is documented in the form supervisors expect; independent validation is built into the delivery pipeline rather than retrofitted before exam; ongoing monitoring is operational, not aspirational. CITADEL and CORTEX co-pilot every regulated ML engagement.

We do not substitute for your Model Risk Management function or your validation team. We engineer the systems they govern, and we produce the documentation and validation artifacts in the form they expect, when they expect them.

Scope

SR 11-7 applies to US banking organizations and their material models — any model whose output affects business decisions, including credit, fraud, market risk, capital, AML, and increasingly ML/AI systems supporting any of these. Engagements with ML deliverables in regulated financial-services workloads are scoped against the guidance from the first architecture review.

Engineering controls

How we map to SR 11-7.

Prosigns engineering practices that produce SR 11-7-aligned evidence as a side-effect of normal delivery. Each control carries a specific reference where applicable.

01
Model development with documented design choices
Every model development cycle is documented per SR 11-7's expected form: business purpose, data sourcing, feature engineering, algorithm selection rationale, training methodology, validation approach, and limitations. The documentation is produced concurrently with the engineering, not assembled before exam.
SR 11-7 §III(A) (Model development, implementation, and use)
02
Independent validation
Validation is performed by engineers structurally independent from the development team. Validation covers conceptual soundness, ongoing monitoring, and outcomes analysis. Findings carry explicit issue, mitigation, and review-date records; risk-accepted findings are tracked through the model's life cycle.
SR 11-7 §IV (Validation framework)
03
Ongoing monitoring with operational alerting
Production models monitored for input-distribution drift, output drift, and outcomes-based performance. Alerting configured per the model's risk tier; performance degradation triggers documented review and potential retraining. Monitoring evidence is collected as a side-effect of normal operation.
SR 11-7 §IV(C) (Ongoing monitoring)
04
Model inventory with risk-tiering
Every production model is registered in a central inventory with risk tier, business owner, model owner, validation status, last-review date, and dependencies. The inventory is the audit-time source of truth; supervisors expect to see it on exam.
SR 11-7 §V(A) (Model inventory)
05
Governance with documented committee structures
Model risk governance committees with documented charters, membership, escalation paths, and meeting cadence. Material model decisions (initial deployment, significant change, decommissioning) require committee review with retained minutes.
SR 11-7 §V(B) (Governance, policies, and controls)
06
Change management for model updates
Material model changes (algorithm change, training-data refresh beyond a documented threshold, feature additions) trigger re-validation and committee review. Minor changes (parameter tuning within documented bounds) are change-managed at the engineering level with concurrent documentation update.

Honest posture

Prosigns engineers ML systems aligned to SR 11-7 / OCC 2011-12 model risk management. We are not a model validation function; on regulated engagements we coordinate with your MRM team, produce documentation in the form supervisors expect, and engineer the monitoring and governance plumbing that makes ongoing supervision tractable.

Audit pack contents

What ships in the evidence package.

Engagement-scoped to the SR 11-7 deliverable. Available on request under NDA, same business day for procurement and InfoSec review.

Model documentation per SR 11-7 expected form (purpose, data, methodology, validation, limitations)
Validation reports with conceptual soundness, ongoing monitoring, and outcomes analysis
Model inventory entries with risk tier, owners, validation status, dependencies
Governance committee charters, membership, meeting minutes, escalation paths
Ongoing-monitoring configuration: drift alerting, performance dashboards, retraining triggers
Change-management records for material model changes
Issue / remediation log with explicit owner and review-date records

Where it applies

Industries we deliver SR 11-7-aware engineering for.

Financial Services
US banks, credit unions, broker-dealers with ML systems in regulated workloads.
Open the industry

Services we deliver

Practices that operate under SR 11-7.

Frequently asked

SR 11-7 in practice.

01
Does SR 11-7 apply to LLMs and generative AI?
Yes when the LLM's output affects a business decision in a regulated workload. The guidance is technology-agnostic — it applies to any model whose output is consumed in regulated decision-making. We engineer LLM-backed systems with the same SR 11-7 documentation, validation, and monitoring expected of traditional models, plus the LLM-specific concerns: prompt versioning, eval-harness coverage, hallucination monitoring, and citation tracking.
02
How do you handle model-validation independence on smaller teams?
Validation is structurally independent of development. On Prosigns engagements that's typically a different engineering pod doing validation than the one doing development; on combined engagements with internal teams, we coordinate so the validation function reports through a different chain than development. Where independence isn't achievable, we document the compensating control (external review, governance-committee oversight) and the supervisor sees the limitation explicitly.
03
What's a typical model inventory look like?
Per-model entry: model name, version, business purpose, risk tier, business owner, model owner, validation status (pending / approved / approved-with-conditions), last-validation date, next-validation date, ongoing-monitoring status, dependencies (upstream data, downstream consumers), known limitations. The inventory itself is operational — supervisors pull samples on exam, and the entries should already be the audit-time source of truth.
04
How does ongoing monitoring work in practice?
Three monitoring layers run continuously. (1) Input-distribution drift on the feature space against the training-time baseline. (2) Output-distribution drift on the prediction space. (3) Outcomes-based performance on labeled feedback once available — accuracy, calibration, fairness metrics. Alerts fire per the model's risk tier; sustained degradation triggers documented review and potential retraining.
05
What about challenger models?
Challenger models are part of validation discipline. We typically build a simpler challenger (logistic regression, gradient-boosted tree) alongside the production model and compare outcomes-based performance over the validation window. The challenger isn't deployed; it's a benchmark for the production model's incremental complexity. Validation reports surface the comparison.
06
How do you document training data?
Training-data documentation covers source, lineage, sampling methodology, time window, exclusions, treatment of missing values, and known biases. Sensitive-attribute handling (race, gender, age) is documented per fair-lending considerations where applicable. The documentation is produced concurrently with engineering, lives in version control alongside the model, and is part of the validation evidence.

Related regulators

Often appears alongside.

Talk to us

SR 11-7-scoped engineering on your roadmap?

CITADEL co-pilots every regulated engagement. Senior engineer plus department lead joins the first call. Audit pack on the same business day.

Talk to a senior engineer Full compliance ledger

SR 11-7 engineering.

Authority

Board of Governors of the Federal Reserve System; OCC Bulletin 2011-12 (parallel guidance)

Effective since

April 2011 (still the supervisory guidance of record)

Posture

Aligned

Engineering posture.

How we map to SR 11-7.

Model development with documented design choices

Independent validation

Ongoing monitoring with operational alerting

Model inventory with risk-tiering

Governance with documented committee structures

Change management for model updates

Industries we deliver SR 11-7-aware engineering for.

Financial Services

Practices that operate under SR 11-7.

AI & Machine Learning

Predictive Analytics

MLOps

Does SR 11-7 apply to LLMs and generative AI?

How do you handle model-validation independence on smaller teams?

What's a typical model inventory look like?

How does ongoing monitoring work in practice?

What about challenger models?

How do you document training data?

Often appears alongside.

SOX 404

NYDFS 500

PCI-DSS

SR 11-7-scoped engineering on your roadmap?

Engineering posture.

How we map to SR 11-7.

Model development with documented design choices

Independent validation

Ongoing monitoring with operational alerting

Model inventory with risk-tiering

Governance with documented committee structures

Change management for model updates

Industries we deliver SR 11-7-aware engineering for.

Financial Services

Practices that operate under SR 11-7.

AI & Machine Learning

Predictive Analytics

MLOps

Does SR 11-7 apply to LLMs and generative AI?

How do you handle model-validation independence on smaller teams?

What's a typical model inventory look like?

How does ongoing monitoring work in practice?

What about challenger models?

How do you document training data?

Often appears alongside.

SOX 404

NYDFS 500

PCI-DSS

SR 11-7-scoped engineering on your roadmap?