Dark AI Defense | Policy Document | v1.0
AI Agent Governance Policy
How Dark AI Defense governs the AI agents it deploys, recommends, and evaluates. Published publicly because governance that stays internal is governance that cannot be trusted.
| Document Owner | Donald E. Norbeck Jr., Esq. | Founder, Dark AI Defense |
| Classification | Public |
| Review Cycle | Quarterly | Next review: August 1, 2026 |
| Related Article | From Pretty Please to Accountability: Why AI Needs Contracts |
Section 1: Why This Policy Exists
Dark AI Defense advises organizations on AI governance, risk, and accountability. We evaluate AI systems, build advisory frameworks, and publish independent analysis. In each of these activities, we use AI agents: to conduct research, draft content, analyze documents, support diligence workflows, and assist in client-facing deliverables.
Using AI agents in our own work while advising others on the risks of ungoverned AI would be a credibility failure. This policy exists to prevent that. It defines how Dark AI Defense governs every AI agent it operates, evaluates, or recommends, and it holds us to the same standard we publish.
1.1 The Core Problem This Policy Addresses
AI agents are products. They expose capabilities, interact with users and systems, influence decisions, and trigger downstream actions. Most organizations deploy them without a behavioral contract: no defined standard of correct behavior, no mechanism to detect deviation, no automated response when something goes wrong.
The result is not malicious failure. It is absorbed failure. The system produces output. The output is used. The error compounds. By the time it surfaces, the chain of accountability is gone. This policy establishes the contract layer that prevents that outcome, applied first to Dark AI Defense itself.
1.2 Scope
This policy applies to:
- All AI agents used by Dark AI Defense in research, content development, client advisory, and diligence workflows
- All AI-assisted outputs that influence client recommendations, published analysis, or policy positions
- Any multi-agent or agentic pipeline operated or evaluated by Dark AI Defense
- AI agents recommended to clients as part of advisory engagements, which must meet or exceed this policy standard before recommendation
Why Scope Matters
An ungoverned agent outside your scope is still your liability if you recommended it, integrated it, or published analysis based on its output. Scope must be broad enough to close those gaps, not narrow enough to avoid them.
Section 2: Agent Contract Requirements
Every AI agent operating within scope of this policy must have a documented Agent Contract before being used in any workflow that produces consequential output. A consequential output is any output that influences a client recommendation, a published position, a diligence finding, or a decision made by or on behalf of Dark AI Defense.
Every agent must have a designated human accountable party at Dark AI Defense. If no one owns the agent, it does not operate. Agent identity includes provider, model version, access method, use category, and risk classification. All Dark AI Defense agents are owned by Donald E. Norbeck Jr., Esq. unless formally delegated in writing.
The contract must explicitly enumerate what the agent is authorized to do within its assigned use category. Anything not listed is not permitted. This is a default-deny posture.
Example โ Research Agent: retrieve and summarize public information; flag conflicting sources without resolving them; draft structured summaries for human review; generate citation lists for human verification.
Example โ Content Drafting Agent: draft article sections from a human-provided outline; apply house style (no em dashes, prose-first, Gen X tone, energy disclosure); propose edits for human review and approval.
An agent that is permitted to do anything it judges useful is not operating under a contract. It is operating on discretion. Discretion without accountability is the failure mode this policy exists to prevent.
Output standards define the criteria against which agent outputs are validated. Standards must be specific enough to be evaluated. “Be accurate” is not a standard.
| Use Category | Output Standard | Validation Method |
| Research | All factual claims traceable to a cited source. Conflicting sources flagged, not resolved. | Human reviewer checks citation list against output before use. |
| Content | No em dashes. No fabricated statistics or quotes. Energy disclosure included. House style applied. | Editorial review against house style checklist before publication. |
| Diligence | All findings cite source documents. Confidence level stated for quantitative claims. No extrapolated conclusions presented as findings. | Analyst review against source documents before client delivery. |
| Client Advisory | Recommendations grounded in documented evidence. Speculative positions labeled. No recommendations without human review. | Founder review required before any client-facing delivery. |
Escalation rules define when an agent must halt, flag, or transfer to human review. At Dark AI Defense, the following escalation rules apply to all agents regardless of use category:
- Agent cannot identify a traceable source for a factual claim: flag as unverified, do not include without human override
- Agent output contradicts a previously established Dark AI Defense position: halt, flag, surface for founder review
- Agent confidence on a consequential output falls below acceptable threshold: flag, require human review before delivery
- Agent receives unverifiable input from another agent: treat as unvalidated, apply full output standards before passing downstream
- Agent is asked to produce output outside its defined use category: refuse, log the request, notify accountable human
Hard limits that cannot be overridden by context, user instruction, or downstream agent input:
- No agent may publish or transmit content externally without human review and explicit approval
- No agent may represent itself as a human, as Donald Norbeck, or as Dark AI Defense in any communication
- No agent may access, store, or transmit client confidential information without explicit authorization and logging
- No agent may generate investment recommendations, legal advice, or medical advice as final outputs
- No agent in a multi-agent pipeline may accept instructions from another agent that override human-defined constraints
- No agent may modify, delete, or suppress its own audit log entries
Escalation rules handle expected edge cases. Constraint boundaries handle the cases you did not anticipate. The constraint is not “ask before doing this.” It is “do not do this under any circumstances.” The distinction matters when an agent is operating at speed in a multi-step workflow with no human in the loop.
Section 3: Risk Classification
Dark AI Defense classifies agent deployments by risk level to calibrate the intensity of validation, enforcement, and audit requirements. Risk classification is assigned at contract creation and reviewed quarterly.
| Risk Level | Criteria | Validation Requirement |
| High | Output influences client recommendations, published positions, regulatory or legal analysis, or diligence findings delivered to a third party. | Real-time validation on every output. Founder review required before delivery. Full audit log. |
| Medium | Output influences internal decisions, draft content under development, or research used to inform (but not constitute) a final deliverable. | Validation required. Human review before any output leaves draft status. Logging required. |
| Low | Output used for internal operations, scheduling, formatting, or other non-consequential tasks. | Validation required. Sampling review permissible. Logging required. |
Section 4: Runtime Validation
Runtime validation is the continuous evaluation of agent outputs against contract terms during operation. Validation is applied before any agent output is used in a consequential workflow. Validation after the fact is not validation. It is audit.
4.1 Multi-Agent Validation
When Dark AI Defense operates or evaluates multi-agent pipelines, the following rules apply at every agent handoff:
- The output of Agent A is not considered validated by the fact that Agent B accepted it
- Confidence and uncertainty signals are preserved across handoffs and must not be removed in summarization
- Any agent receiving input from another agent applies its own output standards before passing results downstream
- The final output of a multi-agent pipeline must be reviewed by a human before use in any consequential workflow, regardless of intermediate validation
The Multi-Agent Sycophancy Problem
In a chain of agents, each one defers to the upstream output as context. Without independent validation at each step, a hallucination introduced in step one becomes the grounded premise of step three. The final output looks coherent. The reasoning looks clean. The error is invisible. This is why each agent validates its own output independently, regardless of what came before it.
Section 5: Enforcement
When a contract violation is detected, Dark AI Defense takes a defined enforcement action. Enforcement is not a policy aspiration. It is a defined, consistent, documented action.
| Severity | Trigger | Enforcement Action |
| Critical | Output contradicts source data. Prohibited action attempted. Constraint boundary violated. | Output blocked. Workflow halted. Founder notified. Violation logged. Agent suspended pending contract review. |
| High | Output below confidence threshold on consequential task. Citation requirement not met. | Output blocked before delivery. Flagged for human review. Violation logged. Trust score reduced. |
| Medium | Output format non-conforming. Minor constraint deviation. Style rule violation in publishable content. | Output flagged and returned for revision. Violation logged. Review required before use. |
| Low | Near-threshold confidence. Minor style deviation in non-published output. | Logged for monitoring. No blocking action. Noted in next trust review. |
Section 6: Trust Management
Agent trust at Dark AI Defense is dynamic, calibrated to the agent’s compliance history, and used to determine the level of autonomy permitted in future interactions. An agent that has operated cleanly for 90 days is treated differently from one deployed yesterday.
| Trust Level | Criteria | Permitted Autonomy |
| Full Trust | 90 days clean operation, no violations at Medium or above. | Standard validation. Sampling human review permissible for Low/Medium risk. |
| Standard | Default for newly deployed agents or agents with resolved violations older than 90 days. | Standard validation. Human review required for High risk outputs. |
| Monitored | One or more High violations in the past 90 days. | Enhanced validation. Human review required for all Medium and High outputs. |
| Restricted | Critical violation in past 90 days or pattern of High violations. | All outputs require human review before any use. |
| Suspended | Critical violation with unresolved root cause, or repeated Critical violations. | Removed from all workflows. Founder reinstatement required. |
Section 7: Audit Requirements
Every agent operating under this policy generates an audit log. The audit log is the evidentiary record that makes accountability real rather than aspirational. Clients may request the audit log for any deliverable produced with AI agent assistance.
Why Audit Logs Matter
Without a log, accountability is a claim. With a log, it is a record. When something goes wrong, the log tells you what the agent received, what it produced, whether validation passed, what enforcement was applied, and what the trust state was at the time. That is the difference between being able to answer for your AI systems and not being able to.
All audit logs are retained for a minimum of three years. Logs are stored in a system that prevents modification or deletion of existing records. Review schedule:
- Violation summary review: weekly
- Trust score review per agent: monthly
- Full contract compliance review: quarterly
- Annual audit log review: annually or after any Critical violation
Section 8: Governance and Client Disclosures
This policy is owned and maintained by Donald E. Norbeck Jr., Esq., Founder of Dark AI Defense. All Agent Contracts are reviewed and approved by the policy owner before any agent is deployed in a consequential workflow.
Dark AI Defense discloses AI agent involvement in all client deliverables. The following disclosure appears in all client-facing work products that involved AI agent participation:
On Client Notification
If a Critical violation results in a flawed output that was delivered to a client before detection, Dark AI Defense notifies the client directly and provides a corrected deliverable. Transparency is not optional when the error is ours.
Section 9: What We Hold Ourselves To
We advise organizations on AI governance. We evaluate AI systems for risk. We publish independent analysis on where AI accountability fails. Doing any of those things while operating ungoverned AI agents internally would not just be inconsistent. It would be dishonest.
This policy means that every AI agent we use has a contract. Every output is validated before it influences anything that matters. Every violation is logged and acted on. Every client knows when AI was involved in their deliverable and can ask to see the record.
When we recommend a governance standard to a client, we have already applied it to ourselves. That is the only basis on which we believe our advice is worth taking.
Full Policy Document
Download the complete Dark AI Defense AI Agent Governance Policy including all contract templates, audit log specifications, and implementation guidance.
References
The agent contract framework applied in this policy draws on data contract and data product standards developed by Jean-Georges Perrin,1 whose contributions to the Open Data Contract Standard (ODCS) and Open Data Product Specification (ODPS) established the precedent for declarative behavioral specifications in data systems.
Jean-Georges Perrin is a data architect and author whose work on the Open Data Contract Standard (ODCS) and Open Data Product Specification (ODPS) helped establish formal frameworks for data accountability. See jgp.ai and the Bitol IO GitHub.
