What We Evaluate

Assessment Scope

Our assessment focuses on whether AI systems can operate as accountable, governable organizational actors under real-world conditions.

We do not evaluate isolated features or model performance. We evaluate institutional fit, operational risk, and governance constraints.

This scope is informed by cross-case research conducted by the SlashLife AI Institute.


Core Evaluation Dimensions

1. Informal Governance Exposure

We assess how much system behavior relies on:

  • human discretion,
  • unwritten norms,
  • verbal instructions,
  • or ad hoc overrides.

Why this matters

Informal governance is not a failure mode. It is a structural layer present in all real organizations.

We evaluate whether your AI systems:

  • assume informal governance will disappear,
  • or are designed to coexist with it safely.

2. Authority vs. Authorization Gap

We examine whether AI agents are likely to be treated as authoritative before formal authorization is established.

This includes:

  • delegation via natural language,
  • perceived competence or confidence,
  • and early behavioral trust.

Why this matters

Authority frequently precedes authorization in practice. Systems that assume the reverse face ungoverned authority expansion.


3. Identity Stability and Scope

We evaluate how identity is assigned, scoped, and revoked across contexts.

This includes:

  • session-based or task-based identities,
  • role drift over time,
  • and temporary legitimacy.

Why this matters

Many operational contexts require provisional identity, not persistent identity. Over-persistence increases risk and friction.


4. Delegation and Interpretation Risk

We assess how delegation is expressed, interpreted, and audited.

This includes:

  • prompt-based delegation,
  • ambiguity in natural language,
  • and post-execution explanation or correction.

Why this matters

Delegation is often enforced through interpretation rather than policy. Interpretation itself is a governance surface.


5. Human Override and Accountability Paths

We evaluate how human override is:

  • triggered,
  • recorded,
  • justified,
  • and bounded.

Why this matters

Human intervention is inevitable. The question is not whether override exists, but whether responsibility remains traceable afterward.


What We Explicitly Do Not Evaluate

To avoid misalignment, we do not assess:

  • model accuracy or benchmark performance,
  • UI or UX optimization,
  • prompt libraries or tuning,
  • or one-off automation scripts.

These may be addressed later, but they are not part of institutional readiness assessment.


Assessment Output

An assessment typically results in:

  • a structured risk and readiness summary,
  • identification of governance gaps,
  • clarification of identity and delegation boundaries,
  • and recommendations for next-phase intervention (if appropriate).

No implementation commitment is implied.


How This Informs Deployment

Assessment outcomes determine whether deployment is:

  • advisable,
  • conditionally viable,
  • or structurally misaligned.

In some cases, the correct outcome is not to deploy.

This determination is part of responsible AI operation.


Relationship to Our Research

Our assessment framework is grounded in:

  • cross-case institutional experiments,
  • published findings from the SlashLife AI Institute,
  • and observed failure patterns in real deployments.

Assessment criteria are revised only through documented research cycles.


Next Step

If your organization operates across:

  • jurisdictions,
  • teams,
  • or delegated AI systems,

and requires accountable operation rather than experimentation, an assessment may be appropriate.