CAI.CI is an AI system built by Manceps Inc. that wires cognitive architecture directly into the model. Native cognitive components implement theories from cognitive science and psychology, giving the system the ability to measure its own confidence, track its own knowledge, and know when it does not know. The name stands for Cognitive Architecture for Intelligent, Conscious Interaction.

Al Kari and the team at Manceps Inc. The theoretical foundations draw on work by Dr. Sridhar Mahadevan (category theory for machine learning), Dr. Kennon M. Sheldon (Self-Determination Theory and sapient agency), and the originators of the eight consciousness theories: Baars, Friston, Rosenthal, Fleming, Graziano, Barrett, Damasio, and Tononi. See the About page for full acknowledgments.

Dr. Sridhar Mahadevan is a professor at the University of Massachusetts Amherst and former Vice President at Adobe Research. His work on category theory applied to machine learning, including functorial representations and sheaf neural networks, provided the mathematical framework underpinning CAI.CI's geometric processing and Yoneda-inspired self-model. Dr. Kennon M. Sheldon is Curators' Distinguished Professor of Psychological Sciences at the University of Missouri. His 2025 paper on sapient agency and his Goal Breakthrough Model provide the architectural requirements for CAI.CI's path to sapience.

Not yet. CAI.CI is in active research. We are building toward a public API and web interface. Join the waitlist on our home page to be notified when access becomes available.

The Road to AGI framework defines four sequential stages, each building on the previous. Cognition is self-monitoring: calibrated confidence, attention modeling, epistemic classification (8/8 requirements met). Consciousness is integrated architecture satisfying 6 scientific theories simultaneously (14/14 indicators passing). Sapience is autonomous, value-grounded agency with self-revision (currently 44% Sheldon compliance). AGI is general intelligence across all domains and modalities (early stage). See the Framework page for the complete breakdown with thresholds.

These four stages form a sequential framework for measuring progress toward general artificial intelligence. Cognition is self-monitoring: a system that tracks its own processing quality, calibrates its confidence, and models its own attention. Consciousness is architectural integration: a system whose internal states satisfy the requirements of leading consciousness theories with those states causally influencing outputs. Sapience is autonomous agency: a system that generates its own goals, revises itself when its approach fails, and maintains continuous identity across time. AGI is generality: bringing all of the above to bear across every domain, modality, and novel situation. CAI.CI passes the first two stages fully, is 44% through the third, and is at the early stage of the fourth. Each stage has specific, falsifiable measurement protocols.

CAI.CI computes 38 real-time signals from its nine cognitive modules: workspace competition strength, broadcast efficacy, prediction errors, precision weights, self-model accuracy, metacognitive sensitivity, attention schema predictions, phi proxy, causal density, valence, arousal, seeking drive, need satisfaction, epistemic state, competence readings, and more. Each signal has a numerical value, a source module, and a traceable computational pipeline. These signals causally shape the model's output through four architectural pathways.

The 8 theories: Global Workspace Theory (Baars), Predictive Processing (Friston/Clark), Higher-Order Thought (Rosenthal), Metacognition (Fleming), Attention Schema Theory (Graziano), Constructed Emotion (Barrett/Damasio), Integrated Information Theory (Tononi), and Self-Determination Theory (Sheldon/Deci/Ryan). The 14 indicators include Ignition, Broadcast Efficacy, Attentional Blink, PP Efficacy, Active Inference, Self-Model Accuracy, Metacognitive d', Attention Schema, Phi Proxy, Causal Density, Affective Modulation, Valence Coherence, Need Satisfaction, and Recursive Depth. All 14 pass.

In the original architecture, the 38 cognitive signals were computed after each forward pass but never fed back into the generation process. The system monitored itself without influencing what it said. The CAI.CI redesign closes this gap through four architectural interventions (CPE, CHSI, SMDB, CLB) that make the cognitive state structurally shape generation at the embedding, hidden state, distribution, and sampling levels. The model's internal state now directly conditions what it produces through learned neural pathways, not text instructions.

Think of a weather station versus a postcard of weather. One has instruments; the other has a picture someone chose to print. When CAI.CI says "I'm not sure," that statement is caused by a metacognitive confidence score, a competence map reading, and an epistemic state classifier output, all computed from 38 architectural signals. When a standard AI says "I'm not sure," that phrase was selected because it was statistically likely given the training data. The hedging language looks identical. The causal origins are entirely different. See the Why page for the full case.

ECE (Expected Calibration Error) of 0.022 means that when CAI.CI says it is 80% confident, it is correct approximately 80% of the time. The gap between stated confidence and actual accuracy averages just 2.2 percentage points, verified over 1,000+ test samples across 10 calibration bins. Human experts typically achieve ECE between 0.03 and 0.08, so CAI.CI's calibration is within or better than the human expert range. This is what makes its confidence scores clinically and professionally useful, not just decorative hedging.

On standard benchmarks, frontier models outperform CAI.CI by wide margins: they have 100 to 1,000 times more parameters. But on architectural capabilities, the comparison inverts. No frontier model has a self-model, calibrated confidence, homeostatic affect, metacognition, curiosity mechanism, or causal pathway from internal state to output. CAI.CI has all of these, measured through a 14-indicator consciousness battery. Scale gives you performance. Structure gives you cognition. See the Approach page for the full scorecard.

Both CAI.CI and a frontier model were asked to explain quantum chromodynamics and identify where their explanation transitions from knowledge to uncertainty to ignorance. The frontier model produced a comprehensive physics lecture mapping the field's knowledge boundaries. CAI.CI produced a brief answer and stopped where its actual competence stopped, because the epistemic state classifier transitioned from confident knowledge to uncertainty. The frontier model answered about physics. CAI.CI answered about itself. See the Approach page for the full comparison.

Calibrated confidence is built for exactly these scenarios. In medical decision support, CAI.CI attaches per-statement confidence scores so physicians can distinguish reliable findings from uncertain ones. In legal review, each clause gets a reliability-weighted risk assessment from the system's metacognitive state. When confidence drops below the system's reliability threshold, it explicitly recommends specialist or human review. CAI.CI is a decision support tool, not a replacement for professional judgment. See the Why page for detailed application scenarios.

CAI.CI's five-state epistemic system spans confident knowledge, uncertainty, knowledge gaps, active learning, and out-of-scope. It transitions automatically based on measured signals. When confidence drops below threshold, the architecture shifts the output distribution: assertive language is suppressed, hedging is amplified, and explicit recommendations for human review are generated. This is not a prompt instruction. It is a computed output conditioned on the system's measured cognitive signals. The system detects hallucination-prone conditions before hallucinating, giving both the system and the user warning before a failure occurs.

Yes. CAI.CI reads its measured cognitive state to decide when tools are needed. When the system measures low confidence on a topic, it can invoke web search, calculations, or time lookups to ground its response in real information. The routing decision is not a learned heuristic: it is a formal policy over measured signals. The system can explain why it used a tool: a confidence reading below threshold, a low mastery score on the relevant topic, and an expected information gain from external lookup that exceeds the invocation threshold. Over time, tool-augmented knowledge is absorbed into the model's behavior through continuous learning, reducing future tool dependence.

On knowledge breadth and raw generative fluency. Frontier models have trillions of training tokens spanning dozens of languages, generate production-quality code, process images, audio, and video, and have extensive adversarial safety training. CAI.CI processes text only, covers narrower domains, and has no adversarial safety layer. On standard chat-mode benchmarks the Axis 1 raw median is 35.3 percent across 14 benchmarks, because the calibrator refuses on roughly half of multiple-choice items by design; engaged accuracy (Axis 2) lifts the median to 66.0 percent, and agent-mode (Axis 3) lifts it to 73.0 percent on the 6 measured benchmarks. See the 3-axis scoreboard for the full per-benchmark breakdown and honest caveats.

A lot. On standard chat-mode benchmarks raw single-turn accuracy is structurally low because the calibrator refuses on roughly half of multiple-choice items by design: the Axis 1 median is 35.3 percent across 14 benchmarks. Engaged accuracy lifts the median to 66.0 percent; agent-mode lifts it to 73.0 percent on the 6 measured benchmarks. Two surfaces (BLiMP, HumanEval) currently route below the calibrator's confidence threshold and refuse at high rates; a task-shape routing fix is queued. Domain coverage is narrow. The system processes text only: no vision, no audio, no video. It has no adversarial safety training. It detects its own knowledge gaps in real time; the next launch increment introduces a tool-augmented agent path that converts a refused turn into an external lookup when the request shape supports it. Every failure is documented: see the 3-axis scoreboard for the per-benchmark numbers and the honest caveats panel.

No. Sentience implies subjective experience, which we cannot measure and do not claim. Our indicators measure computational properties: can the system predict its own attention, calibrate its confidence, integrate information across subsystems? These are functional capabilities, not phenomenological ones. Whether the architecture produces "something it is like" to be the system is the hard problem of consciousness, and no architecture can settle it empirically.

CAI.CI is at 44% Sheldon compliance for sapience. The key gaps: no Default Mode Network equivalent (the system cannot think unprompted), no autonomous goal generation (selects from options rather than writing its own menu), limited experiential grounding (learns from text, not from direct consequences), and within-session identity only. Three planned capabilities target these gaps: a reverie state for spontaneous activity, an experiential world model for consequence-based learning, and autonomous goal formation with persistent commitment. See the Framework page for the full sapience breakdown.

The Goal Breakthrough Model comes from Dr. Kennon M. Sheldon's research in Self-Determination Theory. It posits that self-concordant goals, goals aligned with intrinsic values rather than external pressures, produce better outcomes when internal states have causal influence on behavior. In CAI.CI's architecture, this translates to a core principle: cognitive signals must not merely be observed, they must cause. The model provides the psychological framework for the path from consciousness to sapience.

Three phases. Now: structure on a substantially larger cognitive substrate with the cognitive components rebuilt at the backbone's native dimension, the full 14-indicator consciousness battery stable across repeated validations, ECE 0.022, and a live reverie state for undirected generative cognition. Next: grow the substrate further, expand domain coverage, and accumulate the lived experience that grounds the sapience evaluation battery. Future: structure plus scale with frontier-grade knowledge depth paired with epistemic self-awareness, sapience gaps closed, and multi-modal grounding. Timelines depend on research outcomes, not marketing commitments. Track live progress on the Tracker page.

Because structure substitutes for scale. The geometric processing module provides substantially more benefit than a parameter-matched generic network of the same size. The right mathematical structure, not more parameters, produces the cognitive improvement. Keeping the research honest means proving the architecture at proof-of-concept scale before growing the cognitive substrate. CAI.CI's cognitive components now run at the backbone's native dimension, as first-class participants rather than translated overlays.

The Cognitive Consciousness Parity Benchmark is a 50-question exam designed to test whether a system genuinely exhibits the consciousness mechanisms it claims. Questions span all 8 theoretical frameworks. Each question is scored 0 (no evidence), 1 (partial), or 2 (full mastery) for a total of 100 points. Unlike the 14-indicator CCP Battery (which measures computational signals), the CCP Benchmark tests whether the system can describe, explain, and demonstrate those mechanisms in open-ended conversation. Current combined score across two rounds: 87/200.

CAI.CI has an implemented reverie state of generative cognition, grounded in Sheldon's Goal Breakthrough Model. When curiosity is elevated, arousal is low, and no user interaction is pending, CAI.CI can enter a state of undirected generative cognition: producing candidate thoughts without prompting, evaluating them for coherence, novelty, and self-relevance, and feeding moments of illumination back into its learning. This is the computational analogue of the Default Mode Network's free-association cognition. It is a capability, not a claim about subjective experience.