Jul 2, 2025 14 min read

Thinking About Thinking

Your AI agent just recommended a strategy that would bankrupt your company. It was technically brilliant, perfectly reasoned, and completely wrong for your organization. The recommendation ignored your company's risk tolerance, violated regulatory requirements specific to your industry, and contradicted the decision-making framework that has guided your organization for decades. Yet the AI performed flawlessly according to every benchmark we use to measure these systems.

This scenario explains exactly why knowledgeable practitioners dismiss AI-generated outputs as "obviously written by AI." This applies whether they're executives, principal engineers, or domain experts evaluating any organizational task requiring operational knowledge. The outputs are easily detectable as artificial because they lack the operational context and decision-making approaches that shape how humans actually work within specific companies.

This represents the fundamental challenge of "context engineering." Much of the knowledge required for effective decision-making exists as unspoken conventions baked into the organizational fabric and embedded in the practices of high-performance individuals. Whether it's strategic planning, customer communications, or operational decisions, AI systems consistently produce generic best practices and textbook approaches rather than solutions that fit the organization's culture, constraints, and established ways of working.

The problem runs deeper than missing context. When high-performing organizations develop effective practices rooted in specific principles and values, those practices often get copied by other organizations without understanding the underlying reasoning. As these practices cross-pollinate across companies, they get stripped of their original context and constraints, then fine-tuned into merely workable processes. By definition, best practices represent a reversion to the mean.

AI models inherit this mediocrity because their training data contains these averaged-out, context-stripped versions of originally high-performance practices rather than the organizational first principles and contextual reasoning that created them. The models are trained on what gets documented. This typically includes approaches from top 20% performers who write books, blogs, and thought leadership pieces about their methods.

These same documented approaches become the cargo-cult practices that other organizations copy without understanding the underlying principles. This explains why AI outputs consistently feel generic and perform like competent practitioners rather than exceptional ones. The models have learned to reproduce decontextualized versions of good practices without the deliberate, situational reasoning that would make those approaches highly specific to current problems and opportunities. What's missing is the kind of slow, effortful thinking that adapts general frameworks to particular organizational constraints and contexts. This capability distinguishes exceptional performance from merely competent execution. This type of deliberate reasoning operates from organizational first principles rather than surface-level practices.

True organizational effectiveness requires reasoning from institutional first principles—the foundational beliefs, values, and tenets that an organization holds as self-evident truths. Amazon's leadership principles and document tenets exemplify this approach. These aren't universal physics principles, but organizational assertions about reality that teams treat as foundational until proven otherwise. High-performing practitioners decompose problems and approaches down to this level, building solutions upward from what the organization fundamentally believes to be true.

However, exceptional organizational growth requires applying broader frameworks that test the validity of existing first principles and potentially expand them. This is where insight originates. While organizational first principles help create coherent solutions within current understanding, novel interpretations of problems and their solutions lie outside of our perceived solution space because our current principles prevent us from seeing them. Discovering them requires introducing new principles derived from broader analytical frameworks that can explain past failures or reveal new possibilities that current tenets cannot address.

AI systems capable of this type of reasoning can operate within organizational comfort zones, guide organizations toward expanded understanding, and bridge between different approaches. When proposing a shift from current practices to new solutions, the system can explain what needs to change in the transition. This includes identifying which principles to retire, what new behaviors to introduce, and what policies need modification.

This bridging capability makes the AI's thinking transparent and auditable. Organizations can examine the reasoning behind proposed changes and evaluate whether the transformation logic is sound. Rather than receiving unexplained recommendations, teams can review the step-by-step thinking that connects current state to proposed future state. This transforms AI from mere automation into a tool for amplifying organizational effectiveness beyond current capabilities.

The Hidden Cost of Black-Box Thinking

The current approach to AI deployment treats these systems as sophisticated magic boxes. Input a problem, receive an optimal solution. This assumption creates predictable failure patterns that we can anticipate through deductive reasoning. Consider what happens when an AI system optimizes resource allocation across projects without understanding the partnership agreements that constrain those decisions. Or imagine an AI agent suggesting clinical trial designs that are scientifically sound but would never pass regulatory review because the system lacks awareness of industry-specific compliance requirements. Picture a planning AI generating roadmaps that ignore the technical debt constraints that shape every real decision in a software organization.

Raw intelligence produces suboptimal results when divorced from structured thinking frameworks. Even considerable intelligence fails without proper frameworks to guide decision-making. This pattern should be familiar to anyone who has witnessed a new executive join an organization and immediately begin making drastic changes without taking time to understand the company's context, capabilities, and culture.

The newly hired executive arrives with impressive credentials and proven expertise. They see obvious problems and have clear ideas about solutions. Yet their early decisions often disrupt systems in unexpected ways, alienate key stakeholders, or conflict with regulatory requirements they didn't know existed. The most successful new leaders focus initially on information gathering. They work to understand organizational capabilities, decision-making frameworks, and the cultural context that shapes how change actually happens within the company.

The same principle applies when deploying AI agents. We essentially hire them as instant executives, expecting them to operate with complete context from day one. But unlike human executives who can spend their first 90 days learning organizational frameworks and building relationships, AI agents receive no such onboarding period. They are expected to produce solutions that are not just technically correct, but obviously right to everyone in the organization, or at least obviously understandable and implementable within existing cultural and operational constraints.

When human experts tackle complex problems, they do not simply apply raw cognitive horsepower. They select from a repertoire of proven methodologies, adapt frameworks to specific contexts, and operate within established organizational cultures and constraints.

Consider how a seasoned consultant approaches a strategic planning engagement. They do not start by thinking harder about the problem. Instead, they select appropriate frameworks from their toolkit. This might be a SWOT analysis for competitive positioning, a stage-gate process for product development, or a risk assessment methodology for regulatory compliance. The choice of framework shapes the entire problem-solving process, determining what data to gather, how to structure analysis, and what success criteria matter most.

Yet when we deploy AI agents, we abandon this accumulated wisdom about cognitive frameworks. We expect models to reinvent effective thinking patterns from scratch for each problem, rather than leveraging the extensive knowledge we have about how to structure different types of cognitive work. This approach works reasonably well for tightly scoped problems where the full context can be articulated and presented to the model. But it fails for complex organizational problems where much of the relevant context remains implicit, undocumented, or embedded in cultural practices that have never been explicitly codified.

The potential cost of this misalignment could extend beyond individual failed recommendations. Organizations that cannot rely on AI agent outputs might find themselves trapped between the promise of AI automation and the reality of requiring constant human oversight. The result could be systems that consume resources without delivering the productivity gains that justified their deployment.

The Missing Layer: Cognitive Scaffolding for AI

Effective AI agents need explicit cognitive scaffolding that operates above the level of raw model inference. Just as human expertise involves not only knowledge and reasoning ability but also the wisdom to select appropriate frameworks for different types of problems, AI agents need architectural layers dedicated to framework selection, strategic planning, and execution oversight.

This scaffolding approach mirrors how humans actually solve complex problems. When a management consultant, engineer, or researcher encounters a challenging problem, they engage in meta-cognitive processing. They think about how to think about the problem. This "thinking about thinking" separates exceptional performers from those who simply apply standard approaches. They consider what type of problem they are facing, what methodologies have proven effective for similar challenges, and how to adapt these approaches to their specific context and constraints.

Some current AI models do incorporate thinking modes where they spend initial compute ruminating on problems in dedicated thinking blocks before generating responses. However, these thinking capabilities are typically achieved through two-phase training. First, developers instruct a model to think explicitly. Then they fine-tune on both thinking outputs and final responses together. While this produces models that appear to deliberate, the sophistication of their thinking is fundamentally limited by what was encoded during fine-tuning.

This approach resembles what cognitive scientists call System 1 thinking. System 1 thinking is fast, automatic, and pattern-based, even when it appears deliberative. True framework selection requires System 2 thinking. This type of thinking is slow, effortful, and capable of variable-time processing that can adapt to problem complexity. When thinking is insufficient, the system should be able to recognize this at inference time and continue prompting for additional meta-processing.

Organizations need AI that can handle novel situations and adapt its approach when standard methodologies prove insufficient. Current architectures that rely primarily on fine-tuned thinking patterns cannot achieve this level of adaptive meta-cognition. They leap from problem identification to solution generation using pre-learned thinking patterns, bypassing the dynamic framework selection process that human experts recognize as necessary for reliable outcomes in novel contexts.

Addressing this gap requires implementing multi-layered agent architectures that separate high-level orchestration from framework selection from execution while providing transparent, auditable decision-making processes. Rather than treating AI agents as monolithic systems, we need architectures that explicitly model the different types of cognitive work required for effective problem-solving.

How It Works: Dynamic Framework Selection

Organizations face increasingly complex, novel problems that don't fit predetermined templates. Even when established frameworks exist, they often require adaptation based on organizational size, complexity, and constraints. A strategic planning framework developed for a 200-person company may contain valuable elements but need significant modification for a 100,000-person organization. Organizations need AI systems capable of contextual reasoning that can identify relevant methodologies and adapt them to specific situations rather than cargo-culting one-size-fits-all solutions.

The central component of this new architecture is dynamic framework selection. Traditional approaches assume that the right methodology must be pre-programmed or explicitly specified by users. But this assumption limits organizations to problems that fit predetermined templates and places unrealistic burdens on teams to understand both their problems and the full range of available methodologies.

Instead, effective AI agents should be capable of generating appropriate frameworks dynamically by leveraging the substantial knowledge about decision-making methodologies embedded in their training data. Large language models contain substantial information about project management frameworks, strategic planning processes, risk assessment methodologies, and cognitive approaches across virtually every domain. However, this knowledge remains dormant during normal inference unless explicitly activated through structured prompting.

The framework selection layer operates by transitioning the system into a specialized mode focused on methodology discovery and adaptation. When presented with a problem, the system engages in explicit meta-processing. It asks questions like "Given these problem attributes and this type of solution we are driving toward, what are the best frameworks commonly used in this space? What are the specific steps of that methodology? What are the specific inputs and outputs expected at each stage?"

This approach becomes particularly powerful when established methodologies prove insufficient. The system can engage in creative cross-pollination, identifying problems with similar attributes across different domains and adapting their associated methodologies. For instance, when developing a methodology for an in-person entertainment business where no specialized frameworks exist, the system might identify structural similarities with gymnasium operations and adapt fitness industry best practices to the entertainment context.

The key insight is that what matters is not the source of the framework but the explicit mode change that occurs when the system transitions from general problem-solving to structured framework selection. This separation ensures that methodological choices are transparent, auditable, and adaptable while preventing the system from making implicit assumptions about how problems should be approached.

This multi-stage approach solves the cargo cult problem by forcing models to replicate the same battle-tested practices that top performers use. When you simply ask a model "How risky is this project?" it autocompletes with generic best practices. But when you guide the model through a structured process—first developing a risk evaluation framework, then creating problem-specific criteria, then applying those criteria systematically, then calculating scores using tools—you replicate the thoughtful analysis patterns that distinguish exceptional practitioners.

This is still a form of cargo-culting, but it fundamentally changes who you're trying to copy. Rather than replicating widely documented practices from the top 20% of performers, the system targets the specific methodological approaches that produce top 1% results. These exceptional performers may document their outcomes, but they're vastly underrepresented in training data when it comes to documenting the deliberate processes that created those outcomes. The meta-processing scaffolding forces the model to execute the same types of rigorous, multi-stage analysis that top performers actually use to achieve exceptional results, rather than defaulting to the averaged-out practices that dominate most business literature.

Real-World Integration: Organizational Alignment

Framework selection alone is insufficient if the chosen methodologies conflict with organizational cultures and decision-making processes. Every organization develops its own cultural operating system. This includes practices, document formats, and decision-making methodologies that shape how work gets done and how ideas get evaluated.

Take Amazon's well-documented approach to strategic planning and communication. The company has institutionalized specific document formats like the six-pager for detailed proposals and the OP1 and OP2 processes for annual planning. These are not arbitrary bureaucratic requirements; they represent accumulated organizational wisdom about how to structure thinking, communicate complex ideas, and make decisions at scale. An AI agent operating within this environment cannot simply produce generically good output; it must produce output that conforms to these established patterns.

Similarly, organizations in different industries have developed sector-specific frameworks that reflect their unique challenges and regulatory requirements. A pharmaceutical company's approach to risk assessment differs fundamentally from a software company's approach, not because one is superior, but because they operate under different constraints, timelines, and success criteria.

The meta-processing architecture provides a natural mechanism for encoding these organizational preferences into agent behavior. Rather than attempting to modify model weights or engineer complex prompts to achieve organizational alignment, the framework selection layer can be explicitly configured to prioritize methodologies that align with institutional values and practices.

This architecture can implement the type of reasoning that exceptional practitioners use within their institutional contexts, working from the organization's foundational beliefs and values rather than applying generic frameworks. This alignment with organizational conventions directly addresses why AI outputs feel "obviously artificial." When AI systems produce outputs that contradict expected approaches, experienced practitioners immediately recognize them as externally generated rather than organizationally appropriate.

This organizational alignment capability enables companies to maintain consistency across multiple AI deployments while still allowing for customization based on specific use cases. A company could establish baseline framework preferences that apply across all agent deployments, while permitting individual teams to customize specific aspects based on their unique requirements.

The architecture also enables organizational growth beyond current understanding. Meta-processing layers can apply broader analytical frameworks to test the validity of existing organizational principles. When current tenets cannot explain past failures or reveal new opportunities, the system can guide organizations toward expanded first principles that unlock previously invisible solution spaces.

Technical Architecture: APIs and Standards

The technical implementation centers on three distinct but integrated layers, each with clearly defined responsibilities and interfaces.

The agent orchestration layer handles intake, goal clarification, and high-level coordination. This layer receives user inputs, clarifies objectives, and manages the overall problem-solving process without getting involved in methodology selection or execution details. It serves as the primary interface between users and the underlying cognitive machinery.

The meta-processing layer is responsible for framework selection and strategic planning. When presented with a problem and set of objectives, this layer draws from both static libraries of established methodologies and dynamic framework generation capabilities to identify the most appropriate approach. This layer maintains transparency about its reasoning process, providing clear explanations for why particular frameworks were chosen and how they will be adapted to the specific context.

The execution layer handles implementation within the selected framework, applying the chosen methodology while maintaining awareness of success criteria, tracking mechanisms, and quality standards. This layer works within the constraints and guidance provided by the meta-processing engine while leveraging the full capabilities of the underlying language model to generate solutions and produce outputs.

The distinction between System 1 and System 2 thinking also maps naturally to multi-model architectures where different models handle different cognitive loads. The framework selection and strategic planning layers might leverage large, sophisticated models optimized for complex reasoning and meta-cognitive processing, while tactical execution utilizes smaller, more efficient models that excel at specific implementation tasks. An even simpler model could handle the initial routing decision, determining whether a problem requires System 1 fast processing or System 2 deliberative analysis.

The slow, effortful thinking that distinguishes exceptional performance emerges from the hard-coded meta-processing scaffolding that guides the system through distinct stages. Rather than jumping directly to solutions, the architecture enforces a deliberate sequence: plan to develop the plan, plan to evaluate the plan, plan to evaluate execution, check those meta-plans against risk criteria, then distill the actual implementation plan. This staged approach mirrors how top performers naturally decompose complex problems and prevents the system from defaulting to autocompleted best practices.

This ensemble approach creates a best-of-all-worlds scenario. Expensive computational resources focus on the most challenging aspects of problem-solving. This includes determining what type of problem you're facing, selecting appropriate methodologies, and developing strategic plans. Less expensive models handle the actual implementation work within established frameworks. In failure modes or when problems prove more complex than initially assessed, the system can escalate to more sophisticated models for additional analysis. From an implementation perspective, this agent architecture presents a single API interface while orchestrating multiple specialized models behind the scenes, representing a fundamental shift from treating AI as a single model to treating it as an ensemble of models with explicit switching logic and meta-cognitive processing capabilities.

These layers communicate through well-defined APIs that allow for integration with existing organizational systems and workflows. This includes not only technical integrations with project management tools and document repositories but also process integrations that allow human stakeholders to provide oversight and course corrections at appropriate points in the agent's operation.

This approach makes context engineering an explicit architectural concern, requiring the system to understand not just what frameworks to apply, but how organizational context shapes their implementation. The development of these architectural patterns will require collaboration between AI researchers, software engineers, and domain experts from various industries. Just as web standards required input from multiple stakeholders with different priorities and constraints, creating effective meta-processing frameworks will require balancing technical capabilities with practical deployment requirements.

What You Should Do Now

The transition to cognitive scaffolding for AI agents represents both an opportunity and a competitive necessity. Organizations that continue to rely on black-box AI deployment will find themselves increasingly disadvantaged relative to those that implement structured meta-processing approaches.

For technology leaders, the immediate priority should be evaluating current AI deployments for signs of misalignment with organizational frameworks and decision-making processes. Look for instances where AI recommendations, while technically sound, conflict with established practices or cultural norms. These misalignments indicate opportunities for implementing meta-processing layers that could significantly improve AI reliability and organizational acceptance.

Technology leaders in the AI space specifically should focus on developing shared protocols and mental models for agent meta-processing. The rapid adoption of Model Context Protocol (MCP) demonstrates how standardized interfaces can accelerate ecosystem development once we establish common language and tools. Similar standardization is needed for meta-processing layers: specifications for framework selection interfaces, protocols for multi-model orchestration, and foundational runtime libraries that enable experimentation while maintaining interoperability. The key questions include whether meta-processing layers should be represented as formal specifications, standardized runtime libraries, communication protocols, or conceptual frameworks. Given the momentum in the MCP space for tool usage, establishing similar standards for cognitive scaffolding could drive rapid adoption across the AI development community while balancing shared interfaces with flexibility for independent innovation and experimentation.

Engineering teams should begin experimenting with multi-layered agent architectures that separate framework selection from execution. Start with simple implementations that explicitly prompt for methodology selection before proceeding with problem-solving. Even basic versions of this approach can provide valuable insights into how framework scaffolding improves AI agent performance in organizational contexts.

Business leaders should invest in documenting their organization's decision-making frameworks and cultural operating systems. The meta-processing approach works best when organizational preferences and constraints are explicitly articulated rather than remaining as implicit cultural knowledge. This documentation serves dual purposes: improving AI agent alignment and creating valuable organizational knowledge assets.

Research communities should prioritize developing standards and evaluation methodologies for multi-layered agent architectures. Traditional AI benchmarks focus on model capabilities in isolation, but evaluating agent architectures requires metrics that account for organizational alignment, process transparency, and long-term reliability.

The future belongs to organizations that can effectively integrate AI agents with human organizational cultures while amplifying cognitive capabilities. We stand at an inflection point where the AI industry must mature from building impressive demos to building reliable partners. Moving from cargo cult AI to contextual reasoning offers a path toward AI systems that complement and amplify human expertise. This shift from copying so-called "best practices" to thinking about thinking will define the next generation of AI deployment.