Designing AI to Stretch the Mind

In our innovation programs with students, we often began with a deceptively simple exercise: take two things that do not obviously belong together and force a connection.

At first, the combinations sound absurd and students look unsure. But as they start working together ideas start to make sense. An umbrella and a jump-rope becomes a “Jumbrella” — a water-skiing device where you can sit and relax while being towed by a motorboat.  The point is not to reward randomness for its own sake but to help students escape the first layer of obvious ideas and enter a more interesting space where new meaning has to be constructed.

A less random exercise uses association maps where you try to connect ideas that are 2-hops away from the core object that you are trying to improve. One student, using an association map, started with the idea of a glove. And as he drew the map, he reached “scissors” and a new idea emerged: a glove with a cutting blade attached, making it safer and easier for children who find scissors difficult to hold. In the process of bringing the two concepts together, he recognized a common human problem and found a useful solution. 

This is what creativity looks like. It is not simply “thinking outside the box.” More often, it is a disciplined way of making novel and meaningful connections.

And the same cognitive techniques that help students invent also help them learn.

Creativity As a Learning Engine

We often treat creativity as a detour from learning, but it is actually one of the most effective ways to learn traditional subjects as well. In another of our programs, students created their own numbering systems as part of an imaginary world-building exercise.

On the surface, this looked like imagination: invent a world, design its rules, create its language, build its symbols. But when students had to create a numbering system, they started doing serious mathematical thinking. A worksheet about place value can tell a student how a base system works. But inventing a base-5 or base-11 numbering system forces the student to confront the mechanics of place value at a deep, structural level. 

We usually treat learning and creativity as separate capacities. Learning is associated with acquiring knowledge, mastering facts, and performing correctly. Creativity is associated with novelty, imagination, and original production. But this separation is misleading. At a cognitive level, both require the same fundamental act of viewing the problem from many different perspectives, recognizing gaps in existing mental models and updating them. 

A learner encounters something new and must fit it into what they already know. Sometimes the new idea can be assimilated easily. Sometimes it does not fit, and the learner has to reorganize their understanding. A creative thinker does something similar. She takes existing ideas, experiences, concepts, and constraints and recombines them into a structure that did not exist before. In both cases, the mind is not passively receiving information but actively reconfiguring an internal model of the world.

And classroom evidence points in this direction. For example, in one study students learning statistics were asked to invent ways of comparing data sets before receiving direct instruction on standard measures. Their early solutions were often incomplete, but the act of invention prepared them to understand the formal ideas more deeply. Similarly, in science classrooms, students learn abstract concepts more effectively when they create analogies and examine them. A circuit can be compared to water flow, but the analogy must also be questioned. What is like the battery? What is like resistance? Where does the comparison mislead us? Learning  does not come from the analogy alone but also from the act of mapping, testing, and revising the analogy. A newer study found that goal-directed association can better explain the creativity-learning link. 

Creativity Techniques as Metacognitive Tools

Creative techniques like associative or analogical thinking, are not just ideation tools — they also act as metacognitive tools. Much of our thinking is invisible, even to ourselves. A student might say, “I don’t get it,” without knowing if the roadblock is vocabulary, structure, or a false assumption. However, when that same student maps associations or compares metaphors, their thinking becomes something they can inspect. They can see which connections are obvious, which are missing, which are forced, and which open a new path.

This gives students a toolkit for ambiguity. In school, problems are often presented with clear instructions, known methods, and expected answers. In the real world, the most important problems rarely arrive that way. They are open-ended, poorly structured, and full of incomplete information. The student who has practiced making thinking visible has an advantage. She knows how to begin when the path is unclear: generate associations, map the territory, compare frames, test analogies, revise assumptions.

The AI Challenge

Learning and metacognition are precisely what are at risk with artificial intelligence.

AI can be an extraordinary tool for learning. It can explain concepts, generate examples, translate language, summarize research, and provide feedback at a scale no human could manage alone. But it can also short-circuit the mechanisms through which learning and innovation occur. If students ask AI for the answer before they have formed their own associations, challenged their assumptions and wrestled with their own confusion, they may produce better work but atrophy their thinking skills in the process.

Humans have always used tools to reduce mental effort. We write notes so we do not have to remember everything. We use calculators so we do not have to perform every computation by hand. Offloading is not inherently bad. In fact, civilization depends on it. The question is what and how much we offload.

When we offload storage, we may free the mind for higher-order work. But when we repeatedly offload sense-making, judgment, and creative struggle, we risk weakening the very capacities that make us learn and create.

So, the real question is: How should AI be designed if the goal is not to replace thinking but to stretch it?

An AI tutor could give the answer immediately. Or it could ask the student to first generate three associations, choose the strangest one, and explain how it might connect. A writing assistant could rewrite a paragraph. Or it could offer competing metaphors and ask the student which one best fits the argument and why. 

These are not small design choices. They reflect two very different theories of learning. One treats the learner as a consumer of answers. The other treats the learner as a builder of models.

We often associate technology with a certain kind of dopamine loop: the ping, the scroll, the like, the instant answer. This kind of reward captures attention by hacking into our fears and insecurity. But there is another kind of reward that is underused: the reward of insight.

It’s the “aha” moment when a strange association suddenly makes sense. It is the pride and satisfaction of finding a clever solution to a real problem. That is the “better dopamine” to use. We should design systems that provoke association, analogy, reflection, and metacognition. That can lead to a more effective and beneficial partnership between humans and AI.

The Illusion of Rationale

In 1931, Norman Maier designed an elegantly simple experiment about human reasoning. Subjects entered a large room at the University of Chicago where two cords hung from the ceiling: one near the wall, the other from the center of the room. Their task was to tie the ends of the two cords together. 

The catch was that the cords were too far apart. If a subject held one cord, the other was out of reach. The room contained objects that could help: chairs, poles, clamps, pliers, extension cords, tables. Maier was not interested in whether people could find any solution. In fact, several solutions were available. A person could anchor one cord to a chair and bring the other over. They could lengthen one cord with an extension cord. They could pull one cord closer with a pole.

But Maier was especially interested in a fourth, less obvious solution: tie a weight to the cord in the center of the room, set it swinging like a pendulum, grab the other cord, and catch the swinging cord when it returned. This kind of solution requires a shift from viewing a cord not as a cord, but as a pendulum. These kinds of mental shifts are interesting because they often lead to more creative solutions. 

Once a subject found one solution, Maier simply said, “Now do it a different way.” The experiment continued until the person either discovered the pendulum solution or became stuck. If the subject worked for at least ten minutes and insisted there was no other way, Maier introduced what he called “helps.” The first help was subtle. The experimenter walked across the room, brushed the center cord, and set it slightly in motion. The subject was not told that this was a hint. If that failed, the subject was handed a pair of pliers and told there was another way to solve the problem using them.

The results fell into three groups. The first group discovered the pendulum solution without help. The third group failed to find it even after help was given. But the second group was the most revealing one as they solved the problem only after receiving Maier’s hints.

With this group Maier could compare what had objectively influenced the solution with what people consciously reported afterward. The hint had often worked. In fact, the solution appeared on average only 42 seconds after the effective help was given. Yet many subjects did not identify the swinging cord as the cause of their insight.

Instead, they produced explanations that sounded plausible. One said, “It just dawned on me.” Another thought that a course in physics may have suggested it. A psychology professor reported thinking of “monkeys swinging from trees.” These were not necessarily dishonest answers. They were stories constructed from what was available to consciousness.

Maier’s own conclusion was that when a solution finally appears, “the cue which sets it off is not consciously experienced.”

Maier’s study was about the hidden architecture of human judgment. We often know what we decided and we can usually offer a reason. But we may not know what moved the rope.

The Mind’s Coherent Narrative

Nearly half a century later, psychologists Richard Nisbett and Timothy Wilson gave this phenomenon one of its most memorable descriptions: people often “tell more than they can know.” Their argument was that we have access to some mental content: our feelings, beliefs, images, intentions, and fragments of thought. But we often do not have direct access to the cognitive processes that produce our judgments.

So when someone asks, “Why did you choose that?” the mind does something useful but often incorrect. It generates an explanation that sounds reasonable and may even contain part of the truth.

But it is not necessarily a faithful transcript of the decision process. It is more like a press briefing after a complex geopolitical event. A spokesperson stands at the podium and offers a coherent account. The account may be polished but is a simplified version of the more complex reality. 

The same happens inside our minds. We take one or two visible causes and elevate them into “the reason.” We say the strategy was selected because of market opportunity or that the candidate was hired because of their leadership presence. 

Sometimes those explanations are right. Often, they are incomplete. And occasionally, they are beautifully wrong.

Reasons That Sound Right

One of the most important patterns in this research is that people tend to explain decisions using reasons that are socially available. By “socially available,” I mean reasons that are easy to defend and likely to be accepted by the audience.

This matters because many real influences are hard to confess or hard to detect. A person rarely says, “I trusted him because he reminded me of myself.” Or, a team rarely says, “We preferred the familiar option because uncertainty made us anxious.” Instead, we reach for explanations that sound objective, competent, and culturally approved.

We emphasize one or two dimensions and ignore others that may have played a larger role.

This is not necessarily a moral failure but a cognitive one. In a choice blindness experiment, researchers showed participants two options, such as two faces, and asked them to choose which they preferred. Through a sleight of hand, the researchers sometimes gave participants the option they had rejected and asked them to explain why they had chosen it. Many participants did not notice the switch. Instead, they confidently explained why they preferred the face they had not actually selected.

The mind did not say, “Something is wrong here.” It said, “Let me explain.”

This also helps explain why our reasoning so often has a persuasive quality. Hugo Mercier and Dan Sperber have argued that human reasoning may have evolved less as a truth-finding machine and more as a social tool for argumentation. Reasoning helps us justify ourselves, challenge others, and coordinate within groups. In other words, reasoning evolved to improve communication and coordination. But internally, the process of inference is very different from what the external argument we make. 

The AI Mirror

This brings us to artificial intelligence.

When a large language model gives an answer and then explains its reasoning, we are tempted to treat the explanation as a window into how the answer was produced. This temptation grows stronger when the reasoning is step-by-step, articulate, and professionally formatted. It feels like the model is transparent and showing its work.

But recent research on chain-of-thought reasoning suggests that this confidence can be misplaced. Models can produce explanations that are plausible but unfaithful. In one experiment, researchers introduced hidden biases into prompts, such as making a particular answer option more likely. The model’s final answer changed, but its explanation often failed to mention the true influence. Instead, it rationalized the answer after the fact.

This is the AI version of Maier’s rope.

Why does this happen? AI has been trained on vast amounts of human language, and human language is full of post-hoc justification. The model learns what explanations sound like. It learns which reasons tend to accompany which conclusions. It learns the socially available vocabulary of justification like efficiency, fairness, or customer value. 

In that sense, AI mirrors one of our oldest habits. When the real causal path is inaccessible or difficult to articulate, produce a reason that is coherent, acceptable, and close enough to the surface.

Navigating Reasoning Hallucinations

Both humans and AI can produce narratives that are coherent without being causally faithful. Both can overemphasize one or two salient dimensions while ignoring other (and potentially larger) variables. Both can justify a conclusion using reasons that are more available than accurate. 

This creates a new kind of risk. We may begin outsourcing not only analysis to AI, but also justification. A model recommends a course of action, summarizes the rationale, and the rationale sounds reasonable. The explanation appears to increase transparency but if the explanation is unfaithful, it may instead increase misplaced confidence.

The answer is not to reject human intuition or AI reasoning. The answer is to treat explanations differently.

An explanation should not be seen as the end of inquiry. It should be treated as a hypothesis about causality. When a person says, “I chose this because of X,” we should hear, “X is the reason currently available to me.” When an AI says, “The answer is Y because of these steps,” we should hear, “Here is a plausible reconstruction that may or may not reflect the actual basis of the output.”

For humans, that means comparing stated reasons with behavior, context, incentives, and patterns over time. For AI, it means testing explanations against interventions. Does the answer change when irrelevant details change? Does it remain consistent when the same question is reframed? 

The great irony of our time is that in building intelligent machines, we have inadvertently created a perfect explanatory mirror of ourselves. We built systems that generate language, and language is where human reasoning most often performs its magic trick. We tell the story we can live with, and that story serves a vital purpose: it allows for coordination, communication, and forward momentum in communities. But we also need to accept the limits of that story. Progress depends on knowing when the language that comforts us and allows us to coordinate is not the same as the underlying, unarticulated process that actually moved the rope.

AI, Layoffs and the Innovation Tax

The promise of AI is rapidly becoming a workforce question.

In a memo last year, Amazon CEO Andy Jassy told employees that as generative AI usage spreads through the company, “we will need fewer people doing some of the jobs that are being done today, and more people doing other types of jobs.” He went further: “in the next few years, we expect that this will reduce our total corporate workforce as we get efficiency gains from using AI extensively across the company.” 

Duolingo CEO Luis von Ahn’s “AI-first” memo said that they would “gradually stop using contractors to do work that AI can handle,” and that “headcount will only be given if a team cannot automate more of their work.” 

IBM CEO Arvind Krishna offered perhaps the clearest version of the attrition model. Speaking about back-office functions that could be automated, he said, “I could easily see 30% of that getting replaced by AI and automation over a five-year period.” 

Taken together, these statements reveal a new leadership strategy: automate and reduce headcount as AI expands, or in some cases, in anticipation of higher productivity. It is an understandable response to a technology that can write code, resolve customer tickets, draft marketing copy, summarize meetings, and generate analysis at extraordinary speed. 

But there is a danger in this story. It assumes that innovation is produced by tasks. In reality, innovation is produced by people working together in messier ways that defy easy automation – experimenting, discussing, dissenting, and learning through customer interactions. When companies cut too broadly, they may inadvertently remove the conditions that will allow AI-era innovation to compound.

The tricky question for leaders is whether they can distinguish between work that is truly automatable and work that looks inefficient only because its value is hard to capture in a balance sheet.

Impact of Downsizing on Innovation

The danger is not that every layoff is bad. It is that broad layoffs, especially when framed as an AI-enabled operating upgrade, can confuse labor reduction with organizational learning. A company can become leaner and less capable at the same time.

One of the earliest systematic warnings came from Teresa Amabile and Regina Conti’s 1999 study. They examined a large high-technology firm before, during, and after a major downsizing. Their central finding, apart from a worse morale, was that the conditions that support creativity deteriorated. Creativity, productivity, and perceived work-environment support declined during downsizing, while obstacles to creative work increased. 

That finding should land heavily in today’s technology sector. Innovation is rarely born from an isolated genius typing faster with better tools. It is more often a social process: someone notices an anomaly, someone else connects it to an unmet customer need, a third person remembers a failed experiment from two years earlier, and a fourth turns the conversation into a prototype. Remove enough nodes from that system and the org chart may still look coherent, but the creative network underneath it becomes brittle.

Newer research adds an important nuance. Ramdani and colleagues studied 122 UK firms over 22 years, using downsizing and patent data to examine how workforce reduction affects innovation outputs. Their conclusion is not “layoffs always reduce innovation.” It is more precise: downsizing has a dual effect depending on the firm’s resource position. In firms with resource slack, downsizing can have a positive effect on innovation. In resource-constrained firms, it has a negative effect, and the damage appears more quickly. 

This matters because many tech companies do not experience resource slack uniformly. One division may have layers of coordination, duplicated tooling, and unclear ownership. Another may have exhausted engineers, customer-support escalation queues, and neglected technical debt. From 30,000 feet, both may look like “headcount.” But only one will lead to higher innovation on downsizing.

The key question for leaders is: are you cutting fat, or are you cutting connective tissue? Because downsizing in a resource-constrained organization can significantly hurt innovation down the road. 

Psychological Effects of Layoffs

Why do layoffs damage innovation when there is little slack? The answer lies in psychology as much as economics.

The first mechanism is psychological safety collapse. Innovation requires people to take interpersonal risks: challenge assumptions, admit uncertainty, surface bad news, and suggest ideas that may initially sound naïve. Amy Edmondson and Derrick Bransby’s 2023 review of psychological safety research describes its importance for learning behavior, performance, and work under uncertainty. A meta-analysis of 94 studies also found psychological safety positively associated with both employee innovation behavior and team innovation behavior. After broad layoffs, however, the unwritten rule often becomes: do not look expendable. In that climate, people do not stop having ideas; they stop volunteering them.

The second mechanism is survivor syndrome and identity rupture. Research on downsizing survivors shows that employees who remain often experience reduced commitment and performance, a pattern commonly described as survivor syndrome. Van Dick and colleagues found that downsizing can reduce employees’ identification with the organization, which then harms survivor performance. The innovation consequence is direct. The person who no longer identifies with the company may still complete assigned tasks. But will they fight for an unproven customer insight? Will they spend political capital defending a long-term bet? Will they mentor the junior colleague who may one day become a breakthrough inventor? Often, the answer is no.

The third mechanism is job insecurity narrowing attention. When employees fear future cuts, their time horizon contracts. They focus on visible output, defensible metrics, and work that protects their standing. Niesen and colleagues’ research on job insecurity and innovative work behavior notes the paradox that organizations often expect restructuring to enhance innovation, even as insecurity can undermine the behaviors innovation requires. This is the problem with fear-based efficiency: it may increase activity while decreasing imagination. 

There is also a network mechanism. Corporate knowledge is not stored only in documents or AI retrieval systems. It lives in relationships: who knows which customer exception matters, why a particular architecture decision was made, or which workaround keeps a product alive. Broad layoffs sever these ties indiscriminately. AI cannot easily reconstruct what the organization failed to write down.

A 2024 HBR article based on a study of 146 companies found that engagement, morale, and loyalty can take years, not months, to rebound after layoffs. The authors cite Pixar director Brad Bird’s memorable observation: “If you have low morale, for every $1 you spend, you get about 25 cents of value. If you have high morale, for every $1 you spend, you get about $3 of value.” Whether or not one takes the math literally, the leadership lesson is still valid: the same dollar value produces radically different returns depending on the emotional state of the system. In a frightened organization, talent becomes defensive. In a committed organization, talent becomes generative. That difference will determine whether AI becomes merely a substitute for human contribution or a catalyst for the next wave of innovation.

The Way Forward

So, how can leaders figure out if layoffs are the right move? It starts with three key diagnostic questions.

First, where is work waiting? Slack isn’t always visible in just headcount ratios. Look for queues: unresolved customer issues, delayed experiments, or rising technical debt. If critical work is already backing up, you’re resource-constrained and cuts will only make things worse.

Second, where has learning slowed? Beyond simple experimentation metrics, track behavioral indicators. Are people trying fewer new things? Are you seeing fewer dissenting views? A drop in any of these suggests learning is stalling.

Third, where is human judgment still doing hidden work? AI can automate routine tasks, but leaders must map the judgment layer. This critical layer handles edge cases, ethical tradeoffs, and customer reassurance. Removing people who interpret these ambiguous situations could make the process faster, but the organization will get less intelligent.

Klarna learned their lesson the hard way. They cut about 700 employees citing the productivity of their AI assistant, but later had to reverse course when they had to reassign engineers and marketers back to customer-support roles when AI didn’t hit the mark. 

However, the deeper challenge for leadership is a philosophical one. The industrial-age instinct was simple: reduce labor when a machine increased productivity. That made sense when work was predictable. But in the innovation age, the question is entirely different: when a new tool boosts productivity, where do you redeploy the freed human capacity to gain a learning advantage?

That is the fork in the road. Companies using AI mainly for headcount reduction might win a margin story but lose their innovation future. Conversely, companies that reinvest human attention into discovery, customer understanding, and experimentation will compound their advantage.

AI is Straining the Leadership Model That Built Most Companies

When Jos de Blok looked at Dutch home care, he saw a management model that had become part of the problem.

Home care nursing is not a tidy production process. Every patient brings a different mix of medical needs, family dynamics, living conditions, emotional realities, and sudden changes. Small signals matter and context changes fast. The people closest to the patient often hold crucial tacit knowledge that cannot be captured fully in a procedure manual or escalated up a chain of command in time to matter. 

In Cynefin terms, this is a complex environment: there are too many interdependent variables that make it impossible to create an efficient centralized process.  But the system that de Blok had known as a nurse and later as a leader was built as if home care were merely complicated. It leaned on specialization, managerial oversight, and layers of coordination designed to create control. But for this complex problem, those layers became part of the problem.

Harvard Business School’s account of Buurtzorg notes that de Blok had seen “counterproductive layers of management” undermine care quality and frontline discretion. So he made a radical wager: if the work itself was complex, the answer was not more hierarchy. It was a different theory of leadership. Buurtzorg organized care around small self-managing neighborhood teams, with minimal middle management and a lean support structure. The center stopped trying to out-think the edges and started enabling them. 

That story matters far beyond healthcare. It captures the mistake many organizations now risk making with AI: applying a top-down management model to challenges that are, at least in part, complex.

AI Is Not One Leadership Problem. It Is Two.

Most executive conversations about AI still assume a single challenge: implementation. Buy the tools, train the workforce, hire the experts, and move fast. But AI is creating at least two very different leadership problems.

Some AI problems are complicated. They require expertise, analysis, and disciplined systems. Think data architecture, cybersecurity, privacy, model evaluation, legal compliance, workflow redesign, and technical governance. These are not simple issues, but they are tractable. The right response is rigorous diagnosis, strong standards, and clear accountability. In Cynefin terms, leaders in this domain must sense, analyze, and respond. 

Other AI problems are complex. How will customers behave when AI becomes embedded in products and services? Which use cases will create durable value rather than just attention? How should judgment be divided between humans and machines? What happens to culture when some employees trust AI deeply, others distrust it, and many use it informally out of management’s sight? Those are not problems that yield to a leadership memo. They require leaders to probe, sense, and respond. 

This distinction sounds abstract until you see its consequences. If leaders treat a complicated problem as complex, they can drift into improvisation where rigor is required. But if they treat a complex problem as complicated, they over-centralize, over-standardize, and under-learn. That second mistake may be the defining leadership failure of the AI era.

The Shift From Answer-Giver to Context-Setter

For decades, many leaders rose by being decisive, analytical, and visibly in control. Those traits still matter. But in complex conditions, they are not enough. The leader who insists on having the answer too early can shut down the very learning the organization most needs.

This is where Buurtzorg offers such a powerful lesson. De Blok did not just become a more empathetic leader. He changed his model of what leadership is for. In a complex system, the leader’s job is to create the conditions in which good judgment can emerge throughout the system. That requires adopting a different mindset about authority. 

In the complicated parts of AI, leaders should tighten standards, elevate expertise, and demand rigor. In the complex parts, they should widen participation, encourage small experiments, protect dissent, and reward learning. The critical leadership skill is knowing when to switch.

Why Swarm Intelligence Matters More Than Executive Certainty

Business leaders often talk about “empowering employees,” but complex problems demand something more precise: they demand systems that let intelligence emerge from many places.

Research by Anita Woolley and colleagues found evidence for a general collective intelligence factor in groups. Strikingly, group performance was not tied to the highest individual intelligence in the room. It was more closely associated with social sensitivity and with more equal conversational turn-taking. In practical terms, groups get smarter when more people can meaningfully contribute and when interaction patterns allow insight to surface, not just status to dominate. 

That should provoke an uncomfortable question for senior leaders: what if your organization is full of intelligence that your culture cannot hear?

In complex AI environments, breakthrough insights often begin at the edges. A sales manager notices where customers actually trust the tool. A service employee spots a subtle failure mode. A product designer sees that the real opportunity is not automating the old workflow, but redesigning it entirely. A junior analyst challenges the executive team’s favorite use case and turns out to be right. In a complex environment, these become the raw material of strategy.

The organizations that learn fastest from AI will not be those with the most polished top-down vision. They will be those with the richest lateral sensing mechanisms: more experimentation, more challenge, more idea collisions, and more pathways for weak signals to travel upward and sideways.

Culture Is Your Operating Infrastructure.

That is why culture cannot be treated as a side topic in AI transformation. Culture determines how well an organization learns.

Amy Edmondson’s research on psychological safety showed that teams learn more effectively when people believe the environment is safe for interpersonal risk-taking. In sage cultures, people speak up more and admit mistakes sooner. They raise concerns before problems metastasize. Psychological safety is associated with learning behavior because it lowers the social cost of candor. 

Why does that matter in AI? Because AI adoption is full of ambiguity. Employees are constantly making judgment calls: when to trust the tool, when to override it, when to disclose its use, when to question the workflow, and when to challenge leadership’s assumptions. In a fearful culture, they will hide uncertainty, perform confidence, and quietly work around the system. In a learning culture, they will surface anomalies, share experiments, and improve the system in public.

Many organizations say they want innovation, but their incentives still reward obedience. They say they want initiative, but punish failed experiments. They say they want challenge, but subtly penalize people who question senior leaders. Instead of an innovation culture, it leads to a  compliance culture.

Buurtzorg worked because the shift was structural, not rhetorical. Frontline teams did not merely get permission to speak up. They got real discretion. The system was redesigned around the reality that those closest to the patient were best positioned to respond to complexity. 

What Leadership Looks Like in the AI Era

So what should leaders actually do?

First, diagnose the domain. Ask: is this AI challenge primarily complicated, complex, or a blend of both? That question should come before the org chart, the governance model, or the training plan. 

Second, match the leadership response to the problem. In complicated domains, clarify ownership, concentrate expertise, and build strong review mechanisms. In complex domains, run more small experiments, widen participation, shorten feedback loops, and let the people closest to the work challenge assumptions early.

Third, redesign incentives around learning. You cannot build collective intelligence in a culture where dissent is risky and failure is career-limiting. If leaders want employees to behave like owners, the system must make it safe to notice, question, and improve.

Finally, rethink the role of middle management. In too many organizations, middle layers still function mainly as transmission belts for approval and control. But in a complex environment, the best middle managers help signals travel. They turn the organization into a smarter sensing system rather than a slower permission system.

The Leadership Advantage That Will Matter Most

The AI era will reward many familiar strengths: technical fluency, strategic clarity, disciplined execution. But over time, the most valuable advantage may be more subtle.

It will belong to leaders who can tell when expertise should dominate and when emergence should. Leaders who know when to act like engineers and when to act like gardeners. Leaders who understand that hierarchy is still useful, but not universally wise. Leaders who stop asking, “How do I get the organization to execute my answer?” and start asking, “How do I build an organization capable of discovering better answers than I could alone?”

That is the deeper lesson of Buurtzorg. Jos de Blok did not save a struggling system by becoming a more forceful commander. He succeeded because he recognized that in a complex human system, the smartest move is to increase the system’s capacity to learn. 

AI now puts that same choice in front of every executive team. Some problems will still require experts, precision, and control. But many of the most consequential ones will require humility, experimentation, and trust in intelligence distributed throughout the organization. The companies that thrive will not just deploy better tools. They will build cultures where insight can rise from anywhere, where leadership adapts to the problem at hand, and where the search for the right answer matters more than protecting the illusion that it already lives at the top.

Why Education Needs an “AI-in-the-Loop” Model

Twelve year old Leo sat in his room, staring at his history assignment on the Industrial Revolution. Usually such an assignment would take several hours of reading relevant material, picking his thesis, finding supporting evidence to back his claims and then drafting his essay. But today, he simply fed a prompt into ChatGPT, made some quick and simple revisions, and hit submit. 

On paper he looked like a star student. His essay even got him an ‘A’ from his teacher. But did any real learning take place? 

This is the crisis we face in education. We are currently obsessed with the “Human-in-the-Loop” model, where humans oversee AI outputs. But in a classroom, that model is backwards. If we want to raise a generation of innovators rather than mere prompt engineers, we have to flip the script to an “AI-in-the-Loop” approach that prioritizes the student’s cognitive effort before the algorithm ever enters the chat.

Renzulli’s Three Ring Conception of Giftedness

To understand why current AI integration is risky, we have to look at what actually creates high-level human performance. One of the most respected frameworks in educational psychology is Joseph Renzulli’s Three-Ring Conception of Giftedness.

Renzulli argued that “giftedness” (or what we might call high-level creative productivity) lies at the intersection of three distinct clusters of human traits:

  1. Above-Average Ability: The core competency and foundational knowledge in a specific domain.
  2. Creativity: The ability to generate original ideas, see new patterns, and think divergently.
  3. Task Commitment: The grit, perseverance, and “productive struggle” required to see a difficult project through to the end.

When these three rings overlap, giftedness emerges. However, the current “plug-and-play” integration of AI in schools threatens to thin out every one of these rings, potentially preventing students from achieving high accomplishments.

The Risk to Ability

The most immediate danger of AI is cognitive offloading—the tendency to use external tools to reduce the mental effort required for a task. While offloading is great for mundane chores (like GPS for driving), it is quite harmful for learning.

We already know that easy access to external information changes what people remember and how they allocate mental effort. The classic “Google effect” research showed that when people expect information to be accessible later, they are less likely to remember the information itself and more likely to remember where to find it. That isn’t necessarily bad. Humans have always used tools and transactive memory, but in schooling, ability is built through repeated retrieval, reconstruction, and sense-making.

Research has already begun to show the impact of cognitive offloading. In a large field experiment with high school math students, researchers found that a “ChatGPT-like” tutor improved performance during practice, but when the AI support was removed, those students performed worse than students who learned without the AI tool, an effect consistent with dependence and reduced durable learning.

Without guardrails, AI can act as a “cognitive crutch.” By providing hints and solving intermediate steps, it removes “desirable difficulty” or the kind of struggle that leads to deeper learning and long-term retention. If the AI does the heavy lifting, the students learn less and don’t reach higher levels of competence required for high achievements. 

The Risk To Creativity

Creativity is not just producing something. It’s producing something both novel and appropriate.

AI is remarkably good at the obvious. Large language models generate outputs that reflect patterns in their training data. That makes them useful, fluent, and fast but it also means they can pull learners toward the statistical center of what’s been said before.

A recent experiment on creative writing found that access to AI ideas helped individuals produce stories judged as more creative (especially among less creative writers) but it also made the stories more similar to one another, reducing collective novelty. In other words, AI can raise the floor while lowering the ceiling of diversity.

Separate work has begun to quantify this “echo” effect more directly, showing measurable limits on plot diversity in LLM outputs under the same prompts. And broader research reviewing LLM creativity suggests that while models can appear creative through recombination, there are persistent questions about originality, intent, and the difference between pattern completion and human creative agency. 

When students brainstorm with AI first, they often anchor on the suggestions they see and fall into an “associative rut.” If they took the time to think by themselves before reaching out to AI, they would discover more original and personally meaningful ideas. 

And we’ve seen a similar phenomenon long before AI. Research on brainstorming has shown that nominal brainstorming (individual idea generation before group discussion) produces more ideas and more original ideas than purely interactive brainstorming. In the AI context, the LLM acts like a “dominant personality” in a group meeting. It speaks first, speaks confidently, and sets a baseline. Once a student sees those AI suggestions, their brain finds it incredibly difficult to think outside those parameters.

The Risk to Commitment

Task commitment includes persistence, delayed gratification, self-regulation, and the willingness to stay with ambiguity.

Modern technology has already been impacting this ring. Research shows that higher use of digital devices is tied to concentration difficulties, lower academic performance, and poorer self-regulation. 

Now layer generative AI on top. In a world where a chatbot can produce instant essays and workable code, the emotional “cost” of effort feels high. Why wrestle with the challenge when the answer is right at the fingertips?

Emerging education research is also starting to map how generative AI intersects with self-regulated learning—highlighting both risks (over-reliance, reduced monitoring) and opportunities (scaffolds for planning, reflection, and feedback) depending on design and pedagogy. And survey-based findings have reported associations between ChatGPT use and procrastination or lower performance in some student samples, suggesting that without strong norms and supports, AI can drift from scaffold to shortcut. 

Then there’s an additional twist: changing expectations. Once AI is available, teachers and workplaces may (implicitly or explicitly) expect faster output. But speed is not the same as depth. Many breakthroughs require a long dwell time. If we compress the timeline before students have built the inner muscles of persistence, we don’t get high performers.

The Way Out

In many AI discussions, “human in the loop” sounds reassuring: the human checks the AI’s work. But in education, that framing can be backward. It puts students in the role of evaluator rather than constructor, as if learning were mainly about spotting mistakes in someone else’s thinking.

Decades of learning science tell us that durable learning is constructive and interactive. The ICAP framework, for example, shows that learning activities that are Interactive and Constructive generally outperform merely Active or Passive engagement. Students learn more when they generate, explain, debate, and build meaning, rather than just consume or lightly manipulate information.

In education, we need a “Learning-First” model that prioritizes human cognition before algorithmic assistance. A sample 5-stage framework could look like this:

Phase 1 (Individual): students write an initial thesis, solution path, or set of ideas before using AI. This protects original cognition and forces retrieval, sense-making, and ownership.

Phase 2 (Group): students critique and build together. This is where misconceptions surface and learning becomes social where students learn from each other.

Phase 3 (AI): only then does AI enter as a gap-finder, alternative perspective generator, or a Socratic questioner. It reveals elements that students might have missed and stretches their thinking.

Phase 4 (Group): the group revises their solution based on the feedback from AI, synthesizing aspects that are reasonable and rejecting those that don’t fit well.

Phase 5 (Individual): individuals reconstruct the argument/solution in their own words because self-explanation is a reliable way to accelerate understanding.

This scaffolded approach protects each ring of Renzulli’s model. Ability is built through retrieval and reconstruction. Creativity is protected through first-thought originality and peer divergence. Task commitment is strengthened through social support and reflection.

Conclusion

Education has a choice to make. Without creating the right guardrails on how to use AI, we risk teaching the  models instead of students. 

The danger of a default “human-in-the-loop” stance in classrooms is that it casts students as editors of machine work. But learning isn’t editorial. Students must build mental models, connect ideas, and develop the internal fluency that only comes from doing the cognitive work themselves.

So the guiding question for AI in education should be, “Which phase of learning does this tool strengthen and which phase might it accidentally replace?”

If we stay student-first and learning-first, we’ll use AI the way every great teacher uses support: not to remove the mountain, but to help students become the kind of climbers who can scale it, long after the tool is gone.