AI is Straining the Leadership Model That Built Most Companies

When Jos de Blok looked at Dutch home care, he saw a management model that had become part of the problem.

Home care nursing is not a tidy production process. Every patient brings a different mix of medical needs, family dynamics, living conditions, emotional realities, and sudden changes. Small signals matter and context changes fast. The people closest to the patient often hold crucial tacit knowledge that cannot be captured fully in a procedure manual or escalated up a chain of command in time to matter. 

In Cynefin terms, this is a complex environment: there are too many interdependent variables that make it impossible to create an efficient centralized process.  But the system that de Blok had known as a nurse and later as a leader was built as if home care were merely complicated. It leaned on specialization, managerial oversight, and layers of coordination designed to create control. But for this complex problem, those layers became part of the problem.

Harvard Business School’s account of Buurtzorg notes that de Blok had seen “counterproductive layers of management” undermine care quality and frontline discretion. So he made a radical wager: if the work itself was complex, the answer was not more hierarchy. It was a different theory of leadership. Buurtzorg organized care around small self-managing neighborhood teams, with minimal middle management and a lean support structure. The center stopped trying to out-think the edges and started enabling them. 

That story matters far beyond healthcare. It captures the mistake many organizations now risk making with AI: applying a top-down management model to challenges that are, at least in part, complex.

AI Is Not One Leadership Problem. It Is Two.

Most executive conversations about AI still assume a single challenge: implementation. Buy the tools, train the workforce, hire the experts, and move fast. But AI is creating at least two very different leadership problems.

Some AI problems are complicated. They require expertise, analysis, and disciplined systems. Think data architecture, cybersecurity, privacy, model evaluation, legal compliance, workflow redesign, and technical governance. These are not simple issues, but they are tractable. The right response is rigorous diagnosis, strong standards, and clear accountability. In Cynefin terms, leaders in this domain must sense, analyze, and respond. 

Other AI problems are complex. How will customers behave when AI becomes embedded in products and services? Which use cases will create durable value rather than just attention? How should judgment be divided between humans and machines? What happens to culture when some employees trust AI deeply, others distrust it, and many use it informally out of management’s sight? Those are not problems that yield to a leadership memo. They require leaders to probe, sense, and respond. 

This distinction sounds abstract until you see its consequences. If leaders treat a complicated problem as complex, they can drift into improvisation where rigor is required. But if they treat a complex problem as complicated, they over-centralize, over-standardize, and under-learn. That second mistake may be the defining leadership failure of the AI era.

The Shift From Answer-Giver to Context-Setter

For decades, many leaders rose by being decisive, analytical, and visibly in control. Those traits still matter. But in complex conditions, they are not enough. The leader who insists on having the answer too early can shut down the very learning the organization most needs.

This is where Buurtzorg offers such a powerful lesson. De Blok did not just become a more empathetic leader. He changed his model of what leadership is for. In a complex system, the leader’s job is to create the conditions in which good judgment can emerge throughout the system. That requires adopting a different mindset about authority. 

In the complicated parts of AI, leaders should tighten standards, elevate expertise, and demand rigor. In the complex parts, they should widen participation, encourage small experiments, protect dissent, and reward learning. The critical leadership skill is knowing when to switch.

Why Swarm Intelligence Matters More Than Executive Certainty

Business leaders often talk about “empowering employees,” but complex problems demand something more precise: they demand systems that let intelligence emerge from many places.

Research by Anita Woolley and colleagues found evidence for a general collective intelligence factor in groups. Strikingly, group performance was not tied to the highest individual intelligence in the room. It was more closely associated with social sensitivity and with more equal conversational turn-taking. In practical terms, groups get smarter when more people can meaningfully contribute and when interaction patterns allow insight to surface, not just status to dominate. 

That should provoke an uncomfortable question for senior leaders: what if your organization is full of intelligence that your culture cannot hear?

In complex AI environments, breakthrough insights often begin at the edges. A sales manager notices where customers actually trust the tool. A service employee spots a subtle failure mode. A product designer sees that the real opportunity is not automating the old workflow, but redesigning it entirely. A junior analyst challenges the executive team’s favorite use case and turns out to be right. In a complex environment, these become the raw material of strategy.

The organizations that learn fastest from AI will not be those with the most polished top-down vision. They will be those with the richest lateral sensing mechanisms: more experimentation, more challenge, more idea collisions, and more pathways for weak signals to travel upward and sideways.

Culture Is Your Operating Infrastructure.

That is why culture cannot be treated as a side topic in AI transformation. Culture determines how well an organization learns.

Amy Edmondson’s research on psychological safety showed that teams learn more effectively when people believe the environment is safe for interpersonal risk-taking. In sage cultures, people speak up more and admit mistakes sooner. They raise concerns before problems metastasize. Psychological safety is associated with learning behavior because it lowers the social cost of candor. 

Why does that matter in AI? Because AI adoption is full of ambiguity. Employees are constantly making judgment calls: when to trust the tool, when to override it, when to disclose its use, when to question the workflow, and when to challenge leadership’s assumptions. In a fearful culture, they will hide uncertainty, perform confidence, and quietly work around the system. In a learning culture, they will surface anomalies, share experiments, and improve the system in public.

Many organizations say they want innovation, but their incentives still reward obedience. They say they want initiative, but punish failed experiments. They say they want challenge, but subtly penalize people who question senior leaders. Instead of an innovation culture, it leads to a  compliance culture.

Buurtzorg worked because the shift was structural, not rhetorical. Frontline teams did not merely get permission to speak up. They got real discretion. The system was redesigned around the reality that those closest to the patient were best positioned to respond to complexity. 

What Leadership Looks Like in the AI Era

So what should leaders actually do?

First, diagnose the domain. Ask: is this AI challenge primarily complicated, complex, or a blend of both? That question should come before the org chart, the governance model, or the training plan. 

Second, match the leadership response to the problem. In complicated domains, clarify ownership, concentrate expertise, and build strong review mechanisms. In complex domains, run more small experiments, widen participation, shorten feedback loops, and let the people closest to the work challenge assumptions early.

Third, redesign incentives around learning. You cannot build collective intelligence in a culture where dissent is risky and failure is career-limiting. If leaders want employees to behave like owners, the system must make it safe to notice, question, and improve.

Finally, rethink the role of middle management. In too many organizations, middle layers still function mainly as transmission belts for approval and control. But in a complex environment, the best middle managers help signals travel. They turn the organization into a smarter sensing system rather than a slower permission system.

The Leadership Advantage That Will Matter Most

The AI era will reward many familiar strengths: technical fluency, strategic clarity, disciplined execution. But over time, the most valuable advantage may be more subtle.

It will belong to leaders who can tell when expertise should dominate and when emergence should. Leaders who know when to act like engineers and when to act like gardeners. Leaders who understand that hierarchy is still useful, but not universally wise. Leaders who stop asking, “How do I get the organization to execute my answer?” and start asking, “How do I build an organization capable of discovering better answers than I could alone?”

That is the deeper lesson of Buurtzorg. Jos de Blok did not save a struggling system by becoming a more forceful commander. He succeeded because he recognized that in a complex human system, the smartest move is to increase the system’s capacity to learn. 

AI now puts that same choice in front of every executive team. Some problems will still require experts, precision, and control. But many of the most consequential ones will require humility, experimentation, and trust in intelligence distributed throughout the organization. The companies that thrive will not just deploy better tools. They will build cultures where insight can rise from anywhere, where leadership adapts to the problem at hand, and where the search for the right answer matters more than protecting the illusion that it already lives at the top.

Why Education Needs an “AI-in-the-Loop” Model

Twelve year old Leo sat in his room, staring at his history assignment on the Industrial Revolution. Usually such an assignment would take several hours of reading relevant material, picking his thesis, finding supporting evidence to back his claims and then drafting his essay. But today, he simply fed a prompt into ChatGPT, made some quick and simple revisions, and hit submit. 

On paper he looked like a star student. His essay even got him an ‘A’ from his teacher. But did any real learning take place? 

This is the crisis we face in education. We are currently obsessed with the “Human-in-the-Loop” model, where humans oversee AI outputs. But in a classroom, that model is backwards. If we want to raise a generation of innovators rather than mere prompt engineers, we have to flip the script to an “AI-in-the-Loop” approach that prioritizes the student’s cognitive effort before the algorithm ever enters the chat.

Renzulli’s Three Ring Conception of Giftedness

To understand why current AI integration is risky, we have to look at what actually creates high-level human performance. One of the most respected frameworks in educational psychology is Joseph Renzulli’s Three-Ring Conception of Giftedness.

Renzulli argued that “giftedness” (or what we might call high-level creative productivity) lies at the intersection of three distinct clusters of human traits:

  1. Above-Average Ability: The core competency and foundational knowledge in a specific domain.
  2. Creativity: The ability to generate original ideas, see new patterns, and think divergently.
  3. Task Commitment: The grit, perseverance, and “productive struggle” required to see a difficult project through to the end.

When these three rings overlap, giftedness emerges. However, the current “plug-and-play” integration of AI in schools threatens to thin out every one of these rings, potentially preventing students from achieving high accomplishments.

The Risk to Ability

The most immediate danger of AI is cognitive offloading—the tendency to use external tools to reduce the mental effort required for a task. While offloading is great for mundane chores (like GPS for driving), it is quite harmful for learning.

We already know that easy access to external information changes what people remember and how they allocate mental effort. The classic “Google effect” research showed that when people expect information to be accessible later, they are less likely to remember the information itself and more likely to remember where to find it. That isn’t necessarily bad. Humans have always used tools and transactive memory, but in schooling, ability is built through repeated retrieval, reconstruction, and sense-making.

Research has already begun to show the impact of cognitive offloading. In a large field experiment with high school math students, researchers found that a “ChatGPT-like” tutor improved performance during practice, but when the AI support was removed, those students performed worse than students who learned without the AI tool, an effect consistent with dependence and reduced durable learning.

Without guardrails, AI can act as a “cognitive crutch.” By providing hints and solving intermediate steps, it removes “desirable difficulty” or the kind of struggle that leads to deeper learning and long-term retention. If the AI does the heavy lifting, the students learn less and don’t reach higher levels of competence required for high achievements. 

The Risk To Creativity

Creativity is not just producing something. It’s producing something both novel and appropriate.

AI is remarkably good at the obvious. Large language models generate outputs that reflect patterns in their training data. That makes them useful, fluent, and fast but it also means they can pull learners toward the statistical center of what’s been said before.

A recent experiment on creative writing found that access to AI ideas helped individuals produce stories judged as more creative (especially among less creative writers) but it also made the stories more similar to one another, reducing collective novelty. In other words, AI can raise the floor while lowering the ceiling of diversity.

Separate work has begun to quantify this “echo” effect more directly, showing measurable limits on plot diversity in LLM outputs under the same prompts. And broader research reviewing LLM creativity suggests that while models can appear creative through recombination, there are persistent questions about originality, intent, and the difference between pattern completion and human creative agency. 

When students brainstorm with AI first, they often anchor on the suggestions they see and fall into an “associative rut.” If they took the time to think by themselves before reaching out to AI, they would discover more original and personally meaningful ideas. 

And we’ve seen a similar phenomenon long before AI. Research on brainstorming has shown that nominal brainstorming (individual idea generation before group discussion) produces more ideas and more original ideas than purely interactive brainstorming. In the AI context, the LLM acts like a “dominant personality” in a group meeting. It speaks first, speaks confidently, and sets a baseline. Once a student sees those AI suggestions, their brain finds it incredibly difficult to think outside those parameters.

The Risk to Commitment

Task commitment includes persistence, delayed gratification, self-regulation, and the willingness to stay with ambiguity.

Modern technology has already been impacting this ring. Research shows that higher use of digital devices is tied to concentration difficulties, lower academic performance, and poorer self-regulation. 

Now layer generative AI on top. In a world where a chatbot can produce instant essays and workable code, the emotional “cost” of effort feels high. Why wrestle with the challenge when the answer is right at the fingertips?

Emerging education research is also starting to map how generative AI intersects with self-regulated learning—highlighting both risks (over-reliance, reduced monitoring) and opportunities (scaffolds for planning, reflection, and feedback) depending on design and pedagogy. And survey-based findings have reported associations between ChatGPT use and procrastination or lower performance in some student samples, suggesting that without strong norms and supports, AI can drift from scaffold to shortcut. 

Then there’s an additional twist: changing expectations. Once AI is available, teachers and workplaces may (implicitly or explicitly) expect faster output. But speed is not the same as depth. Many breakthroughs require a long dwell time. If we compress the timeline before students have built the inner muscles of persistence, we don’t get high performers.

The Way Out

In many AI discussions, “human in the loop” sounds reassuring: the human checks the AI’s work. But in education, that framing can be backward. It puts students in the role of evaluator rather than constructor, as if learning were mainly about spotting mistakes in someone else’s thinking.

Decades of learning science tell us that durable learning is constructive and interactive. The ICAP framework, for example, shows that learning activities that are Interactive and Constructive generally outperform merely Active or Passive engagement. Students learn more when they generate, explain, debate, and build meaning, rather than just consume or lightly manipulate information.

In education, we need a “Learning-First” model that prioritizes human cognition before algorithmic assistance. A sample 5-stage framework could look like this:

Phase 1 (Individual): students write an initial thesis, solution path, or set of ideas before using AI. This protects original cognition and forces retrieval, sense-making, and ownership.

Phase 2 (Group): students critique and build together. This is where misconceptions surface and learning becomes social where students learn from each other.

Phase 3 (AI): only then does AI enter as a gap-finder, alternative perspective generator, or a Socratic questioner. It reveals elements that students might have missed and stretches their thinking.

Phase 4 (Group): the group revises their solution based on the feedback from AI, synthesizing aspects that are reasonable and rejecting those that don’t fit well.

Phase 5 (Individual): individuals reconstruct the argument/solution in their own words because self-explanation is a reliable way to accelerate understanding.

This scaffolded approach protects each ring of Renzulli’s model. Ability is built through retrieval and reconstruction. Creativity is protected through first-thought originality and peer divergence. Task commitment is strengthened through social support and reflection.

Conclusion

Education has a choice to make. Without creating the right guardrails on how to use AI, we risk teaching the  models instead of students. 

The danger of a default “human-in-the-loop” stance in classrooms is that it casts students as editors of machine work. But learning isn’t editorial. Students must build mental models, connect ideas, and develop the internal fluency that only comes from doing the cognitive work themselves.

So the guiding question for AI in education should be, “Which phase of learning does this tool strengthen and which phase might it accidentally replace?”

If we stay student-first and learning-first, we’ll use AI the way every great teacher uses support: not to remove the mountain, but to help students become the kind of climbers who can scale it, long after the tool is gone.

The “Tiny Team” Organization Is Here and It’s Redrawing the Management Map

In the past year or two, the business world has felt more like a bumpy ride than a smooth “transformation.” Employees are dealing with a lot of uncertainty—roles changing or getting eliminated entirely, teams shuffling, and rules shifting mid-game. But leaders aren’t operating from a crystal-clear blueprint either. Many are making big cuts, not just because AI speeds things up, but because they honestly can’t see what the company should look like long-term. So, they reduce costs and complexity first, then plan to rebuild smarter.

The tricky part is that AI isn’t a neat replacement for people or their jobs. It absolutely makes many tasks faster, but it also creates entirely new work. Think about customer support: many companies use chatbots to handle volume, but now someone has to watch performance, check logs to find problems, fine-tune the prompts and rules, and constantly improve the system. The work doesn’t disappear; it just moves and changes form.

Still, one advantage is unmistakable. For decades, the biggest hidden cost in any company was the Coordination Tax. You know the drill: an engineer builds a feature, then hands it to a product manager, who syncs with marketing, who waits for a business dev lead to find a partner. Every handoff is a friction point and every meeting is a tax on productivity. But now, a single “pod” of 5 to 10 people—comprising a mix of engineering, design, and growth talent—can now execute faster than a 50-person department.

2026 looks like the year that some of these trends become normal: smaller, multidisciplinary, autonomous feature teams, and flatter organizations where the middle-management role evolves from “traffic cop” to “coach + systems designer.”

Trend 1: Smaller, multidisciplinary “feature teams” become the default

As AI lifts individual capability, it becomes feasible to staff product work like a small startup.

Instead of a large, functionally segmented machine, you get teams of 5–10 people who can handle a feature from the initial idea all the way through building, shipping, and improving it. Speed shoots up because there is less coordination and fewer approvals needed. Quality rises because feedback loops tighten.

We can already see executives publicly describing this “tiny team” dynamic.

  • Mark Zuckerberg recently noted that AI lets “a single very talented person” tackle projects that used to need “big teams,” and he’s actively pushing to “flatten teams.”
  • Tobias Lütke, Shopify’s CEO, gave his teams a powerful signal: before asking for more people, you have to prove why you “cannot get what you want done using AI.” This directly asks them to think of AI as a part of their team that handles work autonomously.
  • Duolingo’s CEO, Luis von Ahn, sees AI as a “platform shift” and is adjusting things like hiring and performance. They’re also reducing contractor work where AI can step in—another clear move toward “smaller teams that deliver much more.”

The common thread in these examples isn’t just “AI is useful.” It’s that they are rethinking the basic rules of how the company operates. The organization shifts from shuffling work between departments to empowering small, focused teams to fully own their results.

Ever watched a two-person team crank out something amazing over a weekend and thought, “How did they move so fast?” Now, imagine that kind of efficiency happening across dozens of teams, each one fully supported by AI.

That’s the emerging design pattern.

Trend 2: Flatter orgs and middle management under pressure

Once small, autonomous teams are successful, a second order effect kicks in: you just don’t need as many layers of management to keep them in sync. This is the point where the discussion about “flattening” an organization becomes very real, and uncomfortable.

Experts studying the workforce have been tracking “delayering” everywhere, not just in tech. Korn Ferry, for instance, talks about companies “thinning out their management midsections,” pointing out that middle managers were a big part of 2024 layoffs, a clear evidence that the traditional manager role is under structural pressure.

And major corporations are openly saying their restructures are about cutting bureaucracy. Amazon’s corporate layoffs, for instance, were reported as an effort to reduce organizational layers and operate more efficiently.

So, the pattern we should expect to continue seeing in 2026 isn’t a world with “no managers.” It’s a world with fewer managers whose fundamental job is completely different.

Which brings up an important, yet simple, question: If AI reduces coordination work, what exactly should managers coordinate?

The manager’s role is evolving

In most modern orgs, managers have been doing (at least) three different things:

  1. People development: coaching, feedback, growth, hiring, conflict navigation, culture
  2. Technical/project leadership: running execution, reviewing work, unblocking tasks, prioritizing
  3. Systems and strategy: setting guardrails, aligning across teams, shaping operating systems, long-term planning

AI and autonomous feature teams change the distribution of these responsibilities.

1) People development becomes more important, not less

As teams become more independent and things change faster, people really need grounding: clear direction, a safe space to work, ways to grow, and honest feedback. AI can help write a review but it can’t handle the human side of building trust, shaping identity, and finding meaning in our work.

2) Day-to-day technical leadership moves closer to the team

Here’s where the scope for many middle managers gets a little smaller. When you have a focused feature team, the day-to-day execution leadership often happens right within the group—think a senior engineer, the product and design leads, and a shared AI process. The manager’s job shifts away from being the daily air traffic controller.

3) Systems-level strategy becomes the manager’s differentiator

As pods proliferate, someone must design the system those pods operate within: how often do they operate, what are the quality rules, how do we control for risk, what are the portfolio priorities, and how do these teams talk to each other?

This is exactly the direction highlighted by McKinsey & Company in its writing on “agentic” organizations: we’ll see more M-shaped supervisors (broad generalists orchestrating agents and hybrid work across domains) alongside T-shaped experts (deep specialists safeguarding quality and exceptions). 


As agents take on more execution, managers are freed up from admin tasks. Their focus is shifting toward leading people and orchestrating these blended systems. In other words, the ideal talent profile is changing. It’s less about being “the smartest technical person in the room” and much more about emotional intelligence and the ability to think strategically and connect the dots.

The emerging operating model

So what does this look like in practice? Expect more organizations to formalize patterns like:

Autonomous Feature Pods: Small teams of about five to ten people, each with a crystal-clear mission, like “improve customer sign-ups.” These teams are accountable for everything, from building to shipping to measuring success. They have all the necessary skills embedded right in the pod—product, design, engineering, and data. What makes them so fast? They use AI agents as a built-in assistant for everything from research and drafting to testing and analysis.

Thinner Management Layers: This shift also means a leaner leadership structure. You’ll see fewer layers of management. Instead, managers will take on a wider scope, focusing less on directing tasks and more on coaching and ensuring the system is running smoothly. We’ll also see more senior, non-managerial roles, like staff or principal engineers, who provide technical leadership without adding more hierarchy.

Guardrails over Gates: Finally, the way work is governed is changing from “approval-heavy” to “principle-driven.” Instead of waiting for sign-offs on every step, teams operate within clear “guardrails”—security protocols, data policies, quality standards, and ethical rules. This allows teams to move much faster and ship products without constant delays.

The trend for the rest of 2026 is clear: Organizations will continue to shrink in headcount but explode in impact. We are moving away from the “industrial” model of management, where people were cogs in a machine, toward a “biological” model, where small, autonomous cells work together to create a living, breathing, and highly adaptive organism.

Beyond the Automation Trap: Why AI Needs Values

In 1997, after Gary Kasparov lost his historic chess match to IBM’s Deep Blue, he didn’t just walk away or rail against the machine. Instead, he started a new kind of competition called “Advanced Chess.” In these matches, a human player and a computer worked together as a team—a “Centaur.”

What happened next was quite unexpected. Amateur players with midrange computers often beat grandmasters and higher end chess computers. They knew when to listen to the machine and when to override it. They used the computer to explore possibilities, but they used their human judgment to make the final call.

In other words, the most powerful force wasn’t the smartest machine but the best collaboration.

Today, we are at a similar crossroads with Artificial Intelligence. We’ve built the machines, but we haven’t quite figured out how to be Centaurs. And that might be why AI adoption is stalling.

The Diffusion Mystery

If you look at the headlines, AI is taking over the world. But if you look at the data, the picture is more complicated.

Everett Rogers, the legendary sociologist who gave us the “Diffusion of Innovations” theory, taught us that technology doesn’t spread because it’s better. It spreads because it fits into our lives, our norms, and our trust networks. Right now, AI has a fit problem.

According to McKinsey’s 2025 global research, while almost every company is playing with AI, very few have successfully scaled it. The problem might be the kinds of problems we are trying to solve with AI. It’s not that the technology is too complex, it’s that we’re trying to use a “tame” solution for a “wicked” world.

Tame Tasks vs. Wicked Problems

In the 1970s, design theorists Horst Rittel and Melvin Webber identified two types of challenges:

  1. Tame Problems: These have a clear goal and a clear stopping rule. Think of a puzzle or a math equation. Coding is often a tame problem. You write the script, you run the test, and it either works or it doesn’t. This is why AI adoption has worked quite well for developers.
  2. Wicked Problems: These are messy. They have no clear definition and no right answer, only “better” or “worse” ones. Moreover, every time you try to solve a wicked problem, the problem changes. Think of education, healthcare, or leading a team.

When we try to use AI to solve a wicked problem through pure automation, we fail because wicked problems require judgment, and good judgment requires something else.

Turbulent Fields

Systems theorist Eric Trist called the environment we live in today a “turbulent field.” Imagine trying to play a game of soccer, but the grass is moving, the goals are shifting, and the other team keeps changing the rules. That’s turbulence. And turbulence creates wicked problems. 

In a stable world, you can rely on data and optimization. But in a turbulent world more data often leads to more confusion. Instead of data, you need a North Star that can simplify the number of variables you need to optimize for. Trist argued that values are effective North Stars in solving such complex problems. They clarify direction by eliminating options that don’t fit within those values.

This might be one reason why solving problems with AI is so challenging. Without clearly defined values, AI becomes a black box that’s hard to trust.

Designing with Values

If we want AI to actually work for us, we have to stop designing for automation and start designing for human flourishing.

This brings us to one of the most important frameworks in social science that I have seen to be highly effective: Self-Determination Theory (SDT). For people to be at their best, they need three things:

  • Autonomy: The desire to be the author of our work and lives.
  • Mastery (or Competence): The urge to learn new things and get better at skills that matter.
  • Purpose (or Relatedness): The yearning to do what we do in the service of something larger than ourselves.

The “Automation Trap” kills all three. If an AI writes your entire report, you lose your autonomy (you’re just a spectator). You lose your mastery (your skills begin to atrophy). And eventually, you lose your sense of purpose.

This is the “Irony of Automation.” As researcher Lisanne Bainbridge pointed out, the more we automate, the more we rely on humans to handle the rare, high-stakes crises. But if the human has been sidelined by the automation, they no longer have the skills to save the day when the machine fails.

Nowhere is this tension clearer than in the classroom. If a student uses AI to generate an essay, the task is finished, but the learning never happened.

Learning requires productive struggle. Elizabeth and Robert Bjork’s research on “desirable difficulties” shows that we learn best when the process feels a little bit hard. When we remove the struggle, we remove the growth.

If we want AI to diffuse in education, and for that matter, in any knowledge-work field, we have to move from “Answer Engines” to “Thought Partners.”

A New Blueprint for the AI Collaborator

So, what does a value-driven, human-centered AI look like? It follows a different set of design principles:

1. Values Over Vibes

Wicked problems are resolved by making choices based on what we value most. An AI collaborator shouldn’t hide these choices. It should surface them. Instead of saying “Here is the best strategy,” it should say “If you value speed, do X; if you value employee well-being, do Y.”

2. Design for Mastery

Success shouldn’t just be measured by task completion. It should be measured by capability gained. Does this AI help the user understand the problem better? Does it challenge their assumptions? A great AI should function like a coach, nudging the user to do their best thinking rather than doing the thinking for them.

3. Human Stewardship

In a turbulent field, the “correct” answer is often a conversation. AI can widen our options and test our scenarios, but humans must steward the meaning. We are the ones who decide which values are important and trade-offs are worth making.

The Question for 2026

As we stare down 2026, we need to stop asking, “What can AI do?” and start asking, “What values should do the steering?”

For the last two years, we’ve been obsessed with technical possibilities. We’ve treated AI like a new engine and spent all our time seeing how fast it can go. But in a turbulent field, speed without a North Star is just a faster way to get lost. If we continue to design simply because a solution is possible, we will keep falling into the Automation Trap.

The truth is, technological possibility should never precede moral clarity. In the era of wicked problems, the right answer doesn’t exist in the data; it exists in our intentions. If we want to move from “Answer Engines” to true Centaur-style collaboration, we have to identify the values we are designing for before we write a single line of code.

The real lesson of Gary Kasparov’s Centaurs wasn’t that they had better computers. It was that they had a better process rooted in human judgment. In the long run, the real competitive advantage won’t be the machine’s speed. It will be our wisdom.

Designing Human-AI Workflows for Synergy

A sobering meta-analysis reveals a counterintuitive truth: most human-AI collaborations actually underperform compared to either the human or the AI working alone. Consider a study on fake hotel review detection: the AI achieved 73% accuracy, the human 55%, yet the combined system managed only 69%.

This raises a crucial question: How do we architect human-AI collaborations that truly elevate performance?

If you’re leading an AI rollout, that question is more than academic. It determines whether your investment produces step-change performance or an expensive stalemate. Simply placing people and AI in the same workflow does not guarantee better results. What matters is what they do together, and how intentionally you design the collaboration.

Synergy Vs. Augmentation

The researchers in the study above investigated two desirable outcomes: synergy and augmentation. Synergy represents the ideal state, where the combined human-AI performance surpasses both the human alone and the AI alone, mirroring “strong synergy” found in purely human groups. Augmentation is a more modest goal and simply means the human-AI system performs better than the human alone.

A common implicit assumption is that the combined system must be better than either component, but the reality is often complicated by human behavioral pitfalls. 

Humans frequently struggle to find the right balance of trust: they either over-rely on AI and blindly accept its suggestions, or under-rely and prematurely dismiss valuable AI input. For example, in the fake hotel review study, since humans were not as good at the task as AI, they didn’t make good judges of AI’s recommendations leading to a sub-par outcome. So, while AI augmented human accuracy (55% -> 69%), it was less effective than AI alone (73%)

On the other hand, in a study on bird image classification, AI was only 73% accurate, compared to expert human performance of 81%. But, the human-AI collaboration reached 90% accuracy, better than either human or AI alone. This is an example of human-AI synergy, which results from expert humans being able to better decide when to trust their own judgement versus the algorithm’s, thus improving the overall system performance.  

Complementarity

Another view of synergy in human-AI collaboration comes from research on complementarity, a practical way to ensure that what the human brings and what the AI brings are meaningfully different and mutually enhancing.

According to the authors, it is useful to think of the distinct ways humans and AI approach decision-making, which result from two key asymmetries:

  1. Information Asymmetry: Often, AI and humans operate with different inputs. AI relies on a vast collection of digitized data. Humans, however, draw on a broader, richer context that includes non-digitized real-world knowledge. For example, an AI might accurately diagnose from a scan, but a human doctor also factors in the patient’s demeanor, additional symptoms, or prior history. This holistic view gives the human a distinct informational advantage in complex situations.
  2. Capability Asymmetry: Even given the exact same information, the processing methods differ. AI models infer patterns from vast datasets while humans use more flexible mental models to build an intuitive understanding of the world. This allows humans to learn rapidly, often after only a few trials, and accumulate lifelong experiences. On the other hand, AI can instantly digest massive information and detect tiny, subtle variations in data that would be imperceptible to a human. These differences lead to different unique capabilities.

Where teams stumble is when these asymmetries are flattened. If your process gives humans and models the same inputs and asks them to do the same step, one of them is redundant. If, instead, you assign different roles and design a clean way for their contributions to combine, the whole becomes greater than the sum of its parts.

When, and When Not, To Use AI

Rethinking the architecture of modern work to integrate human and artificial intelligence demands a careful, nuanced approach. The success of this collaboration hinges on a thoughtful consideration of two critical factors: the inherent nature of the task and the complementary strengths of the human and AI partners.

Task Type

The type of work at hand fundamentally dictates the potential for synergy. For example, the MIT meta-analysis study found that Innovative Tasks are the “sweet spot” for maximum impact. These are characterized by open-ended goals and constraints that evolve in real-time, through iteration and exploration. Here, humans have the natural ability to think in non-linear ways, associating unrelated concepts to forge novel and meaningful connections. For such tasks, AI can tap into its vast informational landscape from which these connections can be drawn, leading to synergy.

However, for Decision Tasks that primarily require evaluation or judgement, the story is not as straightforward. In some cases, the human-AI collaboration can perform worse than either human or AI alone. It depends on what the task is and how it’s split between AI and humans as discussed next.

Task Separation

Underlying all successful partnerships in general is the ability to leverage distinct and complementary strengths. Human-AI collaboration is no different in that way.

It sounds counterintuitive, but if the AI’s initial performance is overwhelmingly superior, the overall human-AI system may actually perform worse. The thoughtful approach, therefore, is to precisely restrict the AI’s role to only those sub-tasks where it has a clear advantage.

Conversely, when the human is the stronger initial decision-maker, the partnership tends toward greater success. As the expert, the human is better positioned to critically assess the AI’s input and selectively integrate it into the process, in a synergistic fashion.

Ultimately, effective human-AI collaboration is not about replacing one with the other; it is a delicate exercise in defining boundaries, recognizing unique excellence, and ensuring that the final output is greater than the sum of its different parts.

Conclusion

Simply introducing AI into a workflow is not a prescription for performance improvement and can, in fact, lead to an expensive underperformance. Achieving true synergy—where the human-AI system surpasses both components working alone—requires intentional design built on complementarity. 

A crucial lesson is that human expertise matters. Experts are better positioned to critically assess and leverage AI input, transforming a simple augmentation into a synergistic gain. This is particularly relevant in the ‘sweet spot’ of Innovative Tasks, where human creative thinking abilities and an AI’s vast informational landscape can combine to give surprising creative breakthroughs.

Ultimately, the future of work hinges on recognizing and embracing these boundaries, and recognizing that AI may not be suitable for every kind of task. Instead of replacing humans, the goal is to define complementary roles that exploit the inherent asymmetries in information and capability. By expertly differentiating tasks organizations can move past the common trap of underperformance toward a future of genuine human-AI synergy.