AI, Layoffs and the Innovation Tax

The promise of AI is rapidly becoming a workforce question.

In a memo last year, Amazon CEO Andy Jassy told employees that as generative AI usage spreads through the company, “we will need fewer people doing some of the jobs that are being done today, and more people doing other types of jobs.” He went further: “in the next few years, we expect that this will reduce our total corporate workforce as we get efficiency gains from using AI extensively across the company.” 

Duolingo CEO Luis von Ahn’s “AI-first” memo said that they would “gradually stop using contractors to do work that AI can handle,” and that “headcount will only be given if a team cannot automate more of their work.” 

IBM CEO Arvind Krishna offered perhaps the clearest version of the attrition model. Speaking about back-office functions that could be automated, he said, “I could easily see 30% of that getting replaced by AI and automation over a five-year period.” 

Taken together, these statements reveal a new leadership strategy: automate and reduce headcount as AI expands, or in some cases, in anticipation of higher productivity. It is an understandable response to a technology that can write code, resolve customer tickets, draft marketing copy, summarize meetings, and generate analysis at extraordinary speed. 

But there is a danger in this story. It assumes that innovation is produced by tasks. In reality, innovation is produced by people working together in messier ways that defy easy automation – experimenting, discussing, dissenting, and learning through customer interactions. When companies cut too broadly, they may inadvertently remove the conditions that will allow AI-era innovation to compound.

The tricky question for leaders is whether they can distinguish between work that is truly automatable and work that looks inefficient only because its value is hard to capture in a balance sheet.

Impact of Downsizing on Innovation

The danger is not that every layoff is bad. It is that broad layoffs, especially when framed as an AI-enabled operating upgrade, can confuse labor reduction with organizational learning. A company can become leaner and less capable at the same time.

One of the earliest systematic warnings came from Teresa Amabile and Regina Conti’s 1999 study. They examined a large high-technology firm before, during, and after a major downsizing. Their central finding, apart from a worse morale, was that the conditions that support creativity deteriorated. Creativity, productivity, and perceived work-environment support declined during downsizing, while obstacles to creative work increased. 

That finding should land heavily in today’s technology sector. Innovation is rarely born from an isolated genius typing faster with better tools. It is more often a social process: someone notices an anomaly, someone else connects it to an unmet customer need, a third person remembers a failed experiment from two years earlier, and a fourth turns the conversation into a prototype. Remove enough nodes from that system and the org chart may still look coherent, but the creative network underneath it becomes brittle.

Newer research adds an important nuance. Ramdani and colleagues studied 122 UK firms over 22 years, using downsizing and patent data to examine how workforce reduction affects innovation outputs. Their conclusion is not “layoffs always reduce innovation.” It is more precise: downsizing has a dual effect depending on the firm’s resource position. In firms with resource slack, downsizing can have a positive effect on innovation. In resource-constrained firms, it has a negative effect, and the damage appears more quickly. 

This matters because many tech companies do not experience resource slack uniformly. One division may have layers of coordination, duplicated tooling, and unclear ownership. Another may have exhausted engineers, customer-support escalation queues, and neglected technical debt. From 30,000 feet, both may look like “headcount.” But only one will lead to higher innovation on downsizing.

The key question for leaders is: are you cutting fat, or are you cutting connective tissue? Because downsizing in a resource-constrained organization can significantly hurt innovation down the road. 

Psychological Effects of Layoffs

Why do layoffs damage innovation when there is little slack? The answer lies in psychology as much as economics.

The first mechanism is psychological safety collapse. Innovation requires people to take interpersonal risks: challenge assumptions, admit uncertainty, surface bad news, and suggest ideas that may initially sound naïve. Amy Edmondson and Derrick Bransby’s 2023 review of psychological safety research describes its importance for learning behavior, performance, and work under uncertainty. A meta-analysis of 94 studies also found psychological safety positively associated with both employee innovation behavior and team innovation behavior. After broad layoffs, however, the unwritten rule often becomes: do not look expendable. In that climate, people do not stop having ideas; they stop volunteering them.

The second mechanism is survivor syndrome and identity rupture. Research on downsizing survivors shows that employees who remain often experience reduced commitment and performance, a pattern commonly described as survivor syndrome. Van Dick and colleagues found that downsizing can reduce employees’ identification with the organization, which then harms survivor performance. The innovation consequence is direct. The person who no longer identifies with the company may still complete assigned tasks. But will they fight for an unproven customer insight? Will they spend political capital defending a long-term bet? Will they mentor the junior colleague who may one day become a breakthrough inventor? Often, the answer is no.

The third mechanism is job insecurity narrowing attention. When employees fear future cuts, their time horizon contracts. They focus on visible output, defensible metrics, and work that protects their standing. Niesen and colleagues’ research on job insecurity and innovative work behavior notes the paradox that organizations often expect restructuring to enhance innovation, even as insecurity can undermine the behaviors innovation requires. This is the problem with fear-based efficiency: it may increase activity while decreasing imagination. 

There is also a network mechanism. Corporate knowledge is not stored only in documents or AI retrieval systems. It lives in relationships: who knows which customer exception matters, why a particular architecture decision was made, or which workaround keeps a product alive. Broad layoffs sever these ties indiscriminately. AI cannot easily reconstruct what the organization failed to write down.

A 2024 HBR article based on a study of 146 companies found that engagement, morale, and loyalty can take years, not months, to rebound after layoffs. The authors cite Pixar director Brad Bird’s memorable observation: “If you have low morale, for every $1 you spend, you get about 25 cents of value. If you have high morale, for every $1 you spend, you get about $3 of value.” Whether or not one takes the math literally, the leadership lesson is still valid: the same dollar value produces radically different returns depending on the emotional state of the system. In a frightened organization, talent becomes defensive. In a committed organization, talent becomes generative. That difference will determine whether AI becomes merely a substitute for human contribution or a catalyst for the next wave of innovation.

The Way Forward

So, how can leaders figure out if layoffs are the right move? It starts with three key diagnostic questions.

First, where is work waiting? Slack isn’t always visible in just headcount ratios. Look for queues: unresolved customer issues, delayed experiments, or rising technical debt. If critical work is already backing up, you’re resource-constrained and cuts will only make things worse.

Second, where has learning slowed? Beyond simple experimentation metrics, track behavioral indicators. Are people trying fewer new things? Are you seeing fewer dissenting views? A drop in any of these suggests learning is stalling.

Third, where is human judgment still doing hidden work? AI can automate routine tasks, but leaders must map the judgment layer. This critical layer handles edge cases, ethical tradeoffs, and customer reassurance. Removing people who interpret these ambiguous situations could make the process faster, but the organization will get less intelligent.

Klarna learned their lesson the hard way. They cut about 700 employees citing the productivity of their AI assistant, but later had to reverse course when they had to reassign engineers and marketers back to customer-support roles when AI didn’t hit the mark. 

However, the deeper challenge for leadership is a philosophical one. The industrial-age instinct was simple: reduce labor when a machine increased productivity. That made sense when work was predictable. But in the innovation age, the question is entirely different: when a new tool boosts productivity, where do you redeploy the freed human capacity to gain a learning advantage?

That is the fork in the road. Companies using AI mainly for headcount reduction might win a margin story but lose their innovation future. Conversely, companies that reinvest human attention into discovery, customer understanding, and experimentation will compound their advantage.

Why Education Needs an “AI-in-the-Loop” Model

Twelve year old Leo sat in his room, staring at his history assignment on the Industrial Revolution. Usually such an assignment would take several hours of reading relevant material, picking his thesis, finding supporting evidence to back his claims and then drafting his essay. But today, he simply fed a prompt into ChatGPT, made some quick and simple revisions, and hit submit. 

On paper he looked like a star student. His essay even got him an ‘A’ from his teacher. But did any real learning take place? 

This is the crisis we face in education. We are currently obsessed with the “Human-in-the-Loop” model, where humans oversee AI outputs. But in a classroom, that model is backwards. If we want to raise a generation of innovators rather than mere prompt engineers, we have to flip the script to an “AI-in-the-Loop” approach that prioritizes the student’s cognitive effort before the algorithm ever enters the chat.

Renzulli’s Three Ring Conception of Giftedness

To understand why current AI integration is risky, we have to look at what actually creates high-level human performance. One of the most respected frameworks in educational psychology is Joseph Renzulli’s Three-Ring Conception of Giftedness.

Renzulli argued that “giftedness” (or what we might call high-level creative productivity) lies at the intersection of three distinct clusters of human traits:

  1. Above-Average Ability: The core competency and foundational knowledge in a specific domain.
  2. Creativity: The ability to generate original ideas, see new patterns, and think divergently.
  3. Task Commitment: The grit, perseverance, and “productive struggle” required to see a difficult project through to the end.

When these three rings overlap, giftedness emerges. However, the current “plug-and-play” integration of AI in schools threatens to thin out every one of these rings, potentially preventing students from achieving high accomplishments.

The Risk to Ability

The most immediate danger of AI is cognitive offloading—the tendency to use external tools to reduce the mental effort required for a task. While offloading is great for mundane chores (like GPS for driving), it is quite harmful for learning.

We already know that easy access to external information changes what people remember and how they allocate mental effort. The classic “Google effect” research showed that when people expect information to be accessible later, they are less likely to remember the information itself and more likely to remember where to find it. That isn’t necessarily bad. Humans have always used tools and transactive memory, but in schooling, ability is built through repeated retrieval, reconstruction, and sense-making.

Research has already begun to show the impact of cognitive offloading. In a large field experiment with high school math students, researchers found that a “ChatGPT-like” tutor improved performance during practice, but when the AI support was removed, those students performed worse than students who learned without the AI tool, an effect consistent with dependence and reduced durable learning.

Without guardrails, AI can act as a “cognitive crutch.” By providing hints and solving intermediate steps, it removes “desirable difficulty” or the kind of struggle that leads to deeper learning and long-term retention. If the AI does the heavy lifting, the students learn less and don’t reach higher levels of competence required for high achievements. 

The Risk To Creativity

Creativity is not just producing something. It’s producing something both novel and appropriate.

AI is remarkably good at the obvious. Large language models generate outputs that reflect patterns in their training data. That makes them useful, fluent, and fast but it also means they can pull learners toward the statistical center of what’s been said before.

A recent experiment on creative writing found that access to AI ideas helped individuals produce stories judged as more creative (especially among less creative writers) but it also made the stories more similar to one another, reducing collective novelty. In other words, AI can raise the floor while lowering the ceiling of diversity.

Separate work has begun to quantify this “echo” effect more directly, showing measurable limits on plot diversity in LLM outputs under the same prompts. And broader research reviewing LLM creativity suggests that while models can appear creative through recombination, there are persistent questions about originality, intent, and the difference between pattern completion and human creative agency. 

When students brainstorm with AI first, they often anchor on the suggestions they see and fall into an “associative rut.” If they took the time to think by themselves before reaching out to AI, they would discover more original and personally meaningful ideas. 

And we’ve seen a similar phenomenon long before AI. Research on brainstorming has shown that nominal brainstorming (individual idea generation before group discussion) produces more ideas and more original ideas than purely interactive brainstorming. In the AI context, the LLM acts like a “dominant personality” in a group meeting. It speaks first, speaks confidently, and sets a baseline. Once a student sees those AI suggestions, their brain finds it incredibly difficult to think outside those parameters.

The Risk to Commitment

Task commitment includes persistence, delayed gratification, self-regulation, and the willingness to stay with ambiguity.

Modern technology has already been impacting this ring. Research shows that higher use of digital devices is tied to concentration difficulties, lower academic performance, and poorer self-regulation. 

Now layer generative AI on top. In a world where a chatbot can produce instant essays and workable code, the emotional “cost” of effort feels high. Why wrestle with the challenge when the answer is right at the fingertips?

Emerging education research is also starting to map how generative AI intersects with self-regulated learning—highlighting both risks (over-reliance, reduced monitoring) and opportunities (scaffolds for planning, reflection, and feedback) depending on design and pedagogy. And survey-based findings have reported associations between ChatGPT use and procrastination or lower performance in some student samples, suggesting that without strong norms and supports, AI can drift from scaffold to shortcut. 

Then there’s an additional twist: changing expectations. Once AI is available, teachers and workplaces may (implicitly or explicitly) expect faster output. But speed is not the same as depth. Many breakthroughs require a long dwell time. If we compress the timeline before students have built the inner muscles of persistence, we don’t get high performers.

The Way Out

In many AI discussions, “human in the loop” sounds reassuring: the human checks the AI’s work. But in education, that framing can be backward. It puts students in the role of evaluator rather than constructor, as if learning were mainly about spotting mistakes in someone else’s thinking.

Decades of learning science tell us that durable learning is constructive and interactive. The ICAP framework, for example, shows that learning activities that are Interactive and Constructive generally outperform merely Active or Passive engagement. Students learn more when they generate, explain, debate, and build meaning, rather than just consume or lightly manipulate information.

In education, we need a “Learning-First” model that prioritizes human cognition before algorithmic assistance. A sample 5-stage framework could look like this:

Phase 1 (Individual): students write an initial thesis, solution path, or set of ideas before using AI. This protects original cognition and forces retrieval, sense-making, and ownership.

Phase 2 (Group): students critique and build together. This is where misconceptions surface and learning becomes social where students learn from each other.

Phase 3 (AI): only then does AI enter as a gap-finder, alternative perspective generator, or a Socratic questioner. It reveals elements that students might have missed and stretches their thinking.

Phase 4 (Group): the group revises their solution based on the feedback from AI, synthesizing aspects that are reasonable and rejecting those that don’t fit well.

Phase 5 (Individual): individuals reconstruct the argument/solution in their own words because self-explanation is a reliable way to accelerate understanding.

This scaffolded approach protects each ring of Renzulli’s model. Ability is built through retrieval and reconstruction. Creativity is protected through first-thought originality and peer divergence. Task commitment is strengthened through social support and reflection.

Conclusion

Education has a choice to make. Without creating the right guardrails on how to use AI, we risk teaching the  models instead of students. 

The danger of a default “human-in-the-loop” stance in classrooms is that it casts students as editors of machine work. But learning isn’t editorial. Students must build mental models, connect ideas, and develop the internal fluency that only comes from doing the cognitive work themselves.

So the guiding question for AI in education should be, “Which phase of learning does this tool strengthen and which phase might it accidentally replace?”

If we stay student-first and learning-first, we’ll use AI the way every great teacher uses support: not to remove the mountain, but to help students become the kind of climbers who can scale it, long after the tool is gone.

From Kitchen To Code: Lessons in Radical Innovation from El Bulli

When patrons were seated at El Bulli, during its prime, the first thing they would get was an olive. 

Or, at least, what looked like one. They would pick up the “olive” resting in a spoon and bite gently, only for it to collapse into a warm, intensely flavored liquid that instantly flooded the palate and then disappeared.

That small bite of the now legendary spherical olive was much more than a novelty. It was the successful outcome of countless experiments using a technique called spherification, which turns olive juice into delicate spheres using alginate and calcium salts. The El Bulli team had to tackle real-world challenges to create it: How do you make a membrane thin enough to melt in your mouth, but strong enough to survive being plated? How do you make the process reliable so you can repeat it hundreds of times a night? And how do you make sure every guest has the exact same moment of surprise with that very first bite?

It is a case study in radical innovation.

Everything the tech world struggles with—finding the sweet spot between creativity and discipline, quickly moving from idea to experiment, collaborating across different fields, and building teams focused on growth rather than just resumes—was being figured out in this remote kitchen on Spain’s Costa Brava.

So, here are four key lessons from El Bulli’s kitchen that translate well to today’s product and innovation teams.

1. Creativity is the core system, not a side project

El Bulli did something revolutionary: it closed for half the year, from roughly October to March. This allowed Ferran and Albert Adrià and their core team to focus entirely on creativity. They spent those months inventing new techniques, testing fresh ideas, and developing the next season’s tasting menu, which often featured 30 or more incredible courses.

By the time the restaurant closed its doors for good in 2011, the team had created about 1,800 dishes and pioneered game-changing techniques like spherification, foams, and warm gels that reshaped high-end cooking worldwide.

A few things stand out:

  • Dedicated time: Creativity wasn’t an afterthought, squeezed into weekends or the “10% time” left after service. Half the year was deliberately reserved for exploration.
  • Permission to break rules: Inside that creative window, the brief was to question everything. Dishes could be deconstructed, recomposed, or turned inside out. Tradition was a reference point, not a constraint. 
  • Discipline in service of magic: For all the experimentation, the final measure of success was simple: did it create a magical experience for the guest? El Bulli eliminated the à la carte menu so that every guest received a carefully choreographed tasting sequence, built from scratch each season around these new creations. 

This raises some important questions for tech companies: Is creativity truly built into your structure, or is it just something people are expected to do after the “real work” is done? How do you maintain a high standard that encourages teams to experiment but still ensures a compelling user experience at the end of the day?

2. Start from first principles: “What is a tomato?”

Ferran Adrià is famous not only for his bold dishes, but for the questions behind them. Again and again, he and his team would come back to deceptively simple prompts: What is a tomato? What is soup? What is a salad?

In interviews about his work, Adrià often challenges common assumptions. For example, he points out that the original “natural tomato” in the Andes was actually inedible. What we enjoy as a tomato today is the result of human intervention through breeding, selection, cultivation.  In other words, even the most ordinary ingredient is already a designed product. 

This is what first-principles thinking looks like in action. Instead of just the category of “tomato” as fixed, he breaks it down:

  • Where does it come from?
  • What is its essence—its acidity, sweetness, aroma, and texture?
  • What aspects should stay unchanged, and which parts are negotiable?

This intellectual groundwork is what powered the deconstructionist cuisine that made El Bulli famous: taking a familiar dish, radically changing its form, texture, or temperature, but making sure its underlying essence remains intact.

In the tech world, we often talk about first principles, but in practice we work from mental templates: “It’s a CRM, so it needs to look like a sales funnel; it’s a learning platform, so it has to have modules and quizzes.”

The El Bulli approach would sound more like this for a product team:

  • What is a meeting? Is it really just a slot on a calendar, or is it a ritual for making decisions that could take on a completely different shape?
  • What is a classroom? Is it defined by a physical room and a timetable, or is it actually about a set of relationships and feedback loops that could be designed differently?

For Adrià’s team, these questions weren’t theoretical. They were directly linked to real kitchen experiments. When they clarified the true essence of a dish, it gave them permission to change everything else about it.

Product teams can adopt this powerful discipline, too: clearly define the non-negotiable essence of the user problem or the desired outcome. Once you have that clarity, you’re free to completely rethink the structure, the interface, or even the business model.

3. Keep the idea-to-experiment loop radically short

One of the most revealing aspects of El Bulli’s creative culture is not on the plate, but on paper.

Museums such as The Drawing Center in New York have mounted exhibitions called Ferran Adrià: Notes on Creativity, displaying his sketches, diagrams and visual maps. These sketches offer a glimpse of how he thought and emphasize drawing as a tool for thinking. It helped externalise ideas quickly, organise knowledge, and communicate concepts to the team. 

The creative process started with a rough sketch that captured the initial thought. That sketch immediately leads to a simple kitchen prototype which the team evaluates and iterates till the idea is perfected. 

What you do not see are lengthy slide decks, layers of approval, or months spent debating concepts before anyone picks up a pan. Instead, the kitchen becomes the thinking environment. The sketches move an idea from an internal hunch to a shared experiment quickly, not to impress anyone in a meeting. 

When we’re building products, we often do things backward. We spend weeks making perfect presentations about an idea before a single user ever sees it. As a result, we invest a lot of time justifying something that hasn’t been tested in the real world.

What if we took inspiration from the El Bulli approach to ask:

  • Could you sketch out a major idea in less than five minutes?
  • Could you build the very first version of a new concept in a day and get it in front of a real user within a week?
  • Could we simplify the number of sign-offs needed to run a small experiment?

This isn’t about being reckless. At El Bulli, the final menu was obsessively refined. But the path from idea to first test was deliberately short. And autonomy was real: talented people were trusted to try things without seeking permission for every iteration.

When teams make the idea → experiment → learning loop shorter, they tap into creative energy that can turn a crazy idea about a “spherical olive” into a world-famous dish.

4. Treat innovation as a team sport across disciplines

El Bulli’s breakthroughs were not only the work of a couple of geniuses.  Every dish came to life thanks to a whole team of experts: chefs, pastry specialists, food scientists, industrial designers, and even the folks running the front of the house.

Albert Adrià’s own journey illustrates this. He joined El Bulli as a teenager in 1985, spending his first two years rotating through all the stations in the kitchen before focusing on pastry. Over time he became head pastry chef and then director of elBullitaller, the Barcelona-based creative workshop that served as the restaurant’s R&D lab during the closed season. 

In interviews and profiles, he always emphasised that it was the team, not individual brilliance, that made El Bulli exceptional. The creative work depended on people who were:

  • Deeply curious and willing to learn fast.
  • Comfortable collaborating across roles rather than jealously guarding territory.
  • Motivated by the shared goal of creating something extraordinary for the guest, rather than building personal fame. 

The spherical olive itself was a multidisciplinary artefact. It required understanding the chemistry of alginate and calcium, mastery of textures and temperatures, and careful design of the serving ritual so that each guest ate it in a single bite at the right moment. 

For business leaders, there are two intertwined lessons here.

Multidisciplinary structures

Radical ideas often sit at the intersection of fields yet many organisations still arrange teams in narrow silos.

El Bulli suggests a different model for breakthrough ideas: Instead of keeping teams separated in silos, bring diverse perspectives together. In practice it would mean getting designers, engineers, data analysts, and subject-matter experts collaborating side-by-side in small, cross-functional “innovation pods.” This way, everyone can see and shape the idea from the very start, using shared visual tools like maps and sketches. It’s about co-creating, not just passing a task down a line.

Hiring for growth mindset, not just pedigree

Albert did not arrive at El Bulli with great credentials. He came as a 16-year-old apprentice and grew into one of the most influential creative forces in modern pastry, precisely because he was willing to experiment relentlessly and learn from others. 

Translating that mindset into hiring means asking:

  • Does this person show evidence of rapid learning across domains, or only depth in one?
  • Do they light up when they talk about collaboration, or only when they describe solo achievements?
  • Are they comfortable with ambiguity and experimentation, or do they need everything defined upfront?

In a world where the most interesting problems are inherently multidisciplinary like climate tech, future of learning, human–AI collaboration, it is often more valuable to hire people who can grow into the unknown than those who perfectly match yesterday’s job description.

Bringing El Bulli’s lessons into your organisation

For all its mystique, El Bulli was, at heart, a working laboratory. It dealt with constraints familiar to any leader: limited time, finite resources, high expectations, and the pressure to keep surprising a demanding audience.

Its response was not to work harder in the same way, but to redesign the system around creativity:

  • Carve out time for exploration.
  • Ask first-principles questions again and again.
  • Move ideas quickly from conception to experiment.
  • Allow multi-disciplinary teams to work closely.
  • Hire people for their capacity to learn and collaborate.

These principles apply just as well to innovative companies. When you do so, you begin to treat creativity not as a garnish, but as the main ingredient—tempered by discipline, grounded in first principles, and always aimed at giving the people you serve a truly memorable experience.

Image credit: The Drawing Center

Designing Human-AI Workflows for Synergy

A sobering meta-analysis reveals a counterintuitive truth: most human-AI collaborations actually underperform compared to either the human or the AI working alone. Consider a study on fake hotel review detection: the AI achieved 73% accuracy, the human 55%, yet the combined system managed only 69%.

This raises a crucial question: How do we architect human-AI collaborations that truly elevate performance?

If you’re leading an AI rollout, that question is more than academic. It determines whether your investment produces step-change performance or an expensive stalemate. Simply placing people and AI in the same workflow does not guarantee better results. What matters is what they do together, and how intentionally you design the collaboration.

Synergy Vs. Augmentation

The researchers in the study above investigated two desirable outcomes: synergy and augmentation. Synergy represents the ideal state, where the combined human-AI performance surpasses both the human alone and the AI alone, mirroring “strong synergy” found in purely human groups. Augmentation is a more modest goal and simply means the human-AI system performs better than the human alone.

A common implicit assumption is that the combined system must be better than either component, but the reality is often complicated by human behavioral pitfalls. 

Humans frequently struggle to find the right balance of trust: they either over-rely on AI and blindly accept its suggestions, or under-rely and prematurely dismiss valuable AI input. For example, in the fake hotel review study, since humans were not as good at the task as AI, they didn’t make good judges of AI’s recommendations leading to a sub-par outcome. So, while AI augmented human accuracy (55% -> 69%), it was less effective than AI alone (73%)

On the other hand, in a study on bird image classification, AI was only 73% accurate, compared to expert human performance of 81%. But, the human-AI collaboration reached 90% accuracy, better than either human or AI alone. This is an example of human-AI synergy, which results from expert humans being able to better decide when to trust their own judgement versus the algorithm’s, thus improving the overall system performance.  

Complementarity

Another view of synergy in human-AI collaboration comes from research on complementarity, a practical way to ensure that what the human brings and what the AI brings are meaningfully different and mutually enhancing.

According to the authors, it is useful to think of the distinct ways humans and AI approach decision-making, which result from two key asymmetries:

  1. Information Asymmetry: Often, AI and humans operate with different inputs. AI relies on a vast collection of digitized data. Humans, however, draw on a broader, richer context that includes non-digitized real-world knowledge. For example, an AI might accurately diagnose from a scan, but a human doctor also factors in the patient’s demeanor, additional symptoms, or prior history. This holistic view gives the human a distinct informational advantage in complex situations.
  2. Capability Asymmetry: Even given the exact same information, the processing methods differ. AI models infer patterns from vast datasets while humans use more flexible mental models to build an intuitive understanding of the world. This allows humans to learn rapidly, often after only a few trials, and accumulate lifelong experiences. On the other hand, AI can instantly digest massive information and detect tiny, subtle variations in data that would be imperceptible to a human. These differences lead to different unique capabilities.

Where teams stumble is when these asymmetries are flattened. If your process gives humans and models the same inputs and asks them to do the same step, one of them is redundant. If, instead, you assign different roles and design a clean way for their contributions to combine, the whole becomes greater than the sum of its parts.

When, and When Not, To Use AI

Rethinking the architecture of modern work to integrate human and artificial intelligence demands a careful, nuanced approach. The success of this collaboration hinges on a thoughtful consideration of two critical factors: the inherent nature of the task and the complementary strengths of the human and AI partners.

Task Type

The type of work at hand fundamentally dictates the potential for synergy. For example, the MIT meta-analysis study found that Innovative Tasks are the “sweet spot” for maximum impact. These are characterized by open-ended goals and constraints that evolve in real-time, through iteration and exploration. Here, humans have the natural ability to think in non-linear ways, associating unrelated concepts to forge novel and meaningful connections. For such tasks, AI can tap into its vast informational landscape from which these connections can be drawn, leading to synergy.

However, for Decision Tasks that primarily require evaluation or judgement, the story is not as straightforward. In some cases, the human-AI collaboration can perform worse than either human or AI alone. It depends on what the task is and how it’s split between AI and humans as discussed next.

Task Separation

Underlying all successful partnerships in general is the ability to leverage distinct and complementary strengths. Human-AI collaboration is no different in that way.

It sounds counterintuitive, but if the AI’s initial performance is overwhelmingly superior, the overall human-AI system may actually perform worse. The thoughtful approach, therefore, is to precisely restrict the AI’s role to only those sub-tasks where it has a clear advantage.

Conversely, when the human is the stronger initial decision-maker, the partnership tends toward greater success. As the expert, the human is better positioned to critically assess the AI’s input and selectively integrate it into the process, in a synergistic fashion.

Ultimately, effective human-AI collaboration is not about replacing one with the other; it is a delicate exercise in defining boundaries, recognizing unique excellence, and ensuring that the final output is greater than the sum of its different parts.

Conclusion

Simply introducing AI into a workflow is not a prescription for performance improvement and can, in fact, lead to an expensive underperformance. Achieving true synergy—where the human-AI system surpasses both components working alone—requires intentional design built on complementarity. 

A crucial lesson is that human expertise matters. Experts are better positioned to critically assess and leverage AI input, transforming a simple augmentation into a synergistic gain. This is particularly relevant in the ‘sweet spot’ of Innovative Tasks, where human creative thinking abilities and an AI’s vast informational landscape can combine to give surprising creative breakthroughs.

Ultimately, the future of work hinges on recognizing and embracing these boundaries, and recognizing that AI may not be suitable for every kind of task. Instead of replacing humans, the goal is to define complementary roles that exploit the inherent asymmetries in information and capability. By expertly differentiating tasks organizations can move past the common trap of underperformance toward a future of genuine human-AI synergy.

Boosting AI’s Intelligence with Metacognitive Primitives

Over the past year or so, AI experts, like Ilya Sutskever in his Neurips 2024 talk, have been raising concerns that AI reasoning might be hitting a wall. It seems that simply throwing more data and computing power at the problem is giving us less and less in return, and models are struggling with complex thinking tasks. Maybe it’s time to explore other facets of human reasoning and intelligence, rather than just relying on sheer computational force.

At its core, a key part of human intelligence is our ability to pick out just the right information from our memories to help us solve the problem at hand. For instance, imagine a toddler seeing a puppy in a park. If they’ve never encountered a puppy before, they might feel a bit scared or unsure. But if they’ve seen their friend playing with their puppy, or watched their neighbors’ dogs, they can draw on those experiences and decide to go ahead and pet the new puppy. As we get older, we start doing this for much more intricate situations – we take ideas from one area and apply them to another when the patterns fit. In essence, we have a vast collection of knowledge (made up of information and experiences), and to solve a problem, we first need to identify the useful subset of that knowledge.

Think of current large language models (LLMs) as having absorbed the entire knowledge base of human-created artifacts – text, images, code, and even elements of audio and video through transcripts. Because they’re essentially predictive engines trained to forecast the next word or “token,” they exhibit a basic level of reasoning that comes from the statistical structures within the data, rather than deliberate thought. What has been truly remarkable about LLMs is that this extensive “knowledge layer” is really good at exhibiting basic reasoning skills just by statistical prediction. 

Beyond this statistical stage of reasoning, prompting techniques, like assigning a specific role to the LLM, improve reasoning abilities even more. Intuitively speaking, they work because they help the LLM focus on the more relevant parts of its network or data, which in turn enhances the quality of the information it uses. More advanced strategies, such as Chain-of-Thought or Tree-of-Thoughts prompting, mirror human reasoning by guiding the LLM to use a more structured, multi-step approach to traverse its knowledge bank in more efficient ways. One way to think about these strategies is as higher-level approaches that dictate how to proceed. A fitting name for this level might be the Executive Strategy Layer – this is where the planning, exploration, self-checking, and control policies reside, much like the executive network in human brains.

However, it seems current research might be missing another layer: a middle layer of metacognitive primitives. Think of these as simple, reusable patterns of thought that can be called upon and combined to boost reasoning, no matter the topic. You could imagine it this way: while the executive strategy layer helps an AI break down a task into smaller steps, the metacognitive primitive layer makes sure each of those mini-steps is solved in the smartest way possible. This layer might involve asking the AI to find similarities or differences between two ideas, move between different levels of abstraction, connect distant concepts, or even look for counter-examples. These strategies go beyond just statistical prediction and offer new ways of thinking that act as building blocks for more complex reasoning. It’s quite likely that building this layer of thinking will significantly improve what the Executive Strategy Layer can achieve.

To understand what these core metacognitive ideas might look like, it’s helpful to consider how we teach human intelligence. In schools, we don’t just teach facts; we also help students develop ways of thinking that they can use across many different subjects. For instance, Bloom’s revised taxonomy outlines levels of thinking, from simply remembering and understanding, all the way up to analyzing, evaluating, and creating. Similarly, Sternberg’s theory of successful intelligence combines analytical, creative, and practical abilities. Within each of these categories, there are simpler thought patterns. For example, smaller cognitive actions like “compare and contrast,” “change the level of abstraction,” or “find an analogy” play an important role in analytical and creative thinking.

The exact position of these thought patterns in a taxonomy is less important than making sure learners acquire these modes of thinking and can combine them in adaptable ways.

As an example, one primitive that is central to creative thinking is associative thinking — connecting two distant or unrelated concepts. In a study last year, we showed that by simply asking an LLM to incorporate a random concept, we could measurably increase the originality of its outputs across tasks like product design, storytelling, and marketing. In other words, by turning on a single primitive, we can actually change the kinds of ideas the model explores and make it more creative. We can make a similar argument for compare–contrast as a primitive that works across different subjects: by looking at important aspects and finding “surprising similarities or differences,” we might get better, more reasoned responses. As we standardize these kinds of primitives, we can combine them within higher-order strategies to achieve reasoning that is both more reliable and easier to understand.

In summary, giving today’s AI systems a metacognitive-primitives layer—positioned between the knowledge base and the Executive Strategy Layer—might provide a practical way to achieve stronger reasoning. The knowledge layer provides the content; the primitives layer supplies the cognitive moves; and the executive layer plans, sequences, and monitors those moves. This three-part structure mirrors how human expertise develops: it’s not just about knowing more, or only planning better, but about having the right units of thought to analyze, evaluate, and create across various situations. If we give LLMs explicit access to these units, we can expect improvements in their ability to generalize, self-correct, be creative, and be more transparent, moving them from simply predicting text toward truly adaptive intelligence.