The “Tiny Team” Organization Is Here and It’s Redrawing the Management Map

In the past year or two, the business world has felt more like a bumpy ride than a smooth “transformation.” Employees are dealing with a lot of uncertainty—roles changing or getting eliminated entirely, teams shuffling, and rules shifting mid-game. But leaders aren’t operating from a crystal-clear blueprint either. Many are making big cuts, not just because AI speeds things up, but because they honestly can’t see what the company should look like long-term. So, they reduce costs and complexity first, then plan to rebuild smarter.

The tricky part is that AI isn’t a neat replacement for people or their jobs. It absolutely makes many tasks faster, but it also creates entirely new work. Think about customer support: many companies use chatbots to handle volume, but now someone has to watch performance, check logs to find problems, fine-tune the prompts and rules, and constantly improve the system. The work doesn’t disappear; it just moves and changes form.

Still, one advantage is unmistakable. For decades, the biggest hidden cost in any company was the Coordination Tax. You know the drill: an engineer builds a feature, then hands it to a product manager, who syncs with marketing, who waits for a business dev lead to find a partner. Every handoff is a friction point and every meeting is a tax on productivity. But now, a single “pod” of 5 to 10 people—comprising a mix of engineering, design, and growth talent—can now execute faster than a 50-person department.

2026 looks like the year that some of these trends become normal: smaller, multidisciplinary, autonomous feature teams, and flatter organizations where the middle-management role evolves from “traffic cop” to “coach + systems designer.”

Trend 1: Smaller, multidisciplinary “feature teams” become the default

As AI lifts individual capability, it becomes feasible to staff product work like a small startup.

Instead of a large, functionally segmented machine, you get teams of 5–10 people who can handle a feature from the initial idea all the way through building, shipping, and improving it. Speed shoots up because there is less coordination and fewer approvals needed. Quality rises because feedback loops tighten.

We can already see executives publicly describing this “tiny team” dynamic.

  • Mark Zuckerberg recently noted that AI lets “a single very talented person” tackle projects that used to need “big teams,” and he’s actively pushing to “flatten teams.”
  • Tobias Lütke, Shopify’s CEO, gave his teams a powerful signal: before asking for more people, you have to prove why you “cannot get what you want done using AI.” This directly asks them to think of AI as a part of their team that handles work autonomously.
  • Duolingo’s CEO, Luis von Ahn, sees AI as a “platform shift” and is adjusting things like hiring and performance. They’re also reducing contractor work where AI can step in—another clear move toward “smaller teams that deliver much more.”

The common thread in these examples isn’t just “AI is useful.” It’s that they are rethinking the basic rules of how the company operates. The organization shifts from shuffling work between departments to empowering small, focused teams to fully own their results.

Ever watched a two-person team crank out something amazing over a weekend and thought, “How did they move so fast?” Now, imagine that kind of efficiency happening across dozens of teams, each one fully supported by AI.

That’s the emerging design pattern.

Trend 2: Flatter orgs and middle management under pressure

Once small, autonomous teams are successful, a second order effect kicks in: you just don’t need as many layers of management to keep them in sync. This is the point where the discussion about “flattening” an organization becomes very real, and uncomfortable.

Experts studying the workforce have been tracking “delayering” everywhere, not just in tech. Korn Ferry, for instance, talks about companies “thinning out their management midsections,” pointing out that middle managers were a big part of 2024 layoffs, a clear evidence that the traditional manager role is under structural pressure.

And major corporations are openly saying their restructures are about cutting bureaucracy. Amazon’s corporate layoffs, for instance, were reported as an effort to reduce organizational layers and operate more efficiently.

So, the pattern we should expect to continue seeing in 2026 isn’t a world with “no managers.” It’s a world with fewer managers whose fundamental job is completely different.

Which brings up an important, yet simple, question: If AI reduces coordination work, what exactly should managers coordinate?

The manager’s role is evolving

In most modern orgs, managers have been doing (at least) three different things:

  1. People development: coaching, feedback, growth, hiring, conflict navigation, culture
  2. Technical/project leadership: running execution, reviewing work, unblocking tasks, prioritizing
  3. Systems and strategy: setting guardrails, aligning across teams, shaping operating systems, long-term planning

AI and autonomous feature teams change the distribution of these responsibilities.

1) People development becomes more important, not less

As teams become more independent and things change faster, people really need grounding: clear direction, a safe space to work, ways to grow, and honest feedback. AI can help write a review but it can’t handle the human side of building trust, shaping identity, and finding meaning in our work.

2) Day-to-day technical leadership moves closer to the team

Here’s where the scope for many middle managers gets a little smaller. When you have a focused feature team, the day-to-day execution leadership often happens right within the group—think a senior engineer, the product and design leads, and a shared AI process. The manager’s job shifts away from being the daily air traffic controller.

3) Systems-level strategy becomes the manager’s differentiator

As pods proliferate, someone must design the system those pods operate within: how often do they operate, what are the quality rules, how do we control for risk, what are the portfolio priorities, and how do these teams talk to each other?

This is exactly the direction highlighted by McKinsey & Company in its writing on “agentic” organizations: we’ll see more M-shaped supervisors (broad generalists orchestrating agents and hybrid work across domains) alongside T-shaped experts (deep specialists safeguarding quality and exceptions). 


As agents take on more execution, managers are freed up from admin tasks. Their focus is shifting toward leading people and orchestrating these blended systems. In other words, the ideal talent profile is changing. It’s less about being “the smartest technical person in the room” and much more about emotional intelligence and the ability to think strategically and connect the dots.

The emerging operating model

So what does this look like in practice? Expect more organizations to formalize patterns like:

Autonomous Feature Pods: Small teams of about five to ten people, each with a crystal-clear mission, like “improve customer sign-ups.” These teams are accountable for everything, from building to shipping to measuring success. They have all the necessary skills embedded right in the pod—product, design, engineering, and data. What makes them so fast? They use AI agents as a built-in assistant for everything from research and drafting to testing and analysis.

Thinner Management Layers: This shift also means a leaner leadership structure. You’ll see fewer layers of management. Instead, managers will take on a wider scope, focusing less on directing tasks and more on coaching and ensuring the system is running smoothly. We’ll also see more senior, non-managerial roles, like staff or principal engineers, who provide technical leadership without adding more hierarchy.

Guardrails over Gates: Finally, the way work is governed is changing from “approval-heavy” to “principle-driven.” Instead of waiting for sign-offs on every step, teams operate within clear “guardrails”—security protocols, data policies, quality standards, and ethical rules. This allows teams to move much faster and ship products without constant delays.

The trend for the rest of 2026 is clear: Organizations will continue to shrink in headcount but explode in impact. We are moving away from the “industrial” model of management, where people were cogs in a machine, toward a “biological” model, where small, autonomous cells work together to create a living, breathing, and highly adaptive organism.

Beyond the Automation Trap: Why AI Needs Values

In 1997, after Gary Kasparov lost his historic chess match to IBM’s Deep Blue, he didn’t just walk away or rail against the machine. Instead, he started a new kind of competition called “Advanced Chess.” In these matches, a human player and a computer worked together as a team—a “Centaur.”

What happened next was quite unexpected. Amateur players with midrange computers often beat grandmasters and higher end chess computers. They knew when to listen to the machine and when to override it. They used the computer to explore possibilities, but they used their human judgment to make the final call.

In other words, the most powerful force wasn’t the smartest machine but the best collaboration.

Today, we are at a similar crossroads with Artificial Intelligence. We’ve built the machines, but we haven’t quite figured out how to be Centaurs. And that might be why AI adoption is stalling.

The Diffusion Mystery

If you look at the headlines, AI is taking over the world. But if you look at the data, the picture is more complicated.

Everett Rogers, the legendary sociologist who gave us the “Diffusion of Innovations” theory, taught us that technology doesn’t spread because it’s better. It spreads because it fits into our lives, our norms, and our trust networks. Right now, AI has a fit problem.

According to McKinsey’s 2025 global research, while almost every company is playing with AI, very few have successfully scaled it. The problem might be the kinds of problems we are trying to solve with AI. It’s not that the technology is too complex, it’s that we’re trying to use a “tame” solution for a “wicked” world.

Tame Tasks vs. Wicked Problems

In the 1970s, design theorists Horst Rittel and Melvin Webber identified two types of challenges:

  1. Tame Problems: These have a clear goal and a clear stopping rule. Think of a puzzle or a math equation. Coding is often a tame problem. You write the script, you run the test, and it either works or it doesn’t. This is why AI adoption has worked quite well for developers.
  2. Wicked Problems: These are messy. They have no clear definition and no right answer, only “better” or “worse” ones. Moreover, every time you try to solve a wicked problem, the problem changes. Think of education, healthcare, or leading a team.

When we try to use AI to solve a wicked problem through pure automation, we fail because wicked problems require judgment, and good judgment requires something else.

Turbulent Fields

Systems theorist Eric Trist called the environment we live in today a “turbulent field.” Imagine trying to play a game of soccer, but the grass is moving, the goals are shifting, and the other team keeps changing the rules. That’s turbulence. And turbulence creates wicked problems. 

In a stable world, you can rely on data and optimization. But in a turbulent world more data often leads to more confusion. Instead of data, you need a North Star that can simplify the number of variables you need to optimize for. Trist argued that values are effective North Stars in solving such complex problems. They clarify direction by eliminating options that don’t fit within those values.

This might be one reason why solving problems with AI is so challenging. Without clearly defined values, AI becomes a black box that’s hard to trust.

Designing with Values

If we want AI to actually work for us, we have to stop designing for automation and start designing for human flourishing.

This brings us to one of the most important frameworks in social science that I have seen to be highly effective: Self-Determination Theory (SDT). For people to be at their best, they need three things:

  • Autonomy: The desire to be the author of our work and lives.
  • Mastery (or Competence): The urge to learn new things and get better at skills that matter.
  • Purpose (or Relatedness): The yearning to do what we do in the service of something larger than ourselves.

The “Automation Trap” kills all three. If an AI writes your entire report, you lose your autonomy (you’re just a spectator). You lose your mastery (your skills begin to atrophy). And eventually, you lose your sense of purpose.

This is the “Irony of Automation.” As researcher Lisanne Bainbridge pointed out, the more we automate, the more we rely on humans to handle the rare, high-stakes crises. But if the human has been sidelined by the automation, they no longer have the skills to save the day when the machine fails.

Nowhere is this tension clearer than in the classroom. If a student uses AI to generate an essay, the task is finished, but the learning never happened.

Learning requires productive struggle. Elizabeth and Robert Bjork’s research on “desirable difficulties” shows that we learn best when the process feels a little bit hard. When we remove the struggle, we remove the growth.

If we want AI to diffuse in education, and for that matter, in any knowledge-work field, we have to move from “Answer Engines” to “Thought Partners.”

A New Blueprint for the AI Collaborator

So, what does a value-driven, human-centered AI look like? It follows a different set of design principles:

1. Values Over Vibes

Wicked problems are resolved by making choices based on what we value most. An AI collaborator shouldn’t hide these choices. It should surface them. Instead of saying “Here is the best strategy,” it should say “If you value speed, do X; if you value employee well-being, do Y.”

2. Design for Mastery

Success shouldn’t just be measured by task completion. It should be measured by capability gained. Does this AI help the user understand the problem better? Does it challenge their assumptions? A great AI should function like a coach, nudging the user to do their best thinking rather than doing the thinking for them.

3. Human Stewardship

In a turbulent field, the “correct” answer is often a conversation. AI can widen our options and test our scenarios, but humans must steward the meaning. We are the ones who decide which values are important and trade-offs are worth making.

The Question for 2026

As we stare down 2026, we need to stop asking, “What can AI do?” and start asking, “What values should do the steering?”

For the last two years, we’ve been obsessed with technical possibilities. We’ve treated AI like a new engine and spent all our time seeing how fast it can go. But in a turbulent field, speed without a North Star is just a faster way to get lost. If we continue to design simply because a solution is possible, we will keep falling into the Automation Trap.

The truth is, technological possibility should never precede moral clarity. In the era of wicked problems, the right answer doesn’t exist in the data; it exists in our intentions. If we want to move from “Answer Engines” to true Centaur-style collaboration, we have to identify the values we are designing for before we write a single line of code.

The real lesson of Gary Kasparov’s Centaurs wasn’t that they had better computers. It was that they had a better process rooted in human judgment. In the long run, the real competitive advantage won’t be the machine’s speed. It will be our wisdom.

From Kitchen To Code: Lessons in Radical Innovation from El Bulli

When patrons were seated at El Bulli, during its prime, the first thing they would get was an olive. 

Or, at least, what looked like one. They would pick up the “olive” resting in a spoon and bite gently, only for it to collapse into a warm, intensely flavored liquid that instantly flooded the palate and then disappeared.

That small bite of the now legendary spherical olive was much more than a novelty. It was the successful outcome of countless experiments using a technique called spherification, which turns olive juice into delicate spheres using alginate and calcium salts. The El Bulli team had to tackle real-world challenges to create it: How do you make a membrane thin enough to melt in your mouth, but strong enough to survive being plated? How do you make the process reliable so you can repeat it hundreds of times a night? And how do you make sure every guest has the exact same moment of surprise with that very first bite?

It is a case study in radical innovation.

Everything the tech world struggles with—finding the sweet spot between creativity and discipline, quickly moving from idea to experiment, collaborating across different fields, and building teams focused on growth rather than just resumes—was being figured out in this remote kitchen on Spain’s Costa Brava.

So, here are four key lessons from El Bulli’s kitchen that translate well to today’s product and innovation teams.

1. Creativity is the core system, not a side project

El Bulli did something revolutionary: it closed for half the year, from roughly October to March. This allowed Ferran and Albert Adrià and their core team to focus entirely on creativity. They spent those months inventing new techniques, testing fresh ideas, and developing the next season’s tasting menu, which often featured 30 or more incredible courses.

By the time the restaurant closed its doors for good in 2011, the team had created about 1,800 dishes and pioneered game-changing techniques like spherification, foams, and warm gels that reshaped high-end cooking worldwide.

A few things stand out:

  • Dedicated time: Creativity wasn’t an afterthought, squeezed into weekends or the “10% time” left after service. Half the year was deliberately reserved for exploration.
  • Permission to break rules: Inside that creative window, the brief was to question everything. Dishes could be deconstructed, recomposed, or turned inside out. Tradition was a reference point, not a constraint. 
  • Discipline in service of magic: For all the experimentation, the final measure of success was simple: did it create a magical experience for the guest? El Bulli eliminated the à la carte menu so that every guest received a carefully choreographed tasting sequence, built from scratch each season around these new creations. 

This raises some important questions for tech companies: Is creativity truly built into your structure, or is it just something people are expected to do after the “real work” is done? How do you maintain a high standard that encourages teams to experiment but still ensures a compelling user experience at the end of the day?

2. Start from first principles: “What is a tomato?”

Ferran Adrià is famous not only for his bold dishes, but for the questions behind them. Again and again, he and his team would come back to deceptively simple prompts: What is a tomato? What is soup? What is a salad?

In interviews about his work, Adrià often challenges common assumptions. For example, he points out that the original “natural tomato” in the Andes was actually inedible. What we enjoy as a tomato today is the result of human intervention through breeding, selection, cultivation.  In other words, even the most ordinary ingredient is already a designed product. 

This is what first-principles thinking looks like in action. Instead of just the category of “tomato” as fixed, he breaks it down:

  • Where does it come from?
  • What is its essence—its acidity, sweetness, aroma, and texture?
  • What aspects should stay unchanged, and which parts are negotiable?

This intellectual groundwork is what powered the deconstructionist cuisine that made El Bulli famous: taking a familiar dish, radically changing its form, texture, or temperature, but making sure its underlying essence remains intact.

In the tech world, we often talk about first principles, but in practice we work from mental templates: “It’s a CRM, so it needs to look like a sales funnel; it’s a learning platform, so it has to have modules and quizzes.”

The El Bulli approach would sound more like this for a product team:

  • What is a meeting? Is it really just a slot on a calendar, or is it a ritual for making decisions that could take on a completely different shape?
  • What is a classroom? Is it defined by a physical room and a timetable, or is it actually about a set of relationships and feedback loops that could be designed differently?

For Adrià’s team, these questions weren’t theoretical. They were directly linked to real kitchen experiments. When they clarified the true essence of a dish, it gave them permission to change everything else about it.

Product teams can adopt this powerful discipline, too: clearly define the non-negotiable essence of the user problem or the desired outcome. Once you have that clarity, you’re free to completely rethink the structure, the interface, or even the business model.

3. Keep the idea-to-experiment loop radically short

One of the most revealing aspects of El Bulli’s creative culture is not on the plate, but on paper.

Museums such as The Drawing Center in New York have mounted exhibitions called Ferran Adrià: Notes on Creativity, displaying his sketches, diagrams and visual maps. These sketches offer a glimpse of how he thought and emphasize drawing as a tool for thinking. It helped externalise ideas quickly, organise knowledge, and communicate concepts to the team. 

The creative process started with a rough sketch that captured the initial thought. That sketch immediately leads to a simple kitchen prototype which the team evaluates and iterates till the idea is perfected. 

What you do not see are lengthy slide decks, layers of approval, or months spent debating concepts before anyone picks up a pan. Instead, the kitchen becomes the thinking environment. The sketches move an idea from an internal hunch to a shared experiment quickly, not to impress anyone in a meeting. 

When we’re building products, we often do things backward. We spend weeks making perfect presentations about an idea before a single user ever sees it. As a result, we invest a lot of time justifying something that hasn’t been tested in the real world.

What if we took inspiration from the El Bulli approach to ask:

  • Could you sketch out a major idea in less than five minutes?
  • Could you build the very first version of a new concept in a day and get it in front of a real user within a week?
  • Could we simplify the number of sign-offs needed to run a small experiment?

This isn’t about being reckless. At El Bulli, the final menu was obsessively refined. But the path from idea to first test was deliberately short. And autonomy was real: talented people were trusted to try things without seeking permission for every iteration.

When teams make the idea → experiment → learning loop shorter, they tap into creative energy that can turn a crazy idea about a “spherical olive” into a world-famous dish.

4. Treat innovation as a team sport across disciplines

El Bulli’s breakthroughs were not only the work of a couple of geniuses.  Every dish came to life thanks to a whole team of experts: chefs, pastry specialists, food scientists, industrial designers, and even the folks running the front of the house.

Albert Adrià’s own journey illustrates this. He joined El Bulli as a teenager in 1985, spending his first two years rotating through all the stations in the kitchen before focusing on pastry. Over time he became head pastry chef and then director of elBullitaller, the Barcelona-based creative workshop that served as the restaurant’s R&D lab during the closed season. 

In interviews and profiles, he always emphasised that it was the team, not individual brilliance, that made El Bulli exceptional. The creative work depended on people who were:

  • Deeply curious and willing to learn fast.
  • Comfortable collaborating across roles rather than jealously guarding territory.
  • Motivated by the shared goal of creating something extraordinary for the guest, rather than building personal fame. 

The spherical olive itself was a multidisciplinary artefact. It required understanding the chemistry of alginate and calcium, mastery of textures and temperatures, and careful design of the serving ritual so that each guest ate it in a single bite at the right moment. 

For business leaders, there are two intertwined lessons here.

Multidisciplinary structures

Radical ideas often sit at the intersection of fields yet many organisations still arrange teams in narrow silos.

El Bulli suggests a different model for breakthrough ideas: Instead of keeping teams separated in silos, bring diverse perspectives together. In practice it would mean getting designers, engineers, data analysts, and subject-matter experts collaborating side-by-side in small, cross-functional “innovation pods.” This way, everyone can see and shape the idea from the very start, using shared visual tools like maps and sketches. It’s about co-creating, not just passing a task down a line.

Hiring for growth mindset, not just pedigree

Albert did not arrive at El Bulli with great credentials. He came as a 16-year-old apprentice and grew into one of the most influential creative forces in modern pastry, precisely because he was willing to experiment relentlessly and learn from others. 

Translating that mindset into hiring means asking:

  • Does this person show evidence of rapid learning across domains, or only depth in one?
  • Do they light up when they talk about collaboration, or only when they describe solo achievements?
  • Are they comfortable with ambiguity and experimentation, or do they need everything defined upfront?

In a world where the most interesting problems are inherently multidisciplinary like climate tech, future of learning, human–AI collaboration, it is often more valuable to hire people who can grow into the unknown than those who perfectly match yesterday’s job description.

Bringing El Bulli’s lessons into your organisation

For all its mystique, El Bulli was, at heart, a working laboratory. It dealt with constraints familiar to any leader: limited time, finite resources, high expectations, and the pressure to keep surprising a demanding audience.

Its response was not to work harder in the same way, but to redesign the system around creativity:

  • Carve out time for exploration.
  • Ask first-principles questions again and again.
  • Move ideas quickly from conception to experiment.
  • Allow multi-disciplinary teams to work closely.
  • Hire people for their capacity to learn and collaborate.

These principles apply just as well to innovative companies. When you do so, you begin to treat creativity not as a garnish, but as the main ingredient—tempered by discipline, grounded in first principles, and always aimed at giving the people you serve a truly memorable experience.

Image credit: The Drawing Center

Designing Human-AI Workflows for Synergy

A sobering meta-analysis reveals a counterintuitive truth: most human-AI collaborations actually underperform compared to either the human or the AI working alone. Consider a study on fake hotel review detection: the AI achieved 73% accuracy, the human 55%, yet the combined system managed only 69%.

This raises a crucial question: How do we architect human-AI collaborations that truly elevate performance?

If you’re leading an AI rollout, that question is more than academic. It determines whether your investment produces step-change performance or an expensive stalemate. Simply placing people and AI in the same workflow does not guarantee better results. What matters is what they do together, and how intentionally you design the collaboration.

Synergy Vs. Augmentation

The researchers in the study above investigated two desirable outcomes: synergy and augmentation. Synergy represents the ideal state, where the combined human-AI performance surpasses both the human alone and the AI alone, mirroring “strong synergy” found in purely human groups. Augmentation is a more modest goal and simply means the human-AI system performs better than the human alone.

A common implicit assumption is that the combined system must be better than either component, but the reality is often complicated by human behavioral pitfalls. 

Humans frequently struggle to find the right balance of trust: they either over-rely on AI and blindly accept its suggestions, or under-rely and prematurely dismiss valuable AI input. For example, in the fake hotel review study, since humans were not as good at the task as AI, they didn’t make good judges of AI’s recommendations leading to a sub-par outcome. So, while AI augmented human accuracy (55% -> 69%), it was less effective than AI alone (73%)

On the other hand, in a study on bird image classification, AI was only 73% accurate, compared to expert human performance of 81%. But, the human-AI collaboration reached 90% accuracy, better than either human or AI alone. This is an example of human-AI synergy, which results from expert humans being able to better decide when to trust their own judgement versus the algorithm’s, thus improving the overall system performance.  

Complementarity

Another view of synergy in human-AI collaboration comes from research on complementarity, a practical way to ensure that what the human brings and what the AI brings are meaningfully different and mutually enhancing.

According to the authors, it is useful to think of the distinct ways humans and AI approach decision-making, which result from two key asymmetries:

  1. Information Asymmetry: Often, AI and humans operate with different inputs. AI relies on a vast collection of digitized data. Humans, however, draw on a broader, richer context that includes non-digitized real-world knowledge. For example, an AI might accurately diagnose from a scan, but a human doctor also factors in the patient’s demeanor, additional symptoms, or prior history. This holistic view gives the human a distinct informational advantage in complex situations.
  2. Capability Asymmetry: Even given the exact same information, the processing methods differ. AI models infer patterns from vast datasets while humans use more flexible mental models to build an intuitive understanding of the world. This allows humans to learn rapidly, often after only a few trials, and accumulate lifelong experiences. On the other hand, AI can instantly digest massive information and detect tiny, subtle variations in data that would be imperceptible to a human. These differences lead to different unique capabilities.

Where teams stumble is when these asymmetries are flattened. If your process gives humans and models the same inputs and asks them to do the same step, one of them is redundant. If, instead, you assign different roles and design a clean way for their contributions to combine, the whole becomes greater than the sum of its parts.

When, and When Not, To Use AI

Rethinking the architecture of modern work to integrate human and artificial intelligence demands a careful, nuanced approach. The success of this collaboration hinges on a thoughtful consideration of two critical factors: the inherent nature of the task and the complementary strengths of the human and AI partners.

Task Type

The type of work at hand fundamentally dictates the potential for synergy. For example, the MIT meta-analysis study found that Innovative Tasks are the “sweet spot” for maximum impact. These are characterized by open-ended goals and constraints that evolve in real-time, through iteration and exploration. Here, humans have the natural ability to think in non-linear ways, associating unrelated concepts to forge novel and meaningful connections. For such tasks, AI can tap into its vast informational landscape from which these connections can be drawn, leading to synergy.

However, for Decision Tasks that primarily require evaluation or judgement, the story is not as straightforward. In some cases, the human-AI collaboration can perform worse than either human or AI alone. It depends on what the task is and how it’s split between AI and humans as discussed next.

Task Separation

Underlying all successful partnerships in general is the ability to leverage distinct and complementary strengths. Human-AI collaboration is no different in that way.

It sounds counterintuitive, but if the AI’s initial performance is overwhelmingly superior, the overall human-AI system may actually perform worse. The thoughtful approach, therefore, is to precisely restrict the AI’s role to only those sub-tasks where it has a clear advantage.

Conversely, when the human is the stronger initial decision-maker, the partnership tends toward greater success. As the expert, the human is better positioned to critically assess the AI’s input and selectively integrate it into the process, in a synergistic fashion.

Ultimately, effective human-AI collaboration is not about replacing one with the other; it is a delicate exercise in defining boundaries, recognizing unique excellence, and ensuring that the final output is greater than the sum of its different parts.

Conclusion

Simply introducing AI into a workflow is not a prescription for performance improvement and can, in fact, lead to an expensive underperformance. Achieving true synergy—where the human-AI system surpasses both components working alone—requires intentional design built on complementarity. 

A crucial lesson is that human expertise matters. Experts are better positioned to critically assess and leverage AI input, transforming a simple augmentation into a synergistic gain. This is particularly relevant in the ‘sweet spot’ of Innovative Tasks, where human creative thinking abilities and an AI’s vast informational landscape can combine to give surprising creative breakthroughs.

Ultimately, the future of work hinges on recognizing and embracing these boundaries, and recognizing that AI may not be suitable for every kind of task. Instead of replacing humans, the goal is to define complementary roles that exploit the inherent asymmetries in information and capability. By expertly differentiating tasks organizations can move past the common trap of underperformance toward a future of genuine human-AI synergy.

Boosting AI’s Intelligence with Metacognitive Primitives

Over the past year or so, AI experts, like Ilya Sutskever in his Neurips 2024 talk, have been raising concerns that AI reasoning might be hitting a wall. It seems that simply throwing more data and computing power at the problem is giving us less and less in return, and models are struggling with complex thinking tasks. Maybe it’s time to explore other facets of human reasoning and intelligence, rather than just relying on sheer computational force.

At its core, a key part of human intelligence is our ability to pick out just the right information from our memories to help us solve the problem at hand. For instance, imagine a toddler seeing a puppy in a park. If they’ve never encountered a puppy before, they might feel a bit scared or unsure. But if they’ve seen their friend playing with their puppy, or watched their neighbors’ dogs, they can draw on those experiences and decide to go ahead and pet the new puppy. As we get older, we start doing this for much more intricate situations – we take ideas from one area and apply them to another when the patterns fit. In essence, we have a vast collection of knowledge (made up of information and experiences), and to solve a problem, we first need to identify the useful subset of that knowledge.

Think of current large language models (LLMs) as having absorbed the entire knowledge base of human-created artifacts – text, images, code, and even elements of audio and video through transcripts. Because they’re essentially predictive engines trained to forecast the next word or “token,” they exhibit a basic level of reasoning that comes from the statistical structures within the data, rather than deliberate thought. What has been truly remarkable about LLMs is that this extensive “knowledge layer” is really good at exhibiting basic reasoning skills just by statistical prediction. 

Beyond this statistical stage of reasoning, prompting techniques, like assigning a specific role to the LLM, improve reasoning abilities even more. Intuitively speaking, they work because they help the LLM focus on the more relevant parts of its network or data, which in turn enhances the quality of the information it uses. More advanced strategies, such as Chain-of-Thought or Tree-of-Thoughts prompting, mirror human reasoning by guiding the LLM to use a more structured, multi-step approach to traverse its knowledge bank in more efficient ways. One way to think about these strategies is as higher-level approaches that dictate how to proceed. A fitting name for this level might be the Executive Strategy Layer – this is where the planning, exploration, self-checking, and control policies reside, much like the executive network in human brains.

However, it seems current research might be missing another layer: a middle layer of metacognitive primitives. Think of these as simple, reusable patterns of thought that can be called upon and combined to boost reasoning, no matter the topic. You could imagine it this way: while the executive strategy layer helps an AI break down a task into smaller steps, the metacognitive primitive layer makes sure each of those mini-steps is solved in the smartest way possible. This layer might involve asking the AI to find similarities or differences between two ideas, move between different levels of abstraction, connect distant concepts, or even look for counter-examples. These strategies go beyond just statistical prediction and offer new ways of thinking that act as building blocks for more complex reasoning. It’s quite likely that building this layer of thinking will significantly improve what the Executive Strategy Layer can achieve.

To understand what these core metacognitive ideas might look like, it’s helpful to consider how we teach human intelligence. In schools, we don’t just teach facts; we also help students develop ways of thinking that they can use across many different subjects. For instance, Bloom’s revised taxonomy outlines levels of thinking, from simply remembering and understanding, all the way up to analyzing, evaluating, and creating. Similarly, Sternberg’s theory of successful intelligence combines analytical, creative, and practical abilities. Within each of these categories, there are simpler thought patterns. For example, smaller cognitive actions like “compare and contrast,” “change the level of abstraction,” or “find an analogy” play an important role in analytical and creative thinking.

The exact position of these thought patterns in a taxonomy is less important than making sure learners acquire these modes of thinking and can combine them in adaptable ways.

As an example, one primitive that is central to creative thinking is associative thinking — connecting two distant or unrelated concepts. In a study last year, we showed that by simply asking an LLM to incorporate a random concept, we could measurably increase the originality of its outputs across tasks like product design, storytelling, and marketing. In other words, by turning on a single primitive, we can actually change the kinds of ideas the model explores and make it more creative. We can make a similar argument for compare–contrast as a primitive that works across different subjects: by looking at important aspects and finding “surprising similarities or differences,” we might get better, more reasoned responses. As we standardize these kinds of primitives, we can combine them within higher-order strategies to achieve reasoning that is both more reliable and easier to understand.

In summary, giving today’s AI systems a metacognitive-primitives layer—positioned between the knowledge base and the Executive Strategy Layer—might provide a practical way to achieve stronger reasoning. The knowledge layer provides the content; the primitives layer supplies the cognitive moves; and the executive layer plans, sequences, and monitors those moves. This three-part structure mirrors how human expertise develops: it’s not just about knowing more, or only planning better, but about having the right units of thought to analyze, evaluate, and create across various situations. If we give LLMs explicit access to these units, we can expect improvements in their ability to generalize, self-correct, be creative, and be more transparent, moving them from simply predicting text toward truly adaptive intelligence.