Boosting AI’s Intelligence with Metacognitive Primitives

Over the past year or so, AI experts, like Ilya Sutskever in his Neurips 2024 talk, have been raising concerns that AI reasoning might be hitting a wall. It seems that simply throwing more data and computing power at the problem is giving us less and less in return, and models are struggling with complex thinking tasks. Maybe it’s time to explore other facets of human reasoning and intelligence, rather than just relying on sheer computational force.

At its core, a key part of human intelligence is our ability to pick out just the right information from our memories to help us solve the problem at hand. For instance, imagine a toddler seeing a puppy in a park. If they’ve never encountered a puppy before, they might feel a bit scared or unsure. But if they’ve seen their friend playing with their puppy, or watched their neighbors’ dogs, they can draw on those experiences and decide to go ahead and pet the new puppy. As we get older, we start doing this for much more intricate situations – we take ideas from one area and apply them to another when the patterns fit. In essence, we have a vast collection of knowledge (made up of information and experiences), and to solve a problem, we first need to identify the useful subset of that knowledge.

Think of current large language models (LLMs) as having absorbed the entire knowledge base of human-created artifacts – text, images, code, and even elements of audio and video through transcripts. Because they’re essentially predictive engines trained to forecast the next word or “token,” they exhibit a basic level of reasoning that comes from the statistical structures within the data, rather than deliberate thought. What has been truly remarkable about LLMs is that this extensive “knowledge layer” is really good at exhibiting basic reasoning skills just by statistical prediction. 

Beyond this statistical stage of reasoning, prompting techniques, like assigning a specific role to the LLM, improve reasoning abilities even more. Intuitively speaking, they work because they help the LLM focus on the more relevant parts of its network or data, which in turn enhances the quality of the information it uses. More advanced strategies, such as Chain-of-Thought or Tree-of-Thoughts prompting, mirror human reasoning by guiding the LLM to use a more structured, multi-step approach to traverse its knowledge bank in more efficient ways. One way to think about these strategies is as higher-level approaches that dictate how to proceed. A fitting name for this level might be the Executive Strategy Layer – this is where the planning, exploration, self-checking, and control policies reside, much like the executive network in human brains.

However, it seems current research might be missing another layer: a middle layer of metacognitive primitives. Think of these as simple, reusable patterns of thought that can be called upon and combined to boost reasoning, no matter the topic. You could imagine it this way: while the executive strategy layer helps an AI break down a task into smaller steps, the metacognitive primitive layer makes sure each of those mini-steps is solved in the smartest way possible. This layer might involve asking the AI to find similarities or differences between two ideas, move between different levels of abstraction, connect distant concepts, or even look for counter-examples. These strategies go beyond just statistical prediction and offer new ways of thinking that act as building blocks for more complex reasoning. It’s quite likely that building this layer of thinking will significantly improve what the Executive Strategy Layer can achieve.

To understand what these core metacognitive ideas might look like, it’s helpful to consider how we teach human intelligence. In schools, we don’t just teach facts; we also help students develop ways of thinking that they can use across many different subjects. For instance, Bloom’s revised taxonomy outlines levels of thinking, from simply remembering and understanding, all the way up to analyzing, evaluating, and creating. Similarly, Sternberg’s theory of successful intelligence combines analytical, creative, and practical abilities. Within each of these categories, there are simpler thought patterns. For example, smaller cognitive actions like “compare and contrast,” “change the level of abstraction,” or “find an analogy” play an important role in analytical and creative thinking.

The exact position of these thought patterns in a taxonomy is less important than making sure learners acquire these modes of thinking and can combine them in adaptable ways.

As an example, one primitive that is central to creative thinking is associative thinking — connecting two distant or unrelated concepts. In a study last year, we showed that by simply asking an LLM to incorporate a random concept, we could measurably increase the originality of its outputs across tasks like product design, storytelling, and marketing. In other words, by turning on a single primitive, we can actually change the kinds of ideas the model explores and make it more creative. We can make a similar argument for compare–contrast as a primitive that works across different subjects: by looking at important aspects and finding “surprising similarities or differences,” we might get better, more reasoned responses. As we standardize these kinds of primitives, we can combine them within higher-order strategies to achieve reasoning that is both more reliable and easier to understand.

In summary, giving today’s AI systems a metacognitive-primitives layer—positioned between the knowledge base and the Executive Strategy Layer—might provide a practical way to achieve stronger reasoning. The knowledge layer provides the content; the primitives layer supplies the cognitive moves; and the executive layer plans, sequences, and monitors those moves. This three-part structure mirrors how human expertise develops: it’s not just about knowing more, or only planning better, but about having the right units of thought to analyze, evaluate, and create across various situations. If we give LLMs explicit access to these units, we can expect improvements in their ability to generalize, self-correct, be creative, and be more transparent, moving them from simply predicting text toward truly adaptive intelligence.

Labels and Fables: How Our Brains Learn

One of the most remarkable capabilities of the human brain is its ability to categorize objects, even those that have little visual resemblance to one another. It’s easier to see that visually similar objects, like different trees, fit into a category and it’s a skill that non-human animals also possess. For example, dogs show distinct behaviors in the presence of other dogs compared to their interactions with humans, demonstrating that they can differentiate the two even if they don’t have names for them.

A fascinating study explored whether infants are able to form categories for different looking objects. Researchers presented ten-month-old infants with a variety of dissimilar objects, ranging from animal-like toys to cylinders adorned with colorful beads and rectangles covered in foam flowers, each accompanied by a unique, made-up name like “wug” or “dak.” Despite the objects’ visual diversity, the infants demonstrated an ability to discern patterns. When presented with objects sharing the same made-up name, regardless of their appearance, infants expected a consistent sound. Conversely, objects with different names were expected to produce different sounds. This remarkable cognitive feat in infants highlights the ability of our brains to use words as a label to categorize objects and concepts beyond visual cues. 

Our ability to use words as labels comes in very handy to progressively build more abstract concepts. We know that our brains look for certain patterns (that mimic a story structure) when deciding what information is useful to store in memory. Imagine that the brain is like a database table where each row captures a unique experience (let’s call it a fable). By adding additional labels to each row we make the database more powerful. 

As an example, let’s suppose that you read a story to your toddler every night before bed. This time you are reading, “The Little Red Hen.” As you read the story, your child’s cortisol level rises a bit as she imagines the challenges that Little Red Hen faces when no one helps her; and as the situation resolves she feels a sense of relief. This makes it an ideal learning unit to store into her database for future reference. The story ends with the morals of working hard and helping others, so she is now able to add these labels  to this row in her database. As she reads more stories, she starts labeling more rows with words like “honesty” or “courage”, abstract concepts that have no basis in physical reality. Over time, with a sufficient number of examples in her database for each concept, she has an “understanding” of what that particular concept means. Few days later when you are having a conversation with her at breakfast and the concept of “helping others” comes up, she can proudly rattle off the anecdote from the Little Red Hen. 

In other words, attaching labels not only allowed her to build a sense of an abstract concept, it also made it more efficient for her brain to search for relevant examples in the database. The figure above shows a conceptual view, as a database table, of how we store useful information in our brains. The rows correspond to a unit of learning — a fable — that captures how a problem was solved in the past (through direct experience or vicariously). A problem doesn’t even have to be big – a simple gap in existing knowledge can trigger a feeling of discomfort that the brain then tries to plug. The columns in the table capture all the data that might be relevant to the situation including context, internal states and of course, labels. 

Labels also play a role in emotional regulation. When children are taught more nuanced emotional words, like “annoyed” or “irritated” instead of just “angry”, they have better emotional responses. Research shows that adolescents with low emotional granularity are more prone to mental health issues like depression. One possible reason is that when you have accurately labeled rows you are able to choose actions that are more appropriate for the situation. If you only have a single label “anger” then your brain might choose an action out of proportion for a situation that is merely annoying. 

At a fundamental level, barring any disability, we are very similar to each other – we have the same type of sensors, the same circuitry that allows us to predict incoming information or the same mechanisms to create entries in the table. What makes us different from each other is simply our unique set of labels and fables. 

The Science Behind Storytelling: Why Our Brains Crave Narratives

“Once upon a time…” These four words have captivated audiences for centuries, signaling the start of a story. But what is it about stories that so powerfully captures our attention and leaves a lasting impression? The answer may lie in the way our brains learn and process information.

How Our Brains Learn: A Baby’s Perspective

A baby is constantly facing an influx of sensory information that its underdeveloped brain isn’t capable of handling. So how does it make sense of all that information? She relies on her adult caretakers to help her understand what is important and what is not. An example can clarify how this learning process works. 

  • Say you are going on a walk with your toddler and you see the neighbor’s cat. 
  • You excitedly point to the cat, in the high-pitched and exaggerated voice that only parents use, and go  “Oh look, a kitty cat” 
  • The high-pitch sound stands out from all the other audio sounds the baby is hearing. At the same time, her body releases some chemicals like dopamine (to put her in alert state) and noradrenaline (to focus attention). 
  • You might then tell her how cute the cat looks and the cheery tone of your voice tells her that the cat is a “good” thing and not something to be afraid of. And simultaneously her body releases a bit of dopamine that signals relief. 

Her brain then captures all of the information related to this event — including context like the neighborhood, the name, the image and the emotional state — and stores it as a “searchable rule”. The next time she walks by the neighbor’s house, her brain pulls up this knowledge about the cat, and she gets excited to pet the cat. Suppose, at another time you happen to be on a hike and see a different cat. Now, the knowledge that your toddler has about cats doesn’t match perfectly – it’s a different location and a different type of cat. Depending on other existing bits of information (e.g. knowledge about aggressive animals in the wild), her brain might pick a different rule and suggest a more cautionary approach. 

The Story-Learning Connection

This learning process has striking similarities to how artificial intelligence (AI) is trained. Both require labeled data and multiple examples to generalize information. However, human brains have a unique ability to learn continuously by integrating discrete “units” of information into our existing knowledge base. Given what we now know about how our brains work, it seems likely that this unit of information corresponds to what lies between the cortisol and dopamine waves. The presence of this emotional signature tells the brain to take a snapshot of the moment and store it with additional metadata. This metadata, like the labels that we assign to this information (e.g. “cat”, “neighbor”, etc.), help in searching this database of knowledge at a later time. 

This also helps explain why we find stories so compelling. Stories are packaged perfectly in the form our brain needs to process a learning unit. “Once upon a time…”, “…and they lived happily ever after” which map to the rise (and fall) of cortisol and dopamine provide the ideal bookends for this learning unit.

Our affinity for the narrative form explains a lot about learning and how we make meaning. Here are three ways stories play a role for us in society:

  • Bedtime Stories: Bedtime stories, a tradition for many generations, are an ideal medium for communicating cultural values. Most folk tales don’t just tell a story but also explicitly call out a moral value, which is essentially a label for an abstract concept, at the end. When children hear different stories for the same moral they are able to build a deeper understanding of the moral concept and the different ways it can manifest. 
  • Pretend Play: When toddlers engage in pretend play they simulate novel scenarios with all the features of a story – setting, conflict, resolution. The simulation allows the child to vividly experience the emotions in the story and thereby learn from it. Engaging in pretend play with children is a great way for parents to recognize what learning their child is taking away from the situation and reframe it for them if needed.
  • Conspiracy Theories: Unfortunately, our learning mechanism can also be hacked in unhealthy ways. The narrative structure also explains why conspiracies, even though untrue and easily verifiable, are so effective. Most conspiracies start with an outrageous claim to grab attention, label the story with a moral value and suggest an action to resolve the situation. When delivered by someone you trust, which is how we started learning in the first place, the conspiracy is easily accepted and integrated into our knowledge base. 

Conclusion: The Enduring Power of Storytelling

Stories are not just a form of entertainment; they are fundamental to how we learn, make sense of the world, and connect with others. We are not certain why stories are so powerful, but one possible explanation is that the narrative structure is recognized by our brain as a unit of learning allowing it to be integrated well into existing knowledge structures. By understanding the science behind storytelling, we can harness its power for education, communication, and personal growth. So, the next time you hear “Once upon a time…,” remember that you’re not just embarking on a journey of imagination, but also engaging in a deeply ingrained learning process that has shaped humanity for millennia.

Can AI Have Ethics?

Imagine finding yourself marooned on a deserted island with no other human beings around. You’re not struggling for survival—there’s plenty of food, water, and shelter. Your basic needs are met, and you are, in a sense, free to live out the rest of your days in comfort. Once you settle down and get comfortable, you start to think about all that you have learned since childhood about living a good, principled life. You think about moral values like “one should not steal” or “one should not lie to others” and then it suddenly dawns on you that these principles no longer make sense. What role do morals and ethics play when there is no one else around? 

This thought experiment reveals a profound truth that our moral values are simply social constructs designed to facilitate cooperation among individuals. Without the presence of others, the very fabric of ethical behavior begins to unravel. 

This scenario leads us to a critical question in the debate on artificial intelligence: can AI have ethics?

Ethics as a Solution to Cooperation Problems

Human ethics have evolved primarily to solve the problem of cooperation within groups. When people live together, they need a system to guide their interactions to prevent conflicts and promote mutual benefit. This is where ethics come into play. Psychologists like Joshua Greene and Jonathan Haidt have extensively studied how ethical principles have emerged as solutions to the problems that arise from living in a society.

In his book Moral Tribes, Joshua Green proposes that morality developed as a solution to the “Tragedy of the Commons,” a dilemma faced by all groups. Consider a tribe where people sustain themselves by gathering nuts, berries, and fish. If one person hoards more food than necessary, their family will thrive, even during harsh winters. However, food is a finite resource. The more one person takes, the less remains for others, potentially leading to the tribe’s collapse as members starve. Even if the hoarder’s family survives, the tribe members are likely to react negatively to such selfish behavior, resulting in serious consequences for the hoarder. This example illustrates the fundamental role of morality in ensuring the survival and well-being of the group.

Our innate ability to recognize and respond to certain behaviors forms the bedrock of morality. Haidt defines morality as “a set of psychological adaptations that allow otherwise selfish individuals to reap the benefits of cooperation.” This perspective helps explain why diverse cultures, despite differences in geography and customs, have evolved strikingly similar core moral values. Principles like fairness, loyalty, and respect for authority are universally recognized, underscoring the fundamental role of cooperation in shaping human morality.

The Evolution of Moral Intuitions

Neuroscience has begun to uncover the biological mechanisms underlying our moral intuitions. These mechanisms are the result of evolutionary processes that have equipped us with the ability to navigate complex social environments. For instance, research has shown that humans are wired to find violence repulsive, a trait that discourages unnecessary harm to others. This aversion to violence is not just a social construct but a deeply ingrained biological response that has helped our species survive by fostering cooperation rather than conflict.

Similarly, humans are naturally inclined to appreciate generosity and fairness. Studies have shown that witnessing acts of generosity activates the reward centers in our brains, reinforcing behaviors that promote social bonds. Fairness, too, is something we are biologically attuned to; when we perceive fairness, our brains release chemicals like oxytocin that enhance trust and cooperation. These responses have been crucial in creating societies where individuals can work together for the common good.

The Limits of AI in Understanding Morality

Now, let’s contrast this with artificial intelligence. AI, by its very nature, does not face the same cooperation problems that humans do. It does not live in a society, it does not have evolutionary pressures, and it does not have a biological basis for moral intuition. AI can be programmed to recognize patterns in data that resemble ethical behavior, but it cannot “understand” morality in the way humans do.

To ask whether AI can have ethics is to misunderstand the nature of ethics itself. Ethics, for humans, is deeply rooted in our evolutionary history, our biology, and our need to cooperate. AI, on the other hand, is a tool—an extremely powerful one—but it does not possess a moral compass. It knows about human moral values strictly from a knowledge perspective, but it’s unlikely to ever create these concepts internally by itself simply because AI has no need to cooperate with others. 

The Implications of AI in Moral Decision-Making

The fact that AI cannot possess ethics in the same way humans do has profound implications for its use in solving human problems, especially those that involve moral issues. When we deploy AI in areas like criminal justice, healthcare, or autonomous driving, we are essentially asking a tool to make decisions that could have significant ethical consequences.

This does not imply that AI should be excluded from these domains. However, we must acknowledge AI’s limitations in moral decision-making. While AI can contribute to more consistent and data-driven decisions, it lacks the nuanced understanding inherent in human morality. It can inadvertently perpetuate existing biases present in training datasets, leading to outcomes that are less than ethical. Moreover, an overreliance on AI for ethical decision-making can hinder our own moral development. Morality is not static; it evolves within individuals and societies.  Without individuals actively challenging prevailing norms and beliefs, many of the freedoms we cherish today would not have been realized.

Conclusion

Ultimately, the question of whether AI can have ethics is not just meaningless; it is the wrong question to ask. AI does not have the capacity for moral reasoning because it does not share the evolutionary, biological, and social foundations that underlie human ethics. Instead of asking if AI can be ethical, we should focus on how we can design and use AI in ways that align with human values.

As we continue to integrate AI into various aspects of society, the role of humans in guiding its development becomes more critical. We must ensure that AI is used to complement human judgment rather than replace it, especially in areas where ethical considerations are paramount. By doing so, we can harness the power of AI while maintaining the moral integrity that defines us as human beings.

Why Schools Shouldn’t Teach AI

One of the earliest technologies that gained wide user adoption was the calculator. Pocket size  versions of the calculator became available in the 1970s and it didn’t take long before people started wondering whether children should just learn to use the calculator instead of learning mental mathematics. 

Thankfully, the educational system continued to teach basic arithmetic to young students for a very good reason. Students don’t just need to understand the concept of addition or multiplication. They need to practice arithmetic over and over again in order to build and strengthen the right neuronal connections for computational fluency. High computational fluency is correlated with better performance in more advanced math, so doing there drills in early years sets the foundation over which more complex mathematical thinking can be built. 

With the coming of AI, especially Large Language Models (LLMs), we are seeing a similar debate take place. The popular line of reasoning — AI is undoubtedly going to be a big part of our lives going forward and students need to learn how to use it — makes logical sense. 

But perhaps that’s precisely the reason why we shouldn’t teach students to use AI.

What Is The Goal Of Education?

There are different philosophical views about the goal of education, but most people would agree that education is about building the right skills so students can thrive in the real world. The point is never to teach students about all possible tasks that they might have to do, but to build enough of the right foundational skills that will allow them to adapt, learn new things and make meaningful contributions. This is why higher order skills, like creative and critical thinking, feature on the top rungs of Bloom’s taxonomy, a hierarchical model of learning goals that guides our educational philosophy. This makes sense because it is nearly impossible to predict the kinds of jobs that will exist a decade or two later. So teaching specific domains is less important than building higher order cognitive skills that will allow students to be successful, no matter what tasks they might have to face. 

Cognitive Complexity

Consider the two questions below: the first one is taken from College Board’s question for AP World History while the second task asks you to create an AI prompt. 

Task 1:

Directions: Question 1 is based on the accompanying documents. The documents have been edited for the purpose of this exercise. 

Evaluate the extent to which communist rule transformed Soviet and/or Chinese societies in the period circa 1930–1990.

(See the full question for accompanying documents)

Task 2: 

Create a prompt for ChatGPT to help you understand the significance of the American Civil War.

Which of these tasks, in your opinion, is harder?

Most people would agree that the first task is way more complex than the second one. 

To complete the first task, students have to quickly read the documents provided, make notes on the historical context, audience, perspective and purpose. As a side note, the documents are varied in nature (e.g. it could be a propaganda poster or a diary entry) and not directly related to the question. Using the provided material as evidence, students have to construct a defensible thesis. They also have to choose an outside piece of evidence that adds an additional dimension, and provide a broader historical context relevant to the prompt. In short, they have to make reasonable inferences from the provided documents and integrate prior knowledge they have about the subject to create a compelling argument. And finally they have to synthesize all of this into a multi-paragraph essay! (Big thanks to my son, who recently took the AP World History test, for helping me understand what goes into answering this type of question).

For the second task, students simply have to ask ChatGPT to explain the significance of the American Civil War! Even if they want to become more sophisticated with their prompting, the strategies (e.g. role-play or providing additional context) are pretty straightforward and easy to learn. 

It’s no wonder the AP exam gives students an entire hour to answer the first question, while the second one can be done in just a few minutes.

A more telling sign of the low cognitive complexity of using LLMs, is how quickly people from all ages and walks of life have adopted LLMs to assist them with work tasks. In contrast, not many people have attempted coding, despite the many tools and tutorials that have been available for many years. Teaching STEM skills in schools continues to be a challenge due to the lack of qualified teachers. 

Teaching students to tackle complex tasks like the first one has a higher payoff than teaching them simpler tasks. After all, if our education system helps students develop critical thinking skills and the ability to handle tough questions, then easy tasks like the second one will be a breeze for them. Much like the calculator, students don’t need to be taught how to prompt LLMs to get an answer – they need to build skills to answer it themselves. 

What Aspects Of AI Should We Teach Students?

It’s clear that teaching students to use LLMs isn’t very useful because we would almost certainly replace a more complex learning goal with a relatively trivial one. 

A more worthy goal of teaching AI would be to teach them how AI works so students can build AI models as opposed to simply using them. But this brings us back to square one, because that requires a grounding in programming and computer science fundamentals. So focusing on STEM, coding and computational thinking in the K-12 curriculum is still the right approach to prepare students for careers in technical fields. 

Beyond the technical aspects of AI, it’s probably more important for students to understand how to learn better and how AI can impact the learning process. While there are a few scenarios where AI can improve the learning process, there are also learning traps that can harm creative and critical thinking in the long run. Or, they might be better served in building their ethical reasoning skills to tackle the numerous challenges that are bound to arise as AI usage becomes more widespread in society. 

All of those would require them to think harder and deeper than simply learning how to use AI.