On Building a Superintelligent AI Mathematician: A Conversation with Axiom Founder, Carina Hong.

Axiom is an AI company on a mission to build a self-improving, superintelligent reasoner, starting with an AI mathematician. The company launched out of stealth in 2024 with $64 million in seed funding at a $300 million valuation led by B Capital. The Axiom team includes renowned AI and mathematics experts including CTO Shubho Sengupta (an eight year veteran of Meta’s AI research lab FAIR), and founding mathematician Ken Ono (former UVA Marvin Rosenblum Professor of Mathematics and VP of the American Mathematical Society), François Charton (who used LLMs to solve a 100-year-old math problem), Aram Markosyan (who led safety and fairness research at FAIR), and Hugh Leather (an early pioneer in using deep learning for code generation).

In this interview, I speak to Axiom’s founder Carina Hong. As a Rhodes Scholar at Oxford’s Hertford College, she earned a neuroscience master’s with dissertation distinctions and conducted deep learning research at UCL Sainsbury Wellcome Center Gatsby Unit. As an MIT math and physics undergrad, she published 9 papers in peer-reviewed journals spanning number theory, combinatorics, theoretical computer science, and probability. Carina won the AMS-MAA-SIAM Morgan Prize (the world’s highest honour for undergraduate mathematics research) and the AWM Schafer Prize (awarded to one undergraduate woman in math in North America).

Q: What is the case for creating an AI mathematician?

[Carina Hong]: We think of an AI mathematician as a starting point for reasoning more broadly. Our view is that we’ll eventually have very strong reasoning systems, and mathematics is the perfect environment to develop them. Such a system should have a few core components. First, it needs a prover: a model that can prove mathematical results really well. And I want to emphasise that this isn’t just about producing numerical answers, but about generating full proofs with clear intermediate steps.

The second component should be a conjecturer—a model capable of proposing new mathematical questions, which the same system can then attempt to prove. This creates a very interesting self-play loop. A third component is a knowledge base. The system should be able to look up what already exists in the mathematical corpus and determine what new results can be added. When conjecturing something new, it should also be able to run sanity checks, for example by searching for potential counterexamples.

Finally, there’s auto-formalization: the technology that converts existing natural-language mathematical proofs into Lean, a machine-verifiable programming language for mathematics. I think this is a core technology—at least as important as proving itself, if not more so—because it weaves together all three components I just described. That’s how we envision Axiom, the AI mathematician.

Q: How do we transition from symbology to mathematics for AI?

[Carina Hong]: We see math as code and code as math. The real magic, and the key transition, comes from combining AI, programming languages, and mathematics—bringing all three pillars together. If you think about English, our natural language, it sits very high on the abstraction ladder. By contrast, code—whether Python, Lean, or another language—operates at a much lower level of abstraction. It’s often easier to explore ideas at a high level, while the low level is what grounds and constrains them.

What we envision is humans using informal reasoning and intuition as a powerful guide, with formal systems then verifying those ideas. In this way, the formal system grounds high-level intuition. At the same time, discoveries made at the low level—where the system can self-verify and apply techniques like reinforcement learning to coding, execution-based feedback, and related methods—can flow back up to inform and refine high-level intuition. That interplay across layers is, I think, the real magic of combining multiple levels of abstraction.

Q: What are the applications of an AI mathematician

[Carina Hong]: think we can break this down into two parts: math itself, and what lies beyond math. Starting with math, simply being able to solve mathematical problems already unlocks a huge number of use cases, as you mentioned. Math is the foundation of almost all branches of science, and it provides theoretical insight into complex systems that can deliver significant gains if they’re better understood. If you look back at early computer prototypes, many of them were essentially differential equation solvers. I often visit the Computer History Museum here in the Bay Area, and you can really see how people’s curiosity drove them to build mathematical solvers, which in turn led to further advances in computer prototypes. That’s the first level. Math alone has enormous applications in areas like financial trading, physics, engineering, and simulation. That’s just the math angle.

Beyond math, I think the impact is even bigger. There are a few layers to this. One is the indirect effect of transfer learning. A model that’s really strong at mathematical reasoning is likely to be strong at coding. And a model that’s excellent at both math and code is often very good at analysing the nuts and bolts of legal reasoning as well—it performs extremely well on legal benchmarks, for example. Anecdotally, many top performers in law school come from STEM backgrounds. Areas like contract law, bankruptcy, antitrust, tax law, and corporate law—especially the more flowchart-like parts—tend to become very straightforward for them.

But that’s still not the final, holy-grail point. The third and deepest reason this matters—why it’s not just commercially meaningful but potentially world-changing—is the ability to bridge different levels of abstraction. If you think about languages at different abstraction levels working together and apply that to contexts like chip verification and chip design, or to software verification and code generation, or to understanding cryptographic protocols, the value becomes enormous. All of these domains involve multiple layers of abstraction, and the ability to move fluidly between those layers is likely to be extremely commercially valuable.

Q: Will AI mathematicians improve the efficiency of LLMs and other AI platforms?

[Carina Hong]: I’d say there are a few things that directly support this point. First, if you pair natural language with a verifiable machine language, you get better sample efficiency—and there’s research showing that. In practice, that means you can often reach strong performance with less compute, a smaller model, and fewer resources. For example, at Axiom we’ve raised $64 million—we’re a small startup—and we recently won the Putnam competition. We scored 90 out of 120, which would have placed us above all ~4,000 human contestants last year and at the level of a Putnam Fellow, meaning top five in the world. Now, if you tried to achieve that purely through informal methods—where hallucination is a persistent risk—getting to the same level of consistent correctness would likely require a lot more research effort and resources. That’s one confirmation.

Second, you could argue that beyond Olympiad-style benchmarks, research mathematics is the next frontier. That’s something Axiom laid out in our initial vision: the math AI world should move beyond Olympiad math. Of course, it still makes sense to justify results on Olympiad benchmarks just to understand where you are—measurement matters, especially on anything that sits on your critical path. But ultimately, we have a discovery team, and conjecturing is a core part of the formal vision. That’s what we should be pursuing.

And if you think about what “research math” entails, it includes things like stochastic differential equations and other tools that are useful for analysing neurodynamics. With stronger mathematical understanding—especially when the math system is coupled with a really strong coding system—there may be parts of theoretical deep learning we can finally make progress on. That’s the kind of direction we’re excited about: what an AI mathematician could enable, what it could do to support an AI researcher theoretically, and what a truly unified system would mean. The vision is clear in our minds—we just need to take concrete steps, one step at a time, to execute it.

Q: How do you differentiate an AI business in todays’ investment world?

[Carina Hong]: There’s the vision, and then there’s the execution. When you’re at the very beginning—before a single line of code has been written—evaluating a startup really comes down to execution potential, which is largely about the quality of the team and the people. That’s not just technical track record, but also soft qualities: grit, resilience, and conviction. Everyone on the team strongly believes in the mission. This is the one thing we want to work on in our lives, and multiple people share that same sentiment. We’ve fallen in love with a very stubborn technical challenge. These are things we tend to show rather than tell. Convincing others is always hard—it requires conviction on both sides.

When we were fundraising, we were very fortunate to meet people and funds that already had a clear internal thesis that this was going to be the future. They also had a very detailed, granular view of what the technical vision would look like and how it would ground execution. Among the many teams they evaluated, Axiom’s technical vision resonated most strongly with them. Our team has an exceptional technical track record, and just as importantly, the soft qualities that convinced them this could be a winning bet.

We’ve delivered so far, even though it’s still effectively day zero and there’s much more to do. What we’ve demonstrated is small-team speed: speed in hiring, speed in getting strong results, and speed in refining our intellectual understanding. Execution is always a learning cycle. As we execute, we have to keep learning, updating our priors, and sharpening our understanding—and that, in turn, lets us execute faster, better, and with more focus. I think that’s what smart and perceptive investors can see from day one, and we’re very fortunate to be partnering with them on this journey.

It’s also been a pretty remarkable team-building effort. We have world experts in AI, including lead authors of open-source AlphaGo; world experts in code generation, including some of the first to apply deep learning to code generation back in 2017; experts in compilers behind work like LM Compiler and Kernel LM; and more recent leaders in computational reward modelling who led flagship efforts at FAIR. We also have pioneers in AI for mathematics going back to 2019, showing that transformers can learn symbolic integration better than Mathematica, alongside very strong infrastructure engineers—people who invented synchronous training and contributed to many layers of the modern machine learning stack that researchers use every day. Taken together, it’s really a totality of circumstances, where each individual pillar is itself quite strong.

Q: How do you win the war for AI talent?

[Carina Hong]: I think there are a few key factors. First, there’s a compelling mission—both because of the enormous economic opportunity it represents, if it succeeds, and because of the broader impact it can have on the world. That kind of mission resonates with both technical and non-technical people. In fact, it’s inspiring enough that even people on our cap table feel motivated to work with us. The idea that formalisation and formal proving could be applied across so many layers of the technology stack, from hardware to software, is incredibly compelling.

At the same time, the mission is defined by a stubborn technical challenge. It has to be hard. No one wants to work on something that isn’t difficult, and in a sense, choosing the hardest problems becomes a moat. Of course, that audacity comes with an expectation of real pain and suffering, as Jensen likes to say. The technical obstacles are very real. We face a severe data scarcity problem: the amount of Lean code in the world is orders of magnitude smaller than the volume of Python code. Beyond data scarcity, conjecturing itself remains an open research problem. In a verifiable language, when you’re conjecturing, you’re judging statements you can’t simply run, which means you don’t get the kind of verifiable outputs—like proofs—that models usually rely on. That makes conjecturing a particularly interesting and difficult part of the self-play loop.

Auto-formalization is also extremely hard. Front-end models perform very poorly at it because they simply haven’t seen enough Lean code. It’s a classic chicken-and-egg problem: data scarcity leads to weak formalisation, but without a reasonable baseline for formalisation, you can’t generate the data needed to train better models. All of this points to open discovery problems—questions that current models and systems can’t yet solve. These are the kinds of problems mathematicians wrestle with for decades, where conjectures can remain open for 30 years and then turn out to be false. In fact, members of our discovery team have previously disproved a 30-year-old conjecture. It has to be that hard.

Then there’s the human element: having some of the smartest minds working together on a shared, difficult problem. That creates a powerful bond. It’s like a mountain-climbing team facing a steep, snow-covered peak, with the best teammates around you—people who pass you the gear, cheer you on, and give you concrete, practical feedback on your technical approaches. Over time, that builds a genuine idea meritocracy, which I think is incredibly powerful as a recruiting force. And it’s not really a narrative we pitch; we don’t talk about it explicitly. People come to the office and feel it. They see what’s on the whiteboards. We’re transparent and open, more like an academic seminar environment, with reading groups and ongoing discussions. It’s really the combination of all these elements working together.

Q: How did you ‘learn’ to lead a fast-scaling entrepreneurial business?

[Carina Hong]: I don’t think I had very strong first-principles guidance from day one, to be completely honest. I come from a technical background, more as an individual-contributor researcher. What’s interesting is that I’ve always tried to understand things deeply and then execute, which is a kind of first-principles approach, but in practice it often turns into more of an engineering-style trial-and-error process. At this stage, I’m really just trying to grow—listening carefully to feedback and learning as much as I can.

When it comes to attracting talent, one belief I hold strongly is that a company has to genuinely want to succeed, and that means continuously hiring people who are smarter than you, better than you, and able to execute faster than you—people who will surpass you. I don’t think I need to be “good enough” relative to the team. What I do need to do is keep learning and keep growing. In that sense, it’s more about the first derivative than the absolute position—the rate of improvement matters more than where you start.

Another thing I’ve found fascinating is the feedback loop itself. You’re not just listening to feedback; you’re constantly exposed to new ideas, proposals, and perspectives, especially when you’re surrounded by so much talent. It becomes a real idea powerhouse. In that environment, learning how to focus, prioritize, and think long-term becomes critical, and those qualities tend to win out over time. It starts to feel like managing a portfolio of trade-offs: risk versus reward, long-term versus short-term, and everything in between.

At this point, I’d say I’m still mostly operating as a pattern-recognition machine—learning through trial and error, iterating, and adjusting—rather than being guided by some perfectly articulated set of first principles. So honestly, I don’t think I have anything especially profound to say on this yet.

Q: Where did you get your passion for mathematics?

[Carina Hong]: I think of mathematics as having different stages of being a mathematician. In our launch video on X—you might remember this—at the very end we show three film snippets. The first is from the International Math Olympiad kids throw their hats into the air on stage. The second is Ramanujan being introduced by Hardy, I think to the Society of Fellows. And the third is the scene with John Nash, later in his career, when other mathematicians come by and place their pens on his desk as a sign of tribute.

I think these capture three stages. In the first stage, you learn to be a good problem solver. You do well in Olympiads, and you get constant dopamine hits. That kind of reward is incredibly motivating for a kid. When I was younger, I was very driven by that quick reinforcement—I solved this, I solved it fast, let me move on to the next one. It feels like playing a game, almost like a video game loop.

In the second stage, you’re humbled as a junior researcher. Ramanujan had extraordinary intuition, but when he arrived in Cambridge, he had to learn how to write rigorous proofs. You become very aware of your limitations. You learn where your strengths are, who you are, and who you could become if you double down on those strengths and collaborate with others, especially on problems that sit just beyond your comfort zone—maybe even a bit too stressful. That stage has always been very interesting to me. In fact, I tend to gravitate toward problems that are harder than my existing toolkit can comfortably handle. That means a lot of pain and suffering, especially early on, but I actually enjoyed that more than the dopamine hits. It’s long cycles of struggling, almost like talking to the problem itself, or even praying: please, just let me solve this. There’s something deeply compelling about that.

The third stage is one I don’t think I’ve reached, so I can’t really speak from experience. But looking at role models who have reached it is incredibly inspiring. Going to professors’ birthday conferences, where people take turns presenting work spanning decades of that professor’s career, is moving. Being mentored by world-class mathematicians like Professor Ken Ono, who has joined Axiom, is inspiring. Now working with people like François Charton, who have spent years at the intersection of AI and mathematics—a relatively new scientific field in itself—is also incredibly inspiring.

At a more fundamental level, what draws me to mathematics are the objects and constructions themselves. Symmetry is beautiful; lack of symmetry is interesting and often surprising, sometimes even exotic. For me, that third aspect is perhaps the most beautiful of all. I love symbolic manipulation. I genuinely enjoy writing down notation. And if you change the notation in a number theory problem, I can probably still solve it, but it feels less natural, less beautiful—like when the conductor of an elliptic curve just has to be written a certain way.

On Building a Superintelligent AI Mathematician: A Conversation with Axiom Founder, Carina Hong.

About the Author

Up Next: Serendipity in Entrepreneurship