Leading the next Product Frontier

On growing into an AI Builder and announcing our partnership with Reforge

Feb 24, 2026

Every software company wants to become AI-native before they are disrupted by a frontier model. CEOs are trying to hire builders who can lead this new charge. If you are a PM looking for great career opportunities, your next interview will focus on your experience with AI agents. And if it does not, you are in the wrong room.

Agentic products are fundamentally different in nature. Whether you are leading, hiring or interviewing for an agentic product team, there are two new skills you need to learn - AI sense and Evaluation.

In this article, we dig deeper into what we mean by both.

Skill #1: AI Sense

As frontier models get better, it will require less and less skill for end consumers to use AI productivity tools in their day to day. There will be entire growth teams dedicated to optimizing the prompting experience and reducing friction for end users learning an AI tool for the first time. This has always been true in software.

The “alpha” for builders lies in knowing how things work under the hood. The value of understanding how agents work is already high and growing everyday. We have found that there are four distinct components to AI sense that are particularly important to PMs and Designers.

Agent mechanics: Understanding how agentic products manage their context, use tools, search for facts and update their memory (Eg: Langchanin on context engineering)
Jagged frontier: Having an intuition for what the real-world frontier model capabilities are (separate from the reported benchmarks) and where they break (eg: Ethan Mollick’s exploration)
UX design: Having a strong point of view on how agents should interact, fail gracefully/recover from errors or manage their level of access and autonomy with their human users
Performance: Understanding the trade offs between latency, accuracy and cost (eg: Epoch’s research)

Skill #2: Evaluation

Behind every successful agent product that has stood up to the test of time, is an AI Flywheel - a systematic way to ship and iteratively improve AI agents and LLM-powered products.

Building an AI Product Flywheel

Sandhya Hegde and Justin Bauer

Jan 23

Read full story

At the heart of that flywheel is a new skill - AI Evaluation. Users of AI products have wildly different experiences. Some love the product. Others get stuck or frustrated. The product team argues about whether quality is improving or not. No one can agree on where to focus or what to prioritize. Leadership wants to see steady progress, but what hill are we climbing?

This chaos is what happens when you try to manage a probabilistic system with deterministic tools. Your old playbook of PRD user stories, bug tickets, and A/B tests breaks down when every user gets a different output from the same input. Your old way of fixing bugs becomes a frustrating game of whack-a-mole with >50% of users never coming back.

AI Evaluation is the practical craft of defining, measuring, and improving the quality of AI-powered products. This skills has a few different components:

Rubric design: Defining quality metrics for agents that clearly align to user success (eg: Cursor’s Tab Accept Rate)
User intent mapping: UX research organized around user intent rather than workflows
Building Verifiers: Creative ways to help your team and your customers confirm the accuracy and safety of agent outputs (eg: Sierra’s simulation process)
Dataset management: Building datasets needed to iteratively improve products - golden outputs, edge cases and error modes

How it all comes together: a sample AI PRD

AI PRDs are a misnomer. They are no longer documents but living prototypes with the right quality metrics and datasets attached. Starting with prototypes requires AI Sense but helps capture real feedback and early failure modes for your agent.

AI agents need to be continuously calibrated and improved. They deliver work, not workflows which means that PMs need to spend more time in the solution space than they might previously have - as you will see in this PRD.

You can download the markdown file shown below here from Github: Sample-AI-PRD-Triage-Agent.md

Join us and learn by building

The only way to learn these skills is by building. Watching podcasts, youtube influencer videos farming your attention and scrolling twitter is not going to make anyone an AI Builder.

We are excited to announce that we are partnering with Reforge to launch a hands on Intro to AI Evals Course on their learning platform. Since we launched this course 2 weeks ago, it has already become their #1 program.

Join us for a free virtual lesson and demo this week. We will walk you step by step through the first step of building AI Sense and running Evals - Trace Analysis.

Building an AI Product Flywheel

Comments

Ready for more?