How I use a metrics tree to align, prioritize, and track progress

Spread the love

You’ve probably heard the word “outcomes” echo in almost every product meeting you’ve been in. But for many, it remains a fuzzy aspiration, a distant star you squint at while prioritizing features based on popular demand.

How I Use A Metrics Tree To Align, Prioritize, And Track Progress

The good news for you is that I’ve made this problem my bread and butter. Over the past five years, I’ve helped eight startups move from feature- to outcome-driven ways of working.

The hardest parts of this transition include:

Aligning on the most impactful goals and growth levers
Measuring progress with the right metrics
Prioritizing features that drive outsized impact

In this article, I’ll unpack how I use Michael Karpov’s concept of metrics trees to tackle all three of these challenges. I’ll walk you through the real tree I’m currently using at Outrig.run (my ninth startup) and dive into how you measure each of the branch metrics, as well as how to use a tree to prioritize product changes.

North Star metrics, OKRs, and value exchange loops

When it comes to product metrics, there’s an overarching line of thinking that goes something like this:

You have a primary business metric you’re trying to achieve that underpins the business model. A simplified shorthand could be “how you make money;” a more nuanced view is “how you capture value from your users”
There’s a primary product metric right below the primary business metric. This can be a North Star metric — the metric that best captures the core value your product delivers to customers
The primary business and product/North Star metrics are inherently linked. Your success hinges on delivering value to your users, as this enables you to capture value back (revenue, market share, etc.). Your product metric or North Star metric acts as a leading indicator — more users “enjoying” your tool should translate to future revenue
You turn the North Star metric/primary product metric into something that lets you
- Track whether you’re on track
- Guide decision-making (in OKRs, these are the key results)

But how does the metrics tree pull it all together?

The primary business metric sits on top, with the North Star metric right below it
The tree systematically breaks down how that North Star metric is achieved through a series of interconnected, measurable sub-metrics
This breakdown informs your OKRs: your objectives can be the lower level metrics in the tree (e.g., focusing on “number of new users MoM” or “user retention rate”), while your key results become the specific, measurable targets for the relevant metrics further down the branches (e.g., “Increase percent of users that reach aha to X percent,” or “Achieve Y new users from Social Media”)
The metrics tree provides the “why” and “how” behind your primary business metrics, North Star metric, and OKRs. It visually shows how they all interlink and influence each other, and also helps you identify where to focus first

Outrig’s metrics tree: Struggling to identify our core value

For context, Outrig is a developer tool focused on local Go development. It allows Go developers to “look inside” their project in a visual way, enabling continuous debugging (instead of “stop the world debugging”). It comes packed with useful features that solve proven pain points for Go devs including: a GoRoutine inspector, a log searcher, a visual view of runtime statistics, and variable watches that allow developers to check how variable states change in real-time.

When I set out to build Outrig’s metrics tree, I bumped into three complications:

Pre-launch reliance on qualitative data

We’re currently relying purely on qualitative data. We haven’t publicly launched Outrig yet, but we’ve conducted approximately 50 landing page tests and 1:1 MVP tests, closely observing users’ initial interactions.

A small subset (around 10 percent) of these test users are still actively engaging with a core debugging feature weekly. We know the sample size is limited, but it’s what we have to work with.

Uncertain core feature

We’re not sure yet what the “core-feature” or “main unit of value” is. For a company like Airbnb, choosing a North Star metric like “nights booked” is relatively obvious. But Outrig has four big features: the log searcher, the goroutine inspector, the variable watches, and runtime statistics.

Our 50 MVP tests showed us that different users get excited about different things. This means users could be getting value out of using any of our four features. A proxy for “user value” can be how often the UI is actively being used (rather than “accidentally” being left open but idle), or session length.

However, simply spending time using a tool doesn’t necessarily mean a user is enjoying it or finding value.

Passive value delivery

Some features might be delivering value even if a user isn’t triggering events. “Runtime statistics,” for example, is a dashboard view that users could get value out of just by looking at it.

Crafting the tree: Selecting and breaking down the #1 goal

Now that you know our complications, let’s turn our attention to how we went about crafting the metrics tree.

Outrig’s “play” is to achieve a large, happy, word-of-mouth-sparking user base first, raise funds based on that, and consider paths to monetization later. The #1 business metric will likely, at some point in the future, be “revenue,” but for the next six months, our end game is the North Star metric.

Level 1 metrics

Our #1 product metric is our North Star metric: 10k DAU performing a core debugging action by the end of October 2025:

Level 1 Metrics

Why not just “DAU”? DAU measures usage, not necessarily value. A developer might log into Outrig daily but not actually be using its core features to find, fix, or verify code faster. They might just have it open in the background. The true value comes from actively using the tools for their intended purpose
Why “DAU performing a core debugging action”? Since Outrig’s core value isn’t tied to a single feature, our North Star metric aggregates key actions across them. This includes instances of log searches, goroutine inspections, setting up variable watches and viewing the tab, and viewing the runtime statistics tab
Why 10,000 by the end of October? The team is benchmarking against a dev tool they’ve built before. The goal is aspirational (OKR style), so even reaching 7000 would be a reason to pop a tiny bottle of champagne

A large, distant goal like 10,000 DAU by Q3 ’25 is too big to work with. Because of that, we broke it down into smaller, more immediate targets so that we can monitor progress weekly or monthly and identify early if we’re falling behind:

Smaller More Immediate Targets

Level 2 metrics

For Outrig’s KPI tree, we chose our two main growth levers to be our level two metrics:

Number of new users MoM
Number of retained users MoM

Level Two Metrics

<br />

Outrig’s current five week retention rate is about 20 percent (only from the 50 user testers, certainly not statistically significant!). We’ve set our five week retention rate goal to 70 percent.

Level 3 metrics

We then break our three into two level three metrics:

New users MoM
Retained users MoM.

New users MoM

New users should come in from two big “buckets”

Channels like offline events (Gophercon), influencers, Golang forums or communities, X, LinkedIn, etc.
Outrig’s viral loop (to be defined…). We have a strong assumption that we can build a feature that offers real value to our users while naturally creating a growth loop (e.g., by letting users share debugging sessions or “pair-debug” with colleagues)

Level Three Metrics

Retained users MoM

We broke down “retention” into aha (“experience value for the first time”) and habit (incorporates Outrig into their workflow, repeated “aha”) moments. These engagement metrics are notoriously difficult to define, made even more difficult by the fact that different users find value in different features.

We rely on our learnings from our 50 qualitative MVP tests to define “aha:”

We watched closely while new users tried Outrig for the first time to see which action made it “click” for them
We specifically asked them in the exit interview, “Which moment during the test, or which feature, really made the value of Outrig click for you?”

We learned that many new users are nervous about adding the SDK to their code base. Therefore, they quickly spin up a “hello world” (or “hello Outrig”) project to try Outrig.

This is problematic, because this “hello world” Outrig doesn’t have enough loglines or goroutines to give a meaningful Outrig experience. That’s why we built a demo project into Outrig (with lots and lots of GoRoutines, loglines, pre-installed variables, and interesting runtime statistics).

We know that users are only using Outrig for a real debugging workflow (and therefore getting value) when they’re using it on their own project, which isn’t a “hello world” or “hello Outrig” project.

Our current definitions are as follows:

Aha

User runs Outrig on their own real project (not demo, “hello world”, “hello Outrig”)
User interacts with any core debugging function (aggregator)

Habit

User runs Outrig on their own real project (not “hello world/hello Outrig”)
Performs a core debugging action at least once per week

These definitions of aha and habit are a starting point; we’ll sharpen them as we learn more.

User satisfaction

We broke down habit under “UI usage frequency” (listing every individual feature in its own metric) and “session duration.” We know that “session duration” is a flawed proxy for a user getting value. For instance, a user might stare at our runtime statistics tab (a visual representation of data) without gaining any real value.

Including user satisfaction as a level three metric mitigates this problem. We’ll pop up five star ratings for each of the four core features and a more generic “Does Outrig make debugging Go code significantly easier or faster?” to gauge user satisfaction levels.

User Satisfaction

Using the tree to prioritize work

Beyond aligning on which outcomes we’re driving, crafting goals, and setting ourselves up to measure, the metrics tree helps us prioritize product changes and features.

Identifying significant loop drop-offs

Our engagement metrics can be translated into a loop, from new user acquisition → aha → habit → referral. This allows us to identify areas with significant drop-offs. If, for instance, we observe a strong influx of new users but a poor “aha” rate, we shift our immediate focus to improving that initial value experience. This helps us direct our efforts to the most impactful lever at any given time.

Prioritizing features aligned with higher-level metrics

As Michael Karpov suggests, features most likely to influence metrics closer to the top of the tree — like our North Star metric or its direct growth levers — are generally prioritized. This isn’t a rigid rule, as a substantial impact on a lower level metric can sometimes outweigh a minor tweak to a higher level one.

We often use the level two and level three metrics as a starting point for brainstorming sessions, asking: “What features could significantly boost these key areas?”

Features With Higher Level Metrics

The importance of qualitative data

Quantitative data only tells us what users are doing; it rarely tells us why. To truly understand user behavior and motivations, we constantly supplement our metric analysis with qualitative feedback.

For Outrig, we maintain an active Discord channel, engage in Go communities, attend developer events, actively solicit in-app feedback, and aim to conduct at least three shadowing sessions and user interviews each week.

Tips for building your own metrics tree

Here are some tips and tricks before you get cracking on your own tree:

Start solo, then combine

Collaborating in a workshop session can be notoriously bad for limiting diverse views. And jamming the metrics tree that you created by yourself down everybody’s throat will likely not lead to a lot of buy-in.

Here’s a better way:

Provide every member of the leadership team with the relevant theory and context (feel free to send this article around), and ask everyone to craft their own metrics tree solo
Then bring the diverging metrics trees together during a separate workshop

AI is an amazing sparring partner

Start by thinking solo; only invite AI in after you’ve done the work. If you have AI make the first draft, you’ve biased yourself
Create a customGPT or custom gem for your project. Upload the necessary documentation to give the LLM the context it needs (anything about your business, and the theory you want it to primarily tap into), then prompt it to act as an expert product/growth coach for product leaders. Take screenshots of your tree and ask for critical feedback

The hardest part is defining each metric and ensuring that the connections between them are clear and actionable. But don’t fret, it doesn’t have to be perfect right out of the gate.

Your metrics tree is a living document. As you gather more data and user feedback, you’ll refine your definitions and potentially even your North Star metric.

Final thoughts

Building an outcome-driven product organization hinges on clear alignment on the most important goals and growth levers, measuring what matters, and strategic prioritization. A metrics tree provides a robust framework to connect overarching business goals to daily product decisions.

By using this structured approach, continuously refining your metrics, and leveraging both quantitative and qualitative insights, you can move beyond simply shipping features to truly driving impactful, measurable value for your users and your business.

For Outrig, this is just the start. I’ll keep you posted on how we progress and how our metrics tree changes over time.

Featured image source: IconScout

LogRocket generates product insights that lead to meaningful action

Plug image

LogRocket identifies friction points in the user experience so you can make informed decisions about product and design changes that must happen to hit your goals.

With LogRocket, you can understand the scope of the issues affecting your product and prioritize the changes that need to be made. LogRocket simplifies workflows by allowing Engineering, Product, UX, and Design teams to work from the same data as you, eliminating any confusion about what needs to be done.

Get your teams on the same page — try LogRocket today.

How I use a metrics tree to align, prioritize, and track progress

North Star metrics, OKRs, and value exchange loops