Claude Code for Product Managers

2.2: Analyze Data

Time to Complete: 60-75 minutes
Prerequisites: Module 2.1 (Writing PRDs), basic understanding of CSV files and product metrics

Start this module in Claude Code: Run /start-2-2 to kick off the interactive experience.

📖 Overview

Module 2.2 teaches the complete PM workflow for data-driven feature development: discovering problems through data analysis, estimating business impact before building, and analyzing experiment results to make ship/kill decisions.

Key takeaway: Never stop at topline metrics – always segment by your target customer, check quality over quantity, and look for leading indicators that predict long-term success.

🎯 The Three-Phase Workflow

Phase	Purpose	Deliverable
Discovery	Find problems with data (funnel + surveys)	`problem-analysis.md` with quantitative and qualitative evidence
Impact Estimation	Build ROI models to justify engineering investment	`impact-estimate.md` and `roi-scenarios.md` with 3 scenarios
Experiment Analysis	Analyze A/B test results beyond topline metrics	`experiment-readout.md` with ship/iterate/kill recommendation

📊 Impact Estimation Framework

The Formula

Impact = Users Affected × Current Action Rate × Expected Lift × Value per Action

Components

Users Affected

How many users will see this feature?
Account for gradual rollout (not always 100%)
Example: 5,000 signups/month × 70% see feature = 3,500 users affected

Current Action Rate

What % currently take the desired action?
Get from analytics tool (Mixpanel, Amplitude)
Example: 45% activation rate (2,025/4,500 complete first task)

Expected Lift

How much will the feature improve the rate?
Sources: similar features you’ve shipped, competitor benchmarks, user research, expert judgment
Example: Survey shows 60% drop due to “need examples.” Conservatively recover 30% → estimate 13pp lift (45% → 58%)

Value per Action

What’s each incremental action worth?
For activation: LTV × conversion rate
For retention: extended LTV
For viral: invite acceptance × activation × conversion × LTV
Example: Activated user → 60% convert × $12/mo × 24 months = $172.80 LTV per activation

Three-Scenario Approach

Always model uncertainty with pessimistic, realistic, and optimistic scenarios:

Scenario	Adoption	Lift	Use Case
Pessimistic (20th percentile)	30%	45% → 50%	Minimum expected impact
Realistic (50th percentile)	70%	45% → 58%	Most likely case
Optimistic (80th percentile)	90%	45% → 62% + retention boost	Best case scenario

Present all three to leadership so they understand the range of outcomes and can make informed bets.

🔬 Experiment Analysis Framework

The Hierarchy of Analysis

1. Topline Metrics

Calculate overall activation rates for control and treatment

Quick snapshot of overall impact
Not enough to make decisions

2. Statistical Significance

Calculate statistical significance between control and treatment groups

p < 0.05: Statistically significant (< 5% chance this is random)
95% CI: Range of plausible effect sizes
Wide CI = high uncertainty, even if p < 0.05

3. Segment Analysis

Segment the experiment results by company_size and calculate activation rates for each segment

Features work differently for different user types
Topline averages can hide segment wins
Always segment by target customer before deciding

4. Quality Metrics

Among activated users, calculate week 1 retention for both cohorts

Activation rate = HOW MANY users activated
Retention = if those are GOOD activations
Check: Week 1 retention, engagement metrics, long-term retention

5. Leading Indicators

Compare template usage and invite rates between cohorts

Feature adoption: Do users engage with the new feature?
Viral metrics: Do users invite teammates?
Depth of engagement: Do users use advanced features?
Leading indicators predict future success

💼 Claude Code Data Analysis

What Claude Can Do

Read CSV files directly:

Read activation-funnel-q4.csv and calculate drop-off rates at each step

Process thousands of rows instantly:

Analyze the 8,000 rows in onboarding-experiment-results.csv and segment
activation rates by company size

Build ROI models:

Build an impact estimation model using the framework in
impact-estimation-framework.md

Run statistical analyses:

Calculate statistical significance between control and treatment groups
with p-values and confidence intervals

Cross-reference data sources:

Analyze funnel data from activation-funnel-q4.csv and correlate with
user feedback from user-survey-responses.csv

Sample CSV Structures

Funnel data (activation-funnel-q4.csv):

step,users_entered,users_completed,completion_rate,median_time_to_complete
Signup,10000,10000,1.0,0
First Task Created,10000,7200,0.72,18
First Task Completed,7200,2880,0.40,45
Invite Sent,2880,1440,0.50,24

Experiment data (onboarding-experiment-results.csv):

user_id,cohort,company_size,completed_first_task,invited_teammate,tasks_completed_week_1
control_user_0001,control,5-20,True,False,4
control_user_0002,control,5-20,False,False,0
treatment_user_0001,treatment,5-20,True,True,8

Claude reads these and presents formatted tables with insights.

💡 Real-World Examples

Discovery: Stuck Activation Rate

Situation: Activation plateaued at 45% for 6 months.

Analysis workflow:

Read activation-funnel-q4.csv and find the biggest drop-off → 60% drop at task completion
Analyze user-survey-responses.csv and extract top complaints → Need examples/templates
Cross-reference: Drop-off correlates with survey feedback
Synthesize: Create problem-analysis.md with quantitative and qualitative evidence

Outcome: Clear problem statement backed by data, ready for stakeholder alignment.

Impact Estimation: Justifying Guided Onboarding

Situation: Proposed $100k feature (4 eng-months). Engineering skeptical, leadership wants ROI.

Analysis workflow:

Analyze taskflow-usage-data-q4.csv to calculate current activation rate → 45% baseline
Estimate lift based on survey data (60% need examples → conservatively recover 30% → 13pp lift)
Build complete ROI model with baseline, projections, and business impact
Generate pessimistic, realistic, and optimistic scenarios

Outcome: 9.4x ROI over 3 years (realistic), 2.6x even in pessimistic case. Build approved.

Experiment Analysis: Revealing Hidden Wins

Situation: Topline shows 45% → 48% (+2.6pp, p=0.04). Team disappointed.

Analysis workflow:

Check topline → Modest +2.6pp
Segment results by company_size → Small teams: +11.4pp (huge!), Enterprise: -3.5pp (negative)
Among activated users, calculate week 1 retention → Treatment: 78% vs Control: 60%
Compare template usage and invite rates → Template usage 3.2x higher, invite rate 35% vs 12%

Outcome: What looked like a failure is a huge win for target segment. Ship to small teams, exclude enterprise.

🎯 Best Practices

Analysis Approach

Do:

Always validate hypotheses with data before building
Create three scenarios for every estimate (acknowledge uncertainty)
Segment by target customer (topline can hide wins)
Check quality metrics (retention > activation count)
Look for leading indicators that predict long-term success
Cross-reference quantitative + qualitative data

Don’t:

Stop at topline metrics without segmentation
Use single-point estimates (use ranges and scenarios)
Assume 100% adoption (account for gradual rollout)
Ignore negative segments (exclude them from rollout)
Kill experiments before checking segments and quality
Over-optimize lift estimates (be conservative)

Lift Estimation Sources

Best to worst:

Your historical data - Past experiments are best predictors
User research - Survey shows 60% drop due to X → fix X recovers Y%
Competitor benchmarks - Industry standards for similar features
Expert judgment - Team estimates from eng/design/PM

Pro Tips

Build a lift estimate library Track estimated vs actual lift for every feature. After 5-10 features, you’ll get much better at estimating.

Front-load disappointing news Show modest topline first, then reveal segment wins. Teaches stakeholders to always dig deeper.

Automate analysis scripts Save prompts for funnel analysis, segmentation, and ROI modeling. Reuse across features.

📁 Working with CSV Data

Common Data Sources

Platform	Export Type
Mixpanel, Amplitude	Usage events, funnels
Optimizely, LaunchDarkly	A/B test results
Qualtrics, SurveyMonkey	Survey responses
Google Analytics	Traffic/conversion data

Claude Code can read CSV, TSV, and JSON directly.

Viewing CSV Files

Options:

Excel or Google Sheets - Best for exploring data visually
VS Code - Good for viewing raw structure
Let Claude read it - Claude formats data in clean markdown tables

Recommended: Let Claude read and analyze the CSV, presenting results in formatted tables. View raw CSV only if you need to verify specific data points.

🐛 Troubleshooting

Claude can’t read my CSV file

Check file path with ls or file browser
Use correct relative path (e.g., data/experiment-results.csv)
Verify file extension is .csv not .CSV (case-sensitive on some systems)

Results don’t match what I see in Excel

Ask Claude to show the calculation step-by-step
Verify Claude is using the correct columns
Ask: Explain how you calculated activation rate from this CSV

Statistical significance seems wrong

Check sample size: Need ~400+ users per cohort for reliable tests
Remember: p<0.05 means “less than 5% chance this is random,” not “5% error”
Wide confidence intervals = high uncertainty, even if p<0.05

Segment analysis shows conflicting results

This isn’t a bug - it’s an insight!
Features often win for target segment and lose for others
Solution: Ship to winning segment, exclude losing segment

📖 Key Terms

Term	Definition
Activation Rate	Percentage of signups who complete a key action
Confidence Interval	Range of plausible values for the true effect size (e.g., 95% CI: [0.1%, 5.1%])
Funnel Analysis	Tracking users through sequential steps to identify drop-off points
Leading Indicator	Early metric that predicts future success (e.g., invite rate predicts retention)
Lift	Improvement in metric (e.g., activation 45% → 58% = +13pp lift)
LTV (Lifetime Value)	Total revenue from a customer over their entire relationship
p-value	Probability the observed effect is due to random chance (`p<0.05` = statistically significant)
ROI	Revenue or value generated divided by cost (e.g., 9.4x ROI)
Segment	Subset of users grouped by shared characteristic (e.g., company size, role)
Topline Metric	Overall average metric before segmentation

🚀 What’s Next?

You now understand how to analyze funnel and survey data, build ROI models with scenario analyses, and analyze A/B tests beyond topline metrics using Claude Code as your data analysis partner.

Module 2.3: Learn about Competitive Research & Strategic Analysis - conduct rapid competitive research with parallel agents and apply strategic frameworks.

Interactive track: Type /start-2-3

About This Course

Created by Carl Vellotti. If you have any feedback about this module or the course overall, message me! I’m building a newsletter and community for PM builders, check out The Full Stack PM.

Source Repository: github.com/carlvellotti/claude-code-pm-course

2.1: Write a PRD 2.3: Product Strategy