5 min read

How to Calculate Feature Flag Impact in PostHog

Your feature flag is live, but is it actually working? PostHog gives you the data to measure what changed after you rolled out a flag. We'll show you how to calculate the actual impact on your key metrics.

Capture Flag Variant Events

Before you can measure impact, PostHog needs to know which variant each user saw.

Send events tagged with the flag variant

When a user interacts with your feature flag, capture an event that includes the variant they received. Use the PostHog SDK to add the flag and variant as properties on your event.

javascript
posthog.capture('feature_interaction', {
  'feature_flag': 'new_checkout_flow',
  'flag_variant': 'treatment',
  'action': 'clicked_button'
});
Capture an event with the flag name and variant as properties

Ensure your control and treatment groups both log events

Users in the control (original) variant and treatment (new) variant must both send the same event. This is how PostHog compares them. If only one group sends events, you can't measure impact.

javascript
if (posthog.featureFlags.isFeatureEnabled('new_checkout_flow')) {
  showNewCheckout();
  posthog.capture('checkout_flow_viewed', {
    'flag_variant': 'treatment'
  });
} else {
  showOldCheckout();
  posthog.capture('checkout_flow_viewed', {
    'flag_variant': 'control'
  });
}
Both branches should send the same event with different variant properties
Watch out: If you only log events for the treatment group, your control group will have zero events and you won't see the comparison.

Calculate Impact in Insights

Once both variants are logging events, create an Insight in PostHog to compare their metrics.

Create a Trend or Funnel insight

Go to Insights > New Insight and choose Trend to measure event frequency over time, or Funnel to see how users progress through steps. Select the event you want to measure (e.g., checkout_flow_viewed).

javascript
fetch('https://your-instance.posthog.com/api/insights/?token=YOUR_API_TOKEN', {
  method: 'GET'
}).then(res => res.json()).then(data => {
  console.log(data.results);
});
Retrieve insights via the PostHog API to track them programmatically

Add a property filter for flag variant

In the Insight, add a filter: flag_variant = control for one graph, then duplicate it and change to flag_variant = treatment. PostHog will show you the metrics side-by-side.

javascript
fetch('https://your-instance.posthog.com/api/trends/', {
  method: 'POST',
  headers: { 
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_TOKEN'
  },
  body: JSON.stringify({
    event: 'checkout_flow_viewed',
    properties: [{ key: 'flag_variant', operator: 'exact', value: 'treatment' }]
  })
}).then(res => res.json()).then(data => {
  console.log('Treatment events:', data.results[0].count);
});
Filter trends by variant property to get counts for each group

Calculate the percentage change

Once you have counts or rates for both variants, calculate impact: (treatment - control) / control × 100. If treatment has 1200 events and control has 1000, that's a +20% improvement.

Tip: Use Cohorts to create a group of users who saw each variant, then run your insight filtered to each cohort. This is more reliable than property filters.

Check Statistical Significance

Not all improvements are real. PostHog helps you determine if your result is statistically sound.

Use PostHog Experiments for automated significance testing

If you set up a feature flag as an Experiment in PostHog, it automatically tracks control vs. treatment and runs a chi-squared test for statistical significance. Open the Experiment and look for the Statistical Significance indicator.

javascript
fetch('https://your-instance.posthog.com/api/experiments/?token=YOUR_API_TOKEN', {
  method: 'GET'
}).then(res => res.json()).then(data => {
  const exp = data.results[0];
  console.log('Significance:', exp.significance);
  console.log('Confidence:', exp.confidence);
});
Fetch experiment results including statistical significance from the PostHog API

Look for p-value < 0.05

PostHog displays the p-value and confidence level (usually 95%). A p-value below 0.05 means there's less than a 5% chance the difference happened by random variation. That's the standard threshold for a statistically significant result.

Common Pitfalls

  • Only logging events for the treatment variant — you need both control and treatment sending the same event to compare them
  • Stopping analysis after one day — feature flag impact needs time to stabilize; wait at least one week of user data before drawing conclusions
  • Confusing event volume with conversion rate — a higher event count doesn't mean better impact if both groups got the feature; measure the rate (events per user) instead
  • Ignoring sample size — with small user counts, random variation can masquerade as impact; make sure you have at least 100+ events in each variant before trusting the result

Wrapping Up

Now you can ship a feature flag and actually know whether it worked. Track both variants, compare them in Insights, and check statistical significance before declaring victory. If you want to track this automatically across tools and get insights without manual analysis, Product Analyst can help.

Track these metrics automatically

Product Analyst connects to your stack and surfaces the insights that matter.

Try Product Analyst — Free