6 min read

What Is A B Testing in Amplitude

A/B testing lets you run controlled experiments to see how changes affect user behavior. In Amplitude, you assign users to control and variant groups, track their events separately, and use statistical tools to determine which version wins. Without this, you're just guessing whether your changes actually move the needle.

What Is A/B Testing in Amplitude

A/B testing (also called split testing) compares two or more versions of a feature to measure which performs better. Amplitude segments your users into groups—one sees the control (baseline), others see variants. All events are tagged by variant, so you can compare behavior across groups.

Track users with their variant assignment

When a user enters your experiment, assign them to a control or variant group using Amplitude's Identify method. Tag them with the variant as a user property. Every event they fire afterward automatically inherits that property, making it filterable in dashboards.

javascript
// Assign user to experiment variant
const variant = Math.random() > 0.5 ? 'control' : 'variant_b';
amplitude.identify(new amplitude.Identify().set('experiment_variant', variant));

// Track event—variant is now attached
amplitude.track('Pricing Page Viewed', {
  experiment_name: 'pricing_redesign_2026',
  experiment_variant: variant
});
Set the variant once per user, then reference it in all experiment events

Define your success metric

Pick what you're measuring: sign-ups, clicks, revenue, time on page, or feature adoption. This becomes your primary metric. The control group is your baseline. You'll compare each variant's metric against the control to find the winner.

javascript
// Track the conversion event consistently
amplitude.track('Subscription Purchased', {
  plan_tier: 'professional',
  revenue: 99,
  experiment_variant: amplitude.getUserProperty('experiment_variant'),
  experiment_name: 'pricing_redesign_2026'
});
Always include experiment_variant so you can segment results by group
Watch out: Assign variants deterministically—once per user, stored in localStorage or your backend. Randomizing on every page load breaks the experiment. If you reassign users mid-test, results become unreliable.

Setting Up and Running an Experiment

Amplitude's Experiments feature automates variant assignment and statistical calculations. You define the groups, Amplitude randomizes users, and the dashboard shows confidence levels as data flows in.

Create the experiment in the dashboard

Go to Experiments in the left sidebar. Click Create New Experiment. Name it, set your control and variant descriptions, then pick your eligibility criteria (all users, or a specific segment). Amplitude automatically randomizes new users 50/50 between groups.

javascript
// Fetch variant assignment from Amplitude's Experiment API
const getExperimentVariant = async (userId) => {
  const response = await fetch(
    `https://api.amplitude.com/2/users/${userId}`,
    {
      headers: { 'Authorization': `Bearer ${process.env.AMPLITUDE_API_KEY}` }
    }
  );
  const data = await response.json();
  return data.user_properties?.experiment_variant || 'control';
};

// Or read the variant from the user's properties
const variant = amplitude.getUserProperty('experiment_variant');
Server-side: fetch from API. Client-side: read from user properties set by Identify

Wait for statistical significance

Amplitude calculates a p-value and confidence level (usually 95% is the target). Don't end the experiment after 24 hours because early results are noisy. Run it for at least 5-7 days, longer if traffic is low. The dashboard shows Confidence and p-value—ship the variant when confidence hits 95% or higher.

Analyze results in the dashboard

Once the test runs, open the Experiment Results view. Amplitude shows the metric (e.g., conversion rate) for each group, the relative lift (how much better the variant is), and the statistical significance. If confidence is below 80%, the test is inconclusive—keep running.

Tip: Use Funnel Analysis to compare how each variant progresses through your user journey. If variant B has higher funnel completion but the same overall conversion rate, you're measuring the right thing.

Common Mistakes and How to Avoid Them

A/B testing seems straightforward but is easy to botch. Here's what trips most teams up.

Don't peek at results too early

Your instinct will be to check results after a day and declare a winner. That's a false positive trap. Early results are dominated by noise. Set a minimum sample size upfront, stick to it, and only end when confidence reaches 95%. Amplitude shows confidence—trust the math, not your gut.

javascript
// Wrong: checking results after 1000 users
if (conversionRate_variant > conversionRate_control * 1.1) {
  // STOP—this is likely noise
}

// Right: let Amplitude calculate confidence
// Run until Amplitude dashboard shows 95% confidence or p < 0.05
Let Amplitude do the stats. Don't stop early.

Keep variant assignment consistent

If you randomize the variant on every page load, a user might see both control and variant, polluting results. Assign once, store it (localStorage, session, or backend), and reuse it for the entire test duration. Use Amplitude's Identify to lock it in.

Don't run conflicting experiments on the same feature

If you run Test A (button color) and Test B (button text) simultaneously on the same button, effects are confounded. You can't tell which change drove the result. Run one experiment, ship the winner, then test the next change.

Common Pitfalls

  • Stopping early because results look good. Early data is noisy. Wait for statistical significance (95% confidence) even if it takes weeks.
  • Randomizing variant assignment on every page load instead of once per user. This breaks the experiment—users see both variants, mixing results.
  • Forgetting to include experiment_variant in event properties. Without it, you can't segment the dashboard by group to compare metrics.
  • Running too many concurrent experiments on the same feature. Each test needs dedicated traffic. Overlapping tests make it impossible to isolate cause and effect.

Wrapping Up

A/B testing in Amplitude turns feature decisions into data-driven calls. Assign variants consistently, let time and traffic build statistical power, and ship what wins. It's the difference between hoping a change works and knowing it does. If you want to track this automatically across tools and run experiments without engineering overhead, Product Analyst can help.

Track these metrics automatically

Product Analyst connects to your stack and surfaces the insights that matter.

Try Product Analyst — Free