A/B testing lets you run controlled experiments to see how changes affect user behavior. In Amplitude, you assign users to control and variant groups, track their events separately, and use statistical tools to determine which version wins. Without this, you're just guessing whether your changes actually move the needle.
What Is A/B Testing in Amplitude
A/B testing (also called split testing) compares two or more versions of a feature to measure which performs better. Amplitude segments your users into groups—one sees the control (baseline), others see variants. All events are tagged by variant, so you can compare behavior across groups.
Track users with their variant assignment
When a user enters your experiment, assign them to a control or variant group using Amplitude's Identify method. Tag them with the variant as a user property. Every event they fire afterward automatically inherits that property, making it filterable in dashboards.
// Assign user to experiment variant
const variant = Math.random() > 0.5 ? 'control' : 'variant_b';
amplitude.identify(new amplitude.Identify().set('experiment_variant', variant));
// Track event—variant is now attached
amplitude.track('Pricing Page Viewed', {
experiment_name: 'pricing_redesign_2026',
experiment_variant: variant
});Define your success metric
Pick what you're measuring: sign-ups, clicks, revenue, time on page, or feature adoption. This becomes your primary metric. The control group is your baseline. You'll compare each variant's metric against the control to find the winner.
// Track the conversion event consistently
amplitude.track('Subscription Purchased', {
plan_tier: 'professional',
revenue: 99,
experiment_variant: amplitude.getUserProperty('experiment_variant'),
experiment_name: 'pricing_redesign_2026'
});Setting Up and Running an Experiment
Amplitude's Experiments feature automates variant assignment and statistical calculations. You define the groups, Amplitude randomizes users, and the dashboard shows confidence levels as data flows in.
Create the experiment in the dashboard
Go to Experiments in the left sidebar. Click Create New Experiment. Name it, set your control and variant descriptions, then pick your eligibility criteria (all users, or a specific segment). Amplitude automatically randomizes new users 50/50 between groups.
// Fetch variant assignment from Amplitude's Experiment API
const getExperimentVariant = async (userId) => {
const response = await fetch(
`https://api.amplitude.com/2/users/${userId}`,
{
headers: { 'Authorization': `Bearer ${process.env.AMPLITUDE_API_KEY}` }
}
);
const data = await response.json();
return data.user_properties?.experiment_variant || 'control';
};
// Or read the variant from the user's properties
const variant = amplitude.getUserProperty('experiment_variant');Wait for statistical significance
Amplitude calculates a p-value and confidence level (usually 95% is the target). Don't end the experiment after 24 hours because early results are noisy. Run it for at least 5-7 days, longer if traffic is low. The dashboard shows Confidence and p-value—ship the variant when confidence hits 95% or higher.
Analyze results in the dashboard
Once the test runs, open the Experiment Results view. Amplitude shows the metric (e.g., conversion rate) for each group, the relative lift (how much better the variant is), and the statistical significance. If confidence is below 80%, the test is inconclusive—keep running.
Common Mistakes and How to Avoid Them
A/B testing seems straightforward but is easy to botch. Here's what trips most teams up.
Don't peek at results too early
Your instinct will be to check results after a day and declare a winner. That's a false positive trap. Early results are dominated by noise. Set a minimum sample size upfront, stick to it, and only end when confidence reaches 95%. Amplitude shows confidence—trust the math, not your gut.
// Wrong: checking results after 1000 users
if (conversionRate_variant > conversionRate_control * 1.1) {
// STOP—this is likely noise
}
// Right: let Amplitude calculate confidence
// Run until Amplitude dashboard shows 95% confidence or p < 0.05Keep variant assignment consistent
If you randomize the variant on every page load, a user might see both control and variant, polluting results. Assign once, store it (localStorage, session, or backend), and reuse it for the entire test duration. Use Amplitude's Identify to lock it in.
Don't run conflicting experiments on the same feature
If you run Test A (button color) and Test B (button text) simultaneously on the same button, effects are confounded. You can't tell which change drove the result. Run one experiment, ship the winner, then test the next change.
Common Pitfalls
- Stopping early because results look good. Early data is noisy. Wait for statistical significance (95% confidence) even if it takes weeks.
- Randomizing variant assignment on every page load instead of once per user. This breaks the experiment—users see both variants, mixing results.
- Forgetting to include experiment_variant in event properties. Without it, you can't segment the dashboard by group to compare metrics.
- Running too many concurrent experiments on the same feature. Each test needs dedicated traffic. Overlapping tests make it impossible to isolate cause and effect.
Wrapping Up
A/B testing in Amplitude turns feature decisions into data-driven calls. Assign variants consistently, let time and traffic build statistical power, and ship what wins. It's the difference between hoping a change works and knowing it does. If you want to track this automatically across tools and run experiments without engineering overhead, Product Analyst can help.