Amazon runs thousands of experiments every year across its platform. Microsoft Bing discovered a single A/B test that added $100 million to their annual US revenue. Ecommerce A/B testing separates stores that guess from stores that know, and that distinction shows up directly in conversion rates.
Around 77% of businesses now run split tests on their websites, yet most still scratch the surface of what’s possible. This guide covers what testing actually involves, which elements deserve priority, and the tools that make systematic experimentation practical.
What Is A/B Testing and Why Online Stores Do It
Split your traffic between two versions of a page. Track which one drives more conversions. Implement the winner. That’s A/B testing at its simplest.
Online stores invest in this approach because guessing costs money. Every assumption you make about what customers want represents potential lost sales if you’re wrong. Testing removes the guesswork and replaces opinions with evidence, which is why companies running systematic testing programmes achieve 48% higher conversion rates than those relying on gut instinct alone.
Product page optimisation typically increases conversion rates by 12-28% when properly executed, and these gains compound as each winning test builds on previous improvements. The maths becomes compelling quickly. A store converting at 2% that improves to 2.5% has increased revenue by 25% from the same traffic.
Statistical significance matters. Running a test until you see a difference isn’t the same as running a test until that difference is statistically reliable. Most tools require at least a 95% confidence level before declaring a winner, meaning there’s only a 5% chance the result happened by random variation rather than your actual change.
Elements Every Ecommerce Store Should Test
Knowing that testing improves performance differs from knowing where to start. The following elements consistently yield measurable improvements.
Call-to-Action Buttons
Your add-to-cart and buy-now buttons are the most direct path to revenue. For a deeper dive into effective button design, see our guide on CTAs for ecommerce sites.
Andreas Carter Sports tested changing their CTA button from green to blue and saw conversions increase by 50%. GoCardless changed their button text from “Request a demo” to “Watch a demo” and achieved a 139% increase in form completions. Neither change required development time or design overhauls.
CTA elements worth testing:
- Colour: contrast against page background, brand alignment
- Size: larger buttons on mobile, prominence relative to other elements
- Position: above fold versus after product details
- Copy: first-person (“Add to my basket”) versus second-person (“Add to your basket”)
- Shape: rounded corners versus sharp edges, button versus text link
Research indicates that CTAs using anchor text outperform banner-style buttons by 121%. Position matters too. Placing buy buttons above the fold typically improves conversions, but higher-priced items sometimes benefit from CTAs positioned after detailed product information, giving customers time to build confidence.
Product Page Layouts
Image size, description placement, review positioning, and trust signals all compete for the visitor’s eye on a single page. Testing lifestyle photography against studio shots often reveals surprising preferences. Some audiences respond better to products shown in real-world contexts, while others prefer clean, detailed product-only images.
Review placement influences trust and conversion simultaneously. Displaying ratings prominently near the price can increase confidence, but the format matters as much as the position. Testing star ratings versus numerical scores, showing review counts versus hiding them, or featuring negative reviews prominently versus burying them all produce measurable differences in purchasing behaviour.
My preference leans toward testing description length first, as this typically produces larger conversion differences than copy style changes. Product pages for complex items often benefit from longer, more detailed descriptions, while impulse purchases convert better with minimal friction between interest and checkout.
Checkout Process
Our ecommerce CRO framework covers how to identify where your funnel leaks revenue. Cart abandonment remains the biggest problem for most stores, and testing checkout elements directly addresses it.
High-impact checkout tests:
| Element | What to Test | Potential Impact |
|---|---|---|
| Progress indicators | Numbered steps vs progress bar vs none | 11-25% completion improvement |
| Payment visibility | Show all methods upfront vs reveal on selection | 8-19% completion improvement |
| Guest checkout | Prominent vs hidden vs forced account creation | Significant first-time buyer impact |
| Shipping thresholds | Different free delivery minimums | 15-23% abandonment reduction |
Single-page versus multi-step checkout itself warrants testing, as different audiences have different patience thresholds. Younger shoppers accustomed to one-click purchasing often prefer condensed single-page checkouts, while older customers may find them overwhelming and appreciate separate steps for shipping, billing, and confirmation.
Homepage and Navigation
Hero images versus videos create different engagement patterns. Adding video to landing pages can improve conversion rates by up to 80% according to research, though this depends heavily on video quality and load time impact.
Testing simplified menus against full mega-menus reveals customer preferences for breadth versus depth in product discovery. The optimal menu layout depends on your catalogue size and how customers typically search for products. Autocomplete features, filtering options, and result display formats all influence whether visitors find relevant products.
Promotional Offers and Pricing
Percentage discounts versus pound-off promotions perform differently depending on price points. Lower-priced items often convert better with percentage discounts (“20% off”) while higher-priced items respond to absolute amounts (“Save £50”).
Free shipping thresholds represent a testing opportunity for average order value. Setting thresholds too low leaves money on the table, while setting them too high discourages purchases entirely. Countdown timers and stock indicators influence purchase timing, and our guide on urgency and scarcity covers the psychology behind these tactics.
A/B Testing Tools For Ecommerce Stores
The market ranges from free basic functionality to enterprise platforms costing tens of thousands annually. Choosing the right tool depends on your traffic volume, technical resources, and testing ambitions.
| Tool | Best For | Starting Price | Key Strength |
|---|---|---|---|
| VWO | Beginners | Free tier available | Visual editor + heatmaps |
| Convert | Growing stores | ~£250/month | Customer support |
| Crazy Egg | Visual learners | £80-400/month | Scroll maps + heatmaps |
| Optimizely | Enterprise | ~£30,000/year | Statistical modelling |
| AB Tasty | Personalisation | Custom pricing | AI recommendations |
| Kameleoon | Mobile apps | Custom pricing | Cross-platform testing |
For stores just beginning their testing journey, VWO’s free tier handles basic experiments well. The visual editor lets non-technical users create test variations by clicking and editing page elements directly, and the platform combines testing with heatmaps and session recordings that provide context for why tests succeed or fail.
Enterprise operations with complex requirements typically gravitate toward Optimizely. The statistical modelling capabilities exceed simpler tools, providing confidence in results from smaller sample sizes. Annual contracts make it impractical for smaller stores but valuable for high-traffic operations where testing improvements translate to significant revenue.
Running Effective Tests
Testing random changes produces random results. Effective programmes begin with understanding where customers struggle. Analytics data reveals high-exit pages and abandoned funnels, session recordings show specific friction points, and customer surveys explain why visitors behave as they do. This research generates hypotheses worth testing rather than guesses about button colours.
Instead of assuming a green button might outperform blue, you identify that customers struggle to find the add-to-cart button on mobile, then test making it larger and more prominent. Hypothesis-driven testing produces more winners than random experimentation because you’re solving real problems rather than changing things for the sake of change.
Sample Size and Duration
Small sample sizes produce unreliable results. A test showing 10 conversions from 100 visitors versus 8 from another 100 hasn’t proven anything meaningful. Statistical calculators help determine required sample sizes before launching tests.
The two-week minimum: Weekly and seasonal patterns affect user behaviour, meaning tests running only during weekdays or only during promotional periods may not reflect normal performance. Running tests for at least two full business cycles captures these variations. Black Friday traffic behaves differently from January traffic, and weekend shoppers often have different intent than weekday browsers.
One Variable at a Time
Changing multiple elements simultaneously makes it impossible to know which change caused any observed difference. If you modify button colour, position, and copy in a single test, a positive result doesn’t tell you which change mattered. You’ve learned that one of your changes improved conversions, but not which one. Most ecommerce stores benefit more from sequential A/B tests than complex multivariate experiments.
Mobile Deserves Its Own Tests
Desktop test results rarely translate directly to mobile experiences. User behaviour differs significantly between devices, and 89% of successful testing programmes create mobile-specific variations rather than assuming desktop wins apply universally. Screen size constraints, touch interactions, and mobile user mindset all affect which designs convert best. A checkout flow that performs brilliantly on desktop might frustrate mobile users with small tap targets and excessive scrolling.
Document Everything
Recording what you tested, why you tested it, what results you achieved, and what you learned creates institutional knowledge that compounds over time. Documentation prevents repeated testing of ideas that already failed and helps new team members understand why current designs exist. The most valuable outcome of a testing programme isn’t any single winning variation. It’s the accumulated understanding of what your specific customers respond to.
Getting Started
Stores new to systematic testing should begin with high-impact, low-complexity changes. CTA button tests require minimal development effort while addressing elements that directly influence revenue. Starting with your highest-traffic pages maximises the chance of achieving statistical significance quickly, and a single winning test often generates enough confidence and revenue to justify expanding the programme.
For stores lacking time or expertise to run testing programmes internally, our ecommerce CRO service handles strategy, implementation, and analysis. The stores achieving the best conversion rates aren’t necessarily those with the most advanced designs. They’re the ones that systematically test, learn, and improve. Ecommerce A/B testing provides the framework for that continuous improvement, turning assumptions into evidence and guesses into gains.



