How do you run a split test?

I want to measure the business impact of some code changes in production.

For example, I want to see what difference optimizing my LCP has on my site’s conversion rate.

Ideally, I want to run a split test (or A/B test) with my changes on one side and the previous version of the code on the other.

  • How should I run this kind of test?
  • Do I need special tools?
  • How do I know my changes actually made the difference in conversions? (And that it wasn’t some other factors?)

For A/B testing, you need to have a control group and an experiment group and a mechanism to split traffic between the two groups. On a cold visit, where a user is not part of either group, you’ll have a basic function that is responsible for assigning a user to one group or the other and setting a cookie. This code can run on the client, on the server, or on the edge (my preferred method).

Once a user is in a group, you keep serving them the same variant they received the first time around on every subsequent page view. If you do use page caching and have the A/B testing on the server or the edge, the cookie that defines the user’s group is something you need to account for when the page is served to the user so you can make sure they receive the same variant as before. If your A/B test runs on the client, the variant logic is handled after the HTML is received from the server.

You’ll also need a way to record which group someone is in so you can identify the winner of the test (if there is one) and correctly attribute the conversion accordingly. There are purpose-built vendors for running tests (VWO, Optimizely, et al), as well as other feature flag platforms that promise this (e.g. Split, LaunchDarkly), but you don’t need one necessarily if you are an occasional tester. You can make due with an analytics tool like Google Analytics that will allow you to define custom variables to handle the attribution and some application logic to assign the users to a group. I have done the GA method in conjunction with a Cloudflare Worker, and worked out really well.

To see how long you should run your test, I use a calculator like this one to see how big my sample size needs to be: Sample Size Calculator (Evan’s Awesome A/B Tools). Also, you should limit what you are testing (as well as code and development changes) on each page that is part of the test so you know the impact wasn’t from some outside factor.

1 Like