Choosing an A/B testing method


user testing metrics on TryMyUI


How should you decide on one of the A/B testing methods described above? Which one is the best fit for your research project?

Sign up for our FREE trialto start A/B testing your usability!

Get started    

Request a demo

Firstly, decide how much time users will need to spend on each prototype. Will there be enough time in the video portion of the test for them to get all the way through both prototypes? If not, Method #3 (“Separate tests for each version”) will be the best fit.

This option allows the users to dedicate the entire duration of their test to just a single set of designs, so you’ll be able to get into greater detail and ensure that you receive abundant and deep feedback on all aspects of the designs.

Another benefit of Method #3 is that you can collect psychometric scores (like the SUS or PSSUQ) for both designs. If you include both design versions in a single test setup (as in Methods #1 and #2), this isn’t possible, as a psychometric questionnaire can only be included once per test. This allows you an additional vector of comparison, besides the task-level data for usability, completion, and duration.

The downside of using Method #3 is that you won’t get to hear direct contrasts of your designs from users that have seen both. Methods #1 and #2 enable you to collect this kind of feedback, and it can be very insightful. When users have been exposed to multiple design variants, it’s easier for them to notice what they like and don’t like about a specific version, as they can use the two designs as points of cross-reference.

When using Methods #1 and #2, you can still collect task usability, completion, and duration data, and if you’ve repeated the same series of tasks for each version, you can directly compare the stats for analogous screens (for example, if the checkout step for Version A was Task 4, and then Task 9 was the checkout step for Version B, you can compare the user testing metrics for Tasks 4 and 9 to see which checkout version performed better).

As for choosing between Method #1 (“Same test, two links”) and Method #2 (“Same test, one link”), that choice mainly depends on the order in which you’d like to present your design variants, how many variants you have, and how similar the different versions of the flow are to each other. Read more: Comparative usability testing