Test Progress

Your test report will display a summary of how your test data is progressing in the form of various status banners displayed at the top of your test report page.

Understanding the status of your test

There are three key pieces of information displayed in test reports that provide different information relating to the validity of your test results before significance is reached:

Progress

Progress is a semantic indicator that measures the reliability with which test results can be certified as real. Any test is only a small sample of data in the overall lifetime of your store, so progress is used to tell you if your test results are a representative sample or if they are an anomaly.

Probability to win

Probability to win is the estimated chance that one test experience outperforms another. The percentage relates to the certainty of your test overall. As data is collected and your test approaches statistical significance, the probability to win for each variant becomes more credible.

Estimated time to significance

Estimated time to significance reflects the approximate remaining time necessary to reach a statistically significant result. This estimate varies based on the volume and velocity of data collected, and is updated dynamically as your test progresses.

Active test statuses

Gathering data

  • If your test is in the "Gathering data" stage, this means that the sample size of your test is not yet large enough to provide an indicator of credibility in your test results, and more data is needed.

  • This is the default status upon launching a test, and you should keep your test running until enough data has been gathered to provide subsequent status updates.

Trends in test data can only be validated after the time requirement (at least 3 days) and sample size requirement (30 orders per test variant) are both met. For more information about these requirements, see Statistical Significance.

Trending (positive or negative)

  • As data is collected and confidence in your results improves, a "Trending" status will be displayed, which indicates that the changes you are testing are trending either positively or negatively in relation to your original.

  • While a "Trending" test provides an indication of the initial degree of success of your test, it is just an indication, and more data is needed to certify that your results are valid.

Nearing significance

  • When your test has gathered enough data to indicate that significance will be achieved shortly, if the trend continues, a "Nearing significance" status will be displayed. This indicates that there is a strong probability that your test results are valid.

  • Shoplift does not advise making decisions on tests that have reached this status, because there may still be continued variability in test results.

Significant

  • When your test has gathered enough data to provide a confidence level of 95% or greater, a "Significant" status will be displayed, and your test results are certified as credible. This means that your test results have achieved 80% statistical power.

  • When a test enters this stage, you can confidently proceed with making decisions based on your test results.

Long timeline expected

  • If your test has a total duration of 14 days or more, and the estimated time to significance remains another 14-60 days out, your test will enter a "Long timeline expected" status.

  • This indicates that while the changes you are testing may reach significance, it will require more data than you may be willing to wait for, depending on what you are testing.

Significance unlikely

  • If your test has a total duration of 14 days or more, and the estimated time to significance remains greater than 60 days out, your test will enter a "Significance unlikely" status. This indicates that either the changes you are testing are not large enough to result in a meaningful change in performance, or that your sample size remains too low to provide an indication of credibility in your results.

  • If a test reaches the "Significance unlikely" stage, Shoplift recommends ending the test and trying a new test idea. If you would like to continue to run the test and collect data, you can, and as soon as the sample size is large enough to provide one of the above statuses, your test will exit this status.

Ended test statuses

When you end a test, your test report will provide a summary status that explains the credibility of your test results, depending on the data collected and the degree of certainty in your results when the test was ended.

Significant

  • If your test achieved significance, a "Significant" status will be displayed that certifies your original or variant experience as the winner at a confidence level of 95%.

  • If your variant is the winning experience, you can confidently proceed with implementing that experience as the default experience going forward. For more information on implementing a test variant, see Implementing a winning test.

Inconclusive (not enough data was gathered)

  • If your test was ended before reaching the "Significant" state, it will be marked as inconclusive because enough data was not gathered to determine a clear winner.

Inconclusive (determining a winner was unlikely):

  • If your test was ended when it was in the "Significance unlikely" state, it will be marked as inconclusive but the banner will indicate that significance for this test idea was unlikely to be achieved. This distinction from the more generic Inconclusive state can be helpful in determining the difference in which of your historical test ideas may have reached significance with enough data, and which were unlikely to achieve significance at all.

If you find that your tests often result in no winner being found, it means you may not be testing changes that are significant enough to impact performance in a meaningful way. If this is the case, we suggest testing more dramatic changes.

Last updated