Guide: Identify Products Worth Price Testing

Prices get set at launch — based on cost, competition, or instinct — and then they sit. Months pass. Years pass. The market shifts, costs change, competitors reprice, and the original logic behind that $49.99 becomes increasingly hard to remember. Meanwhile, every sale is running at a margin that may have been wrong from day one.

This guide walks you through a three-step framework for figuring out which products to test, how to rank them by opportunity and risk, and how to sequence tests so each result builds on the last.

The Framework

This guide is organized around three steps, each one narrowing the field and adding a different layer of context.

ABC Classification: Which products generate enough revenue to make a test worthwhile?
Margin Headroom Analysis: Which products can absorb a conversion drop and still come out ahead on total profit?
Product Role Assessment: Which direction should you test, and which products need extra care?

Each step feeds into the next. By the end, you have a ranked shortlist with a clear test direction for each product and a documented rationale for each decision.

The steps draw from three established bodies of pricing research: ABC/Pareto analysis (a standard in retail inventory management), Nagle and Müller’s break-even framework from The Strategy and Tactics of Pricing (the academic standard in pricing strategy), and McKinsey’s Key Value Item methodology used in retail pricing. None of it requires a data team, it all runs in a spreadsheet.

Tier Classification: Which Products Matter?

Before thinking about pricing at all, you need to know which products are worth the effort. This is simple: which products are selling at the highest volume. In many catalogs, a small number of SKUs generate a large majority of revenue. For example, your distribution might follow the 80/20 rule: roughly 20% of products drive roughly 80% of revenue. Alternatively, if you have a broader base of SKUs, your revenue could be more evenly distributed.

For price testing, this matters because a successful test on an A-tier product delivers far more impact than the same test on a C-tier product, for exactly the same time investment. Step 1 identifies your A-tier products so you’re not spending three weeks learning something that barely affects the business.

In Shopify, pull your Total sales by product report for the last 90 days: Analytics → Reports → Total sales by product. Export to a spreadsheet and sort by total revenue, descending. Then add two columns (or use our template).

How to Run the Analysis

Cumulative revenue: A running total of revenue from the top of the list down. Row 1 equals that product's revenue. Row 2 equals Row 1 plus Row 2's revenue. And so on down the list.
Cumulative revenue %: Each row's cumulative revenue divided by your total revenue across all products.

Once those columns are filled in, your tier cutoffs become visible in the Cumulative revenue % column:

Everything up to roughly 80% is your A-tier.
The next band up to around 95% is your B-tier.
Everything below that is C-tier.

In practice, you're looking for the row where the cumulative percentage crosses each threshold.

The products above that line are the ones you carry forward.
In a catalog of 45 products, this typically produces 8–10 A-tier SKUs.
In a catalog of 200 products, closer to 30–40.

Either way, these are the only products you carry forward. C-tier products aren’t worth testing, because even a perfect outcome barely moves the business.

A Note on Seasonality

The 90-day window works well when it covers a typical trading period. If it includes a major peak, like a holiday, back-to-school, or summer peak, products that spike seasonally will look more important than they actually are for most of the year. If your window includes a peak, pull a second non-peak 90-day comparison. If products shift tiers significantly between the two windows, use the non-peak data. You want to test against sustainable volume, not a spike that’s already passed.

Don’t Ignore Traffic

While you’re in the data, pull one more report: Sessions by landing page (Analytics → Reports → Sessions by Landing Page, filtered to your product pages).

Look for products with high traffic but lower-than-expected conversion rates. A product with 2,000 monthly sessions converting at 1.2% against a store average of 3% has a problem worth investigating, and price is one of the most common causes. These products may not qualify as A-tier by revenue. They’re still candidates for a price decrease test. Flag them separately — they’ll follow a different hypothesis and a different test strategy than your increase candidates.

Output from Step 1

Your list of A-tier products — typically 5–15 SKUs — plus any flagged high-traffic, low-conversion products. These are the candidates you carry into Step 2.

Margin Headroom: Which Products Are Safest to Test?

Knowing which products matter tells you where to focus. Knowing which ones have margin room tells you how confidently you can test them. Products with higher gross margins can absorb a volume dip on a price increase without losing ground on total profit. This step ranks your A-tier candidates by that cushion.

The Logic: Nagle’s Break-Even Sales Change

This calculation comes from Thomas Nagle and Georg Müller’s The Strategy and Tactics of Pricing, the standard academic reference in pricing strategy. The formula finds the maximum volume decline that can occur before a price increase stops being profitable:

Break-Even Volume Change = −(Price Increase %) ÷ (Gross Margin % + Price Increase %)

Where Gross Margin % = (Price − COGS) ÷ Price, calculated using your current price before any change.

A product with a 60% gross margin and a 10% price increase breaks even on profit even if sales volume drops by up to 14.3%. That’s the safe zone — the range within which the increase is still profitable even if some customers walk away.

The Counterintuitive Result for Low-Margin Products

Running the same formula on a product with a 20% gross margin and a 10% price increase gives a break-even threshold of 33.3%. That’s a larger allowable drop than the high-margin example, which seems backwards.

The reason comes down to per-unit economics. Raising the price of a low-margin product by 10% — moving contribution from $20 to $30 per unit — is a 50% improvement in profit per sale. Each remaining customer is worth dramatically more, so you can afford to lose more of them before the math turns negative.

The same $10 increase on a high-margin product moving from $60 to $70 per unit is only a 17% improvement. The per-sale gain is smaller, so fewer lost customers tip the balance. A price increase is proportionally more powerful on a low-margin product because it changes the per-unit economics more dramatically.

That said, the percentage threshold isn’t the whole picture. The absolute dollar buffer tells a different story. A low-margin product losing 33 units × $20 contribution has $660 of room. A high-margin product losing 14 units × $60 has $858. High-margin products have a smaller percentage threshold but a larger dollar cushion.

One more thing the formula doesn’t tell you: how much volume will actually drop. Low-margin products tend to live in more competitive markets with more price-sensitive buyers. The actual volume loss from a price increase may exceed the break-even threshold even when the formula shows more theoretical room. Treat it as a planning tool, not a prediction.

How to Calculate It

For each A-tier product, you need:

Current selling price
COGS
Gross Margin % = (Price − COGS) ÷ Price, using the current price before any change

Run the break-even formula at a hypothetical 10% price increase and rank your A-tier products by the size of the threshold. The larger the allowable volume drop, the more headroom the test has.

Gross Margin

Break-Even Volume Drop at 10% Increase

Pricing Headroom

70%+

~12.5%

High

50–69%

~13–17%

Medium

30–49%

~17–25%

Lower (needs a strong hypothesis)

Below 30%

25%+

Proceed carefully and factor in likely price sensitivity

For reference across a wider range of scenarios:

Gross Margin

5% increase

10% increase

15% increase

20% increase

20%

20.0%

33.3%

42.9%

50.0%

30%

14.3%

25.0%

33.3%

40.0%

40%

11.1%

20.0%

27.3%

33.3%

50%

9.1%

16.7%

23.1%

28.6%

60%

7.7%

14.3%

20.0%

25.0%

80%

5.9%

11.1%

15.8%

20.0%

What This Step Tells You

High-margin products with large safe zones are your strongest increase test candidates. They have the most room to explore, the most to gain from finding upside, and the most buffer if conversion dips temporarily during the test.

Lower-margin products are still worth testing — they may just need a different kind of test. If they showed up in Step 1 as high-traffic/low-conversion, a decrease test makes more sense than an increase. If you’re planning a COGS-driven price increase and need to validate it before committing store-wide, a de-risking test is the right frame.

Output from Step 2

Your A-tier product list ranked by break-even headroom. The products at the top are your highest-confidence increase test candidates.

Product Role Assessment: What Direction, and In What Order?

Revenue and margin data tell you which products to prioritize and how safely you can test them. They don’t tell you everything. A product’s role in your catalog — what it means to the customer, and what it does for your brand — affects both the direction of the test and how carefully you need to run it.

The Logic: Key Value Items and Product Roles

This step draws from McKinsey’s Key Value Item (KVI) methodology and retail category management practice.

The underlying idea is that customers don’t treat all products the same way:

Some products are what they notice, compare, and use to decide whether your brand is fairly priced.
Others are purchased without much thought about price at all.

Mis-pricing a KVI doesn’t just affect that product’s conversion rate. It shapes how customers feel about your whole store. A brand that pushes its hero product past the point customers find reasonable won’t just see that SKU underperform — shoppers will second-guess the rest of the catalog too.

Margin builders don’t carry that risk. Customers are buying without comparing heavily, and price isn’t the central factor in their decision. These are the products where testing is most forgiving and most rewarding.

The Four Product Roles

Traffic Drivers / KVIs: products customers actively search for, compare across stores, and use to form their impression of your brand. These are typically your hero SKUs, your most-advertised products, or your entry-level gateway items. They set your price image. Getting them wrong costs you more than one conversion.
Margin Builders: solid-volume products purchased without heavy comparison shopping. Customers chose your brand and are here to buy — price sensitivity is lower and testing headroom is larger. These are your best first candidates for increase tests.
Basket Builders: products frequently bought alongside others: accessories, complementary items, add-ons. Their job is to complete a cart. Price them too high and they create friction at the worst possible moment.
Destination / Specialty Items: lower-volume, highly differentiated products customers sought out specifically. Because shoppers came looking for these rather than stumbling across them, price sensitivity is often lower. They can be strong increase candidates when margin data supports it.

How to Classify Your Products

For each A-tier product, work through these questions:

Is this product featured in paid ads, on your homepage, or in top navigation? It's likely a KVI or Traffic Driver.
Is it an entry-level or introductory product that acquires new customers? It's likely a KVI/Gateway.
Does it appear frequently in multi-product orders? It's likely a Basket Builder.
Do customers search for it by name, or arrive at your store specifically for it? It's likely a Destination/Specialty item.
None of the above: it's a Margin Builder.

Where to find this data:

Online Store: Look at your global navigation and your homepage. What’s featured?
Shopify Analytics: Reports → Sessions by Landing Page (which product pages pull the most direct and organic traffic?)
Your ad platforms: Which product pages are in active campaigns?

What This Changes About Your Shortlist

The most important output of Step 3 is sequencing. Your highest-margin products might also be your KVIs. The break-even formula says they have the most headroom. Product role analysis says to test them second.

A high-margin KVI is a handle-with-care candidate. Test your Margin Builders first. Learn the workflow, get comfortable reading results, and build some data before you touch anything high-stakes. Then approach your KVIs with smaller increments and closer monitoring.

One caveat: not every brand has KVIs in the traditional sense. If you're a premium or lifestyle brand, you're probably not competing on price at all. Your customers aren't comparison-shopping you against alternatives the way they would with a commodity product. Your edge is brand, design, and innovation, not being the known-value option. In that case, most of your catalog probably falls into Destination, Specialty, or Margin Builder territory. That's actually good news for testing, since those roles tend to have more headroom and lower risk.

Output from Step 3

Each A-tier product labeled with a role, a test direction, and a wave assignment. Your shortlist is complete.

Pulling It Together: Your Price Testing Playlist

Wave 1: Margin Builders in your A-tier. High volume, strong margin headroom, and no significant role-based risk. This is where you learn the workflow, build confidence, and start generating margin improvements.

Wave 2: KVIs and Traffic Drivers. Bring Wave 1 learnings with you. Use a smaller initial increment, benchmark competitors first, and watch results more closely. The upside is real, but so is the downside if you miscalibrate.

Separate track: Decrease test candidates. Products flagged in Step 1 for high traffic and low conversion are a different kind of opportunity. These are conversion plays with their own hypothesis and timeline.

Matching Products to Test Strategies

If the product is…

Recommended strategy

A-tier Margin Builder, price unchanged 12+ months

Unlock hidden revenue: test a 5–15% increase

Any product facing COGS increases or tariff pressure

De-risk price increases: test the new price on a portion of traffic before committing store-wide

A KVI or Traffic Driver

Test conservatively: smaller increments, benchmark competitors first, monitor closely

High-traffic but below-average conversion

Test a decrease: price may be the primary friction

If a product fits more than one row, default to the more conservative test direction. De-risking a necessary increase before trying to find the ceiling is always the right order of operations.

Three Sanity Checks Before You Finalize

Before committing your shortlist to a live test, run these three checks. Each takes under 10 minutes.

Check 1: Seasonality. Did your 90-day window include a major peak? If yes, pull a non-peak comparison. Products that drop a tier in the off-peak window should be treated as B-tier candidates — you don’t want a test running at half-speed because peak season ended the week after you launched.
Check 2: Variant complexity. The framework evaluates products as a whole. If any shortlisted product has significant conversion variance across variants — size, color, material — note it, but don’t let it stall you. Product-level analysis is the right starting point. Flag variants for a closer look after your initial tests.
Check 3: Competitor context. For any KVI or Traffic Driver on your list, spend 10 minutes checking where your price sits relative to your top 2–3 competitors. A product already priced above the market is a riskier increase candidate than one with room to move.

Write Your Hypotheses

Before launching any test, write down what you expect to happen and what you’ll do depending on the result. This takes 10 minutes and does two things: it forces you to articulate the reasoning behind the test, and it prevents you from rationalizing the outcome after the fact — especially on a null result.

Hypothesis Format

We believe [product name] can support a [X%] [increase/decrease] because [reason].
We expect conversion to [remain stable / improve / decline no more than X%].
If we see no statistically significant change, we will [hold the new price and test a further increment / hold the current price and investigate other friction points].

Example hypothesis: Price increase

We believe our Daily Cleanser can support a 12% price increase because it’s an A-tier Margin Builder with a 68% gross margin, the price hasn’t changed in two years, and our top competitors are priced higher.
We expect conversion to remain within 5% of baseline.
If we see no significant change, we’ll hold the increase and test a further 8% increment.

Example hypothesis: Price decrease

We believe our Starter Kit is priced above its role as a gateway product. Our new customer conversion rate on this product has declined over the past two quarters.
A 10% decrease may recover conversion and improve downstream lifetime value.
If we see no significant change, we’ll hold the current price and look elsewhere for the friction.”Pre-committing to what a null result means — in writing, before the test runs — keeps each test connected to the next one.

What Good Test Results Looks Like

Statistically Significant Outcomes

When a test produces a statistically significant result, you’ve found a boundary. If conversion drops meaningfully after a 15% increase, you know where the ceiling is, or that you’ve crossed it. That’s useful information because you can price right up to the line with confidence.

Inconclusive Outcomes

When a test produces no statistically significant change, you have room to keep going. Customers absorbed the increase without noticing, or without caring. You can implement the new price and design the next test at a higher increment.

Both outcomes are wins, because both tell you something actionable. The only outcome that wastes your time is drawing a conclusion before the data is ready, or stopping after a single test when there’s more to find.

Consider a brand that runs three sequential increases — +8%, +6%, +5% — and sees no conversion impact at any step. By the end, they’ve found a meaningfully better price without ever hitting a cliff. Each null result moved them forward.

When a test shows no statistically significant change, take the increase and test higher. When it shows a clear result, you’ve found your boundary. Either way, you have something to work with.

Common Mistakes

Testing C-tier products based on gut feel. “This feels underpriced” might be right. But if the product sits in your C-tier, it won’t move the business even if you’re correct. Run the ABC classification first and let that narrow the field.
Testing your hero product first. Test your Margin Builders first. Learn the workflow, get a feel for what results look like, and establish your hypothesis discipline before you put your most important SKU on the line.
Using peak-season data to assess volume tier. A product that looks like a top-10 SKU in January based on November–January data may sit in the middle of the catalog for the other nine months. Check your window against your seasonal calendar before assigning tiers.
Looking only at sales, ignoring traffic. A product with 2,000 monthly sessions converting at 1.5% is more interesting than one with 400 sessions converting at 4%. The first has a problem worth diagnosing. Pull sessions by landing page data alongside your product sales data.
Reverting after a null result. No statistically significant change means take the price and test higher. Reverting is the most common and most costly mistake in price testing.
Testing too many products at once. Start with three to five products, or products in the same category or collection. More than that splits your focus and delays the learnings you need for Wave 2.
Ignoring product relationships. A price change on a frequently bundled product affects how the whole cart feels. Check cross-sell relationships before launching and watch multi-product order rates alongside conversion during the test.

Example: A Skincare Brand Works Through the Framework

A DTC skincare brand with 45 SKUs wants to identify their first price testing candidates.

Step 1: ABC Classification

They export 90 days of Shopify Sales by Product data and sort by revenue. Nine products account for roughly 80% of revenue. This is their A-tier.

Before moving on, they check the window. It includes December. They pull a May–July comparison and find that two products, both gift-set SKUs, drop significantly in the off-peak data. Those are removed from the A-tier. Their final A-tier is 7 products.

They also pull sessions by landing page and flag the Starter Kit for high traffic and below-average conversion. This is a potential decrease test candidate, regardless of margin.

Step 2: Margin Headroom Analysis

They cross-reference their A-tier with COGS records, flag two estimated rows, and run Nagle’s break-even formula at a hypothetical 10% increase.

Product

Gross Margin

Break-Even Threshold

Headroom Rank

Daily Cleanser

68%

~12.8% volume drop

Vitamin C Serum

65%

~13.5%

Hero Serum

71%

~12.4%

Linen Face Towel

72%

~12.2%

Eye Cream

55%

~15.4%

Body Oil

48%

~17.2%

Starter Kit

41%

~19.6%

All seven have meaningful headroom, but the top four have the largest safe zones. The Starter Kit sits last, and combined with its Step 1 flag, it clearly needs a different test altogether.

Step 3: Product Role Assessment

Hero Serum: This is in all paid ads, navigation, and is brand-defining. It’s a KVI/Traffic Driver.
Starter Kit: This is an entry-level product, and is the most common first purchase. It’s KVI/Gateway.
Daily Cleanser, Vitamin C Serum, Linen Face Towel, Eye Cream, Body Oil: These are solid-volume products with no significant role risk. They’re Margin Builders.

The Final Price Testing Playlist

Wave 1 is four Margin Builders with strong headroom and no role-based risk. The goal here is to learn the workflow before touching anything higher-stakes.

The Hero Serum moves to Wave 2 with a smaller increment and closer monitoring.

The Starter Kit gets its own test and its own hypothesis, because it’s a customer acquisition play, not a margin play.

Product

Role

Strategy

Wave

Daily Cleanser

Margin Builder

Unlock Hidden Revenue (+10–12%)

Wave 1

Vitamin C Serum

Margin Builder

Unlock Hidden Revenue (+10–12%)

Wave 1

Linen Face Towel

Margin Builder

Unlock Hidden Revenue (+10–12%)

Wave 1

Eye Cream

Margin Builder

Unlock Hidden Revenue (+8–10%)

Wave 1

Hero Serum

KVI

Unlock Hidden Revenue (+8%, conservative)

Wave 2

Starter Kit

KVI/Gateway

Test decrease (–10%), conversion play

Separate track

Next Steps

Work through the steps in order, using our template.

Export 90 days of Shopify Sales by Product data and run the ABC classification
Add your margin data and calculate break-even headroom for your A-tier, using current prices for the contribution margin calculation
Classify each A-tier product by role
Run the three sanity checks: seasonality, variant complexity, competitor context
Write a hypothesis for each of your top three to five candidates, including your planned response to a null result
Launch your first Wave 1 test

Start with one product, then apply the same process to your full Wave 1 queue once you’re comfortable with how a test runs.

Lastly, this isn’t a one-time exercise. Re-run it quarterly, or any time something meaningful changes: new supplier costs, a competitor reprice, completed test results, or new products entering your catalog. Products that were B-tier candidates today can move up quickly as conditions shift.

PreviousGuide: How to Use Shopify Magic to Generate New Theme Blocks for Testing NextGuide: Find a Winning Price Through Iterative Price Testing

Last updated 19 days ago

Was this helpful?

hashtagThe Framework

hashtagTier Classification: Which Products Matter?

hashtagHow to Run the Analysis

hashtagA Note on Seasonality

hashtagDon’t Ignore Traffic

hashtagMargin Headroom: Which Products Are Safest to Test?

hashtagThe Logic: Nagle’s Break-Even Sales Change

hashtagThe Counterintuitive Result for Low-Margin Products

hashtagHow to Calculate It

hashtagWhat This Step Tells You

hashtagProduct Role Assessment: What Direction, and In What Order?

hashtagThe Logic: Key Value Items and Product Roles

hashtagThe Four Product Roles

hashtagHow to Classify Your Products

hashtagWhat This Changes About Your Shortlist

hashtagPulling It Together: Your Price Testing Playlist

hashtagMatching Products to Test Strategies

hashtagThree Sanity Checks Before You Finalize

hashtagWrite Your Hypotheses

hashtagWhat Good Test Results Looks Like

hashtagCommon Mistakes

hashtagExample: A Skincare Brand Works Through the Framework

hashtagStep 1: ABC Classification

hashtagStep 2: Margin Headroom Analysis

hashtagStep 3: Product Role Assessment

hashtagThe Final Price Testing Playlist

hashtagNext Steps

The Framework

Tier Classification: Which Products Matter?

How to Run the Analysis

A Note on Seasonality

Don’t Ignore Traffic

Margin Headroom: Which Products Are Safest to Test?

The Logic: Nagle’s Break-Even Sales Change

The Counterintuitive Result for Low-Margin Products

How to Calculate It

What This Step Tells You

Product Role Assessment: What Direction, and In What Order?

The Logic: Key Value Items and Product Roles

The Four Product Roles

How to Classify Your Products

What This Changes About Your Shortlist

Pulling It Together: Your Price Testing Playlist

Matching Products to Test Strategies

Three Sanity Checks Before You Finalize

Write Your Hypotheses

What Good Test Results Looks Like

Common Mistakes

Example: A Skincare Brand Works Through the Framework

Step 1: ABC Classification

Step 2: Margin Headroom Analysis

Step 3: Product Role Assessment

The Final Price Testing Playlist

Next Steps