What are the requirements and limits for CPP A/B Testing?

This article lists what you can test with CPP A/B Testing, what the system needs to run a test, and the key limits that can affect setup.

Learn more about what is CPP A/B Testing and how it works.

Before you start;

You can’t create an A/B test for an ad group that you recently created. To create an A/B test, the selected ad group must have at least 28 complete days of performance data.

The system uses the last 28 days’ traffic as a benchmark to estimate:

test duration
required traffic volume

Supported test setups

Setup	What you compare	Key limit
Multiple ad groups + one custom product page	Ad group configurations for one custom product page	Up to 4 ad groups, 1 custom product page
One ad group + multiple custom product pages	Product pages for one ad group	Up to 4 custom product pages
Switching ad groups	Product pages by rotating duplicated ad groups	Up to 4 custom product pages
Switching ads	Product pages by rotating ads inside one ad group	Up to 4 custom product pages

Precision limits (desired precision)

Desired precision is the margin of error you’re willing to accept in the results.
Supported range: 1%–10%
- 1% = more accurate (narrower margin of error)
- 10% = less accurate (wider margin of error)
The system caps precision at 10%.

Confidence and statistical reliability

Tests are designed to deliver results with 80%–99% range.
The system uses 90% confidence as the standard value, but you can choose a different confidence level.
In the test performance monitoring dashboard, you’ll see a Confidence Level badge per variant:
- Green = reached desired confidence (statistically significant)
- Gray = still collecting data

Switch period requirements (Switch tests)

Only one variant is live at a time. The system selects a switch interval based on traffic and traffic fluctuations.
Minimum traffic thresholds:
- For an hourly switch, the ad group must receive at least 100 taps or 30 installs per 6–8 hour slot, depending on the number of variants in the test.
- For a daily switch, the ad group must receive at least 100 taps or 30 installs per day.
- If thresholds aren’t met, the system defaults to a weekly switch.
The system also runs a fluctuation check using the last 28 days of data:
- no fluctuation / normal fluctuation / high fluctuation
- This calculation affects which switch option is recommended and how the duration is determined.
- To learn more about how the fluctuation is calculated, check out How are switch periods chosen in CPP A/B Testing

Test environment limits

Actions that can negatively affect test health include:

Bid changes, budget changes, and status changes for ad groups, campaigns, and keywords

#Tip: These actions vary depending on the test method you prefer.

Because custom product page performance is directly tied to screenshots:

Keep custom product page assignments stable
Don’t change screenshots during the test
Keep promo text stable

What to expect during a test

Some traffic dip is possible because the system switches ad group statuses, and Apple Ads needs time to reflect these updates.

In some tests, Apple Ads’ traffic distribution can favor one variant over others disproportionately. If this happens, and one variant receives significantly more traffic than the average of the rest, our system can detect the imbalance and temporarily pause the overperforming variant. The “Stabilize Traffic” setting can be enabled to protect the fairness of your test. If your priority is to keep all variants live at all times, you may want to disable this option when running a Parallel test.

Need more help?

If you have further questions on the process, contact your dedicated Customer Success Manager or contact the support team via live chat!