MARKETING
How to Determine Your A/B Testing Sample Size & Time Frame
Do you remember your first A/B test you ran? I do. (Nerdy, I know.)
I felt simultaneously thrilled and terrified because I knew I had to actually use some of what I learned in college for my job.
There were some aspects of A/B testing I still remembered — for instance, I knew you need a big enough sample size to run the test on, and you need to run the test long enough to get statistically significant results.
But … that’s pretty much it. I wasn’t sure how big was “big enough” for sample sizes and how long was “long enough” for test durations — and Googling it gave me a variety of answers my college statistics courses definitely didn’t prepare me for.
Turns out I wasn’t alone: Those are two of the most common A/B testing questions we get from customers. And the reason the typical answers from a Google search aren’t that helpful is because they’re talking about A/B testing in an ideal, theoretical, non-marketing world.
So, I figured I’d do the research to help answer this question for you in a practical way. At the end of this post, you should be able to know how to determine the right sample size and time frame for your next A/B test. Let’s dive in.
A/B Testing Sample Size & Time Frame
In theory, to determine a winner between Variation A and Variation B, you need to wait until you have enough results to see if there is a statistically significant difference between the two.
Depending on your company, sample size, and how you execute the A/B test, getting statistically significant results could happen in hours or days or weeks — and you’ve just got to stick it out until you get those results. In theory, you should not restrict the time in which you’re gathering results.
For many A/B tests, waiting is no problem. Testing headline copy on a landing page? It’s cool to wait a month for results. Same goes with blog CTA creative — you’d be going for the long-term lead generation play, anyway.
But certain aspects of marketing demand shorter timelines when it comes to A/B testing. Take email as an example. With email, waiting for an A/B test to conclude can be a problem, for several practical reasons:
1. Each email send has a finite audience.
Unlike a landing page (where you can continue to gather new audience members over time), once you send an email A/B test off, that’s it — you can’t “add” more people to that A/B test. So you’ve got to figure out how squeeze the most juice out of your emails.
This will usually require you to send an A/B test to the smallest portion of your list needed to get statistically significant results, pick a winner, and then send the winning variation on to the rest of the list.
2. Running an email marketing program means you’re juggling at least a few email sends per week. (In reality, probably way more than that.)
If you spend too much time collecting results, you could miss out on sending your next email — which could have worse effects than if you sent a non-statistically-significant winner email on to one segment of your database.
3. Email sends are often designed to be timely.
Your marketing emails are optimized to deliver at a certain time of day, whether your emails are supporting the timing of a new campaign launch and/or landing in your recipient’s inboxes at a time they’d love to receive it. So if you wait for your email to be fully statistically significant, you might miss out on being timely and relevant — which could defeat the purpose of your email send in the first place.
That’s why email A/B testing programs have a “timing” setting built in: At the end of that time frame, if neither result is statistically significant, one variation (which you choose ahead of time) will be sent to the rest of your list. That way, you can still run A/B tests in email, but you can also work around your email marketing scheduling demands and ensure people are always getting timely content.
So to run A/B tests in email while still optimizing your sends for the best results, you’ve got to take both sample size and timing into account.
Next up — how to actually figure out your sample size and timing using data.
How to Determine Sample Size for an A/B Test
Now, let’s dive into how to actually calculate the sample size and timing you need for your next A/B test.
For our purposes, we’re going to use email as our example to demonstrate how you’ll determine sample size and timing for an A/B test. However, it’s important to note — the steps in this list can be used for any A/B test, not just email.
Let’s dive in.
Like mentioned above, each A/B test you send can only be sent to a finite audience — so you need to figure out how to maximize the results from that A/B test. To do that, you need to figure out the smallest portion of your total list needed to get statistically significant results. Here’s how you calculate it.
1. Assess whether you have enough contacts in your list to A/B test a sample in the first place.
To A/B test a sample of your list, you need to have a decently large list size — at least 1,000 contacts. If you have fewer than that in your list, the proportion of your list that you need to A/B test to get statistically significant results gets larger and larger.
For example, to get statistically significant results from a small list, you might have to test 85% or 95% of your list. And the results of the people on your list who haven’t been tested yet will be so small that you might as well have just sent half of your list one email version, and the other half another, and then measured the difference.
Your results might not be statistically significant at the end of it all, but at least you’re gathering learnings while you grow your lists to have more than 1,000 contacts. (If you want more tips on growing your email list so you can hit that 1,000 contact threshold, check out this blog post.)
Note for HubSpot customers: 1,000 contacts is also our benchmark for running A/B tests on samples of email sends — if you have fewer than 1,000 contacts in your selected list, the A version of your test will automatically be sent to half of your list and the B will be sent to the other half.
2. Use a sample size calculator.
Next, you’ll want to find a sample size calculator — HubSpot’s A/B Testing Kit offers a good, free sample size calculator.
Here’s what it looks like when you download it:
3. Put in your email’s Confidence Level, Confidence Interval, and Population into the tool.
Yep, that’s a lot of statistics jargon. Here’s what these terms translate to in your email:
Population: Your sample represents a larger group of people. This larger group is called your population.
In email, your population is the typical number of people in your list who get emails delivered to them — not the number of people you sent emails to. To calculate population, I’d look at the past three to five emails you’ve sent to this list, and average the total number of delivered emails. (Use the average when calculating sample size, as the total number of delivered emails will fluctuate.)
Confidence Interval: You might have heard this called “margin of error.” Lots of surveys use this, including political polls. This is the range of results you can expect this A/B test to explain once it’s run with the full population.
For example, in your emails, if you have an interval of 5, and 60% of your sample opens your Variation, you can be sure that between 55% (60 minus 5) and 65% (60 plus 5) would have also opened that email. The bigger the interval you choose, the more certain you can be that the populations true actions have been accounted for in that interval. At the same time, large intervals will give you less definitive results. It’s a trade-off you’ll have to make in your emails.
For our purposes, it’s not worth getting too caught up in confidence intervals. When you’re just getting started with A/B tests, I’d recommend choosing a smaller interval (ex: around 5).
Confidence Level: This tells you how sure you can be that your sample results lie within the above confidence interval. The lower the percentage, the less sure you can be about the results. The higher the percentage, the more people you’ll need in your sample, too.
Note for HubSpot customers: The HubSpot Email A/B tool automatically uses the 85% confidence level to determine a winner. Since that option isn’t available in this tool, I’d suggest choosing 95%.
Email A/B Test Example:
Let’s pretend we’re sending our first A/B test. Our list has 1,000 people in it and has a 95% deliverability rate. We want to be 95% confident our winning email metrics fall within a 5-point interval of our population metrics.
Here’s what we’d put in the tool:
- Population: 950
- Confidence Level: 95%
- Confidence Interval: 5
4. Click “Calculate” and your sample size will spit out.
Ta-da! The calculator will spit out your sample size.
In our example, our sample size is: 274.
This is the size one your variations needs to be. So for your email send, if you have one control and one variation, you’ll need to double this number. If you had a control and two variations, you’d triple it. (And so on.)
5. Depending on your email program, you may need to calculate the sample size’s percentage of the whole email.
HubSpot customers, I’m looking at you for this section. When you’re running an email A/B test, you’ll need to select the percentage of contacts to send the list to — not just the raw sample size.
To do that, you need to divide the number in your sample by the total number of contacts in your list. Here’s what that math looks like, using the example numbers above:
274 / 1,000 = 27.4%
This means that each sample (both your control AND your variation) needs to be sent to 27-28% of your audience — in other words, roughly a total of 55% of your total list.
And that’s it! You should be ready to select your sending time.
How to Choose the Right Timeframe for Your A/B Test
Again, for figuring out the right timeframe for your A/B test, we’ll use the example of email sends – but this information should still apply regardless of the type of A/B test you’re conducting.
However, your timeframe will vary depending on your business’ goals, as well. If you’d like to design a new landing page by Q2 2021 and it’s Q4 2020, you’ll likely want to finish your A/B test by January or February so you can use those results to build the winning page.
But, for our purposes, let’s return to the email send example: You have to figure out how long to run your email A/B test before sending a (winning) version on to the rest of your list.
Figuring out the timing aspect is a little less statistically driven, but you should definitely use past data to help you make better decisions. Here’s how you can do that.
If you don’t have timing restrictions on when to send the winning email to the rest of the list, head over to your analytics.
Figure out when your email opens/clicks (or whatever your success metrics are) starts to drop off. Look your past email sends to figure this out.
For example, what percentage of total clicks did you get in your first day? If you found that you get 70% of your clicks in the first 24 hours, and then 5% each day after that, it’d make sense to cap your email A/B testing timing window for 24 hours because it wouldn’t be worth delaying your results just to gather a little bit of extra data.
In this scenario, you would probably want to keep your timing window to 24 hours, and at the end of 24 hours, your email program should let you know if they can determine a statistically significant winner.
Then, it’s up to you what to do next. If you have a large enough sample size and found a statistically significant winner at the end of the testing time frame, many email marketing programs will automatically and immediately send the winning variation.
If you have a large enough sample size and there’s no statistically significant winner at the end of the testing time frame, email marketing tools might also allow you to automatically send a variation of your choice.
If you have a smaller sample size or are running a 50/50 A/B test, when to send the next email based on the initial email’s results is entirely up to you.
If you have time restrictions on when to send the winning email to the rest of the list, figure out how late you can send the winner without it being untimely or affecting other email sends.
For example, if you’ve sent an email out at 3 p.m. EST for a flash sale that ends at midnight EST, you wouldn’t want to determine an A/B test winner at 11 p.m. Instead, you’d want to send the email closer to 6 or 7 p.m. — that’ll give the people not involved in the A/B test enough time to act on your email.
And that’s pretty much it, folks. After doing these calculations and examining your data, you should be in a much better state to conduct successful A/B tests — ones that are statistically valid and help you move the needle on your goals.
MARKETING
YouTube Ad Specs, Sizes, and Examples [2024 Update]
Introduction
With billions of users each month, YouTube is the world’s second largest search engine and top website for video content. This makes it a great place for advertising. To succeed, advertisers need to follow the correct YouTube ad specifications. These rules help your ad reach more viewers, increasing the chance of gaining new customers and boosting brand awareness.
Types of YouTube Ads
Video Ads
- Description: These play before, during, or after a YouTube video on computers or mobile devices.
- Types:
- In-stream ads: Can be skippable or non-skippable.
- Bumper ads: Non-skippable, short ads that play before, during, or after a video.
Display Ads
- Description: These appear in different spots on YouTube and usually use text or static images.
- Note: YouTube does not support display image ads directly on its app, but these can be targeted to YouTube.com through Google Display Network (GDN).
Companion Banners
- Description: Appears to the right of the YouTube player on desktop.
- Requirement: Must be purchased alongside In-stream ads, Bumper ads, or In-feed ads.
In-feed Ads
- Description: Resemble videos with images, headlines, and text. They link to a public or unlisted YouTube video.
Outstream Ads
- Description: Mobile-only video ads that play outside of YouTube, on websites and apps within the Google video partner network.
Masthead Ads
- Description: Premium, high-visibility banner ads displayed at the top of the YouTube homepage for both desktop and mobile users.
YouTube Ad Specs by Type
Skippable In-stream Video Ads
- Placement: Before, during, or after a YouTube video.
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Vertical: 9:16
- Square: 1:1
- Length:
- Awareness: 15-20 seconds
- Consideration: 2-3 minutes
- Action: 15-20 seconds
Non-skippable In-stream Video Ads
- Description: Must be watched completely before the main video.
- Length: 15 seconds (or 20 seconds in certain markets).
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Vertical: 9:16
- Square: 1:1
Bumper Ads
- Length: Maximum 6 seconds.
- File Format: MP4, Quicktime, AVI, ASF, Windows Media, or MPEG.
- Resolution:
- Horizontal: 640 x 360px
- Vertical: 480 x 360px
In-feed Ads
- Description: Show alongside YouTube content, like search results or the Home feed.
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Square: 1:1
- Length:
- Awareness: 15-20 seconds
- Consideration: 2-3 minutes
- Headline/Description:
- Headline: Up to 2 lines, 40 characters per line
- Description: Up to 2 lines, 35 characters per line
Display Ads
- Description: Static images or animated media that appear on YouTube next to video suggestions, in search results, or on the homepage.
- Image Size: 300×60 pixels.
- File Type: GIF, JPG, PNG.
- File Size: Max 150KB.
- Max Animation Length: 30 seconds.
Outstream Ads
- Description: Mobile-only video ads that appear on websites and apps within the Google video partner network, not on YouTube itself.
- Logo Specs:
- Square: 1:1 (200 x 200px).
- File Type: JPG, GIF, PNG.
- Max Size: 200KB.
Masthead Ads
- Description: High-visibility ads at the top of the YouTube homepage.
- Resolution: 1920 x 1080 or higher.
- File Type: JPG or PNG (without transparency).
Conclusion
YouTube offers a variety of ad formats to reach audiences effectively in 2024. Whether you want to build brand awareness, drive conversions, or target specific demographics, YouTube provides a dynamic platform for your advertising needs. Always follow Google’s advertising policies and the technical ad specs to ensure your ads perform their best. Ready to start using YouTube ads? Contact us today to get started!
MARKETING
Why We Are Always ‘Clicking to Buy’, According to Psychologists
Amazon pillows.
MARKETING
A deeper dive into data, personalization and Copilots
Salesforce launched a collection of new, generative AI-related products at Connections in Chicago this week. They included new Einstein Copilots for marketers and merchants and Einstein Personalization.
To better understand, not only the potential impact of the new products, but the evolving Salesforce architecture, we sat down with Bobby Jania, CMO, Marketing Cloud.
Dig deeper: Salesforce piles on the Einstein Copilots
Salesforce’s evolving architecture
It’s hard to deny that Salesforce likes coming up with new names for platforms and products (what happened to Customer 360?) and this can sometimes make the observer wonder if something is brand new, or old but with a brand new name. In particular, what exactly is Einstein 1 and how is it related to Salesforce Data Cloud?
“Data Cloud is built on the Einstein 1 platform,” Jania explained. “The Einstein 1 platform is our entire Salesforce platform and that includes products like Sales Cloud, Service Cloud — that it includes the original idea of Salesforce not just being in the cloud, but being multi-tenancy.”
Data Cloud — not an acquisition, of course — was built natively on that platform. It was the first product built on Hyperforce, Salesforce’s new cloud infrastructure architecture. “Since Data Cloud was on what we now call the Einstein 1 platform from Day One, it has always natively connected to, and been able to read anything in Sales Cloud, Service Cloud [and so on]. On top of that, we can now bring in, not only structured but unstructured data.”
That’s a significant progression from the position, several years ago, when Salesforce had stitched together a platform around various acquisitions (ExactTarget, for example) that didn’t necessarily talk to each other.
“At times, what we would do is have a kind of behind-the-scenes flow where data from one product could be moved into another product,” said Jania, “but in many of those cases the data would then be in both, whereas now the data is in Data Cloud. Tableau will run natively off Data Cloud; Commerce Cloud, Service Cloud, Marketing Cloud — they’re all going to the same operational customer profile.” They’re not copying the data from Data Cloud, Jania confirmed.
Another thing to know is tit’s possible for Salesforce customers to import their own datasets into Data Cloud. “We wanted to create a federated data model,” said Jania. “If you’re using Snowflake, for example, we more or less virtually sit on your data lake. The value we add is that we will look at all your data and help you form these operational customer profiles.”
Let’s learn more about Einstein Copilot
“Copilot means that I have an assistant with me in the tool where I need to be working that contextually knows what I am trying to do and helps me at every step of the process,” Jania said.
For marketers, this might begin with a campaign brief developed with Copilot’s assistance, the identification of an audience based on the brief, and then the development of email or other content. “What’s really cool is the idea of Einstein Studio where our customers will create actions [for Copilot] that we hadn’t even thought about.”
Here’s a key insight (back to nomenclature). We reported on Copilot for markets, Copilot for merchants, Copilot for shoppers. It turns out, however, that there is just one Copilot, Einstein Copilot, and these are use cases. “There’s just one Copilot, we just add these for a little clarity; we’re going to talk about marketing use cases, about shoppers’ use cases. These are actions for the marketing use cases we built out of the box; you can build your own.”
It’s surely going to take a little time for marketers to learn to work easily with Copilot. “There’s always time for adoption,” Jania agreed. “What is directly connected with this is, this is my ninth Connections and this one has the most hands-on training that I’ve seen since 2014 — and a lot of that is getting people using Data Cloud, using these tools rather than just being given a demo.”
What’s new about Einstein Personalization
Salesforce Einstein has been around since 2016 and many of the use cases seem to have involved personalization in various forms. What’s new?
“Einstein Personalization is a real-time decision engine and it’s going to choose next-best-action, next-best-offer. What is new is that it’s a service now that runs natively on top of Data Cloud.” A lot of real-time decision engines need their own set of data that might actually be a subset of data. “Einstein Personalization is going to look holistically at a customer and recommend a next-best-action that could be natively surfaced in Service Cloud, Sales Cloud or Marketing Cloud.”
Finally, trust
One feature of the presentations at Connections was the reassurance that, although public LLMs like ChatGPT could be selected for application to customer data, none of that data would be retained by the LLMs. Is this just a matter of written agreements? No, not just that, said Jania.
“In the Einstein Trust Layer, all of the data, when it connects to an LLM, runs through our gateway. If there was a prompt that had personally identifiable information — a credit card number, an email address — at a mimum, all that is stripped out. The LLMs do not store the output; we store the output for auditing back in Salesforce. Any output that comes back through our gateway is logged in our system; it runs through a toxicity model; and only at the end do we put PII data back into the answer. There are real pieces beyond a handshake that this data is safe.”
You must be logged in to post a comment Login