How We Hacked MailChimp to A/B Test Automated Campaigns

Welcome to Edgar Learn, where we share the strategies that helped us find success! This is Part One of our series on Email Marketing, and focuses on a neat little trick we’ve been using with MailChimp.

We absolutely adore MailChimp over here at MeetEdgar. It is without a doubt one of the most important tools for our business.

But when we started really digging into A/B testing our email campaigns, we ran into a little problem.

MailChimp doesn’t allow for A/B testing of automated campaigns. Which makes sense – the cardinal rule of A/B testing is that you only want to change one variable at a time, and the very nature of automated campaigns (sending several emails in a row, automatically) makes that very difficult to do.

For example, if you had a series of three emails, and you wanted to test changes to all three, you’d need to set up a test that looked something like…

Email 1 Email 2 Email 3
Control Control Control
Test Control Control
Test Test Control
Test Test Test
Control Test Test
Control Control Test
Control Test Control
Control Test Test
Test Control Test

Where “Control” is the standard version of the email, and “Test” is the email.

That’s nine different variations, which means you’re splitting traffic nine different ways, which means it’ll take a while to get a meaningful sample size. And since each one these series already takes several days to complete, it means we’d be looking at a very long time to run each test properly.

So we decided to fudge things a little bit, and to make our tests simply look like this:

Email 1 Email 2 Email 3
Control Control Control
Test Test Test

Please don’t call the Testing Police on us.

Our justification, other than general simplicity and speed of testing, is that we’re also looking for big changes to our conversion rates. When you’re looking for big change, it becomes a little less important to isolate exactly what element you’re changing in the test.

It’s likely that down the road, as these early rough tests get us in the general ballpark of where we want to be, that we’ll refine our testing process and be more scientific about our numbers. But for now, this is letting us see enough results to make some really big changes to our email campaigns!

But the important part – here’s how we did it!

How We Tricked MailChimp Into Running These Tests

Because MailChimp wasn’t set up to allow us to run tests at all on automated emails, we had to come up with a workaround. So here’s what we did…

Our testing has focused primarily on our invite series. When someone visits, they’re invited to sign up for an invitation to use Edgar. You may even have a popup invite appear on this very blog! Anyhoo, when you sign up for an invite, our resident WordPress expert built in a little behind-the-scenes magic that assigns you a number we call the Test Variation.

When we create the segments for the test campaigns in MailChimp, we use this Test Variation number to split you into either a test group or the control group. Then we send you the appropriate drip campaign.

hackerman man standing gif

Moments before inventing the Test Variation number.

To reference our examples above once more, if we were running a by-the-books scientific study, we’d need nine such groups to test the nine different variations of our testing flow. But in our simplified version, we only need two Test Variations to get the results we want.

So we could have just assigned everyone either a one or a two. But that’s not what Hackerman would do! Instead, we assigned everyone a number between one and six. This helps ensure a more even (and random) split between groups, and allows for three-way split testing as well if we’re feeling particularly… testy. Ones, twos, and threes went into our control group and received the same emails we usually sent. Fours, fives, and sixes went into the experimental group and received our all-new flow.

Here’s what MailChimp’s Segment tool looks like:

Email Marketing by Testing Segment

We used a Test Variation to split our audience.

Note that you only need to do this for the first email in the automated series! All of the other emails are connected to this email, making it a little easier to split up your flows for testing. Thanks, MailChimp!

How We Measure Success

The trickiest part of our hacked-together testing process is measuring the results.

We’re looking at two main things: Open Rate and Click Rate. Open Rate is how many people open each email, and Click Rate is how many people click on a link within each email. These are pretty obviously vital stats, right?

The reason it’s tricky to measure success is because we need concrete numbers to work with, but MailChimp gives these numbers as percentages. So we’ve built a spreadsheet that takes the percentage rates and total number of emails sent, and spits out actual numbers for clicks and opens. It looks like this:

Email Testing Spreadsheet

Don’t worry, we’ll explain the columns in a second!

So why do we need the actual number of opens and clicks? Because we then take those numbers and feed them into VWO’s A/B Split Test Significance Calculator to see if our results are statistically significant.

Here’s what that looks like:


Without getting to mathtastic, this calculator basically figures out if your sample sizes are big enough to show a real change. In the above example, they are not – which is why our “Significant?” column in the spreadsheet also shows a “No.”

We keep running a test and hand-feeding the numbers into the spreadsheet and calculator until it shows significance, or we decide that it’s gone on long enough to show that there’s no real impact from the change. And sometimes, that can be a pretty valuable lesson as well!

Our Biggest Lesson So Far

We’ve gleaned a lot of useful information from our hacked together testing, including swapping out some subject lines and changing around the content of specific emails. But in the overall scheme of things, our most valuable learnings so far have come from a test that showed no significant change in performance – that very same test we’ve been using as an example in the above images.

In this test, the only thing that was different between our two campaigns was the amount of time between each email. In our control, we had three emails spread out over roughly the course of a month. In our test, we condensed that schedule down to 10 days.

And guess what? The results were pretty much the same.

Essentially, we learned that we could reduce the timeframe of our intro email campaign to roughly one-third of its original length with no negative effects. So of course that became our new standard! Not only will this help bring in new customers more quickly, but it’ll make all of our future tests run more quickly as well. And that’s the sort of thing that will pay dividends well down the road.

We’ll be back with more information from these tests, as well as from our general adventures in email marketing, down the road. For now – well, if you’re feeling like having a little fun with MailChimp, give this automated campaign testing trick a shot!