top of page

A/B Testing for Calgary Businesses: How to Start

  • Jun 3
  • 7 min read
Hands typing on a laptop overlaid with glowing analytics charts, graphs, percentages, and globe icons in a bright office.

Quick Answer: Start A/B testing once your tested page has at least 3,000 monthly sessions and a clear conversion event. Write a single-variable hypothesis, calculate sample size before launching, run for at least 7 to 14 days, and only declare a winner at 95% statistical confidence. Most Calgary sites can run their first valid test inside 2 weeks.


A/B testing is the engine of any serious CRO program. It compares two versions of a page (the original "control" and a modified "variation") by splitting traffic 50/50 and measuring which version produces more conversions. Done right, it gives you evidence-based decisions instead of opinion-based ones. Done wrong, it produces "winners" that don't replicate and erode confidence in the entire CRO practice.


For Calgary businesses starting an A/B testing program, the three things that separate a useful program from a confusing one are: enough traffic to reach statistical significance, single-variable hypotheses you can interpret, and the discipline to wait for proper sample size before declaring results. Below is the full setup: when you're ready to test, how to write a hypothesis, how to calculate sample size, which tools to use, and the most common mistakes that quietly invalidate Calgary testing programs.


At a Glance

Quick Facts:

  • Minimum traffic for reliable testing: 3,000+ monthly sessions on the tested page

  • Minimum test duration: 7 to 14 days (covers full weekly cycle)

  • Required statistical confidence to declare a winner: 95% or higher

  • Typical sample size for 20% lift on 2% baseline: roughly 12,000 visitors per variation

  • Calgary-friendly A/B testing tools: VWO, Convert, AB Tasty, Optimizely (Convert often best for SMB)

  • Reasonable monthly testing cadence: 2 to 4 concurrent tests for a single Calgary SMB


When Is Your Calgary Business Ready to Start A/B Testing

You're ready when three conditions are met. First, the tested page receives at least 3,000 monthly sessions; lower than that and most tests take months to reach significance. Second, you have a clearly defined conversion event (form submission, purchase, call) that's tracked properly in your analytics. Third, you have a hypothesis backlog from a CRO audit, heatmap review, or analytics deep dive; testing without hypotheses is just guessing with better tools.


If any of those three is missing, fix it first. For low-traffic sites (under 3,000 sessions on tested pages), focus on best-practice fixes (page speed, form length, mobile CTAs, trust signals) rather than formal A/B testing. Those fixes don't need statistical significance to be obviously valuable.


A practical readiness checklist:

  • Traffic: 3,000+ monthly sessions on the page you want to test

  • Conversion tracking: GA4 events firing reliably for the conversion you'll measure

  • Hypothesis source: audit, heatmap data, session recordings, or user research backing your test idea

  • Testing tool: VWO, Convert, or equivalent installed and verified

  • Decision discipline: willingness to wait for 95% confidence rather than calling early winners


Business meeting with a laptop showing business reports and charts, while a man points at the screen across a table

How Do You Write an A/B Test Hypothesis That's Worth Running

A good hypothesis names the change, the expected outcome, and the reason. Bad hypotheses ("let's try a green button") test arbitrary changes and produce learning you can't generalize. Good hypotheses ("based on heatmap data showing 60% of mobile users miss the primary CTA below the fold, moving the CTA above the fold will lift form submissions by at least 15%") test specific friction points and produce learning that informs the next 5 hypotheses.


The standard format:

  • Because [evidence from data or research]

  • We believe [proposed change]

  • Will result in [specific metric improvement]

  • We will know we're right when [measurable outcome with confidence level]


This forces you to back hypotheses with evidence, predict an effect size, and define success in advance. Tests written this way also build a documented learning library; when the same friction shows up on a different page, you already know what to try.


Avoid these common hypothesis mistakes:

  • Multiple variables in one test ("new headline AND new button AND new image"); you won't know what drove the lift

  • No effect size prediction ("might improve conversion"); you can't calculate sample size without one

  • Aesthetic hypotheses ("the new design looks more modern"); not falsifiable

  • Hypotheses that contradict prior winners without explanation; revisit the prior learning first


How Do You Calculate Sample Size Before You Launch a Test

Sample size matters because tests called too early produce false winners. The math depends on your baseline conversion rate, the size of the lift you're trying to detect, and your desired statistical confidence (typically 95%) and power (typically 80%).


A useful rule of thumb: detecting a 20% lift on a 2% baseline requires roughly 12,000 visitors per variation, or 24,000 total. Detecting a 10% lift on the same baseline requires about 50,000 visitors per variation. The smaller the lift you want to detect, the more sample you need; the higher your baseline rate, the less sample you need.


Use a free sample size calculator (CXL, VWO, and Evan Miller all publish good ones) before every test. Plug in your baseline conversion rate, the minimum detectable effect you care about, and your confidence level; the calculator returns the sample needed per variation. Then estimate how long that will take based on your weekly traffic; if the answer is over 8 weeks, either pick a higher-impact test, increase traffic to the page, or accept a larger minimum detectable effect.


The honest version: most "tests" run by Calgary businesses without sample-size planning are decided after 200 to 500 visitors per variation, which is roughly 5% of what's needed for statistical validity. Those tests produce noise, not signal, and the agency or in-house team building on those "winners" is building on sand.


Which A/B Testing Tools Are Worth Paying For

For most Calgary SMBs, three tools cover the practical range:

  • Microsoft Clarity (free) for heatmaps and session recordings to inform hypotheses (not a testing tool, but essential for hypothesis generation)

  • Convert.com ($99 to $999/month) for full A/B testing with strong stats and a clean editor; best price-to-feature ratio for SMB

  • VWO ($199 to $1,000+/month) for A/B testing with broader research features (surveys, session recording bundled)


Higher-end options like Optimizely and AB Tasty run $1,000+/month and add enterprise features (server-side testing, personalization at scale, deeper integrations) that most Calgary SMBs don't need yet. Google Optimize was the popular free option but was sunset in 2023; its replacements include the free tier of Convert or the GA4 native experiment integrations.


A practical setup for a Calgary SMB starting CRO: Microsoft Clarity (free) for hypothesis research, Convert ($99 to $299/month) for testing, and GA4 (free) for measurement. Total tooling cost under $300/month for a real testing program.


Hands typing on a laptop showing charts and graphs, with glasses and a phone on a wooden desk in a focused office scene

What Are the Mistakes That Invalidate A/B Test Results


The five mistakes that most often undermine A/B testing programs:

  • Calling tests early before reaching the calculated sample size produces false winners that don't replicate

  • Running tests under a week misses weekly cyclical patterns (business buyers behave differently Tuesday vs. Saturday)

  • Multi-variant tests with too little traffic; splitting traffic 3 or 4 ways multiplies the required sample size, and most sites don't have it

  • Testing during atypical traffic periods (Stampede week, Black Friday, holiday lulls); the traffic mix isn't representative

  • Not pre-defining success criteria; "the variation has more conversions" without a confidence threshold is selection bias


Tests should also be set up with the same goal event you actually care about. Optimizing for form starts rather than form completions, or for clicks rather than purchases, leads to "winners" that move the proxy metric without moving the business outcome. Always test against the conversion that matters to revenue.


Frequently Asked Questions

How long should an A/B test run before you call a winner?

Until the calculated sample size is reached AND the test has run for at least a full weekly cycle (7 days minimum, 14 days preferred). Calling a winner at 3 days because confidence shows 95% is a common trap; statistical confidence assumes random sampling, and a partial-week sample is not random for most B2B Calgary businesses.

What's the difference between A/B testing and multivariate testing?

A/B testing compares two complete versions of a page. Multivariate testing varies multiple elements simultaneously (3 headlines x 2 buttons x 2 images = 12 combinations) to learn which elements drive lift. Multivariate is only viable on sites with very high traffic (typically 100,000+ monthly sessions on the tested page) because it splits the sample across many more variations.

Can I run multiple A/B tests at the same time?

Yes, on independent parts of the site. Two tests on the same page or on overlapping user flows can interact in ways that contaminate both results. A safer pattern: run concurrent tests on unrelated pages or audiences, and serialize tests within the same funnel.

Do I need a developer to run A/B tests?

Not for visual changes within most testing tools' WYSIWYG editor; you can change headlines, colours, button copy, and image positions without code. You will need developer time for tests that require backend logic, new form behaviour, or complex layout changes. Most Calgary SMB testing programs run 70% no-code, 30% dev-supported.

How many tests should a Calgary business run per month?

Two to four concurrent tests is a comfortable cadence for a single-site Calgary SMB. Higher rates work for high-traffic e-commerce or SaaS sites. The constraint usually isn't testing tool capacity; it's hypothesis quality and traffic volume. Better to run 2 well-designed tests than 5 hasty ones.


LTL Creative logo with a black-and-teal circular symbol above the text LTL CREATIVE on a white background

About LTL Creative: LTL Creative is a Calgary digital marketing agency providing Calgary conversion rate optimization services for ambitious local businesses, specializing in hypothesis-driven A/B testing, statistical rigour, and continuous experimentation programs, delivered through CXL-certified strategy for owners and marketing leaders requiring measurable, trusted results.


Ready to Drive Results Today? LTL Creative helps Calgary businesses build A/B testing programs that produce evidence-backed wins, backed by Google Partner, Meta-certified, and CXL-trained specialists.


Connect with LTL Creative today to discuss your Calgary conversion rate optimization testing roadmap.


Disclaimer: Results vary by business, industry, and market conditions. Statistics, platform data, and pricing referenced reflect current industry benchmarks and are subject to change.

Comments


bottom of page