How is this different from Google Optimize?

Google Optimize was sunset in 2023. Experiment Flow is a modern replacement with better multi-armed bandit support, faster setup, and more affordable pricing.

Do I need to install anything?

Just add one script tag to your site. No npm packages, no build steps, no dependencies.

How does statistical significance work?

We use two-proportion z-tests with 95% confidence intervals. Results only show as significant when there is less than 5% chance the difference is due to randomness.

Can I use this server-side?

Yes. The REST API works with any language. There are official SDKs for Node.js, Python, Go, and Ruby.

← Back to Blog

March 15, 2026 12 min read

How We Grew Camda: A Data-Driven Experimentation Story

growthexperimentationcase studyproduct

Introduction: Growing Camda on a Lean Budget

Camda is a productivity application built for small creative teams. Like most early-stage products, it launched with strong conviction but limited data. The founding team had a hypothesis about who their users were and what those users needed — but hypotheses are not facts, and budgets do not forgive prolonged guessing.

The challenge was familiar: how do you grow a product when you cannot afford to bet on intuition alone? Paid acquisition is expensive. App store visibility is competitive. Organic search takes months to compound. Every experiment that fails to move a needle costs time the team does not have.

The answer was systematic experimentation. Not a single cleverly designed A/B test, but a rigorous, cross-channel programme that treated every surface — from the app store listing to the onboarding flow to a paid search ad — as a testable hypothesis. This post documents what Camda learned and how ExperimentFlow made it possible.

Setting Up the Experimentation Foundation

Before running a single test, the team invested two weeks in instrumentation. This work was unglamorous but indispensable. Without clean event data, experiment results are noise.

Choosing the Right Metrics

The team identified a three-level metric hierarchy:

North Star metric: Weekly active teams (a team with at least three members who each completed one core action in the last seven days).
Leading indicators: Signup-to-activation rate, day-7 retention, and feature adoption depth.
Guardrail metrics: Support ticket volume and subscription churn rate — metrics that should not worsen even when leading indicators improve.

This hierarchy prevented the common trap of optimising a funnel metric in isolation. A landing page change that doubled signups but halved activation would hurt the north star metric even though it appeared to “win” a narrow conversion test.

Instrumentation Choices

The team instrumented every meaningful user action as a named event with a consistent schema: event_name, user_id, team_id, variant, and a freeform properties object. Variant assignment was handled server-side via ExperimentFlow’s /api/decide endpoint so that every event carried the correct experiment context from the moment of assignment.

Instrumentation is not a one-time task. Every new feature required a corresponding instrumentation ticket before it shipped. No data, no experiment.

App Store Optimisation Experiments

The iOS App Store and Google Play Store are high-leverage channels that most teams under-test. Store listings are essentially landing pages with constrained real estate, and small copy or visual changes can produce large swings in conversion from impression to install.

Icon Variants

The Camda icon had been chosen by committee. The team suspected it was not communicating the product’s value at a glance. They ran three icon variants using Apple’s native product page optimisation tool:

Control: Abstract geometric mark in brand teal.
Variant A: Stylised checkmark suggesting task completion.
Variant B: Two overlapping speech bubbles suggesting collaboration.

Variant B won with a 14% lift in tap-through rate from search results. The collaboration framing matched what users were actually searching for. The geometric mark, while visually polished, communicated nothing.

Screenshot Copy

Screenshots are the highest-information surface in a store listing. The team tested two copy strategies across their first three screenshots:

Feature-led: “Real-time collaboration. Unlimited projects. Version history.”
Outcome-led: “Ship creative work faster. Stay in sync without the meetings. Never lose a version.”

Outcome-led copy produced an 11% higher install rate. Users do not buy features; they buy outcomes. This insight would recur in every subsequent channel.

Keyword Testing

ASO keyword tests are slower to evaluate because they depend on store algorithm re-indexing cycles. The team maintained a rolling 30-day keyword experiment cadence, cycling through clusters of related terms and measuring impression volume and conversion rate for each cluster. The highest-performing cluster centred on “team task management” rather than “creative project management,” suggesting the addressable audience was broader than the team had assumed.

Search (SEO) Experiments

Organic search was Camda’s second major acquisition channel. SEO experiments are structurally different from product experiments: you cannot randomly assign users to see different title tags, and you must account for temporal confounds. The team used a time-based holdout approach, making a single change, waiting 28 days for Google to re-evaluate, then measuring the delta against a set of control pages that had not changed.

Title Tag Variants

The team tested two title tag formats for their core landing pages:

Format A (brand-led): “Camda — Creative Team Collaboration Software”
Format B (problem-led): “Stop Losing Work in Slack — Camda for Creative Teams”

Format B produced a 23% higher click-through rate from search result pages. The problem-led framing created pattern interruption in a results page full of generic product descriptions.

Landing Page Copy Experiments

The team ran structured landing page copy tests using ExperimentFlow. Visitors were assigned to variants at page load, and conversion to signup was the primary metric. Key findings:

Social proof placed above the fold (rather than below) lifted signup conversion by 9%.
A pricing anchor in the hero section (“Free for teams under five”) reduced bounce rate by 17% among visitors from high-intent keywords.
Removing the product feature list from the hero and replacing it with a short customer quote increased time-on-page and signup rate simultaneously.

Content Testing

Long-form content was tested at the structural level: the team compared two article formats for the same target keyword. A narrative case study format consistently outperformed a listicle format on both average time-on-page and conversion to trial signup, though listicles generated more inbound links. The team used a blended approach: publish as narrative, add a structured summary list for shareability.

User Feedback Loops

Quantitative experiments answer “what” but rarely answer “why.” The Camda team ran a systematic feedback programme that fed qualitative insights directly into the experiment backlog.

In-App Micro-Surveys

Short, contextual surveys were shown to users at two moments: immediately after completing their first core action (capturing activation sentiment) and at day 28 for users who had not upgraded (capturing barrier-to-purchase insight). Survey responses were tagged and clustered weekly.

The most actionable cluster was unexpected: users repeatedly mentioned that they did not know a key feature existed. This was not a marketing problem — it was a discoverability problem. It generated a hypothesis that became a high-value in-product experiment (described below).

Churn Exit Interviews

Every churned user received a short email survey. Response rates were low (around 12%) but the signal was high-quality. Two themes emerged consistently: onboarding felt too long, and the value proposition was not clear until day four or five. Both became experiment hypotheses.

The best experiment ideas do not come from data alone. They come from combining quantitative signals with the qualitative “why” that only users can provide.

In-Product Experiments

The product surface area was where ExperimentFlow delivered the most leverage. The team ran concurrent experiments across onboarding, activation, and ongoing engagement, using ExperimentFlow’s batch decide API to assign users to multiple experiments simultaneously without incurring multiple round trips.

Onboarding Experiments

The original onboarding flow was eight steps. Based on churn interview feedback, the team hypothesised that a shorter flow would improve activation. They tested three variants:

Control: Eight-step linear flow.
Variant A: Four-step flow (removed configuration steps, deferred to in-product tooltips).
Variant B: Three-step flow plus a single “quick win” task designed to reach the core value moment faster.

Variant B won with a 31% improvement in day-1 activation rate. Critically, it also improved day-7 retention by 18%, disproving the concern that a shorter onboarding would leave users under-equipped. Users who reached the core value moment faster simply stuck around.

Feature Discoverability

The in-app micro-survey feedback pointed to a discoverability problem. The team identified the most-loved but least-discovered feature (collaborative version history) and tested three discovery mechanisms:

A persistent sidebar tooltip shown on day 3.
A contextual in-product modal triggered by the first relevant user action.
A “did you know” card in the weekly digest email.

The contextual in-product modal produced the highest feature adoption rate (42% of exposed users tried the feature within 24 hours, versus 14% for the sidebar tooltip and 8% for the email card). Context is more powerful than prominence.

Engagement Loop Experiments

The team tested notification strategies for re-engaging users who had been inactive for five days. Push notifications outperformed email by a significant margin for same-day re-engagement, but email produced higher-quality sessions (more actions per session, higher upgrade intent). The team adopted a sequenced approach: push on day five, email on day seven, combining the strengths of each channel.

Paid Channel Experiments

Paid acquisition experiments were the most expensive to run, so the team prioritised speed of learning over statistical perfection. They used a structured test matrix to isolate variables systematically rather than running free-form creative tests.

Ad Copy Tests

Consistent with the landing page and screenshot learnings, outcome-led copy outperformed feature-led copy in paid search ads. The team also discovered a third frame that outperformed both: a “before and after” frame that described the problem state and the resolution in the same headline. “Losing work in Slack? Camda keeps your team in sync” produced a 19% higher click-through rate than the best-performing outcome-led variant.

Audience Targeting Tests

The team tested four audience definitions on paid social:

Interest-based (design tools, project management software).
Lookalike audience based on existing activated users.
Lookalike audience based on paying users.
Job title targeting (creative director, design lead, head of brand).

The lookalike audience based on paying users produced the lowest cost per acquisition by a wide margin, and the best 30-day retention of any paid channel cohort. The job title targeting produced high click-through rates but poor conversion to paid, suggesting the audience was curious but not in-market.

Bid Strategy Tests

The team compared target-CPA bidding against manual CPC for their highest-intent keyword cluster. Target-CPA bidding produced 34% more conversions at the same budget once the algorithm had accumulated sufficient signal (approximately three weeks). Below that signal threshold, manual CPC outperformed it. The lesson: automated bidding requires a minimum volume of conversion data to function correctly. Starting with manual CPC and switching at scale is a repeatable playbook.

Cross-Channel Learnings

Running experiments across multiple channels simultaneously created an unexpected dividend: insights from one channel unlocked improvements in another. This cross-pollination effect compounded the value of the programme beyond what any single-channel effort could have achieved.

Messaging Consistency

When the team discovered that “before and after” framing outperformed other copy strategies in paid ads, they applied the same frame to their app store description, landing page hero, and onboarding welcome screen. Each application produced a measurable lift. The insight was channel-agnostic because it reflected something true about how users think about the problem.

Audience Understanding

The keyword testing in ASO revealed that users described the product as a “team task management” tool rather than a “creative project management” tool. The team updated their paid search keyword strategy to include task management terms, which had previously been excluded as too competitive. Cost per acquisition from this keyword cluster was 28% lower than the legacy creative-focused clusters, because the competition was not as fierce and the intent match was stronger.

Feature Priority Signals

The contextual modal experiment that surfaced version history also provided a signal about feature priority. The high adoption rate of version history when users were prompted confirmed it was a high-value feature that deserved more prominence. The team added version history to their paid ad copy and landing pages as a named differentiator. It became one of the top-cited reasons for upgrade in subsequent exit interview data.

Results Summary: Compound Growth Through Systematic Experimentation

Over a nine-month period, the Camda team ran 47 experiments across all channels. Not all of them produced positive results — approximately 40% produced no statistically significant difference, and 8% produced negative results that were caught before shipping. The 52% that produced positive results, however, compounded in ways that a single optimisation effort never could.

App store install rate: +31% (driven by icon and screenshot experiments).
Organic search traffic: +67% over six months (driven by title tag and content format experiments).
Signup-to-activation rate: +44% (driven by onboarding and landing page experiments).
Day-30 retention: +22% (driven by discoverability and engagement loop experiments).
Paid CPA: -38% (driven by ad copy, audience, and bid strategy experiments).
Overall paid subscriber growth: +189% year-on-year.

No single experiment drove 189% growth. A systematic programme of 47 experiments, each compounding on the last, did.

The learnings were not additive in a simple sense. They interacted. Better onboarding improved retention, which improved the lookalike audience quality for paid acquisition, which reduced CPA, which freed budget for more experiments. Systematic experimentation is a flywheel, not a checklist.

How ExperimentFlow Powered This

ExperimentFlow provided the infrastructure that made it possible to run concurrent experiments across multiple product surfaces without building and maintaining a custom experimentation platform. The team used three capabilities most heavily.

Variant Assignment via the Decide API

The /api/decide endpoint assigned users to experiment variants server-side, ensuring consistent assignment across sessions and surfaces. The call is simple and fast enough to include in the critical path of page load and API responses:

// Assign a user to a variant for the onboarding experiment
const response = await fetch('https://experimentflow.com/api/decide', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-API-Key': 'your-api-key'
  },
  body: JSON.stringify({
    experiment_name: 'onboarding-flow-length',
    user_id: user.id
  })
});

const { variant } = await response.json();
// variant is 'control', 'variant_a', or 'variant_b'

if (variant === 'variant_b') {
  return renderShortOnboardingWithQuickWin(user);
} else if (variant === 'variant_a') {
  return renderFourStepOnboarding(user);
} else {
  return renderEightStepOnboarding(user);
}

Batch Decide for Concurrent Experiments

Running multiple concurrent experiments required the batch decide endpoint to avoid cascading API calls at page load. A single call resolved all active experiment assignments for a given user:

const response = await fetch('https://experimentflow.com/api/decide/batch', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-API-Key': 'your-api-key'
  },
  body: JSON.stringify({
    user_id: user.id,
    experiments: [
      'onboarding-flow-length',
      'feature-discovery-modal',
      'engagement-notification-sequence'
    ]
  })
});

const variants = await response.json();
// { 'onboarding-flow-length': 'variant_b', 'feature-discovery-modal': 'control', ... }

Stats and Auto-Promotion

The ExperimentFlow dashboard provided real-time significance calculations using z-tests with configurable confidence thresholds. The team set an auto-promote threshold of 95% confidence for low-risk experiments (copy changes) and reviewed high-risk experiments (onboarding flow changes) manually before promotion. This split approach kept the cadence fast for low-stakes tests while maintaining appropriate rigour for changes with larger surface area.

If you want to build a systematic experimentation programme like the one described here, get started free with ExperimentFlow. The first experiment takes less than ten minutes to set up. The compounding starts from there.

For a deeper look at the statistical mechanics behind auto-promotion, see Thompson Sampling Explained. For guidance on which experiments to prioritise, see When Not to A/B Test.

Ready to optimize your site?

Start running experiments in minutes with Experiment Flow. Plans from $29/month.

Get Started