February 10, 2026 11 min read

Personalization at Scale: Using Contextual Bandits for Dynamic Content

personalizationcontextual banditsmachine learning

The Limits of A/B Testing

Traditional A/B testing finds the single best variant for your average visitor. But your visitors aren't average—they come from different sources, use different devices, have different intents, and respond to different messages.

What if your "Sign up free" CTA works best for organic search visitors, but "See a demo" converts better for visitors from LinkedIn? A standard A/B test would pick one winner, leaving conversions on the table for every segment where the winner isn't optimal.

This is where contextual bandits come in: they learn which variant works best for each type of visitor and automatically serve the right experience to the right person.

What Are Contextual Bandits?

A contextual bandit extends the multi-armed bandit framework by incorporating context—information about the current visitor—into the decision. Instead of learning a single best variant, it learns a policy: a mapping from context to the best action.

Context can include:

Traffic source: organic, paid, social, email, direct
Device type: mobile, desktop, tablet
Geography: country, region, timezone
User behavior: new vs returning, pages viewed, time on site
Time: day of week, time of day, season
Custom attributes: plan tier, industry, company size

How Contextual Bandits Work

The algorithm maintains a model that predicts the expected reward (conversion probability) for each action given the current context. For each new visitor:

Observe the visitor's context (device, source, etc.)
For each possible action (variant), predict the expected reward given this context
Add exploration noise to balance learning with performance
Select the action with the highest adjusted prediction
Observe the outcome and update the model

// Rank actions for a visitor using contextual bandits
const response = await fetch('/api/personalize/123/rank', {
  method: 'POST',
  headers: {
    'X-API-Key': apiKey,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    context: {
      device: 'mobile',
      source: 'organic',
      country: 'US',
      returning: true
    },
    actions: ['free-trial', 'demo', 'pricing', 'case-study']
  })
})

// Returns actions ranked by predicted conversion for THIS visitor
// { "ranking": ["demo", "case-study", "free-trial", "pricing"] }

Personalization Use Cases

Dynamic Landing Pages

Serve different hero messages, images, and CTAs based on the visitor's context. A visitor arriving from a technical blog post might see "Explore our API docs" while someone from a business publication sees "Book a strategy call."

Product Recommendations

Rank products differently for each visitor based on their browsing history, demographics, and what similar visitors purchased. This goes beyond collaborative filtering by incorporating real-time context.

Email Personalization

Choose the best subject line, send time, and content variant for each subscriber. Contextual bandits learn which subscribers respond to urgency ("Last chance!") vs curiosity ("You won't believe...") vs utility ("How to...").

Pricing and Offers

Show different discount levels, payment plan options, or feature highlights based on the visitor's predicted price sensitivity and use case.

Contextual Bandits vs Segment-Based Testing

You might think: "I'll just run separate A/B tests for each segment." This approach has serious limitations:

Combinatorial explosion: 3 devices x 5 sources x 4 regions = 60 segments, each needing its own test
Thin traffic: Splitting into many segments means each one takes forever to reach significance
Manual management: Someone has to set up, monitor, and act on 60+ tests
No feature interaction: What if mobile + organic behaves differently than mobile + paid? Segments can't capture these interactions efficiently.

Contextual bandits handle all of this automatically. They discover which context features matter, capture interactions between features, and optimize allocation across the entire context space simultaneously.

Getting Started with Personalization

Experiment Flow's personalizer feature makes contextual bandits accessible without requiring ML expertise:

Define your actions: The variants you want to personalize (CTAs, layouts, offers)
Send context: Pass visitor attributes with each ranking request
Track rewards: Report conversions so the model can learn
Watch it improve: The model adapts over time, improving personalization with every interaction

The system uses embedding-based similarity to generalize across contexts it hasn't seen before. A visitor from Germany on mobile benefits from what the system learned about visitors from France on mobile, because the contexts are similar in the embedding space.

Start with basic A/B testing, graduate to bandits for faster optimization, then level up to contextual bandits for true personalization. Each step builds on the last.

Measuring Personalization Impact

To measure the impact of personalization, run a meta-experiment: randomly assign some visitors to the personalized experience and others to a static control (your current best variant). The difference in conversion rates is the personalization lift—the value that context-aware optimization adds beyond finding a single best variant.

In our experience, personalization lift typically ranges from 5-30% above an already-optimized baseline, with the highest gains in businesses with diverse visitor segments.

Ready to optimize your site?

Start running experiments in minutes with Experiment Flow. Free plan available.

Start Free