Personalization at Scale: Using Contextual Bandits for Dynamic Content
The Limits of A/B Testing
Traditional A/B testing finds the single best variant for your average visitor. But your visitors aren't average—they come from different sources, use different devices, have different intents, and respond to different messages.
What if your "Sign up free" CTA works best for organic search visitors, but "See a demo" converts better for visitors from LinkedIn? A standard A/B test would pick one winner, leaving conversions on the table for every segment where the winner isn't optimal.
This is where contextual bandits come in: they learn which variant works best for each type of visitor and automatically serve the right experience to the right person.
What Are Contextual Bandits?
A contextual bandit extends the multi-armed bandit framework by incorporating context—information about the current visitor—into the decision. Instead of learning a single best variant, it learns a policy: a mapping from context to the best action.
Context can include:
- Traffic source: organic, paid, social, email, direct
- Device type: mobile, desktop, tablet
- Geography: country, region, timezone
- User behavior: new vs returning, pages viewed, time on site
- Time: day of week, time of day, season
- Custom attributes: plan tier, industry, company size
How Contextual Bandits Work
The algorithm maintains a model that predicts the expected reward (conversion probability) for each action given the current context. For each new visitor:
- Observe the visitor's context (device, source, etc.)
- For each possible action (variant), predict the expected reward given this context
- Add exploration noise to balance learning with performance
- Select the action with the highest adjusted prediction
- Observe the outcome and update the model
// Rank actions for a visitor using contextual bandits
const response = await fetch('/api/personalize/123/rank', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
context: {
device: 'mobile',
source: 'organic',
country: 'US',
returning: true
},
actions: ['free-trial', 'demo', 'pricing', 'case-study']
})
})
// Returns actions ranked by predicted conversion for THIS visitor
// { "ranking": ["demo", "case-study", "free-trial", "pricing"] }
Personalization Use Cases
Dynamic Landing Pages
Serve different hero messages, images, and CTAs based on the visitor's context. A visitor arriving from a technical blog post might see "Explore our API docs" while someone from a business publication sees "Book a strategy call."
Product Recommendations
Rank products differently for each visitor based on their browsing history, demographics, and what similar visitors purchased. This goes beyond collaborative filtering by incorporating real-time context.
Email Personalization
Choose the best subject line, send time, and content variant for each subscriber. Contextual bandits learn which subscribers respond to urgency ("Last chance!") vs curiosity ("You won't believe...") vs utility ("How to...").
Pricing and Offers
Show different discount levels, payment plan options, or feature highlights based on the visitor's predicted price sensitivity and use case.
Contextual Bandits vs Segment-Based Testing
You might think: "I'll just run separate A/B tests for each segment." This approach has serious limitations:
- Combinatorial explosion: 3 devices x 5 sources x 4 regions = 60 segments, each needing its own test
- Thin traffic: Splitting into many segments means each one takes forever to reach significance
- Manual management: Someone has to set up, monitor, and act on 60+ tests
- No feature interaction: What if mobile + organic behaves differently than mobile + paid? Segments can't capture these interactions efficiently.
Contextual bandits handle all of this automatically. They discover which context features matter, capture interactions between features, and optimize allocation across the entire context space simultaneously.
Getting Started with Personalization
Experiment Flow's personalizer feature makes contextual bandits accessible without requiring ML expertise:
- Define your actions: The variants you want to personalize (CTAs, layouts, offers)
- Send context: Pass visitor attributes with each ranking request
- Track rewards: Report conversions so the model can learn
- Watch it improve: The model adapts over time, improving personalization with every interaction
The system uses embedding-based similarity to generalize across contexts it hasn't seen before. A visitor from Germany on mobile benefits from what the system learned about visitors from France on mobile, because the contexts are similar in the embedding space.
Start with basic A/B testing, graduate to bandits for faster optimization, then level up to contextual bandits for true personalization. Each step builds on the last.
Measuring Personalization Impact
To measure the impact of personalization, run a meta-experiment: randomly assign some visitors to the personalized experience and others to a static control (your current best variant). The difference in conversion rates is the personalization lift—the value that context-aware optimization adds beyond finding a single best variant.
In our experience, personalization lift typically ranges from 5-30% above an already-optimized baseline, with the highest gains in businesses with diverse visitor segments.
Ready to optimize your site?
Start running experiments in minutes with Experiment Flow. Free plan available.
Start Free