(Editor’s observe: This put up is the final in a 4 half collection that discusses experimentation at GoDaddy. You’ll be able to learn half one right here, half two right here, and half three right here.)
Over the previous 12 months, we’ve explored how GoDaddy grew to become a studying group. On this ultimate article of our 2025 collection, we have a look at how GoDaddy scales experimentation and the mindset, programs, and metrics that flip studying right into a company-wide functionality.
As we outlined earlier, scaling A/B testing isn’t about operating extra experiments — it’s about creating the circumstances for groups to be taught sooner whereas defending buyer belief. This strategy has already contributed $1.6 billion in income development.
Journey to scale
GoDaddy’s experimentation basis took form over a number of years. Scaling A/B testing has been a gradual effort formed by programs, habits, and know-how.
2021: Foundations
GoDaddy ran about 1,000 experiments, establishing core workflows whereas migrating scorecards to Hivemind, our inside experimentation platform. Early capabilities included segmentation, new metrics, and the primary sample-size calculator — starting the democratization of experimentation instruments and greatest practices.
2022: Constructing the platform
Experiment quantity rose to 1,300+ as Hivemind grew to become the first configuration and reporting engine, changing SplitIO. Groups gained standardized templates, customized and guardrail metrics, and dashboards monitoring experiment well being and throughput.
2023: Introducing intelligence
Experimentation exceeded 1,700 experiments. Hivemind expanded with automated badging, segmentation insights, AI-powered speculation strategies, experiment summaries, and stronger integration throughout platforms. Peer opinions and showcase rituals matured, embedding experimentation deeper into workflows.
2024: Experimentation in all places
GoDaddy surpassed 2,500 experiments. New options included the Steady Deployment Surroundings (CDE) for measuring full-release impacts, a unified function flag retailer, adjacency mannequin, AI agent, and Firm Experiment Metrics. Experimentation expanded to org, workforce, and portfolio-level monitoring.
2025: Scaling with intelligence
The main focus has shifted from scaling quantity to scaling perception. Hivemind’s AI surfaces hypotheses, flags experiment overlaps, and connects outcomes throughout journeys. GoDaddy’s experimentation program has advanced right into a self-improving studying engine, accelerating the time from thought to measured impression.
Working mannequin
Experimentation at GoDaddy is pushed by folks and the way in which we work collectively throughout three interconnected layers, every with its personal focus and suggestions loops. Collectively, these create a system that helps experimentation from firm imaginative and prescient to each day execution. The next sections describe the experimentation working mannequin.
Tradition (firm stage) — studying as a system
Rituals just like the Experimentation Showcase and OKR alignment make insights seen and celebrated. Groups are inspired to pursue daring concepts with out concern of being flawed—the main target is on studying, not profitable. Management reinforces curiosity and transparency by means of Firm Experiment Well being Metrics.
Collaboration (program stage) — groups in sync
Shared templates, unified metrics like incremental gross money receipt (iGCR), and Alternative Resolution Bushes (OSTs) assist groups be taught in the identical language. Peer opinions and cross-team boards unfold greatest practices and keep away from duplication. Applications present governance and readability, connecting native experiments to strategic priorities.
Execution (squad stage) — experiments in motion
Squads type round hypotheses, run speedy checks, and iterate shortly. Standardized lifecycles guarantee each experiment meets shared high quality and data-integrity requirements.
At GoDaddy, we encourage each worker to invent, discover, and resolve issues to enhance our merchandise for purchasers. Psychological security is on the coronary heart of this tradition—when studying is the objective, not profitable, groups be at liberty to take daring bets.
The next picture exhibits GoDaddy’s inside experimentation platform, Hivemind:

What scaling means for GoDaddy
Our rituals information accountable, high-velocity experimentation at scale. The next 5 rules stability velocity, high quality, and belief:
- Velocity with rigor — Study sooner with out compromising statistical or moral requirements.
- Parallelization with management — Run a number of experiments safely by means of visitors governance.
- Unified metrics and visibility — Standardize reporting for readability in any respect ranges.
- High quality by design — Construct greatest practices into the platform by means of automation and peer overview.
- Portfolio impression — Join experiments to company-wide outcomes, not simply native wins.
Platform enablers
These technical capabilities make scaling experimentation potential.
| Scale Lever | Description | Enterprise Impression | Buyer Impression | Engineering Raise |
|---|---|---|---|---|
| CDE and have flags | Visibility into launch impression + potential to judge adjustments with out full A/B checks | Sooner, safer launches; dynamic danger mitigation | Dependable, low-friction releases | SDK, monitoring, alerting, unified UI |
| Adjacency mannequin and AI agent | Floor experiment concepts and streamline setup | Accelerated ideation; decreased handbook effort | Extra related, impactful experiments | Auto-scaling, API integration |
| Company experiment metrics | Commonplace measurement framework | Portfolio-level impression monitoring, PandL visibility | Prevents damaging surprises | Automated reporting, portfolio administration |
| Parallelization and visitors governance | Coordinate experiment visitors | Sooner decision-making | Secure buyer expertise | Allocation providers, exclusion lists |
| Unified metrics and guardrails | Constant measurement requirements | Deal with sturdy worth | Prevents regressions | Shared metric layer and alerting |
| High quality scoring and peer opinions | High quality alerts and shared overview processes | Extra conclusive checks | Safer, clearer adjustments | Templates and reviewer workflows |
| Put up-rollout causal evaluation | Reveal true impression after launch | Assured scale-ups and rollbacks | Fewer long-tail points | Automated impression detection |
These enablers present the inspiration — however scale emerges from how groups put them into observe by means of systemization.
Systemization
Systemization defines the routines, checkpoints, and shared practices that make experimentation dependable at scale. Platform options present the tooling; systemization shapes the day-to-day conduct.
Lifecycle requirements
Each experiment follows a shared lifecycle—from speculation pre-registration and energy steerage, by means of guarded launch and sequential monitoring, to post-rollout causal checks. These steps maintain insights rigorous whereas permitting groups to discover freely.

High quality scoring
Experiments earn high quality badges — Bronze, Silver, Gold, Platinum — based mostly on completeness and design high quality. Light-weight peer opinions make high quality a collaborative behavior.
| Degree | Necessities |
|---|---|
| Bronze | Configured in Hivemind and analyzed utilizing a Hivemind scorecard |
| Silver | Bronze + documented speculation, choice/guardrail metrics, affordable limits, alpha ≤ 0.2 |
| Gold | Silver + frequentist design with alpha ≤ 0.1 |
| Platinum | Gold + contains experiment length calculation |
Information circulate
Shared summaries, dashboards, and automatic readouts assist insights journey shortly, forming a company-wide suggestions loop.

AI-enabled teaching
Hivemind’s AI agent guides experiment authors in actual time, bettering design and evaluation inside their workflow.

Celebrating curiosity
Showcases, storytelling, and voting leaderboards make studying seen and reinforce insight-over-outcome tradition.

Monitoring metrics
Firm Experiment Well being Metrics monitor velocity and high quality at each stage. After GoDaddy invested in high quality and platform capabilities, inconclusive outcomes fell and win charges rose—from underneath 5% to over 30% in two years.

Inside monitoring
Inside monitoring exhibits whether or not scaling is working by turning our metrics into a transparent view of how experimentation performs throughout all ranges of the group.
The next desk summarizes how we monitor experimentation well being on the firm, program, and squad ranges.
| Degree | What we measure |
|---|---|
| Firm stage | – Complete experiments run – Time-to-decision (median) – Managed vs. non-controlled ratio and conclusive outcomes – iGCR impression and guardrail compliance – % of constructive, damaging, and impartial learnings – Perception reuse throughout enterprise models |
| Program stage | – Exploration vs. optimization stability – Protection throughout journeys and surfaces – High quality ladder distribution – Peer overview participation – Collaboration by means of shared templates and initiatives |
| Squad stage | – % of experiments utilizing Hivemind options – Use of AI-enabled setup and causal evaluation instruments – Squad participation in showcases – Development of self-service experimentation |
Challenges
Scaling A/B testing throughout a big group is rarely frictionless. Key challenges formed our course of—and finally strengthened it.
Consistency throughout groups
Shared templates, peer opinions, and AI-guided setup helped standardize early variations in experiment design.
One problem with A/B testing at GoDaddy is that it’s not often so simple as flipping a swap. To run a clear check, groups want clear objectives, good information, and coordination throughout a variety of transferring components. It might gradual issues down, but it surely’s additionally pushed us to work extra intently collectively and construct higher habits round experimentation. The payoff is that once we do get a consequence, we will belief it and act on it with confidence. — Araz Javadov, Group Product Supervisor
Making metrics significant
Unified measurement turned fragmented visibility into alignment.
One of many greatest sources of friction our PMs confronted was the overhead of sharing pre- and post-experiment outcomes. Though hypotheses and information already lived in Hivemind, PMs nonetheless needed to recreate them in decks, current dwell, after which put up screenshots and hyperlinks throughout a number of channels. Based mostly on PM suggestions, Hivemind was up to date to incorporate a high-level experiment abstract with the speculation, before-and-after visuals, and ends in a single view, together with direct Slack sharing. This decreased duplication and allowed PMs to spend much less time packaging outcomes and extra time bettering the shopper expertise. — Heather Stone, Director Product Administration
Balancing velocity and rigor
Light-weight opinions, high quality scoring, and platform nudges allow groups to maneuver quick with out chopping corners.
At our scale, velocity solely issues in the event you can belief what you be taught. The programs we constructed, templates, guardrails, and light-weight opinions, let groups transfer quick with out chopping corners. That’s how experimentation turns into a part of the day after day, not an all-hands dash — Alan Shiflett, GM, Area Aftermarket and Specialty Manufacturers
Conclusion
This 12 months made one thing clear: scaling experimentation is much less in regards to the variety of checks and extra in regards to the surroundings that helps studying. At GoDaddy, that surroundings strengthened as groups aligned round shared rules, trusted their programs, and handled each consequence as info that strikes the group ahead.
If you happen to’re constructing your individual experimentation observe, begin by creating readability round how experiments are run and the way studying is shared. Give groups the protection to discover and the instruments to grasp the impression of their work. When these items come collectively, experimentation stops being an exercise—it turns into a method of working.
As we glance forward, GoDaddy is exploring a brand new framework that brings human and AI-driven experimentation nearer collectively—shortening the time between an thought and understanding its actual impression. Keep tuned for extra learnings subsequent 12 months!









