When Microsoft’s Bing engineering team decided to speed up its cadence of delivering code into production, it set out on a journey toward agility that initiated with Continuous Delivery (CD) as a key destination.
With a goal of trying to catch and surpass its closest leading competitor, the Bing team had to rethink the way it worked.
Craig Miller, technical advisor for Bing, said the effort began four years ago with a system of monthly deployments, 100 engineers, two to three weekly incidents—production issues that occur—and a pretty standard waterfall development system. That had worked fairly well for the early generation of Bing. But to continue to chase Google, something had to change, drastically.
“So we stepped back and decided to go after an order of magnitude improvement in our system,” Miller told eWEEK. “I don’t mean code velocity, but to look at the big picture as well and at what we call ‘idea velocity’ here in Bing.”
According to a synopsis of the Bing team’s move to agility and agile methods on Microsoft’s Engineering Stories site, “When we began to make the leap to Continuous Delivery, we not only changed the way our developers write code—we fundamentally altered the way our business operates.”
The CD method is an agile software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing and releasing software faster, more frequently and with higher quality.
“At its core, CD represents a decision to pursue development processes that result in steady incremental rather than comprehensive change,” said Charles King, principal analyst at Pund-IT. “If CD is implemented successfully, organizations can reduce the cost and time required to make changes, along with associated risks.”
That’s true for two reasons, he said. One is that making numerous, ongoing incremental changes tends to identify and correct potential problems more effectively, resulting in greater reliability and resiliency. The second is that since older features are altered and newer features are introduced more gradually in CD, it’s easier to identify which deliver actual value for customers and which fail to do so.
In a blog post on the Bing transformation, Dr. Jan Pedersen, Microsoft’s chief scientist for Bing and Information Platform R&D, said, “To accelerate feature deployment and innovation, we have invested much effort in overcoming software engineering challenges. The monthly deployment cadence has been gone for some time, taking with it both the old culture and most of the infrastructure. In its place, a highly distributed, parallelized, and agile system has risen, and this system has been a game-changer for both developers working on Bing features and the live site users. Bing Engineering has spearheaded this effort, and has produced a world-class ideation, development, validation, and experimentation system.”
That system helped Miller’s Bing engineering team scale from 100 engineers to 600, and from a cadence of monthly releases to daily releases.
“We evolved from monthly deployments to daily—which was our mantra, to now where we deploy 20 to 24 times a week,” Miller said. “That’s roughly four times a day. We have 600 engineers in the branch, and we have zero to occasionally one incident in a given week.”
Agility is the speed at which ideas can transition from the whiteboard to the keyboard to the live site and to users, Miller said.
To ensure compliance with its standards, the team built a core focus on quality into its process, demanding that engineers think about the effect of their changes before they go out to customers.
“When you’re this close to production …, when your code goes in and literally within six or seven hours, it’s fully rolled out worldwide, it actually connects the engineer to the customer much more closely than if you check in your code and a month later it shows up in production,” Miller said. “It gives that engineer a feeling of real power.
Indeed, he noted that with Continuous Delivery, there is an inherent 24-hour cycle as part of the model.
“Within 24 hours you can write your code, you can get it deployed and you can get it live and experimented, and get feedback by the next day when you come in,” Miller said. “Every day you get a signal back from the users on the code you just put out, so that you can then alter your behavior. It may mean you have a bug that you missed, or the feature just isn’t working for the users. Or you see where you need to iterate and tweak a feature. That’s a powerful connection to the users.”
Instead of the term “fail fast” that is often used in discussions of agile development to describe the process of quickly learning what does not work, Miller said he prefers to use the term “learn fast.” Code velocity is not the ultimate goal, he said. The real goal is idea velocity and being able to deliver as many ideas through the system safely and with value to the customer at every opportunity.
Microsoft Move to Agility Hastens Bing’s Deployment Cadence
Microsoft uses a beltway analogy for its agility process, with an inner and outer loop. The inner loop is the loop that spans ideation through code commit. It also includes prototyping, crowd-sourced feature engagement and feasibility studies, the Microsoft post said. The outer loop is the loop that gets committed code out to production.
The numbers are staggering. Microsoft’s Bing team deploys thousands of services more than 20 times per week, with over 600 engineers contributing to the code base. The team pushes more than 4,000 individual changes per week, where each code change submission goes through a test pass containing over 20,000 tests. “In short, agility has been a game-changer for Bing,” the Bing engineering post said.
Because of that complexity, the move to Continuous Delivery was not simple, and all the obstacles were not technical. The team had to overcome cultural issues as well. For instance, there were conflicting views on how to handle quality and testing—particularly with such a fast-moving process.
“People asked how can you validate the code to be able to push out this often,” Miller said. “How can you run your entire test suite? In our case, it’s 20 minutes. We run 20,000 tests in 20 minutes. Can you actually do that? People said we needed manual tests. We do not allow a single failure in our test. All 20,000 [tests] must run clean. If they don’t run clean, then that check-in is blocked. So you cannot have automation flakiness. We have no manual tests. If you can’t automate it, then you have a problem with your feature.”
Meanwhile, Microsoft provides tooling to the Bing engineers to allow them to get feedback from external users about their ideas within a few minutes. Experiments are then sent to hundreds of people to get their feedback. Microsoft has its own crowdsourcing platform with a pool consisting of several thousand external people on panel, so feedback from the pool usually comes back within two hours, and it allows engineers to experiment visually instead of questions without any need to write code, the Bing Engineering post said.
However, “I think observation unlocks more truth than just listening,” Miller told eWEEK. “That is, we look at what the users actually do. We take their feedback as well, but we really want to look at behaviors and see how we can evaluate [them]. In idea velocity, idea generation, testing at scale, rapid deployment and experimentation are the four core elements to deliver end-to-end ideas with value to your customers.”
As far as tooling and support for the move to CD, Miller said the team relied on the underpinnings of Microsoft’s Azure cloud platform. They started by leveraging a cloud-based build system, as well as using Azure and Microsoft’s Test Authoring and Execution Framework (TAEF) to build out a custom, highly parallelized and distributed feature-validation system, the Bing post said.
And to ensure that there are plenty of ideas to fuel the changes that are constantly being added to Bing, Microsoft encourages creativity from all engineers through things like Growth Hacks, Hack Days and the Bingcubator—where engineers can pitch ideas and get them funded.
“Within the organization, it’s kind of like [venture capital] funding,” Miller said of the Bingcubator. “You can say [that] I have this idea I really want to do and put some data together and you come and talk for three minutes, and we decide if we should invest in it. It’s an opportunity for individual engineers and product managers to say we should do this and the work couldn’t be funded within individual teams for various reasons. It’s a way to encourage this culture of creativity.”
Miller also noted that a key factor influencing Microsoft’s move to Continuous Delivery for Bing development was the group’s challenger mentality.
He said the team knows full well, “We’re not the top dog; we have to be scrappy, we have to find ways to really be the best. And because we’re relatively small compared to our largest competitor, we needed to do things faster and more efficiently than they could. We didn’t have the levels of income that they had. And that does alter the core culture that you have and the people you attract. There are people who come here who want to go after the big dog. It’s pervasive in us that that challenger mind-set is a core attribute that we have, and we continue to try to foster it.”
Ironically, however, the move to CD has not taxed the team any more than the old way of working. In fact, the team’s work/life balance is better, Miller said.
“Our work/life balance went from a number I’m not that proud of to where twice as many people said their work/life balance was better when we adopted continuous delivery,” he said. “And that’s counterintuitive, because people think that means you’re on all the time and your code is constantly in production. But in fact it removes this whole element of debating whether you take this code fix or this check-in or not. It removes debates about whether [something] is an important feature or not.”