It is not enough to just stand up a web service that can make predictions.
In a 2017 SAS survey, 83% have made moderate-to- significant investments in big data, but only 33% say they have derived value from their investments. Other more recent surveys have shown similar results. We have found that the main reason for this gap is a failure to understand the full scope of what is required to operationalize the predictions in a way that truly benefits your business. In this post, I’d like to walk you through a sample scenario, and show how success requires a process that ensures alignment with the affected people, keeps focus on the business value to be derived, and iteratively builds the technology platform.
The process for operationalizing a model starts before the first data frame is loaded by your data science team. A high-functioning data science team will generate ideas alongside the business with a vision of the business value that will be derived. For instance, it is not enough to simply say “We want to predict customer churn”. A better business hypothesis might be, “We want to save $720K per year by preventing 10% of the customers who are most likely to churn from doing so by sending them targeted promotions”. When prioritizing the backlog of work, the data science team should focus on those ideas that have the highest potential ROI. This task, as my colleague Sean McCall pointed out, should fall to the Data Science Product Owner. You can find an example calculation for customer churn below:
($300 cost of acquisition (measured)
– $100 cost of promotion(assumed))
* 30% success of retaining with promotion (assumed)
* 10000 customer churn per month (measured)
= $60K / month
= $720K / year
Once the data science team starts to develop models to support a business idea, they should start with the simplest possible model and share the results regularly with the teams most affected, to collaborate with them on improving the model and decide together when the model is good enough. This is where our earlier rough valuation comes to the rescue. Talking to your stakeholders about RMSE, eigenvectors, and ROC curves is likely to make them glaze over. By translating the accuracy of what you’re trying to predict back to the business value statement, you’ll be speaking the same language. “We’re able to predict customer churn with 75% accuracy” sounds much better when followed up with, “which will save us about $45K / month based on current assumptions”, especially when you actually start operationalizing the model in the next step.
When the team behind a model makes the decision to start running it in production, they should also understand the cost to build and run the production version (including the data platform to feed input data, the model prediction endpoint, and the monitoring of the model service). This, along with the cost of the Data Scientists above, provides the cost side of the ROI equation. The details for developing this estimate are beyond the scope of this article, so let’s just assume the cost is $180K. At that cost, the solution should pay off in 4 months, and provide a 1-year ROI of $540K. This is a great return, so we start the process of automating data feeds, deploying the model behind a web service, generating predictions, and integrating the prediction results into new and existing applications to make them easy to use. Because we were working with the end users of the predictions the entire time, we know which details they want to see to help build trust in the predictions.
Now that we’ve gotten to the point where we’re actually operationalizing the model, we still have to convince people to use it. In our previous example, let’s assume one way to offer our targeted promotions is to hand-write notes to customers. First, we’ll need to convince someone to provide the budget to set up a note-writing department (maybe we invest in one of those Turry handwriting robots from Robotica). If we don’t hand it all over to the robots, we will need to keep our human note-writing team motivated by showing that their work is having an impact. We should develop a tracking system to see if the promotions within our notes are actually used and which percentage of the promotion users are still with the company. This will allow us to validate the business hypothesis and help inform other decisions down the line.
So now that we’ve built a great model hosting system and we’re measuring 75% of the expected retention (people always seem to overestimate these types of assumptions, but that’s OK since we’re still on track to save $405K), the team has likely moved on to build other high-value models. One key step that is often overlooked is building out the common operations that occur across multiple models into a platform or framework. In order to operationalize models at any sort of scale, the underlying data platform must be flexible enough to enable rapid experimentation, but also scalable enough to process transformations and facilitate predictions over a large volume of data very quickly. To justify the costs of these investments, they can be amortized across the ROI of several models.
As this process repeats, the platform becomes more robust, and the team becomes better at estimating value and building trust with end users, we’re able to truly operationalize models at scale. The team starts churning out valuable applications, features, and insights based on their predictive models faster and the broader business begins to trust the results more quickly.
Then, one day 6 months down the line, the alarms go off that the churn model is not working anymore. The accuracy has declined to 50%, so it’s essentially just guessing. The data science team has to scramble to find out what is wrong and fix it. What has changed in the input data? When is the last time we deployed that model? Did a patch on the underlying machine break something? It would be really nice to get back to the exact state when we trained the model so we can determine what has changed. Eventually, someone finds and corrects the problem (the metric for usage was being set to 0 for all customers due to an errant deployment). Incidents like this cause the team to focus on building proactive debugging tools like anomaly detection on key features and prediction outcomes to detect and isolate problems faster.
At this point, your organization is capable of building and deploying high-value applications of machine learning at scale while minimizing risk. This is what we at Pariveda call “Machine Learning as a Business Capability”. While the story above was fictional, it was based on our experience working with many clients across all stages of their machine learning journey. This is why we have partnered with AWS and Domino Data to provide a Solution Space offering that can enable your team with both a technology platform to support experimentation and ModelOps and the guidance / implementation support on the architecture, processes, and tools to enable your team to leverage Machine Learning as a Business Capability.