Across business use cases and verticals, engineers and leaders are constantly discussing the value AI can bring—often, the opportunities seem endless. It can predict your interests, the people you know or your next job.

However, we often overlook the steps that must be taken to execute AI-powered systems at scale. Deploying AI can be costly in terms of talent, compute resources, and time, and to fully unleash the wave of innovation that AI promises, developers must be properly empowered and equipped. In fact, many of the key elements needed for successful AI implementation have less to do with algorithm particulars and more with the tooling and processes in place around them.

Several of these tools and processes revolve around standardizing the most frequent workflows. This can take the form of something as simple as a spreadsheet listing common features, or as sophisticated as a full AI developer platform. As we’ve scaled our AI efforts at LinkedIn, we’ve gradually built toward the latter, creating our “Productive Machine Learning” (“” for short) program to improve developer productivity and efficiency.

Here are a few key takeaways and tips for organizations of any size that we’ve accumulated through this work.

Clean data in, smart insights out

A prerequisite to the process of deploying AI is having a thorough understanding of your data. The performance of an AI model is intrinsically tied to the data it’s trained on, so it’s important to know you have clean data to work with. Then, in choosing which datasets to use for training, it’s helpful to collaborate with your business partners to understand what the ultimate business goal is. For instance, if you want to “increase engagement” with a news feed, do you measure that by the click-through-rate for articles and posts, or the rate of “likes” or comments on posts? By jointly determining the best data to use to support clear business goals, you’ll design a more effective model.

Another factor to consider when selecting training data is how it is labeled. Does the data have sufficient context to be fed directly into a model, or does it require annotation? In the case of the latter, it’s important to create a “code book” or “run book” that sets standards for how data should be classified. I once worked with a small team of experts seeking to label a data set by hand, and when we evaluated the finished product, we realized that the agreement rate among them was less than . This means that the expert annotators didn’t agree with each other at all, and there is no reason to expect a model trained on such data will perform acceptably. If experts can’t agree on how data should be labeled, it’s unrealistic to expect that annotators with a service like CrowdFlower (now Figure Eight) will be able to do so effectively.