Making sense of Microsoft’s graph database strategy


It’s taken some time, but Microsoft’s $26 billion purchase of LinkedIn is finally starting to show some interesting results, with LinkedIn data starting to show up in tools like Outlook. It’s the first sign of Microsoft using the social network’s relationship graph, the complex data set that was the reason for one of Microsoft’s biggest Silicon Valley acquisitions.

Under the hood, a social network like LinkedIn is nothing more than a huge NoSQL graph database, using a schema-less approach to managing semistructured data. Each node in the graph is an individual, with all his or her profile data. Each node is linked to others, tens or hundreds for people with a few connections, thousands for highly connected individuals. Queries traverse those connections, letting you find all the people you know working on AI, or who are based in Ontario, or who used to work at LinkedIn.

Graph databases everywhere: Microsoft Graph, Common Data Service, Cosmos DB, and Security Graph

Microsoft’s interest in graph-based data is clear. CEO Satya Nadella described the Office 365 APIs, the foundation of what’s now called , as the company’s “most important” bet. It’s certainly a very powerful tool, and opening it up to everyone lets organizations explore how their internal teams evolve and how corporate knowledge is stored in documents and conversations – along with the tools to expose that information and making it usable.

There’s a lot of data in the Microsoft Graph, with tools both for consumer information and for business information. Elements associated with Microsoft accounts, like the new Activity Stream and the Device Graph, are the basis for device-roaming features like (similar to Apple’s iCloud account-based Handoff capability in iOS), and which Microsoft is encouraging Universal Window Platform (UWP) developers to build into their code as part of and the upcoming Windows Timeline feature.

, which builds on a JSON document database with different API sets, including one for developing and managing your own graph databases at scale.

  • Although not completely public, Microsoft’s Security Graph is used to assess and manage threats, exposed to your apps through tools like Azure Active Directory’s conditional-access feature.
  • Microsoft’s different approach: Querying multiple graphs

    Where things get interesting is using graph queries across multiple graphs and using them to extract insights that can help drive business decisions. I’ve often talked about the idea of “right-time information”: the right information at the right time delivered to the right people so they can make the right decision for the right business outcome. Being able to query the edges of a graph, rather than on the node, lets you understand the relationships between items, a key factor in delivering the type of information support a modern business needs.

    By supporting multiple graphs, Microsoft is offering an alternative to traditional database-driven decision-support tools. By mixing internal staff and document data on the Microsoft Graph, external relationships via LinkedIn, core business information in the Dynamics 365 Common Data Service, and custom schema in the cloud-hosted Cosmos DB, you can make complex cross-graph queries focusing on not just than individual nodes in those graphs but also on the links between nodes. That lets you work with much more complex relationships than those exposed in relational databases.

    One way this being exposed is in the new Bing for Business tool that adds information from a corporate Active Directory and other sources to Bing searches when a user is logged in to an Azure Active Directory account. Results are dynamically generated from Microsoft Graph queries that return details of, for example, where someone is in the organization chart, along with related content from the wider web and from documents they’ve shared internally.