Earlier this year, Google at , an automanaged database service that melds features from both conventional relational systems and NoSQL technologies.
Today, Google will be available to the general public later this month. It will compete not only with rival cloud databases, but also up-and-coming open source projects that address scale and reliability issues by using Google’s own ideas.
The best of both worlds
Google presents Cloud Spanner as a happy medium between two common database needs that often prove incompatible. A database can be highly scalable and distributed (the NoSQL approach), or it can be transactionally consistent (the conventional database approach). Cloud Spanner aims to be both.
As laid out in a , one key to accomplish this is a time synchronization mechanism for actions that need to be kept consistent between nodes—such as globally consistent read operations, which people expect from a transactional database.
published earlier this year, Google talked about another key element: How Cloud Spanner leverages Google’s own network. Of the that are most desired from a distributed system—consistency, availability, and tolerance for splits between nodes—Cloud Spanner tries to deliver all three by making slight but often undetectable sacrifices to availability, aided by the fact that the service runs on Google’s own highly redundant network.
A little more scale, a little less SQL
The actual database Google has created from this technology strongly resembles other cloud-hosted transactional databases, but with some potentially irksome differences.
First, Cloud Spanner is advertised as having support for ANSI 2011 SQL queries. The shows this is true for
SELECT queries; they support all the familiar SQL syntax, including
GROUP BY. But
UPDATE commands are not available; according to a at Quizlet, which used Cloud Spanner in beta, you need to use “RPCs for mutating rows given their primary key” instead. Some of this is made easier through Cloud Spanner’s language and interface support, as it provides libraries for Go, Java/JDBC, Node.js., and Python, as well as support for REST calls.
based on the number of nodes in use, storage needed on those nodes, and outbound bandwidth consumed. Right now the size of a database influences the number of nodes required to deploy it; every 2TB of database storage to support it.
Imitation and flattery
Cloud Spanner’s promises are echoes of features in other database products, although Google is clearly hoping to compete broadly by offering a better amalgamation of features in one place.
Take autoscaling, for instance. Ex-Microsoftie Bob Muglia served up as a cloud data-warehouse system that didn’t need to be tweaked or tuned. There, Google can almost certainly compete on pricing, as it has its own infrastructure, where Snowflake is implemented on Amazon.
Speaking of Amazon, it has a few products that could be competition. Aurora, for instance, is Amazon’s hosted version of MySQL, and it for high-end work. It also has the advantage of being familiar and widely supported; there’s barely a database developer who hasn’t touched MySQL at some point. But again, Google’s hope is that Cloud Spanner will compete by offering better scale across the board, including for write operations and not only reads.
Then there’s , which is approaching its first full 1.0 version. This open source database project is an implementation of the ideas in Google’s Spanner paper, in much the same way Google’s paper on inspired Hadoop.
Where Google wants to stand out, though, is in the execution. That explains the white paper professing how it isn’t only the time-synchronization functions that makes Cloud Spanner special, but also Google’s tight control over the networking between nodes. It might be possible for another cloud to implement that through a CockroachDB-based service, but Google’s counting on first-mover advantage—and all the major back-end resources it can work with—to make an impression.