Disclaimer: Matt Asay works for AWS but the views expressed herein are his and don’t reflect those of his employer.
In two recent blog posts (“” and (“”), Snowflake spent 6,064 words arguing a very simple concept: All software need not be open—open source, open standards, open APIs. It’s not a particularly objectionable argument and reflects the reality that while virtually all software includes open source code, most software isn’t licensed as open source. Snowflake, in other words, is safely within its rights to keep its software closed.
And yet the company clearly felt the need (twice) to justify its decision, reflecting the strong gravitational pull of open source, open standards, and open APIs, even when its customers don’t appear to be clamoring for them.
Open sourcing data
Nearly a decade ago, Cloudera Co-founder Mike Olson : “No dominant platform-level software infrastructure has emerged in the last 10 years in closed source, proprietary form.” Olson was mostly correct. Splunk had emerged in that time and perhaps a few other examples, but, on balance, he was right.
Fast forward to 2021 and Olson’s pronouncement has remained pretty accurate with few exceptions. Snowflake is one of them. The company that bills itself as the data cloud company has managed to build a big business with a proprietary SaaS offering in an industry awash in exceptional open source data infrastructure like Apache Hadoop, Apache Arrow, Apache Spark, and more.
This perhaps reflects a more nuanced reality: Enterprises may intuitively want “open” but they place a bigger premium on “working.” This has been clear for years as companies have introduced managed services to make it easier to consume open source software or, in the case of companies and Snowflake, provide managed services that aren’t based on open source at all. Getting both “open source” and “operationally easy” in the same service is the holy grail, but if enterprises must choose one, they’re going to pick the solution that is easiest for them. After all, a customer can turn to Apache Spark, Dremio, or any number of tools to build data warehouses or data lakes, yet thousands of customers spent roughly half a billion dollars with Snowflake last year.
on this: “Open source was not about enabling users to understand and enhance the software. It’s about enabling the world to do so. Just because relatively few people are capable of understanding or patching Linux kernel code doesn’t mean its openness has had little impact. It’s a little smug and insulting to suggest that they shouldn’t share because only PhDs would understand it. In fact, science advances through sharing and publication. That’s the whole point of scientific journals and conferences. The art advances through disclosure.”
, “Not all software needs to be or should be open sourced. Open source is an appropriate license/model for a lot of software but not all.”
Whether Snowflake should is ultimately a decision for its customers, and based on revenues, it seems that Snowflake’s customers don’t care. So again, why write the posts?
Selling past the close
Most of also offer proprietary data cloud/platform services. (Disclosure: I work for AWS, which is a Snowflake partner and competitor, though I am not involved with that part of our business.) It’s highly unlikely, for example, that Oracle salespeople are beating up Snowflake for offering proprietary software. Perhaps the pressure is coming from Databricks or other open source vendors?
its Delta Sharing project, an open protocol for securely exchanging large data sets in real time. This was just one of Databricks’ announcements at the , which sported the tagline, “The future is open.” Nor is Databricks alone in positioning its data cloud as an open alternative to solutions like Snowflake. Journalist Sean Kerner , “You should see my inbox… Every other pitch is ‘X is an open alternative to Snowflake.’ ”
about the Snowflake IPO:
“Developers have never been overly religious about open source. The reason for [Olson’s comment about a] ‘stunning’ trend is simply that open source made it easier for developers to get their jobs done thanks to high-quality, easily accessible, open source data infrastructure. There are, of course, other benefits, such as the communities that often accompany open source projects, coupled with a desire to have more granular control of one’s software stack. But ultimately open source has won because it enables developers to ’get —- done.’ Which is why, for example, you’ll find developers happy to use open source software like Apache Airflow to load data into their proprietary Snowflake data platform. It’s not cognitive dissonance. It’s pragmatism.”
By rationalizing its decisions rather than simply delivering value to customers, Snowflake ends up confusing more than it clarifies. Enterprises clearly appreciate what it’s selling. No need for apologies about not being open enough.
Copyright © 2021 IDG Communications, Inc.