Tutorial: Spark application architecture and clusters

Get the full book Data Analytics with Spark Using Python (Addison-Wesley Data & Analytics Series) This article is an excerpt from the Pearson Addison-Wesley book “Data Analytics with Spark Using...

Why you should use Gandiva for Apache Arrow

Over the past three years has exploded in popularity across a range of different open source communities. In the Python community alone, Arrow is being downloaded more than 500,000...

Review: MXNet deep learning shines with Gluon

When I in 2016, I felt that it was a promising deep learning framework with excellent scalability (nearly linear on GPU clusters), good auto-differentiation, and state-of-the-art support for CUDA...

IDG Contributor Network: The future is cloudy, with a chance of success

I would have titled this post, “How to be a rainmaker in the cloud,” except the term rainmaker often refers to the selling process, which is already succeeding, and that...

Real-time data processing with data streaming: new tools for a new era

Today, there are many data sources—such at IoT devices, user interaction events from mobile applications, financial service transactions, and health monitoring systems—that broadcast critical information in real time. Developers working with...

The best open source software for data storage and analytics

is a distributed SQL database built on top of a transactional and consistent key-value store. It is designed to survive disk, machine, rack, and even data center failures with...

What is a data lake? Flexible big data management explained

If you are tuned in to the latest technology concepts around , you’ve likely heard the term “data lake.” The image conjures up a large reservoir of water—and that’s what...

Matei Zaharia, creator of the Apache Spark project, on the big data framework ...

In this episode of True Technologist, host Eric Knorr talks with Matei Zaharia, chief technologist at Databricks and an assistant professor of computer science at Stanford, about the Apache Spark...

Why there are no shortcuts to machine learning

Big data remains a game for the 1 percent. Or the 15 percent, as new suggests. According to the survey, most enterprises (85 percent) still haven’t cracked the code...

IDG Contributor Network: Why we lose out if we leave everything to algorithms

“”—an amazing piece of journalism by Sarah Jeong at the Verge—implicitly answers this question. It’s about the romance genre on Kindle Unlimited, and the royal rumble that’s been happening this...

STAY CONNECTED

2,973,060FollowersFollow
13,460,346FollowersFollow
6,473SubscribersSubscribe