Real-time data processing with data streaming: new tools for a new era

Today, there are many data sources—such at IoT devices, user interaction events from mobile applications, financial service transactions, and health monitoring systems—that broadcast critical information in real time. Developers working with...

The best open source software for data storage and analytics

is a distributed SQL database built on top of a transactional and consistent key-value store. It is designed to survive disk, machine, rack, and even data center failures with...

What is a data lake? Flexible big data management explained

If you are tuned in to the latest technology concepts around , you’ve likely heard the term “data lake.” The image conjures up a large reservoir of water—and that’s what...

Matei Zaharia, creator of the Apache Spark project, on the big data framework ...

In this episode of True Technologist, host Eric Knorr talks with Matei Zaharia, chief technologist at Databricks and an assistant professor of computer science at Stanford, about the Apache Spark...

Why there are no shortcuts to machine learning

Big data remains a game for the 1 percent. Or the 15 percent, as new suggests. According to the survey, most enterprises (85 percent) still haven’t cracked the code...

IDG Contributor Network: Why we lose out if we leave everything to algorithms

“”—an amazing piece of journalism by Sarah Jeong at the Verge—implicitly answers this question. It’s about the romance genre on Kindle Unlimited, and the royal rumble that’s been happening this...

How to build stateful streaming applications with Apache Flink

Apache Flink is a framework for implementing stateful stream processing applications and running them at scale on a compute cluster. In a we examined what stateful stream processing is,...

Introducing BigQuery ML for building predictive models with SQL

One key to efficient data analysis of big data is to do the computations where the data lives. In some cases, that means running R, Python, Java, or Scala programs...

IDG Contributor Network: Big data: enabling new approaches to IT infrastructure security

Consider modern enterprise IT infrastructure. Increasingly, it is a complex combination of on premise computing and storage and off premise, cloud-based resources. Tying all of this together is a web...

3 big data platforms look beyond Hadoop

A distributed file system, a MapReduce programming framework, and an extended family of tools for processing huge data sets on large clusters of commodity hardware, Hadoop has been synonymous with...

STAY CONNECTED

2,973,060FollowersFollow
13,460,346FollowersFollow
6,218SubscribersSubscribe