Working with large volumes of data is a lot like developing software. Both require a good understanding of what end-users need, knowledge of how to implement solutions, and agile practices to iterate and improve the results. Software development and data practices both require technology platforms, coding practices, devops methodologies, and nimble infrastructure to be instituted and ready to meet business needs.
Data scientists, , and data engineers have many similar technologies and practices compared to software developers, and yet, there are many differences. While attending the 2019 Strata Data Conference in New York, I looked at methodologies, platforms, and solutions presented there through the dual lenses of a software developer and a data engineer.
Getting data ready for consumption
Application developers working with modest amounts of data often implement the upfront data integration, formatting, and storage through scripts, database stored procedures, and other coding options. It’s a straightforward approach to get past the required plumbing and to have data ready to be supported by a microservice, shared through APIs, or consumed by an end-user application.