Make database normalization part of cloud migration


as a best practice in multicloud architecture. Let’s look at this concept within cloud migration, as well.

Don’t confuse database normalization with data normalization. Data normalization is about reducing redundancy and defining a more optimized structure. Perhaps you DBAs are aware of this process. I taught it in college more than 30 years ago.

Database normalization is the process of reducing redundancy of the databases themselves to create a set of databases that are better focused on serving the needs of the business applications, data scientists, and those performing data analytics. 

The challenge is that databases and data slated to move to the cloud are overly complex and full of redundancy (few single sources of truth). Also, most people moving that data to the cloud just want to replicate the databases to the public cloud destination—a huge mistake, and here’s why.

I understand that most budgets are limited and that the cost of moving and combining data to cloud-native and noncloud-native databases is much higher than simply pushing bad database architectures to the cloud. However, I also understand that you’re better off figuring that out as you move to the cloud, rather than having to fix it later.

If you don’t, you’ll have to migrate data twice: First lifting and shifting to a public cloud or clouds, then having to loop back and fix things once you figure out that the database architecture in the cloud is not optimized (because it’s overly complex, or redundant, or too expensive). 

What should those charged with cloud migration do? Here is the ideal process:

  1. Get buy-in from stakeholders. This is first because you will spend at least twice as much for the migration and normalization, including changing applications to process the new database stack. If leaders are not willing to put up the money, explain the risks and future costs. (Try sending an email. You can pull it out of your sent mail folder as proof you tried to warn against the folly of doing nothing.)
  2. Make sure you have the necessary human resources. You’ll need people who understand the incumbent database stack, cloud-native database options, database design, and data migration processes. Include data security and governance in there, as well as data operations.
  3. Spend enough time on planning. Much of this work is around migration planning, including how the databases will change and which new or existing ones to use. The devil is in the details. If you’re missing a piece of middleware or a data compliance system, it will have to be redone.

This is really not as hard as it seems. We’ve moved from platform to platform and simultaneously changed databases in the past. What’s new is that we’re looking more strategically at the data now. It needs to be core analytical data, as well as training data for AI systems. Data is really everything to the business now. Treat it as such.

Copyright © 2021 IDG Communications, Inc.