I get this request a lot: We have a single, huge database and want to keep part of the data on-premises and part of the data in the cloud. Is that possible?
Of course. Enough time and money can solve all problems. The real question is not “can we,” but “should we?” Here are the realities:
Most databases provide physical partitioning mechanisms that allow you to separate physical data over networks, including the open Internet, where a partition is hosted in the cloud. Some enterprises use this architecture for hybrid cloud use cases. However, they are not typically designed for the cloud and the slower network of the Internet.
The issues? Even if you get it working, the latency will be noticeable for half of the stored data. Let’s say a cloud-based application accesses the cloud and on-premises partitioned data. The data that resides on the remote partition (in this example, the on-premises partition) will have noticeable latency issues.
Remember, performance is determined by the slowest component. When a database produces data with latency, the overall data latency will be slow as well. You can prove this using performance modeling or simply try it. The gratification you get for keeping some of your data nearby will cost you in performance. Indeed, in most cases, it’s unworkable.
Many of the database players, cloud and not, won’t tell paying customers who want to use this structure that the answer should be no. Obviously, you can toss money at the problem, such as for dedicated network circuits. But the cost of doing that typically removes any value that cloud-based databases may bring. In other words, it’s cheaper to remain on-premises.
Moving to the cloud actually means moving to the cloud. If you try these types of hybrid voodoo, stretching technology beyond what it’s designed to do, you will just end up migrating twice: once to the solution that does not work, and then again to the solution that does. As always, it’s best to do things right the first time.