We’re seeing a lot of activity with clients nowadays eager to modernize their data platforms. This post discusses an architecture option that is gaining more traction with some of our clients as they attempt to get their feet wet with modern data platforms leveraging tools like Snowflake and Fivetran.
We see it all the time, our clients know that they have one or all of the following problems with their existing data platforms:
- Query response times are too slow.
- One group of users or reporting applications bring the platform to its knees.
- Business rules and data transformations will be too hard or too costly to completely re-write on a new platform.
The good news here is that modern data platforms utilizing technologies like Snowflake and Fivetran can solve these problems. Consider a typical legacy data architecture below:
What if one can reap the performance, elasticity and security benefits of a cloud Data Warehouse like Snowflake with minimal impact on their existing platforms? Granted, anyone out there can find a reason why this solution doesn’t work for 100% of use cases, but we’re seeing this method become increasingly popular as a first step in modernizing data platforms. Now, let’s see how a relatively simple technology switch in the diagram below can make a big impact.
Let’s talk through the problems addressed above one by one now:
- Query response times are too slow – In the new world, our reports and dashboards are now pointing directly to Snowflake. We can leverage its on-demand elasticity where necessary but simply moving your data to Snowflake will result in drastic performance improvements over classic RDBMS platforms and/or legacy OLAP tools. Opportunities to scale out Snowflake on-demand can increase performance an order of magnitude in many cases.
- One group of users or reporting applications bring the platform to its knees – here lies the beauty of Snowflake’s Virtual Warehouses. Outside of the platform itself simply being more performant you now have fine control over which users or applications can utilize what processing resources a.k.a. Virtual Warehouses.
- Business rules and data transformations will be too hard or too costly to re-write on a new platform – Often, too much time and money has been spent on the current architecture so a wholesale re-write may never make it past leadership. The good news here is that we can keep the existing architecture in this new world. Continue to leverage your data staging, cleansing and Data Warehouse processing logic. Don’t worry about changing ETL tools or re-writing business rules. Let’s not re-test everything that has been already tested and validated. If it’s working today, don’t fix it. Simply replicate your legacy Data Warehouse to Snowflake using a modern, flexible data pipeline like Fivetran. As changes to data occur (both to the data itself and/or the data structures) products like Fivetran are smart enough to automatically replicate those changes downstream to Snowflake as frequently as you would like.
In the new architecture, data remains fresh, data pipelines don’t break when new columns or tables are added to your Data Warehouse and data consumers have the performance and reliability they need from their analytics tools. Now they can spend more time making decisions and less time waiting for reports and dashboards to refresh. Let’s not forget existing reporting platforms pointing at legacy cubes or data marts would need to re-point to Snowflake which can take some time. We suggest picking out the worst performing to start and slowly migrate as it makes sense. Some clients choose only to write new reports against this new architecture after moving over the poor performers.
There are several approaches you can take to help modernize your data platforms. We work with our clients every day to help them solve their biggest data challenges while providing alternate approaches that can save time and money. If you are having challenges with your existing data platform, give us a call, we’d love to help!
Mike Galvin is a Co-Founder at One Six Solutions in Chicago and has over 20 years of experience assisting clients across various industries think through their toughest data challenges. Mike can be reached at 312-761-1616 or via email at firstname.lastname@example.org.