The question we get asked most often is exactly what everyone expects: where (and how) do I get started on a modern data journey based on a solid enterprise data strategy?
Not to provide the typical consulting answer of “it depends,” but blanket solutions don’t often make for the best ones. Each business needs something different and, within each business, each department needs something different, and it can certainly go on.
Multi-cloud solutions, versus using a single stack, offer the best functionality possible for each need. The blanket statement that can be made in this scenario is that there needs to be a Single Source of Truth (SSOT) and a data strategy with a definitive North Star.
“Data” can be a scary word for businesses of all sizes. Small ones think they’re too small to need a data strategy and enterprise-sized businesses have so much data that they feel too far gone, and then there’s the fact that data is just notoriously difficult to manage – without the proper technology that is.
The fact of that matter is, however, that businesses of all sizes need a solid data strategy as the foundation for a modern data stack. Starting early can give you a leg up as you scale and prevent having to deal with the herculean task of creating one when the company is already set in its ways and heavily siloed.
Despite the challenges, an effective data strategy and a modern data stack is well worth the effort – especially when you leverage the right technology and partner with the right firm to put in place a long-term strategy that will result in profitable business intelligence.
The History of Enterprise-Wide Data Strategy
Where we started with data strategy and where we are now are two radically different places, and the advancements are immense. In 1996, large volumes of data had to be stored in gigabytes on disks due to limited advancements to database technologies – this obviously performed poorly in the scheme of things.
This all began to change with Ralph Kimball, who introduced the star schema, the simplest style of data mart schema. Its effectiveness handling simple queries has led to it being the most widely used approach for developing data warehouses and dimensional data marts.
Providing simplified business reporting logic, query performance gains, and fast aggregations, it changed the accessibility of large swathes of data.
While the Kimball star method was great for organizing data in an efficient and easy to consume manner for business users, there were still flaws. Due to its denormalized state, data integrity was in question, as well as its inflexible structure. Not only that, but it took years to implement, with many programs never reaching their full potential. Often, by the time a data warehouse was live, end-users had lost interest or the business was no longer using those KPI’s to measure performance.
In the 26+ years since, many advancements have been made in the computational speed of modern CPUs, providing tons of memory and expedient flash memory to store data. While this alone wouldn’t solve the time to deploy problem, it can solve the initial requirements of organizing data for efficiency gains.
Additional technologies have been introduced in the database world, many of which take radically different approaches. These approaches were shunned in the early days of data storage, but have proven immensely efficient in modern times, such as columnar stores and compression within a database engine.
How to Develop an Enterprise-Wide Data Strategy
The past quarter century of engineering has brought us to today, to an array of advancements and competitive advantages. A good data strategy takes a lofty business strategy and turns it into a well-defined action plan that unlocks the value in your data and puts your business in the best position possible on the market.
There are four components to developing an enterprise-wide data strategy.
1. Business Needs and Overarching Themes:
Every journey starts with understanding the resulting data measurement goals being sought after, such as revenue, sales by widget, where are we most successful with widgets, etc. However, we’d be illogical in thinking our needs could be simply nailed down to an individual piece of data. Data is just the first step in the journey to determine the ultimate goals of the overall program.
Many critical points in your data journey center around topics such as data governance, speed of data movement, data access and masking, and data trust. While this list isn’t exhaustive, you can quickly see the topics which are typically used in determining a data strategy can be broad reaching and require additional planning beyond just a single element of data.
2. Data Organization:
Star schemas (from good ol’ Ralph Kimball mentioned above) are great for measuring fiscally controlled data which is locked after a point in time, such as a period/year end. However, they lack the ability to quickly change as the business needs shift. This isn’t to say they don’t have a place in the modern data strategy, especially given additional advancements in speed to deploy capabilities, they just need more evaluation to determine necessity.
Non-star structures, on the other hand, can often take advantage of speed gains within the storage and computational layers to provide faster performance with greater flexibility as model changes are introduced. We generally refer to this approach as a “virtual” semantic layer, which utilizes various database objects to provide users with a more business friendly method of utilizing data.
Building a successful virtual semantic layer takes both unique understanding of source data structures as well as business needs to enact a series of dependent views which create a usable, scalable layer that’s easy to maintain and requires limited processing to organize data for usage. Often, this virtual layer can be managed by more advanced business users given an appropriate understanding of goals contained within your data strategy.
3. Visualization and Consumption Tools:
While tool selection may not be as important as in the 90s, it’s critically important to ensure reporting/BI tools selected can take advantage of modern data infrastructure. Additional features within tools typically provide self-service capabilities to mix and use various internal and external datasets to understand patterns in data to ensure company growth in the market is achievable.
Tools focused primarily on cloud releases often limit volume and mashup capabilities in a cost-efficient manner. Keeping this in mind, the data model design can easily avoid many of these cost pitfalls by pushing these workloads to modern data platforms such as Snowflake and Google BigQuery.
4. Advanced Data Capabilities:
While many programs have a deep desire to progress to the world of advanced data capabilities such as real-time Artificial Intelligence (AI) and Machine Learning (ML), these are difficult places to start without having a solid foundation in your overarching data strategy. These tools require a unique understanding and organization of data to become materially effective at diagnosing problems and presenting potential solutions with high degrees of accuracy.
Spaulding Ridge x Your Data Strategy = The Competitive Advantage You Need
Creating a comprehensive data strategy can be complex and time consuming if one attacks it without a broad understanding of the available options or as a siloed mission. Spaulding Ridge has developed a unique approach to ensuring this process is efficient, nimble, and moves at the speed of your business in a modern world.
Want to talk about data strategy? Reach out to Reggie Gentle.