Skip to Main Content

Matillion vs dbt

Why you don’t need to choose between the two data transformation tools.

Matillion and dbt are two outstanding tools for data orchestration, modeling and transformation. Many data teams wonder which is the best data transformation tool for their stack: Matillion vs dbt? But it doesn’t need to be one or the other.

Spaulding Ridge is a big fan (and a premier partner) of both dbt Labs and Matillion, and we’ve used both together on a wide range of client engagements. Notably, Matillion’s dbt integration makes it easier than ever to take advantage of the strengths of each tool. In this article, we’ll take a closer look at:

  • The benefits of using both Matillion and dbt
  • How the Matillion dbt integration works
  • Why we chose Matillion and dbt for our client and how we set it up
  • The positive results they’re seeing

ELT’s Best Kept Secret: Matillion’s dbt Integration

Launched in April 2023, Matillion’s dbt integration enables you to orchestrate your ingestion and transformation jobs so all your jobs are running in one place. By packaging all functionality into a single platform, users can load their source data and immediately begin applying dbt functionality. This greatly increases the efficiency and time to value for dbt users. Basically, it gives you the best of both worlds.

3 Benefits of Orchestrating with Matillion and Modeling with dbt

1: Seamless integration with dbt version control

Git versioning is interwoven into the dbt experience. Within Matillion, users can specify git repositories where dbt code may be contained, as well as the branch directly tied to the version of the code. This allows users to:

  • Develop their code
  • Save it within the repo/branch containing the code
  • Immediately make it available for use within a Matillion pipeline
2: Validating data quality with tests

As an ELT platform, Matillion allows users to extract and load source data and use Snowflake as both the storage and processing layer for transformations. With dbt, users can test the quality of data after transformations to ensure data quality. This gives users confidence that any “dirt” has been scrubbed from the dataset—and if any remains, they will be notified.

3: Utilizing Matillion workflow controls to apply error handling

Any component in a Matillion pipeline (including data quality tests instituted by dbt) then results in either a Success or Failure outcome. By using either Success or Failure connectors to subsequent components, downstream actions can then be automated. For example, think of email sends, Slack notifications, or any post-dbt actions a user may want to take. Anything is available based on how a user designs their Matillion pipeline.

How to Use the dbt Commands Component in Matillion

The feature is actually very simple to use! Here’s the step-by-step breakdown. First, we have to configure our git repo as a file source in Matillion.

We then use the “Sync File Source” component as the first step in our dbt job:

From there, all we have to do is add as many dbt Core Matillion components as we need to perform our transformations. The components accept commands exactly as dbt would, so you can keep all syntax the same as you enter in your dbt command.

 

In the example above, we have a component running the “dbt deps” command to install all dependencies. We also have a “dbt build –select tag” command. As you can see, after we run our build command, we have an “End success” component if the job runs successfully. If the job fails, a “Send Failure Email” component alerts us via email.

Why dbt and Matillion?

Now let’s see a real-world use case to understand why we used the Matillion and dbt integration and how we set it up.

Our client Blount Fine Foods, one of America’s leading prepared foods companies, was struggling with a lack of data freshness, accuracy, and trust. They had a number of different scripts running to sync and refresh data between their server, on-premise operational data storage, and enterprise cloud data warehouse. But the outdated scripts continuously broke and caused issues that were difficult to track down and fix. And because they also had complex logic housed in DAX formulas in PowerBI, logic for updating tables lived in multiple places in the data pipeline—which not only made their reporting slow, but was also challenging to debug, update, and make changes when needed.

To ensure data accuracy, we reconstructed their ingestion process and created safe, water-tight joins that wouldn’t allow fanouts. We used Matillion to ingest the data (the “Extract” and “Load” part of ELT), then used dbt to model (Transform) the data. Matillion is the orchestration tool not just for Matillion pipelines, but also for dbt transformations.

Here’s a step-by-step of how we set up the Matillion and dbt integration:
  1. First, we used Matillion to connect to a replication of their server to make sure we refreshed the data in reliable, accurate scripts. Our Matillion ingestions created clean, ready-to-use, trustworthy tables that we could use as our input to model in dbt.
  2. Next, we took the tables Matillion produced and used dbt to transform them (aka, model the data). We chose to use dbt to transform the data as it is easy to see the logical flow of transformations, allowing Blount’s team to easily debug code and perform their own modeling of the data.
  3. They now house all their logic in one place: dbt. The simplicity of the tool means the client’s data is very easy to read, understand, and debug. It’s easier and faster for the client’s analytics team to understand what’s going on, fix and update things as needed, and onboard new people.

Why Matillion?

  • Highly cost effective over the long run
  • Able to communicate with the client’s various data sources, pull in data, and refresh it on whatever cadence they need
  • Metadata-driven ingestion pipelines loop through multiple tables in each data source, reducing the need for individual load scripts for each table and greatly increasing the speed of data extraction and loading
  • Can help with future reverse ELT workflows

Why dbt?

  • Industry dataOp standards, dbt versions, ci/cd and scaling
  • Team members who were already well-versed in SQL could quickly learn the straightforward modeling
  • Onboarding of new employees was simpler
  • Could place all transformational logic in one place (vs. multiple steps in the ETL process)
  • Made debugging, changing, and updating code faster and easier

Positive Results of Using dbt and Matillion

Running all the jobs in one place (Matillion) simplifies job maintenance, eliminating the need to toggle between multiple platforms to manage tasks. We only have to check one place to ensure that all of Blount’s data is extracted, loaded, and transformed. In addition, Matillion’s alerting functionality sends alerts via email, Slack, or other methods as soon as a job fails, making it easy to stay on top of any pipeline breakdowns.

Due to dbt’s easily understandable user interface and out-of-the-box testing, Blount has quickly learned how to refresh, update, and debug their data. They are able to develop and validate data more quickly and can easily identify bugs or pipeline breakdowns. The combination of Matillion and dbt has removed the shroud of mystery around their old scripts and obscure load failures, and enables the team to update, share, and trust the data in their reporting.

Need help with ELT?

As a certified partner of both Matillion and dbt, Spaulding Ridge can provide end-to-end data engineering services across a variety of industries to make your data analytics-ready. Contact us for a free assessment!