Table of Content
Subscribe to our Newsletter
Get the latest from our team delivered to your inbox
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Ready to get started?
Try It FreeFollowing our recent product release supporting schema changes in SQLAlchemy, we’ve seen a lot of demand to support additional frameworks which engineering teams use for managing schemas in production databases.
Today we are announcing two new frameworks that are now officially supported by Foundational: Liquibase and Active Record. This release allows our customers to track upstream changes across dozens of database types, and understand the downstream impact that changes in schemas in these databases have on data, and prevent breaking changes and semantic issues before the code is deployed.
One of the common reasons for data incidents across our customers today are upstream changes–which for the data team typically means changes that are happening by the engineering teams, in the operational database, for example Postgres or MySQL. One reason for why upstream changes plays such a significant role in data incidents is the way that a typical organization is structured, where the engineering teams and the data teams are separated. While the data stack (e.g., data warehouse) is owned by the data team, upstream sources are typically managed by a different team, namely Engineering, with different processes and a different set of incentives. A typical engineering team is focused on launching product features and would normally report to the VP of Engineering. For that team, data is often a byproduct. Since the tech stack is fragmented, with Engineering and Data working with separate repositories, tools and CI/CD environments, software engineers are typically not even aware of who consumes downstream data and the potential downstream impact of every data change they might be making.
In Foundational, we believe that the natural solution for these disparities is to introduce the understanding of cross-tool impact, in this case downstream, to everyone who might be making a change that impacts data. In many cases this would be the understanding of what is actually being impacted, who is the owner, and whether that change is safe or not to deploy. In most cases this is enough information for the developer, whether in Engineering or in Data, to become more informed about their change and take the necessary steps to deploy it safely. In the common case, this would inform software engineering when they are making upstream changes that have negative downstream impact and would empower them to work with the cross-team members to deploy safely and get more efficient when it comes to data development. Notably, this would prevent cases where the Engineering teams change a column, for example in a Postgres database, in order to support a new product feature, only to realize two weeks later that the change had actually broken an executive dashboard in Tableau that is critical for the company’s C-levels.
Ruby Active Record is a very popular ORM for Ruby on Rails and is therefore used by almost every engineering team that is developing in Ruby. It supports all the major databases, including MySQL, Postgres, Oracle and others.
Liquibase is a bit different - It’s an open-source framework for managing database schema changes, and engineering teams use it for migrations and versioned schema management. Similarly, it supports all the major databases, and is an extremely popular framework for data management in operational databases.
These two frameworks are commonly used by software engineering teams to manage the schemas and transactions in an operational database, such as Postgres. From this database, the data is typically replicated into a warehouse or a lakehouse through ETL tools such as Fivetran and Airbyte, or through streaming platforms such as Kafka, Flink and others.
Here’s a simple example for how Active Record code typically looks like:
With respect to Foundational, this block would translate into an entity within the lineage graph:
With these new integration announcements, that join the previously announced SqlAlchemy ORM support, Foundational is uniquely positioned to help customers validate upstream pull requests, and automatically track the true end-to-end lineage, spanning all the way from the operational database and through to the downstream Tableau dashboard. These integrations also demonstrate the power and efficiency of code-based data lineage - One framework can cover dozens of database types.
Our core capability at Foundational is our ability to extract very accurate lineage from source code, which means that lineage is always up to date and is always available for newly opened pull requests, before these are deployed to Production. This unique capability is what has allowed us to automate data contract enforcement, through adding a mechanism for enforcing specific rules every time when code is changed. We refer to these rules as Policies–and they also define the desired course of action which the system should take, for example, should it notify the owner or should it block the pull request from being merged. Custom rules also allow users to decide what fits best with their organizational development processes - For example, several clients use our Policies page to require that any change done by Engineering to a Postgres schema requires the approval, in GitHub, by someone from the data engineering team. This streamlines the communication while creating a formal contract between engineering and data teams.
If you would like to learn more about how Foundational can help you prevent data incidents, enforce data contracts, and track end-to-end lineage all the way from Postgres to dashboards, please reach out to schedule a demo, or request a free trial.