Unit Testing Transforms

Since the transform layer contains heaps and stacks of logic for a data pipeline, it is important to maintain this layer and monitor its health as much as possible. One method for monitoring the health of a pipeline is introducing unit testing.

Read the Kleene guide to unit testing transforms here.

Unit tests in the transform layer record the results of a test into a separate unit test log, thus a history and most recent test results of a transform can be monitored from a table in the data warehouse. Hence, the unit test log enables dashboards to be created on top of this data, which can be set to update after the data transformation layer.


A PowerBI dashboard created on top a unit test log table.

General unit tests include primary key uniqueness checking, data extraction recency and static data checks. It is recommended to create unit tests more specific to the use-case, so that any potential problems that may occur can be easily flagged by the system and rapidly amended.

What’s Next