Create Data Unit Tests

Introduction

Data Unit Tests are blocks of SQL code that run at some point in a Transform (usually at the end), creating records in another table, which indicates if the test passed or failed. Lots of types of test are available, Kleene currently has three types built in - Null in field, Uniqueness and Freshness.

They are useful to reliably monitor the health of data in the pipeline.

The Add unit tests option only appears above a Transform that creates or replaces a table once it has been saved.

Create a Data Unit Test

Click the Add unit tests in the SQL Console menu to open the window, where three tests are available to add to the current transform.

1378

Add Unit Tests

Create the Unit Test log table

Before you create any tests, you'll need to have somewhere to store the results. You can do this the first time the UNIT TEST button is used, by selecting CREATE REQUIRED PROCEDURES, this will generate the stored procedures that are used as unit tests and a table in your warehouse called UNIT_TEST_LOG to store the result. This table can then be queried to check the results of the tests.

606

CREATE REQUIRED PROCEDURES

If you get an error saying the object already exists but you do not have access please ask the procedure creator or ACCOUNTADMINs/superusers of your warehouse to grant usage permissions to your warehouse role:

Snowflake:

GRANT USAGE PRIVILEGES ON ALL PROCEDURES IN SCHEMA PROD.KLEENE TO ROLE <warehouse role name>; 

Redshift:

GRANT EXECUTE PRIVILEGES ON ALL PROCEDURES IN SCHEMA KLEENE TO ROLE <warehouse role name>; 

Null in field

Checks none of the entires in a specified column have a NULL entry.

  1. Select the table to test, then
  2. Select the column to check and
  3. Click ADD UNIT TEST.

Repeat this process for as many columns as needed.

1102

Null in field

Uniqueness

Checks the distinct number of entries in a specified field equals the table row count

  1. Select the table to test, then
  2. Select the column to check and
  3. Click ADD UNIT TEST.

Repeat this process for as many columns as needed.

1108

Uniqueness

Freshness

Checks the maximum entry of a specified DATETIME or TIMESTAMP field is within a specified time interval

  1. Select the table to test, then
  2. Select the column to check (should be DATE/TIME), then
  3. Select the period and time frame eg 3 hours
  4. Click ADD UNIT TEST.

Repeat this process for as many columns or time periods as needed.

1108

Freshness

SQL Code

When you close the window, you'll notice one or more lines of SQL have been added to the end of the transform.

If you need to delete a test, simple remove the relevant line of SQL.