Exception Reporting in The Mastering Process

How to Create an Exception Report for Entity Build-Out

This guide outlines the process of creating an exception report during an entity build-out, focusing on identifying and resolving data discrepancies, including handling duplicates between systems.

Prerequisites

Before starting, ensure you have:

Access to the relevant data sources (e.g., CRM, financial systems, analytics platforms).
A clear understanding of the entities and sub-entities involved in your data model.

1. Modeling Section

This work should be done in your active project documentation as a working document key to establish fast effective data modelling.

An example Exception Reporting output doc


1.1 Identify the modeling section in your project documentation or workspace.

This section is key to the mastering process, which involves connecting sub-entities from various systems to create a unified entity.

1.2 Within the modeling area, focus on:

Entity Comparisons: This is where data discrepancies, such as duplicates, will be checked.
Duplicate Analysis: Tools or code that help you identify and manage duplicate records between systems.

2. Running Counts and Establishing Primary Keys

After identifying the modeling area, the next step is to compare records from two systems and run counts to identify any discrepancies.

2.1 Set up Tables for Comparison

Identify the tables you need to compare. For example:
A contacts table from a CRM system.
A user table from a financial system.
Establish primary key relationships between these tables. For example:
Compare the primary key contacts.id in the CRM system with user_id in the financial system.

2.2 Run Record Counts

Run counts to determine how many records are found in both tables:

Records in System A only: These records exist in one system but not the other.
Records in System B only: These records exist in the second system but not the first.
Common records: These records are present in both systems.
Example:

34 records exist only in the CRM.
18 records exist only in the financial system.
The majority of records exist in both systems, though some discrepancies exist.

3. Identifying and Resolving Data Discrepancies

3.1 Addressing Duplicate Records

Duplicate records often occur between systems. The first task is to:

Check for duplicates: Use any duplicate checking tools or techniques available to detect duplicates.
Review discrepancies: Investigate records that exist only in one system (e.g., 34 records in the CRM but not in the financial system) and understand why.

3.2 Investigating Anomalies

For records that exist only in one system, investigate potential reasons:

Old records: These records may be outdated and safe to discard.
Test records: If the record names include test or other indicators, they may have been used for system setup and can often be ignored.
Always confirm with the customer or project stakeholders before discarding records.

4. Establishing a Master Source of Truth

Once you’ve identified the discrepancies, you’ll need to determine which system will serve as the master source of truth for the entity you are building.

4.1 Define the Master Source

Decide which system will be considered the source of truth for the entity. For example:

If the CRM system is chosen as the master, only the records present in that system will be considered authoritative, and any records in other systems that are not in the CRM may be discarded.

4.2 Populate the Entity

Once the master source is defined:

Merge the data: Bring in all unique records from both systems to create the entity.
Handle missing data: For records present in only one system, note that some fields may remain unpopulated in the final entity.


4.3 Document the Decision Logic

Document the reasoning behind the decision to choose a master source. Ensure stakeholders are informed about the choice and understand its implications, especially regarding any records that may be discarded or ignored.

5. Mastering Logic and Deduplication

The next step involves defining the logic for deduplication and mastering. This ensures that the entity is clean and reliable, with all duplicate records handled appropriately.

5.1 Define Deduplication Rules

Create a clear hierarchy or set of rules to:

Deduplicate records: Identify how duplicate records will be managed. For example, prioritize records from the master system.
Create a system of record: Ensure there is a clear system of record for each entity that consolidates data from multiple systems.


5.2 Store Notes and Document the Process

As you work through the data, store any notes or observations, such as:

Record counts: Document how many records were found in each system and how they were handled.
Anomalies: Record any unusual data, such as missing or inconsistent fields, and how they were resolved.

6. Final Steps and Communication

6.1 Review and Finalize the Entity

Once the deduplication and mastering process is complete:

Ensure all records are correctly merged: The final entity should include all necessary data from the different systems, without duplicates.
Verify the data: Ensure that all the records in the entity are accurate and consistent.

6.2 Communicate with Stakeholders

Share your findings: Present the final entity and explain any decisions made during the deduplication and mastering process, such as how records were handled or discarded.
Monitor for issues: Be prepared to update the entity if further discrepancies arise or as new data is introduced.

Conclusion

Creating an exception report for an entity build-out is a critical step in the data mastering process. By comparing records between systems, addressing duplicates, defining a master source of truth, and applying deduplication rules, you can ensure that your final entity is accurate and reliable.