What are the validations I need to perform before transferring data from source system to target system and if my source system doesn't any have data?



  • What are the validations I need to perform

    1. Before transferring data from source system to target system?
    2. If my source system doesn't have data to transfer to DWH, how do I test in that scenario?


  • Q. Before transferring data from source system to target system?

    Before transferring data from source system to target system, why there is a need to validation ?
 So to answer it, the prime focus of validation here that the data which is going to be loaded into target system must be accurate, complete without data loss. So based on that data in the target system different solution and alternatives can be taken as business decision for growth. 


    -Verify Data Uniformity i.e. there must be exact match of the entities actual values wherever used. For ex. it might be within same schema or across schema (i.e. Consider BookId from Books table and BookId from Readers table must be exact matching within source system or across target system).


    -Find out Data Entity presence / absence / present with different name throughout source system and target system as per design model.

    -Verify Data is logically accurate, finding issues in quality not only from source system but also from target system by execution of test on target system and rechecking with the source system for defects. e.g. verify the accuracy on Non-numerical content (for ex. Emails / Pin codes / Phone) and domain based analysis (for ex. Gender[M/F], Age range[1-18,18-30, above 50], reference files any) on stored data.

    -Verify the data model design is as per business requirement for that firstly we need to look out for metadata defined for the target system and then move on to tables and fields level definitions for their accuracy. To be precise, we need to cross check Data Type, Data Length, Index and Metadata check across environments.

    -Sometimes there are some change taken in source system in the mid-way when the project is in progress and did not get implemented in target systems. Identify such change and take it into consideration in either next phase or stop and re-run based on the priority.

    -Verify one must validate Data Integrity constraints like Foreign Key, Primary Key reference, Unique, Default, etc.

    -Verify Data Completeness which can be done with help of record count (i.e. matching count of records) and column data profiling (i.e. filtering unique values, picking up minimum, maximum, average values, filtering out null values on important columns) by comparing source and target system.

    -Verify transformation logic on data of source system by preparing and executing some test on source data and identifying the mistakes comparing with expected values ahead from actual execution.

    -Verify uniqueness of data by identifying the unique values in column based on data model prepared. Verify by running test if they are unique in the system and then next move on to identify actual duplicates if exists.

    -Verify for Mandatory fields from source system and also must have default values associated with field.

    -Verify that you document tests that you are working with data from specific time period. For e.g. There might be an instance where every bi-weekly there is a discount change logic on different products and you may end up with wrong values. Consider in your planning such instances.

    -Firstly filter out NULL data columns and validate if null is allowed and secondly, if found important we should make sure that they cannot be null in future as it might be important for making business decision.

    -Also it is extremely important to consider Range check on columns specifically for e.g. consider For Life Insurance policy Age must not cross 60 / Purchase quantity must not exceed above 2000 as per business requirement.

    -Document if any business requirements to be met i.e. From date must be greater than to date, Age criteria for Driving license, Left date must be null on Active Employees, etc.

    -Please ensure about the Aggregate functions used are purely documented with examples and verify the calculated values stored in target system must give same aggregate values based on calculations.

    -Verify data truncation and rounding logic concern with business needs.

    -Verify encoding values from the source system should be rightly populated into target system. 
Considering above points before transferring data from source system to target system will really helpful in successful data transfer and optimum data in Target system.

    Q. If my source system doesn't have data to transfer to DWH, how do I test in that scenario?

    In such case, we can do it in 2 ways,

    1. We can contact to actual concern Database Architecture person who have knowledge related to tables and entity relationship between them.

    2. Else you need to understand the entity relationship and database architect of the source system and create / add data by your own.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 1
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2