Considerations When Planning To Test Data Migration

Considerations When Planning To Test Data Migration

It is important to note that testing a data migration should start well in advance of the actual data being migrated. It is quite typical that one of the most important commodities that a business controls, owns, or maintains is its data and therefore any data migration should be considered high risk and should be subjected to significant verification and validation efforts. A business may choose to reduce the risk level against the data migration, but it is always prudent to start off with data migration as a high-priority task to ensure it is successful.

By: Paul Darby

It is important to note that testing a data migration should start well in advance of the actual data being migrated.   It is quite typical that one of the most important commodities that a business controls, owns, or maintains is its data and therefore any data migration should be considered high risk and should be subjected to significant verification and validation efforts.  A business may choose to reduce the risk level against the data migration, but it is always prudent to start off with data migration as a high-priority task to ensure it is successful.

A common term for exercising a data migration task is to perform an ‘Extract, Transform and Load’ (ETL) function which sees extracted data from various sources being fixed and then loaded into the target data warehouse.  This document offers some of the keys consideration points to a test team so that effective test planning and strategizing can be achieved for an ETL project.

Strategy

When strategizing the data migration testing, the following phases of testing need to be considered:

  • Pre-migration testing
  • Post-migration testing
  • Acceptance (Not covered in this document)
  • Production Migration (Not covered in this document)

Pre-Migration testing

Before any migration has taken place, the following testing actions should take place:

  • Ensure that the scope of the data is understood: the extraction process, how many records, tables, relationships, etc. exist.  For this, it is important to get visibility of an entity relationship diagram, data dictionary or similar for the current system that is being migrated from.
  • Ensure that the destination as well as the load process is understood.  Again, for the former at least, this should include an entity relationship diagram, data dictionary, or something similar.
  • Ensure that the data scheme is known including mandatory fields, field names, field types, data types etc. for both the original data source and the destination system.
  • Understand the data cleansing requirements (See section 1.2 for more detail)
  • Understand any interfacing with other systems or secondary suppliers of data into the system

Much of this information may come from a formal design review and detailed requirements or a migration specification; however, if this review does not take place or the specification does not exist, it should still remain part of the test team’s planning process to source this information.

These steps will provide an understanding of the mapping of the data from its original location to the new location and it will highlight where there are inconsistencies.  These inconsistencies may require a design change or a business process change to enable the data to be usable and present correctly to the user.

Once this is understood, the following can take place:

  • Ensure the mapping to user interface is correct
  • Ensure the mapping to business process is correct
  • Test tool configuration and script generation using field details
  • Identify data subset (sampling) details for testing (Note: it is often impractical to test 100% of a data set, therefore it is imperative that the scope of the test data is fully understood along with any risks that are presented by using a subset)
  • Ensure all business cases, user stories, or use cases are understood

Data Cleansing

It is vitally important that data cleansing is performed on a system that houses and maintains data.  This is easiest done by virtue of data validation during input through ‘Field Validation’ and ‘Error Checking’ as this is a cheaper and far more efficient method.  Depending on how stringent previous methods of system development and user input have been, it is very likely that incorrect data will exist within the migrated data set.  The following data cleansing tasks should be planned within a data migration exercise:

  • Understand the error types: these may include blank fields, too-long data lengths, or bad characters, among many others
  • Identify method of checking the data: either interrogation through database using SQL commands or the use of tools such as DQGlobal, Clean & Match or DTM Data Scrubber, though many other tools are also available
  • Understand the data linkages: changing the data in one table may damage the data in another linked table if the same changes are not made across the entire system.

Once this level of understanding has been achieved, the following tasks should be carried out:

  • Interrogation of data in order to find the potential error types previously ascertained
  • Fix the data (Note: data must be fixed consistently across the system)
  • Check record counts

The list above shows things that should take place during the pre-migration and migration phases and may be fulfilled by the development team and the tester working together.  The migrated data should be the result of the work of all parties within the development lifecycle.

See the ‘Post-Migration Testing’ section of this document to understand the data cleansing tasks once the data has been migrated.

Post-Migration Testing

Following the migration, black box testing against the business cases, use cases, or user stories may be carried out and it is possible that the scripts produced for this work will be reusable for Acceptance testing later on.

The tester should fulfill these tests against the subset of data as defined during the pre-migration testing phase.  It is recommended that the tester builds test scripts that provide exact detail of data usage either via referenced data sheets or within the text of the test itself.
In addition to testing the known business flows, the testers should carry out the following testing, including negative testing approaches, which primarily ensure that data cleansing is being carried out at run time within the system:

  • Input bad data: attempt to violate the validation rules including the use of boundary value analysis, equivalence partitioning and error guessing. The validation at the user interface level should be in accordance with the validation at database level.  If no validation has been enforced or the validation between database and user interface do not match then this should be treated as a defect.
  • Bypass mandatory data: attempt to navigate to further stages of a user case within the system prior to all data requirements being fulfilled.
  • Drag and drop:  dragging and dropping data and even images may force bad error handling in the application, which may lead to incomplete records.
  • Checking data locks:  Where data is being written, it should not be possible for multiple user to access the same new record within the database.  Typically this level of validation is managed by the underlying database if it is provided by a third party so the risk may be minimal but this testing should be carried out where possible.
  • Conversions testing: ensure that the data when recalled from the database(s) is displayed accordingly to the user.  This will most likely be fulfilled within the coverage of business cases, use cases and user stories; however, it is worth ensuring that all data is utilized to be certain that it will be called and displayed accordingly.

Outside of the actual data migration tasks there are a number of non-functional test approaches that should be considered when planning the testing.  These are as follows:

  • Security Testing: is access to the data restricted accordingly?
  • Performance Testing: can the data be utilized and accessed in accordance with relevant KPI’s?
  • Usability of data: is the user able to utilize the data effectively? Do they have to input the same data multiple times?

Data migration is a technique which, due to its complexity, necessitates specific, careful testing at all stages of the process. Not doing so can result in improper migration (incomplete or corrupt data) or poor user experience, which will be damaging to both the system’s reputation and its inherent usability as well.