By: Katie Stibbon
A few years ago, a friend of mine was shocked to get a series of letters in the post addressed to her daughter, offering her everything from car insurance to income protection from a well-known insurance company.
This was a bit of a surprise, because her daughter was only eight months old at the time!
It seems that in taking up an investment plan from that company, she had opened the door to a flood of sales and marketing literature for her baby daughter, all age-inappropriate, because the investment product she had taken out was, in fact, a Stakeholder pension. With the launch of the Stakeholder, the pension market had been opened for the first time to the under-16s.
When Data Misfires
Brands use data from all over their business, combined with external sources, to build their picture of their customers and potential customers. This mass of data is vast, unstructured, and ever-growing and changing. It is commonly known as Big Data.
The accuracy of our data, and its relevant use, is important. Getting invitations for inappropriate insurance products is annoying, but inaccurate use of information can have a far bigger impact in our day-to-day lives.
Consider a company which continues to send credit card statements or new cards to your old address after you move, leaving you open to fraud. Or your records being sent to your ex-partner by your bank after a split because they don’t hold separate mailing addresses across all their systems. Even something as harmless as misdirected marketing though has an impact; every time my friend tells this story, that company’s reputation takes another small hit.
Using Good Quality, Reliable Data
In the case of my friend’s insurance company, was the marketing system tested? Was the data used for testing sufficient? Did they rely only on live data for testing, which would not have yet included anyone so young? If the product and data parameters had been better understood, and data sufficient to test the boundaries had been present and used in testing, could this issue have been found and prevented?
The value of good quality test data of sufficient quantity should not be underestimated. Not giving data management the time and attention it deserves can slow or even halt your testing, generate invalid bugs that distract your testers and developers alike, and ultimately result in project delays, cost increases and reduced quality.
As testing becomes more automated – including the use of AI in testing – we become more and more reliant on volumes of good quality, reliable data on demand. As organizations embrace DevOps, the continuous delivery pipeline requires continuous testing.
Effective data management i.e., creating reusable, reliable data for all automated tests, every time, is one of the most critical aspects of continuous testing. In the drive for speed to market and improved quality with reduced overhead, test data can be the lynchpin without which all these plans fall apart.
The Challenges of Live Data
High-quality, timely test data is essential; yet test data selection is a discipline which often sees little focus. It is frequently an afterthought, assumed and forgotten until the tests start failing due to “data issues.”
In the past, testers, developers and project managers often assumed that the business would be able to supply “live data” for testing, but with the increasingly strict rules imposed by the evolving Data Protection Act since 1998, and the more recent GDPR legislation, the use of live data for testing has become more and more problematic.
Simple obfuscation can be insufficient to protect an individual’s identity, and misapplied, may render the data useless for specific testing, or damage its integrity across databases.
While taking extracts from live is the quickest method, and often yields a good range of the most relevant data, this presents challenges beyond data protection considerations. Even where the “cut” of data is a complete set of live data, this may not be sufficient for testing needs.
Combining Live & Synthetic Data for Better Results
As products and system change and evolve, there may be new variations of data which will become common, or which are required to test specific changes, that will not yet be in existence in live (like the infant saving for retirement). Testing of complex changes may require more of a specific combination of data which is not readily available. Therefore, a robust test data strategy should always allow for both – the management, maintenance and protection of live data, and the generation of suitable additional (synthetic) test data as required.
Step 1: Get to Know Your Data Sources
As a first step you need to understand the nature of your data: the complexity, where it resides, its sensitivity and its availability. Depending on the complexity and volume of the additional data required, and the infrastructure in which the data resides, a variety of approaches can be considered.
Step 2: Set Your Data Test Management Strategy
You may consider scheduling additional data extract runs to update / replace / augment the existing data set from live. You may decide to regularly age existing data and apply baselines before adding new data or making changes to allow the original state to be restored on demand. You may require synthetic data, data created specifically for testing to meet data requirements, or even a blended approach of all of these.
Step 3: Set Your Data Test Management Strategy
If you are using live data and require data to be synchronised across multiple tables or even multiple databases, you should almost certainly consider a Test Data Management (TDM) tool. An off-the-shelf TDM tool can facilitate the automatic desensitisation, synchronisation, maintenance and even on-demand generation of suitable synthetic test data across all databases, to ensure that all testing efforts are supplied with the right data at the right time.
Step 4: Test Your Population
Whether you purchase a tool, find a manual solution or an in-house technical one, your test data should be given the same consideration and the same focus as any other significant investment or project delivery. But it is a genuine investment which can reap real rewards in terms of quality, and free your change program from test data restrictions in the long term.
Protect Your Organization’s Reputation; Get your Test Data Right
The importance of test data management reaches beyond its role in a comprehensive test strategy. Inaccurate or insufficient data is often the root cause of missed defects that can seriously impact your organization’s customers and have a devastating impact on your company’s reputation. Qualitest’s test data management specialists will employ advanced data management techniques to ensure the quality of your test data. Get your test data right; don’t risk your organization’s reputation.