The Testing Show: Optimized QA with Aviram Shotten

Transcript

It may feel that your team may be doing a lot of testing but how do you know if you are doing the right testing or the needed testing and not wasting time on areas that are not as important or add little value to your efforts?

Aviram Shotten joins us to have a conversation about optimizing QA efforts and putting those hard-earned QA dollars and cents to best use.

Panelists:

References:

Transcript:

Michael Larsen:

Hello everybody. And welcome to the testing show. Glad to have you with us. We are excited today. We have a special guest here today to talk about some neat stuff in the way that you can do your testing in maybe a different way. So first time on the show, we’d like to welcome Aviram Shotten. Did I pronounce that right?

Aviram Shotten:

Yes. Very well. Thank you for having me here.

Michael Larsen:

Yes! Glad to have you with us. You know, me, I’m Michael Larsen. I’m the guy that twiddles the bits and hopefully makes everything sound good. And we have Matt Heusser, who is our normal MC. So Matt, I am going to turn the time over to you.

Matthew Heusser:

Thanks, Michael. Appreciate it. Avi,. great to have you here. Let’s get started. Let’s start with QA optimization. So if I hear that, what I hear as a business owner is we’re going to help you go faster, cheaper, less risk, but what does that mean to you?

Aviram Shotten:

So for a quality professional, every time that someone talks to me about challenges, it all sounds relatively the same and mitigations to those challenges are also relatively the same. In Qualitest, we recently conducted a survey and we found that 97%, that’s an astonishing number of the responders to the survey, identified 12 areas in quality engineering that, usually, they are challenges. It was quite amazing. It looks like everyone collaborated and came up with the same challenges. So number 12 in total. What we found out when we continued the research and the surveys and interviews is there was a gap. There was a gap in the impact of each issue, a challenge, for example, high level one, how we can implement test automation properly. That was a very common challenge. There was a gap, some people or some responders said, test automation is my biggest problem. When we dug a little bit deeper, we found out that the perception it was that it’s their problem. But in reality, their problem was somewhere else. For example, around availability of infrastructure, an infrastructure can go both to environments availability and to test data. They just can’t run. So who cares about this automation if you can’t execute test cases in multiples for us, QA Optimization is about making sure that we are able to identify what are your pain points and what are the pain points that we should tackle first in order to really open those bottlenecks that stops us from going full throttle. I hope it makes sense.

Matthew Heusser:

Yeah. I mean, you’ve phrased that in terms of the theory of constraints, you know, we could use other terms like low hanging fruit. I find that most of the clients, if it really was low hanging fruit, they’d have done it. Usually it’s “we need to change the funnel, the way we think about doing testing or delivery or team structure, those sorts of change management problems. How do we create test environments… but it’s your program. So tell me more about it.

Aviram Shotten:

So Again, I think that there are perceptions. Don’t get me wrong. This execution and delivery is very difficult. We need to be detailed oriented. You need to be very good at what you do and everything is correct. In reality, those bottlenecks, the difference between perception and reality, the gap is very wide. That’s the first thing I would say. The second one is how do you bring everyone on the same page in terms of recognizing that that’s what we need to handle in order for you to solve. For example, there’s data generation. It’s not just the testing you’ll need support. And if you want, as soon as infrastructure automation, that infrastructure is going to serve everyone be it development, be it testing, be it the staging environments. And if you go all in, it can serve production as well. So bringing everyone together to recognize what the problems are, that’s the key thing for QA optimization, getting to the single truth. This is evidence based is what we do at the end of the day. It’s very deterministic. Those 12 areas. Let’s extend them a little bit. There are 20 areas that stops us from delivering testing as fast, and as much as we want. Those 20 areas should be prioritized. Should be then roadmap and then be executed upon? And that’s the core idea of QA optimization. Now we are very opinionated. We know exactly how we want this automation to be deployed in a digital product, for example, we’re in a desktop application or a command and control system. The real time system, we are very opinionated, but what we are doing as part of QA optimization, we’re making sure that the solution you are being served with is the right one for you. When we’re saying you, that means our costumer and we build a very compelling story and then a very compelling business case to show them QA optimization, when applied, we’ll give them the right kind of value in terms of testing, by making the right kind of prioritize investment. We’re making testing for that organization. So it’s a well defined process. It’s a framework. If you’d like to people to work in, to make sure that we’re not just doing, book stand up, the deployment of solutions, we’re doing a contextualized deployment for our customer and every customer is slightly different than the other.

Matthew Heusser:

You mentioned these 12, I’d like to hear some of them. Do you have a assessment that you do to figure out which of the 12 are relevant and which ones are the critical success factors right now?

Aviram Shotten:

Great question. We have an assessment method we developed on top of that method. We added a layer of almost science. That science is achieved by using machine learning on customer’s data. Test data. Test data can be defects that were identified to be the defects in production and defects in testing phases, test execution records, development system logs, the source control logs, continuous integration logs. And we’re taking all this data and we’re able to identify simple metrics in the shape of “are we over-testing?” I would say that from the assessments we did, we found that most of our organizations are testing either six times more than they need to to eight times more than they need to. That means they are overspending on test execution. That means we are investing, be that in licenses, be that in manpower, in hours, meaning development people are waiting for feedback. And we’re not able to see that if we’re not optimizing the way we’re working and we can do that using machine learning. Let’s think about the day after the assessment. Let’s say that our costumer is the head of development and he really wants to improve the way they are doing their software delivery. And we come with this number. Look, we think you’re over-testing by multiples. In order for this organization to accept there is a need for a change and to be smarter around risk based testing, we better have a very conclusive and almost a smoking gun. By applying machine learning, we’re changing the traditional consulting of senior individuals, senior quality engineering individuals coming in, and basically suggest ways to move forward. We’re coming with a paper that says, “This is the evidence. In reality, 81 of your test cases has never failed. What’s the use of them and why do you need to execute them every time?” This is a real number drawn from one of the assessment. So yes, we have very skilled engineers. Yes, we have very senior consultant but when they come with a conclusive report, how an organization should transform to a smarter one, justify investment in test automation, we’re doing that because we also have the data to support, approve, or disapprove our findings. And that makes it very hard to argue with.

Matthew Heusser:

So to do that, we would want something like JIRA that ties the defect to a test case, and we could export that data and then we could run whatever analysis we want on it. I mean, most of the larger enterprises that we’re talking about, they usually have something like this. I’m not always seeing a direct correlation of tests to defects that way, but you find that most of the clients we’ve with have that available.

Aviram Shotten:

I think that there’s a great variance in terms of quality of data. The beautiful thing about machine learning is that you can only understand in retrospective what was the logic that it created? Machine learning is very powerful, finding correlations, finding patterns. And that’s what we’re trying to find. It’s very rare that we have a traceability from development all the way down to the production defect. Usually we identify trends, identify certain changes over time, created a certain quality impact on the system under test, the change that was made early January, that basically we saw that in the month afterwards production environment was less stable and that’s where machine learning never fails. Because once you have the data standardized, the timeline arranged, machine learning can thrive. But to your point, I think that quality of data varies. Some organizations are very disciplined. Some organizations are all over the place and when it’s all over the place, it just takes a lot of time to standardize on the data.

Matthew Heusser:

And I’m looking at the list of 12 here. We’ve got automation, Agile, performance, metrics, environments, data, suppliers, governance, end-to-end, requirements, tooling, security, usability. The specific hard example with the machine learning back, I think is more of a coverage and effectiveness. Kind of falls under metrics because you don’t have the data in the first place to know. There’s a lot of this assessment, people talking to people and analysis and looking at artifacts. And how long does it take to do the assessment?

Aviram Shotten:

So what used to take us between six to eight weeks, we’re now able to deliver that within three to four weeks. We do want to get your data in advance, but the way we’re conducting it is that we’re not letting go of the traditional surveys, the traditional interviews, the traditional evidence collection processes, but whatever we come up with on the, if you will like to call it, the manual side of the process, we’re making sure that there’s a way to approve or disapprove that conclusion with the data. When we say, for example, to a costumer, they need to invest in a better defect management process, it’s because we have between three to four smoking guns to show them that their existing discipline and processes around defect management have failed them and caused them delays or overspending and once you have that, people’s beliefs can change. But if we will come and say, guys, look, your process is outdated. Someone might stand up and say, “guys, I don’t think it’s outdated, persuade me differently.” And then it’s just a “he said, she said” conversation that will go nowhere. I think that now we’re doing this much quicker. We’ve been recognized by analysts as the only company in the world who applies such a advanced analytics approach into consultancy. And we definitely see the fruits of it.

Matthew Heusser:

That makes a lot of sense. So when you’re doing this analysis, do you have, “Oh, if the problem is test data” and it’s going to depend on context and size and the type of data, of course, but do we have a handful of solutions we typically pull out? The problem is structure. We want to inject a center of excellence. The problem is portfolio. We want to inject a managed test solution. Do you have kind of hip pocket solutions that you typically use?

Aviram Shotten:

Yes, Definitely. I think that this is exactly what you would expect from a company that all it does is testing. And we wouldn’t want to reinvent the wheel, not for us, not from a risk perspective and not from our customers. Once we’ve identified the level of urgency and the level of impact of a certain challenge, then we go back to the basics. We are pulling solutions that have been successful in the past, and we’re trying to see if they will fit to purpose here for us to embark on a new development journey. Usually it’s unjustified because the landscape of technology changes not in days, but in month and someone in Qualitest or in any organization must have done something already with advanced knowledge management, we are able to get to where we think this in the past, what kind of framework have we used? What kind of partners technology we leverage. And then we’re making sure that the solution is fit for purpose. We’re trying not to reinvent the wheel. It’s not cost efficient. And that basically demonstrate bad practices.

Matthew Heusser:

Yeah, that makes a lot of sense. I hear what you’re saying. Can you tell me a couple of transformation stories what’d you do first?

Aviram Shotten:

We can definitely talk about one of our customers that in the last three years stopped treating their digital presence as branding exercise and move to an eCommerce model and all of a sudden they were able to improve their revenue stream. They stopped paying high commissions to eCommerce websites that were reselling their brands and goods, and they invested more and more in technology and hence they invested more and more in quality. And then they stopped and asked themselves, “how well are we doing? We feel pretty good about ourselves, but what should we do next?” And that was the perfect question for us, because for us before you meet the problem, let’s make sure where you are now. Let’s see if we have any problems, any perceived or any actual problems. And again, we did exactly what I just described to you. We did this four weeks process where we took more than 10,000 ServiceNow tickets that they had and we analyzed over a hundred thousand test cases that were executed in the past two years. And we analyzed more than 5,000 ALM defects and 2000 JIRA defects. And we saw things and we were able basically to say to this customer, “you’re doing quite well in terms of maturity, even though you’re just three years out there, you’re actually doing as good as five or six years long test practice. And in terms of your coverage, your coverage is quite good. We do expect you, or we do suggest you will make investments in the following areas”.

One of them was removing of duplicated test cases because we saw that there was a very little efficiency of test case execution. It was less than 1% of the the tests would show a fail. Now that doesn’t necessarily mean that the system under the test is in a high quality. It definitely suggests that they are overtesting. All the test cases are not fit for purpose. And we use machine learning to do the application analysis using NLP. So once we’re able to identify those vulnerabilities in the test process, we’re able to focus their guns in the right places. And the transformation that happened on the back of this process is to basically invest much more in smart test automation or intelligent automation as it’s called now. They’re definitely investing more in nonfunctional testing, performance testing. And the way that their testing is structured is that they are working in shared testing, shared services model, which makes those assets, the likes of test automation, the likes of nonfunctional testing, accessible to project managers much easier because we establish this in the right way.

Matthew Heusser:

Yeah, absolutely. I’m looking right now at my case study. I’m not sure if it’s available to the public yet. We’re talking about test cases. I’m not sure where… They’re ALM domain. So there’s ALM software. We’re pulling tickets from service now. We’re pulling defects probably from something like JIRA and we’re pulling requirements. Do you happen to know how are we measuring requirements? That’s just a bunch of words on a paper, right?

Aviram Shotten:

First of all, you’re right. It’s a bunch of words on paper. In reality, when you apply advance natural language processing algorithms together with meta data, the likes of writer, the likes of traceability, you then start to see patterns. You start to see overlapping requirements. You’re starting to see requirements that are in the end of the day, resemble in their linguistics to defects and to test cases. And at some point you’re able to say, if this requirement is good, it’s in approved state, that can be used as reference to judge new requirement, if it’s good or not. So you can learn a lot once you apply machine learning in the right way. And once you are asking the right kind of questions, if you know what the good requirement looks like, you can definitely then qualify and classify good or bad or unsure requirement quality.

Matthew Heusser:

So I know this is conceptually possible. I did a project that was similar to this in grad school. It’s as simple as zipping up the defects and zipping up the requirements, because what zip does is it identifies massively redundant information, or you can put them together and zip them up and then you look at the header. And that is all of the redundant information. When you see this symbol and the zip file, you are actually going to replace it with these 10 words or these five words or these three words. And that’s how you make it smaller because those three words occur 15 times. So if you have redundancy, zip is a technology that happens to accidentally find that redundancy. You can say the things are correlated. So the technology exists but Michael, have you ever seen anybody else actually do this?

Michael Larsen:

You know, that’s actually where I was going to take this. And I was going to ask, because these are great comments. And these come up, when you’re looking at doing something new, trying to say, “Hey, we’re going to start setting up a QA environment. And this is how we want to do it.” You think about, you kind of think more about how to get an optimization happening earlier when you’re just starting stuff out. But the biggest problem that I think you face… Aviram, I’m curious, this would be my question here that I would actually ask.

Matthew Heusser:

You got to answer my question first, buddy.

Michael Larsen:

Okay. Have I seen people do it? No, I haven’t. And that’s the point, but this is why I want to bring this up. And then my question would be, why haven’t we seen this? I think that there’s a lot of resistance because when an organization comes together and they start jelling on things, effectiveness kind of comes down to “well, we’ve managed to make it work. And once you’ve managed to make something work, you’re really reluctant to say, “Hey, how can we optimize this?” A fear sets in, you know, “don’t change the formula, man” is one of those things that tends to happen. So Matt, for your question, I would say, no, I have not seen this. Aviram, my question for you is why haven’t we seen this? Why is it that organizations that are more… mature, I’ll use that as the term… might be less inclined to approach something like this.

Aviram Shotten:

I think that it’s very easy for customers, organization to engage with us on the assessment process. “Oh yes, you have this fantastic process and this fantastic machine learning tool. We’d love you to come and do stuff.” Traditionally, we were selling those assessments without the machine learning layer on top of it. And what happened is that we were very welcomed to come and assess. We had access to everyone, everyone complied beautifully. But when it was time for the rubber to meet the road, to start to make changes, to start to move and shift things around, for example, the responsibility for testing the shared services model, that’s where people became, “you guys don’t understand us. We tried it. It never work. It will not be the right solution for us.” So assuming that this will continue to happen and people will engage with us because they appreciate external point of view and the profession of testing to come in and basically provide them with a way to move forward, a and like I suggested many times so far in this conversation, which I found to be extremely interesting guys, well done, is that a smoking gun. This is the evidence that you need to evolve the way you consider your cyber resilience in terms of how you do your defect management or your automation delivery. At that moment in time, they lost their ammunition. We have scientific evidence that shows their inefficiency.

Now it still can be political. Someone might be afraid of losing something or power or job or anything like this. We can definitely be stopped, but when get the right kind of echo and we’re doing the proper due diligence, it’s harder to argue. And that moment in time we see less resilience. People are more talking about how do we make the transition and the change management rather arguing our recommendation, which is an incredible shift right. Instead of arguing about, “are we right or wrong?” , the useful discussion is how do we make this change management work for us without interrupting existing projects, without creating too much of an impact or too much of a noise while we are doing the change. So for me, it’s really powerful. It really changed my quality of life, because think about the life of the consultant, how hard it is to persuade someone you’re right. Good luck. Now it’s a little bit easier, but the art of execution and the art of change management still remains very difficult and very important.

Matthew Heusser:

Thank you. If I could just add an augment that a little bit, Michael, honestly, you know, there are a half dozen consultants. I think that I’m one of them that has done this sort of work and you export stuff to a lot of spreadsheets and you disappear into a room and you look at the data and you run some analysis on it and you look at medians and means, and you look at standard deviation and you talk to a lot of people and you can get the data for some companies. It’s a lot of work. The idea of tying requirements to defects through contextual analysis. You’re just not going to get that. Hopefully they’d connected it in JIRA for you. And I think the reason people haven’t done it is because you would need to go write a program to do it. I’d have to go write that program almost uniquely for every stack for the customers that I work with. Either you’re going to charge a customer a million dollars they’re not going to pay you, or you would need to spread that work over a hundred customers and you can charge a much smaller each. And frankly, I think that Qualitest is one of the few companies in a position to do that. So there’s no open source tool to do this now. It is an enterprise niche and enterprises tend to pay money but if this works, we should see open source tools falling out of this in the next five to 10 years. And there are some semantic analysis tools that get you part way there. I’m really interested in seeing some of these case studies. And you mentioned the shared service model, Aviram. What do you mean by shared service model?

Aviram Shotten:

So you mentioned how hard it is to find consultants that really take a big picture of testing and lock themselves down in the room and get with the right kind of recommendation. It’s also difficult to find very capable performance engineers. It’s very hard to come across people who get cyber and can really educate testing and testers and developers, how cyber resilience can be built into the development process. Those niche skills, the likes of UX testers, the likes of performance, the likes of automation gurus, the likes of coaches around Agile and in-sprint testing, those hard to find people, instead of letting the projects go around and try to hire a consultant or bring in talents to the project, we’re centralizing those niche skills in one place and making them accessible to all projects whenever they need them. Now, if we need one more, we will hire one more. But that one person will probably be solving three or four projects that if they didn’t have access to shared services, they will have to do it by themselves. And that is not cost efficient. That usually results in lower quality because you just hire someone. You don’t have all the time in the world, to hire the best person and definitely, and very painfully, you have a knowledge bleed. That subject matter expert finishes the performance testing, and it disappears. If he’s part of shared services, we can always refer to him and ask him, “Hey mate, what have you done six months ago that made you develop this kind of flow on the system, because now we don’t see something”, et cetera. So we don’t lose knowledge and we have access to this great talent and we reduce costs.

Matthew Heusser:

Yeah. My graduate paper was… think it was called “the outsourcing equation”. The argument was, if you have a partner that has expertise in an area, they can serve a lot of customers and have economy of scale. Then they can provide your resources for a lower cost. And our conclusion was “just don’t do that for your core competence. Don’t hire someone else to do your core competence.”

Aviram Shotten:

It’s definitely what we’re saying. Quality is usually not a core competency for costumers who are either an energy company or defense contractor or an eCommerce giant. Those guys are good at what they do. We are good at testing. Within this capability, there are niche skills. Like you suggested, the economy of scale is so much more efficient when you’re adopting that anything that is outside of the industry or the project testing, you should be part of a shared services model.

Matthew Heusser:

That’s an interesting quote. “Anything outside of the in-sprint testing should be part of a shared services model.” So does that mean they have individual testers on the team testing the features, the functional testing, adding to the test suite, maybe not running it through regression if you don’t deploy every two weeks. What does that mean?

Aviram Shotten:

That’s exactly what the in-sprint testers are doing. They are like earmarked to a certain scrum or certain functionality or certain channel for the digital team. But besides those people that are really integrity, part of the scrum team, besides those people, the rest of the gang that provides testing, be they’re the guys that deploys the KPIs and measure the testing, be they that performance expert, be they the guy who tried to optimize things by deploying machine learning, be that that cyber dude, all of those are part of the centralized shared services group.

Matthew Heusser:

Is Qualitest Interested in doing the in-sprint testing, too?

Aviram Shotten:

Of course. I think that more than 50% of what we do is work within Agile teams. Most of that is automation or in-sprint automation related. Some of it is deploying the right kind of pyramid and the end of the pyramid is user experience, manual testing. This is all done as part of in-sprint testing. Some of our customers are still “Wa-Jile” waterfall like, Agile like, et cetera. But most of what we do is in-sprint testing.

Matthew Heusser:

Great.

Michael Larsen:

Hey everybody, I want to say thank you. Aviram, I want to say thank you for coming out today for this. I really appreciate it. I mean, we could talk about this all day and I’ve been greatly enjoying this conversation.

Matthew Heusser:

We should do a video, but today, right now, we’re out of time.

Michael Larsen:

So for those listening, we want to say, thank you very much for joining us on The Testing Show. We look forward to seeing you again in two weeks, glad to have you with us. We hope you have a great two weeks until the next time we get to talk again, until that time be good, stay safe, be excellent to each other.

Aviram Shotten:

Bye everyone. Thank you.

Matthew Heusser:

Thanks for being on the show, Aviram.

Michael Larsen:

That concludes this episode of The Testing Show. We also want to encourage you, our listeners, to give us a rating and a review on Apple podcasts. Those ratings and reviews, help raise the visibility of the show and let more people find us. Also, we want to invite you to come join us on The Testing Show Slack channel, as a way to communicate about the show. Talk to us about what you like and what you’d like to hear, and also to help us shape future shows. Please email us at [email protected] and we will send you an invite to join group. The Testing Show is produced and edited by Michael Larsen, moderated by Matt Heusser, with frequent contributions from our many featured guests who bring the topics and expertise to make the show happen. Additionally, if you have questions you’d like to see addressed on The Testing Show, or if you would like to be a guest on the podcast, please email us at [email protected].

The Testing Show: Optimized QA with Aviram Shotten

share

Panelists:

References:

Transcript:

Recent posts

Testing AI in the National Health Service (NHS)

Synthetic Data in Testing

Women In Testing, Part 2

share

Get started with a free 30 minute consultation with an expert.