Blog

Quality Sense Podcast: Julio de Lima – Machine Learning to Understand Performance Testing Results

Júlio explains a machine learning algorithm to reduce the scope of load testing and get meaningful analysis from your data faster

In this Quality Sense episode, Fede has a chat with Júlio de Lima, an engineer at Capco, who recently completed his master’s degree in Electrical Engineering and Computing (Artificial Intelligence) and also co-founded GaroaQA, a meetup group with four locations across Brazil and over 2,000 members.

Episode Highlights

  • The complexity of analyzing the huge amounts of data that software performance tests provide
  • Using machine learning to solve data issues by giving meaningful insights about what happened during test execution
  • How he used K-means clustering, a machine learning algorithm, to reduce almost 300,000 records to fewer than 1,000 and still get good insights into load testing results

Relevant Links:

Listen Here:

Episode Transcript:

(Lightly edited for clarity.)

Federico:

Hello, Júlio. It is so nice to have you here on the show. How are you doing today?

Júlio:

I’m very fine, and it’s a pleasure to be here, and thank you for having me here.

Federico:

It’s my pleasure as well, and also to share this space with more Latin American friends. This is one of the biggest motivations I have for doing this podcast is sharing things between different parts of the world. Having great leaders in software testing from here, from the states, or countries around here, but also having the possibility to share experiences also with my brothers from the South.

Júlio:

Awesome.

Federico:

Julio, one of the main topics we want to address today is related to performance testing. But before that, I would like to know how you ended up working in software testing, and specifically what’s your background around performance.

Júlio:

Okay, that’s good. So I started my career playing the role of a developer. So I started developing PHP. I used to be a good programmer, actually. And some day I heard some director that was working for the same company that I was at at that time, and she said that they were going to start to invest more effort on testing. And I said, “Testing, how?”

Yeah, someone is paid to just click buttons. And I said, “That’s a lie.” And then I started to search about it and they found a lot of stuff on testing. I said, “Oh, that seems good.” And at that time I was working with 39 other programmers. So it’s very difficult to raise your career in an environment like this. So I said, I believe that it’s an opportunity for me. I started to study a lot, read a lot of books. I met, how they say, in the books, a lot of people that are very well known even today, like James Bach, Janet Gregory, who you also interviewed, right? It was a good conversation.

And then I started there. I invested a lot of time. So 12 years so far just dedicated to testing. And I saw the profession has evolved a lot from the people that were more concerned should the user experience, using the application, to someone that is very deep on the architecture, engineering, and even understanding the user expectations, using more skills regarding the engineering itself. Yeah, it’s a huge evolution. And during this time, the first contact that I have with, how they say, testing tools were regarding the load testing. And since the beginning, I started to use who’s this. I remember the first two that I used was Apache JMeter. The scenario at that time was that we used to have desktop applications and we were going to web. So we needed to exercise the performance of this application.

Federico:

A typical scenario for performance testing because you are migrating the architecture and there are many risks associated with that. 

Júlio:

Yes. So I remember at that time, one of the difficulties that we had was that no one knew what exactly should be the environment to support those applications. And no one knows about the common paths that the user will do on the application, since we didn’t have that kind of data. And yes, that was the first contact with performance. I’m talking about 2009 to 2010. Yeah, that’s it.

Federico:

Okay, cool. So your story is like the opposite from some companies where they say, okay, first we give the people the possibility to work as a tester, then we promote them to developers. So you went in the other direction. That’s really good. I also know some other people who had started in programming and then moved to testing, and I think those movements are really, really good. If you’re a tester and you have a background in programming, and not only a background like you started in programming, but you worked as a programmer. I think you understand some problems, some typical problems from another perspective and you can contribute better to their problems, right?

Júlio:

Yeah, yeah, exactly. And also you have more, how they say, empathy, because it’s all about that, right? Communications and exchanged knowledge to help these people to overcome that wall that usually companies create between the tester and the developer. And then when we overcome these walls, we can anticipate problems, right?

Federico:

Yeah, totally. Julio, recently you published a paper and also you presented your results in STAREAST, right?

Júlio:

Yes.

Federico:

Would you like to share with the audience a little bit more about what your findings in this research? Maybe a good way to start will be by understanding better the problem you’re trying to tackle with your research. Can you tell us a little bit more about that?

Júlio:

So to help the audience to understand better what I’m calling as a load test, okay? I have one scenario. This scenario had two steps, okay? It’s a very simple scenario. And I was running load tests where I have hundreds of users, okay, 50 simultaneously, running this scenario during one hour, 60 minutes. So this is what I’m calling load testing, okay? 

When running the scenarios, I had a lot of metrics being collected, but all of these metrics are from the user’s perspective. What I mean is how much time the response takes to come back to my browser, for example. And for sure, there are a lot of other metrics, the connection time, the latency in other measures. But after we ran this test, what I had was a table with more than 300,000 lines to read.

For me as a QA specialist, I can read those lines, but it’s very difficult, right? Reading that hundred thousand lines is very difficult. So I tried another approach. Let’s try to plot this information in a graph. And I have another problem because the amount of lines is so, so, so huge that I can’t read the graph with efficiency, right? It’s very dense. 

So I started to think, what are my possibilities here? Because my problem is I can’t read the information in an easier way, right? So I tried to divide the results and tried to get more information. It wasn’t a good approach either, the better approach as well, since I don’t have the correlation between the variables. So that is the problem statement. It’s difficult, very difficult to read the information, even being a QA specialist.

Federico:

Yeah. I think this problem happens for performance testing for so many things. Now we have a lot of tools giving you tons and tons of lines, right? You have a lot of the data, but the problem is how to analyze this data and in an efficient way. Because probably what is really important here is to provide an analysis in the current time, right? You have to be fast analyzing this data. So how can we solve that? What’s your proposal?

Júlio:

Great, great. So before going to the proposal, let’s understand something, okay? In the machine learning field, and I know that probably there are a lot of people in the audience that is, oh, machine learning testing, that’s not the approach that I am looking for, but please be patient and listen about that. 

In machine learning, we have a lot of algorithms, right? And we have two approaches. There are two well-known approaches; the supervised and not supervised one, okay? In the supervised, you have the classification for everything. You know exactly what should be the answer for that question for each data that you have in your data set. On the unsupervised learning, you don’t know what is the classification. So this is the basic principle, okay? Both can read a lot of lines and then take decisions, make guesses about it.

So on the unsupervised, this is the one case where I have 300,000 lines, but I don’t know what is the classification for each line. There I can just filter for the classification that I consider as flaw and success, right? But I don’t know the classification, I need to analyze each line to understand that. So for sure it’s not a problem to supervised learning, right? So why go with the unsupervised one? What do I mean? I will use an algorithm that will find the similarities between lines. So with this maybe you already discovered what I did. So I use one algorithm to find out what the similarities are. And after finding out the similarities, I create groups. In this case, we call these groups as clusters of information, similar information. And then I started to understand, to investigate the cluster itself instead of investigate the 300,000 lines.

When you use an algorithm like this that creates clusters, you have a specific number of clusters. And there are also some techniques that you can use to define what is the optimal number of clusters based on the data sets. Okay?

Federico:

Okay.

Júlio:

So, we can base it on the data set that I have, those 300,000 lines. I discovered that six clusters are enough. Oh sorry, nine clusters are enough, okay? And when I did it, the algorithm classified the data and created those nine clusters. And I saw one cluster that was very small, something like 861 lines. So it called my attention, right? I thought, whoa, we have a lot of clusters that have more than thousands of lines. So why is this one so small? And then so far my solution could divide the clusters to meet with intelligence because they are not just looking for the elapsed time, as people generally do, right? They were looking for all the variables. And I’m talking about elapsed time, latency, connection time, idle time, bytes received and bytes sent. So all of these similarities create that group, and that group is small. And when I looked at that group, at that cluster, I saw there were a lot of errors regarding server connection refused. And that was the first finding that I obtained.

Federico:

Okay. At the beginning, you mentioned that we should be patient when it comes to applying machine learning. And now I understand because the first thing people tend to think about using machine learning for testing is replacing the testers with some artificial intelligence or something like this. But you are actually helping the tester to do a better job. And I think this is a completely different approach, and it’s taking the best out of the tools we can build for improving our work, abilities of analyzing tons of data. So that’s cool. And you are, as I understood, you’re analyzing all of the metrics that probably JMeter is giving you, right?

Júlio:

Yes, you’re right.

Federico:

So can you also combine this with some metrics from an APM or some monitoring tool so you can also correlate those possible anomalies with some other behavior in the infrastructure? Is that possible?

Júlio:

It is possible. I haven’t had enough time to produce that kind of result, but this is my future work. We also need to get the metrics from the server in order to get more efficient feedback, efficient insights, because as a QA specialist that has knowledge in performance testing, I know that just the user view doesn’t reflect what is happening under the application, right? We have a server that may not perform as expected, that may be overloaded, a lot of other measures that can be used and then combine all of this in the group again in order to understand better what happened.

Federico:

Another thing that I’m thinking about right now is that when I did my PhD in Spain, what I realized is there is a disconnect between academia and the industry in our field. So we should be doing more things like this, collaborating and learning from each other related to that. Is there a way that anyone here can try to replicate your findings or use the tool that you prepared for this, or something like that?

Júlio:

Yes, there are. So, as you said before, I presented these results at STAREAST this year in May, 2020. So you also can find out the video and also understand step-by-step how to do this. But it’s reproducible, and yes, you can use this in your company.

Federico:

Oh, perfect. 

Júlio:

Awesome.

Federico:

Julio, one of my last questions is related to habits because I really believe that by improving those little things in our lives that we do every day, we can improve the way we try to achieve our goals. I am curious if you have any habit that you can inspire others to form or to adopt.

Júlio:

For sure, for sure. Yes. One thing I always share with my students, I don’t know if I told you, but now I have more than 5,000 students in my online courses.

Federico:

Whoa.

Júlio:

Yeah. I am very glad. And probably some of these students will listen to this podcast. Something that I always say to them is that we are not just tool operators. 

Testing is more than just tools. Testing is  critical thinking, testing is collaboration, testing is looking for risks. Testing is more. 

JÚLIO DE LIMA

In that we need to always strive to discover more. Being curious, right? But maybe it’s cliche, but what I’m talking about is to understand what the people that are in your team are doing. That is something that I learned from my career. 

Every time that I try to understand more about the architecture of the APIs, about the code, the way they do unit tests, the way that they think about testing, the way that they understand what is my activity, I could understand more about what I’m doing. So be curious and be communicative in order to grow your skills.

Federico:

Yeah, I’m 100% with you. I really believe this is really important to do a better job and also to improve the results, not only from you, but also from your team. So thanks for that. And one last question is, I don’t know if you like to read, if you have any book recommendation to make.

Júlio:

Yeah. So I would like to refer two researchers: Zhen Ming Jiang and Ahmed E. Hassan. They are striving to learn more about software testing, actually in the load testing area. And they are both people that I studied to create this experiment that I shared with you today.

Federico:

Julio, one last question, do you have anything you would like to invite the audience to do or to join one of the multiple channels where you share content?

Júlio:

Yes. So I am on YouTube, Spotify, Instagram, Telegram, Medium. So it’s pretty difficult to share all of these social networks, so I invite you to access about.me/Juliodelimas.

Federico:

Perfect.

Júlio:

Thank you.

Federico:

Thank you, Julio. It was really a pleasure to finally get to talk to you and share all this knowledge and experience working in performance testing, but also from academia. This is something that I was really interested in also because it’s like what academia is doing regarding those topics that we are working every day in our field. So thanks again for sharing that. And I hope you enjoy the rest of the week.

Júlio:

Excellent. As I said before, we actually started this conversation, for me, it’s a pleasure to have a friend like you that builds your company from zero, from scratch, and make a huge contribution to Uruguay and the community, and now for the whole world, right? So it’s a pleasure finally to meet you and to talk to you, and thank you for having me here and for inviting me to this podcast that I love to listen to.

Federico:

Thank you very much. See you.

Júlio:

See you.


Did you enjoy this episode of Quality Sense? Explore similar episodes here!


Recommended for You

Quality Sense Podcast: Refael Botból – Optimizing Performance Testing Costs
3 Key Performance Testing Metrics Every Tester Should Know

205 / 432