What is the optimal time that performance/load tests should take when building an application?
So I'm on this project where we re-build our application on one of our environments on each merge to repository. I'm a Automation Tester, and have been doing functional and REST Api tests so far, no previous experience in performance/load testing.
I was asked to prepare a performance test, that checks response times on each of our endpoints. A load and throughput tests that have about 7 scenarios 5-7 requests each, same scenarios for both. With the difference that the load test should reach a fixed throughput at peak time, and the throughput test increases the number of concurrent users until application becomes unstable, both start at 10 concurrent users.
I was told by one of our senior devs that all three tests should complete in no more than 5 minutes, is this realistic and achievable?
There are lots of unknowns in your question to give an answer and thus to define a proper experiment.
Instability point. You said you want to find maximum throughput until the application becomes unstable. How do you define instability of your application?
- When error rate increases to 20%, 40%, 50%? What makes an error? HTTP 500 response? Connection drop down?
- Or when response time for 99.99, 99 or 75 percentile of responses goes below certain threshold? What is your threshold?
The answer will depend on requirements of your specific application. For instance, 3 seconds for 75 percentile of your responses might be satisfactory for a batch process but not necessarily for online banking application (imagine those 25% of users waiting longer than 3 seconds to get a transaction done).
Experiment duration. You said you want to find maximum throughput until the application becomes unstable. But on the other hand that all three tests should complete in no longer than 5 minutes. What if your applications becomes unstable after 10 minutes of increased load? Or after 1 hour? We really don't know the specific of your application. Why is there requirement for completing in 5 minutes? Are you going to run those tests as a part of regression?
Throughput at peak time. You said that "load test should reach a fixed throughput at peak time". What is a peak time? What is the expected value of fixed throughput? Do you intend to replicate situation from production for its peak times? E.g. your application is for stock transactions where peak happens twice a day: when stock session starts in the morning and when it's about to get closed at 4pm. and you want to apply same number of concurrent users that happens on production at those times?
Deployment architecture. Is deployment architecture of an application in your test environment the same as in production? For instance, you have the same number of application servers, load balancers, etc. Usually, test environments are reduced in size and complexity comparing to production and thus application may behave differently. For instance, the application may saturate faster than in production and crash in time shorter than 5 minutes.
You see, there are many variables that impact the length of your experiment: expected throughput, expected error rate, expected response time, peak time, error definition, deployment architecture. What I would do if I were in your shoes would be to start from small experiment and design it iteratively to understand your application and expectations better:
- Prepare the simplest throughput scenario, with just one type of a request and slowly increase number of concurrent users until the application stops responding at all, i.e., the number of 500 HTTP responses will be 100%.
- Then look at the throughput graph to see how the application behaved.
- Share your findings with devs and ask them for feedback.