Good strategy for automated regression where initial state can't be guaranteed or predicted



  • The situation I need to start building automated regression for a very large web application with thousands of users. It is impossible to start from a known state, or even to be sure that data entered into the application will be saved to a known state. The application is a payroll entry system, where all data is read only if payroll has not been started, and has an insane number of configuration choices. The application can't be run on a local database with a local setup: it requires multiple interlocking web services (which use - don't ask me why - the system registry to determine where to submit their results) and is not practical to reset (a database refresh can take all day). The test sandbox environment can be accessed by multiple users and configurations changed in ways that completely alter the expected behavior of the system. the development sandbox environment is often out of action for days at a time when large projects are being worked on. The strategies I see as viable The potential strategies I'm considering are: Set my initial conditions first - This is - alas - a bit more complex than it looks. I would need to make either several database queries or do a number of screenscrapes to determine whether the company has had its configuration changed, and whether I need to start a payroll run, change the configuration back to what it should be, then submit the false run before starting a clean run with the correct configuration. This would also complicate end-of-test checks because I would then need to find and discard the initial submit file in order to check the validity of the one with the data my tests actually entered. On the plus side, the values saved to the pending transaction tables prior to submit and the contents of the submit file would match expectations. Dynamic checking of results - by this I mean the much more complex coding needed to check results based on which configuration flags are active when I test, regardless of whether they're what I expect to see. The advantages I see here are that I wouldn't need to try to force the system to the settings I was starting with, and I'd be building in an oblique check that certain fields display only when certain configuration flags are enabled. On the flip side, this is a great deal more complicated to code properly, and as a result, much more error-prone. Hybrid approach - where I set certain key values but let the rest do whatever someone else set them to. This would give me some of the disadvantages of each method, but would allow me to limit the extra complexity to areas that tend to be moving targets anyway. Some other considerations I have (after six months experience with the system) a reasonable grasp of where the system interacts with web services (these are all built and maintained in-house, but tend not to get much development. They also typically get called from the web application or poll the web application database, so testing them separately isn't on the agenda at this stage). The web services are integral to the system to an extent that it's not possible to simply test only the web application. The application is written is classic ASP. There is no way to separate front end and business logic. It is not unit tested because it is not unit testable. While there is a plan to replace it with a better engineered system, I will be testing this system for at least another 2 years, potentially longer. Given these "interesting" complexities, what approach should I take to building automated regression for this application? Terminology Update In response to @user246 requesting clarification of some of the terminology I used: submit the false run - by this I mean to submit a payroll that contains nothing but a change of configuration settings where I don't actually care what gets sent. submit file - This one is a bit more complex. All the calculation logic and check generation logic happens in a back end server I have no access to. My only interface to the back end server is a fixed length format text file: the submit file. This file is generated for every payroll submission, with a defined naming convention. When there is more than one payroll submission in a short timeframe for a given customer, it can be difficult to find the one that contains the information I'm interested in. pending transaction tables - between the start of a payroll run and submitting the run, everything that happens is recorded to one of a number of database tables. They're known as the pending transaction tables because changes to configuration, payroll, and personnel data are not final until everything has been synched between the web application and the back end server. Until a payroll run is submitted, changes aren't permanent, and depending on the logic involved can exist only in the pending transaction tables.



  • While the entire system isn't under your control (which is clearly the superior option), it might still be possible for some slice of the system to be under your control. You say it's a payroll system. Can you create a "Test Company" that everyone else would agree to leave alone? Can you create "Test Employees" that everyone else would agree to leave alone? In the cases where I've had to share systems, that's the approach I have taken. Each tester (manual or automated), had their own subset of the system for their exclusive use. When naming the objects, we preceded them by the initials of the tester. My "company" would be "JSS Corporation" My "employees" would have names like "JSS Smith", "JSS Jones", etc, etc. My automation would typically begin by setting up these companies and employees, with the attributes as I needed them (Some single, some married, some full-time, some part-time, etc), then proceed to the other functional steps I needed to test.



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2