Skip to main content

Randomly Generated Values in Tests

The use of randomly generated test data seems like a good thing at first glance. Having worked with several teams that have used this concept I generally discourage the practice. Consider a simple method that joins together two strings. A test using random values may look like this.


Harder to Read

While this is a toy example to demonstrate the problem, in more realistic scenarios the lack of literal values harms the readability of the tests. It is worth noting the lack of literals causes more lines of code as anything that has importance needs to be stored in a variable or field. My biggest concern is when assertions start to become complicated or even worse, duplicate production code in order to pass. If we wish to treat tests as examples, this is pretty poor.

Edge Cases

Generating a random string seems easy enough. Overtime the edge cases in question start to ramp up. You have whitespace, special characters, new lines, numbers and much more to worry about if you wish to do this properly. The code to actually generate random values is often shared via inheritance or composition, this makes changes tricky and dangerous as you can inadvertently change more than one test when modifying this common code. If the two inputs need to be different then you could potentially generate the same string each time, leading to flaky tests if you're not careful.

Psuedo Random

The random aspect of these tests can confuse developers. In the example above, there is only ever one value for each variable. In other words this test can run many times locally and pass, but fail when executed elsewhere. There may be a subtle bug that is only found after the code is declared complete. This issue often causes failures in the build, at which developers declare "it's just a random failure" before re-triggering the build because a value may be invalid for a specific scenario.

Date/Times can be Tricky

Date/Times are hard enough as it is. Trying to randomly generate these is not worth the hassle.


My recommendation is to rely on literal values or value objects where possible, these make the test much more readable and act like an example or specification. Additionally their use allows the inline variable refactor to take place, meaning shorter, conciser tests.

Test Cases/Parameterized Tests

If you wish to test similar scenarios in one go then test cases can help. This is usually the case when you cannot name a test easily because the functionality is the same as an existing test.


The assumption that randomly generated tests catch bugs and cover more ground is wrong. If you really do discover a bug after manual testing or on a live system just write a new test exposing that bug and fix it. Thinking you cover more scenarios by using random values is false.

Property Based Testing

I cannot comment on Property Based Testing fully, but this is certainly an interesting area and does not suffer from the issues above. Worth looking into.


This solution certainly violates DRY. There is clear duplication. If this was production code I would remove it, however for tests my stance for a long time has been to allow this duplication to remain. Readability and expressiveness is much more important. There are valid times when duplication between tests is a bad thing. While this simple example doesn't suffer from this problem I will expand on how to keep your tests expressive but DRY in a future post.


  1. Good article, it's something that I keep going back and forth on. My biggest issue with it is as you mentioned the duplication of production code within the test itself.

    I'd much rather have multiple test cases testing the scenario's that can occur, ok so you may miss one and get a bug, big deal as you said add a test and fix it, the test cases will grow over time to cover any bugs.

    Doing it this way allows for some nice documentation tests especially if you combine it with BDD and Gherkin you can end up with some real nice living documentation.

  2. Thanks.

    Acceptance Tests written in a Gherkin style should by their definition use literal values anyway, given you are meant to be able to show them to a domain expert and have them understand what is going on.


Post a Comment

Popular posts from this blog

Constant Object Anti Pattern

Most constants are used to remove magic numbers or variables that lack context. A classic example would be code littered with the number 7. What does this refer to exactly? If this was replaced with DaysInWeek or similar, much clarity is provided. You can determine that code performing offsets would be adding days, rather than a mysterious number seven.Sadly a common pattern which uses constants is the use of a single constant file or object. The beauty of constants is clarity, and the obvious fact such variables are fixed. When a constant container is used, constants are simply lumped together. These can grow in size and often become a dumping ground for all values within the application.A disadvantage of this pattern is the actual value is hidden. While a friendly variable name is great, there will come a time where you will want to know the actual value. This forces you to navigate, if only to peek at the value within the constant object. A solution is to simple perform a refactor …

Three Steps to Code Quality via TDD

Common complaints and problems that I've both encountered and hear other developers raise when it comes to the practice of Test Driven Development are: Impossible to refactor without all the tests breakingMinor changes require hours of changes to test codeTest setup is huge, slow to write and difficult to understandThe use of test doubles (mocks, stubs and fakes is confusing)Over the next three posts I will demonstrate three easy steps that can resolve the problems above. In turn this will allow developers to gain one of the benefits that TDD promises - the ability to refactor your code mercifully in order to improve code quality.StepsStop Making Everything PublicLimit the Amount of Dependencies you Use A Unit is Not Always a Method or ClassCode quality is a tricky subject and highly subjective, however if you follow the three guidelines above you should have the ability to radically change implementation details and therefore improve code quality when needed.

DRY vs DAMP in Tests

In the previous post I mentioned that duplication in tests is not always bad. Sometimes duplication becomes a problem. Tests can become large or virtually identically excluding a few lines. Changes to these tests can take a while and increase the maintenance overhead. At this point, DRY violations need to be resolved.SolutionsTest HelpersA common solution is to extract common functionality into setup methods or other helper utilities. While this will remove and reduce duplication this can make tests a bit harder to read as the test is now split amongst unrelated components. There is a limit to how useful such extractions can help as each test may need to do something slightly differently.DAMP - Descriptive and Meaningful PhrasesDescriptive and Meaningful Phrases is the alter ego of DRY. DAMP tests often use the builder pattern to construct the System Under Test. This allows calls to be chained in a fluent API style, similar to the Page Object Pattern. Internally the implementation wil…