Tuesday, 28 April 2015

Getters and Setters are Evil - Redux

Back in early 2011 I wrote one of my most viewed and commented posts at the time - Getters and Setters are Evil. Four years later it's time to review this.

The feedback within the team was generally positive. Production code was written in this style to great success. The core benefit was encapsulation was preserved as Business Objects were the sole source of domain logic. As an additional side effect testing was easier.

However not everyone within the team agreed that the benefits were worth the extra hassle or believed in the benefits of encapsulation. I always found the addition of an IRender interface or similar broke the SRP, even if you moved the logic to a separate class. The OCP suffered too, if view requirements changed, you need dig out your business object. The biggest failing is that legacy code and frameworks still require public getters/setters to function.

Overtime I found myself and others slipping back to the "old ways" of applying getters/setters without thought.

2015

I now simply use two models, where the used to be one. Changes go to the domain model in the form of commands. Queries get returned as view models. The big change here is to simply split commands from queries and embrace the second model, everything else falls into place. This style works without a rich domain model also. The commands can be expressed as Transaction Scripts or similar if desired.

This is not new, I've applied this style in the past, but the big difference is the business object is never mapped or converted into a view model. There is no relationship between the two. They are two distinct paths in the code. This is the difference that CQRS brings - limited coupling.

Benefits

Encapsulation is preserved as before, but the view model can be tailored to requirements. SOLID principles are not broken, while still having the huge benefit of playing nicely with frameworks due to exposing public getter/setters in order to facilitate model binding.

Getters and Setters are not evil as I've concluded before. It just turns out there are better ways of embracing the benefits of thinking they are evil.


The term Business Object is also known as Domain Object. The later being my preferred term now. I stuck with the original terminology to match the original post's code example.

Wednesday, 22 April 2015

CQRS - The Simplest Introduction

CQRS or Command Query Responsibility Separation is easy to understand but it can become complex due to various levels to which developers take the principle behind it. Simply - CQRS is two models, where the used to be one. Nothing more at its heart.

Take the Customer aggregate below. This exposes both commands as void methods and queries as methods with return types. Public state is leaked, but needed in order to display or persist the data. Many frameworks or libraries require public accessibility in order to function.

CQRS states we split commands from queries. This means we end up with a pure Customer aggregate root that exposes behaviour only. Likewise we end up with a basic application service that simply returns data.

Benefits

Commands
  • Domain model is purely behaviour.
  • No data is exposed, public fields/methods gone (no getters/setters)
  • Only way to modify customers is via the commands - encapsulation is preserved.
  • Less relationships simply for querying/persistence (has-a relationships)
  • Testing is easier, check event raised/command issued rather than state
  • Allows task based UI's, rather than CRUD focused interactions.
  • If you use repositories, you only need a GetById method.
Queries
  • Queries can be simplified - in many cases by a huge amount. Just read from the data store, no need to create relationships between models.
  • You can use direct data access, rather than repositories or other abstractions. This has a lot of benefit.
  • It's easy to develop, less layers and moving parts.
  • You can independently replace persistent storage mechanisms per query based on use cases.

Complexity

CQRS is an easy concept, that introduces many benefits. However implementation of this pattern can vary from simple, to complicated. The extent to which CQRS is implemented should be judged on a case by case basis. Many systems can get away without separating read and write stores, yet still enjoy the benefits that this pattern provides.

Monday, 13 April 2015

Cool URI's Don't Change

I switched domains back in June 2013. This was out of my control. A lot of links were lost despite an attempt to backlink in order to keep the traffic from the old links and new links crossing over. The previous domain also broke content without consideration, there are links around that simply point to nothing.

To compound the issue I switched this blogs platform back in June 2014. This was much overdue, but an issue fully in my control. This yet again, broke links despite being for the better. My link management has been poor and given how annoyed I become at other sites breaking links, it's time to make a stand.

A recent example was when I was on holiday with a limited wifi connection of an evening. A couple of users on Twitter wanted to share some of my content, but the link was broken. After some delay and flip flopping I was finally able to share the post they were after. I am extremely happy that Paul thought of a blog post I wrote, so the fact that he was unable to share it was embarrassing.

An old mentor of mine introduced me to Cool URI's early on in my career and highlighted the importance of choosing a good URI scheme. From this post onwards no links to my content both past and future will break, despite hosting or platform choices. I've introduced an automated process to check each post when the blog is backed up, to ensure this never happens again.

The lesson here is simple. If you publish content on a site under your control, it's your duty to ensure you handle breaking changes.


I debated the use of URL or URI for this post initially. For future reference, URI's identify, URL's identify and locate.

Thursday, 9 April 2015

DRY vs DAMP in Tests

In the previous post I mentioned that duplication in tests is not always bad. Sometimes duplication becomes a problem. Tests can become large or virtually identically excluding a few lines. Changes to these tests can take a while and increase the maintenance overhead. At this point, DRY violations need to be resolved.

Solutions

Test Helpers

A common solution is to extract common functionality into setup methods or other helper utilities. While this will remove and reduce duplication this can make tests a bit harder to read as the test is now split amongst unrelated components. There is a limit to how useful such extractions can help as each test may need to do something slightly differently.

DAMP - Descriptive and Meaningful Phrases

Descriptive and Meaningful Phrases is the alter ego of DRY. DAMP tests often use the builder pattern to construct the System Under Test. This allows calls to be chained in a fluent API style, similar to the Page Object Pattern. Internally the implementation will still use literals or value objects, but each test can provide just the differences it needs in order to execute. The key point regardless of how DAMP tests are implemented is to favor readability over anything else, while still eliminating duplication where possible.

The example shows a typical arrange aspect of a test written in the DAMP style. The end result of this builder is we will have the ability to now act and assert against the result - a controller instance. If further tests were required we could use the same setup but simply provide different order dates for example. Additionally we could add or remove further chained calls. Behind the scenes the implementation of these builders is straightforward.

I tend to introduce this pattern after the third time of seeing duplication between tests. There is a bit of an overhead otherwise, the builder itself requires implementation and careful construction. Once you go past three tests the overhead pays itself off by allowing you to rapidly add new tests and make large, structural changes.

Beware the builders becoming too big or complex. If this starts to happen you may wish to refactor as there may be missing abstractions in your design. DAMP tests have numerous advantages, but they should be applied where required rather than for every scenario. Tests for objects that are lower in the dependency graph tend to fit into the more traditional testing patterns, while higher up your stack DAMP tests can prove useful.

Wednesday, 1 April 2015

Randomly Generated Values in Tests

The use of randomly generated test data seems like a good thing at first glance. Having worked with several teams that have used this concept I generally discourage the practice. Consider a simple method that joins together two strings. A test using random values may look like this.

Problems

Harder to Read

While this is a toy example to demonstrate the problem, in more realistic scenarios the lack of literal values harms the readability of the tests. It is worth noting the lack of literals causes more lines of code as anything that has importance needs to be stored in a variable or field. My biggest concern is when assertions start to become complicated or even worse, duplicate production code in order to pass. If we wish to treat tests as examples, this is pretty poor.

Edge Cases

Generating a random string seems easy enough. Overtime the edge cases in question start to ramp up. You have whitespace, special characters, new lines, numbers and much more to worry about if you wish to do this properly. The code to actually generate random values is often shared via inheritance or composition, this makes changes tricky and dangerous as you can inadvertently change more than one test when modifying this common code. If the two inputs need to be different then you could potentially generate the same string each time, leading to flaky tests if you're not careful.

Psuedo Random

The random aspect of these tests can confuse developers. In the example above, there is only ever one value for each variable. In other words this test can run many times locally and pass, but fail when executed elsewhere. There may be a subtle bug that is only found after the code is declared complete. This issue often causes failures in the build, at which developers declare "it's just a random failure" before re-triggering the build because a value may be invalid for a specific scenario.

Date/Times can be Tricky

Date/Times are hard enough as it is. Trying to randomly generate these is not worth the hassle.

Solution

My recommendation is to rely on literal values or value objects where possible, these make the test much more readable and act like an example or specification. Additionally their use allows the inline variable refactor to take place, meaning shorter, conciser tests.

Test Cases/Parameterized Tests

If you wish to test similar scenarios in one go then test cases can help. This is usually the case when you cannot name a test easily because the functionality is the same as an existing test.

Bugs

The assumption that randomly generated tests catch bugs and cover more ground is wrong. If you really do discover a bug after manual testing or on a live system just write a new test exposing that bug and fix it. Thinking you cover more scenarios by using random values is false.

Property Based Testing

I cannot comment on Property Based Testing fully, but this is certainly an interesting area and does not suffer from the issues above. Worth looking into.

DRY?

This solution certainly violates DRY. There is clear duplication. If this was production code I would remove it, however for tests my stance for a long time has been to allow this duplication to remain. Readability and expressiveness is much more important. There are valid times when duplication between tests is a bad thing. While this simple example doesn't suffer from this problem I will expand on how to keep your tests expressive but DRY in a future post.