Tuesday, 2 February 2016

Sproc vs ORM vs Inline vs Polyglot

With relational databases the common data access patterns tend to fall into three core options.

  • Direct access via inline SQL
  • Stored procedures using the standard library
  • ORM frameworks or libraries

Individually these have both pros and cons, often leading to heated debate and discussion.

Inline

  • Leaky abstractions.
  • Dangerous in places via SQL injection.
  • Quick and dirty solution.
  • Non testable by default.
  • Useful for integration testing where dynamic input is required and safe.

Stored Procedures (standard library)

  • Can be clunky and low level to use in places.
  • Non testable by default.
  • Allows the use of DB specific features internally.
  • Easy to tune and optimize as long as interface is stable.
  • Developers can optimise the execution of queries.

ORMs

  • Testable by default.
  • Complex, large and difficult to use correctly.
  • Leaky abstractions.
  • Optimisation is harder, especially for DB engineers.
  • Mini or lightweight alternatives exist, with less of the downsides.

Polyglot Persistence

The actual decision of which data access method to use can be a non issue providing a good abstraction is used. Whether you use inline SQL, stored procedures or full blown ORMs is beside the point. Instead of abstracting the implementation detail, focus on the role the object or function has to play. A benefit of this approach is the ability to mix and match data access patterns. Polyglot persistence is gaining more traction where alternate data storage solutions are more appropriate.

N+1

One common flaw that all these data access patterns can have is the N+1 problem. (expanded in the next post).

Tuesday, 26 January 2016

Getting Things Done - For Software Developers

I have been using the incredibly simple techniques within Getting Things Done (GTD) to good effect over the last twelve months.

The System

At a high level the system consists of buckets, grouping and a task store. The actual implementation of GTD systems is down to personal preference. Many find their system changes and evolves over time.

Buckets

Have one or more buckets which act as simple dumping grounds for anything you need to do. My phone, pen and paper and post it notes are the three core buckets I use.

Buckets are where you store anything that takes more than a couple of minutes to do. If something takes less time, just do it there and then. Regularly empty the buckets and assign them to groupings of related items. Example groupings include tasks around the house, work projects, blog items, or items to buy.

Grouping

Each grouping can then be allocated a priority. Each grouping essentially becomes a mini kanban board.

Grouping is preferred to having one big todo list as different scenarios allows the act of tackling items when the time is right. If you have thirty minutes to spare on the computer, anything that can be done via the PC can be worked on. Likewise if the weather is good, what tasks can I do outside?

Trello

I use Trello for the storing of tasks. Trello has the added benefit of being able to assign due dates, notes and comments. The boards also make priorities visible. The more tasks in a column, the more to do and potentially the more attention a certain grouping should be given.

Daily one or more emails land in my inbox after being filtered. These are tasks that need doing within the next twenty four hours. These are simply reminders or tickles to complete a task by a set date.

Day to Day

GTD has been a great assistance not just in software development, but day to day life in general. There is more to GTD but the core system is very simple yet highly effective.

One of the biggest benefits of GTD is the ability to clear you mind. As everything is recorded or waiting in a bucket nothing gets forgotten. Instead you can focus on exactly what you need to be doing at the time.

In part the use of GTD is partly responsible for the growth of this blog from 2014 to present.

Tuesday, 19 January 2016

A Lotta Architecture - A Reply to "A Little Architecture"

A recent post about architecture from Uncle Bob got me thinking and talking about a typical day in the life of a developer. It's well worth a read. In fact at the time of writing this reply there are 347 retweets and 288 likes - of which I was one of those statistics.

The advice is practical and advice that I agree with. Except this is not the full story. While deferring architectural decisions as late as possible is a good thing, such details actually tend to be the most important, costly and difficult parts of an application.

In the example the BusinessRuleGateway allows the business logic to be coded in pure isolation, using a stub or fake. This is fantastic and provides numerous benefits. Sadly the actual implementation of the gateway requires knowledge of MySql. This may be obvious but the decision of what database to use cannot be deferred or ignored forever.

Once chosen you will require intricate knowledge of how it works and is implemented. When things go wrong and you are staring at a one hundred line stack trace, you better hope you understand how the DB is configured.

Additionally the gateway interface demonstrates another common problem, leaky abstractions. This particular interface while coded without an implementation in mind, is tightly coupled to a relational database. If we opted for a file system or document database the use of transactions is now incorrect.

From my experience such implementation details end up taking the majority of your time and effort - see the 80/20 rule. From small to large systems, this tends to be a common running theme.

  • One project was tightly coupled to the web framework. Making a code change required detailed knowledge of the inner workings of the page request/response lifecycle.
  • Another required deep knowledge, awareness and fear of the legacy database schema. Code changes were easy. Plugging in a legacy database took horrific amounts of effort.
  • A current project is working with an asynchronous, distributed system. In order to be productive a solid understanding of the mechanics of message queues and distributed computing is required.

In some of these cases, the advice offered around abstracting implementation details was actually used. Rarely is the problem ever pure business logic. In a typical week I would bet a large sum of money the majority of developers find themselves fighting with integration, or third party dependencies, over faulty domain logic.

Deferring decisions is a sign of good architecture, but the act of deferral or hiding behind interfaces only gets you so far. The sad state of affairs is that any implementation detail left unchecked can swallow applications in complexity.

Wednesday, 13 January 2016

Validation is not a Cross Cutting Concern

Attributes in C# are also known as decorators in Python or annotations in Java. Other languages may have similar constructs. This post will use attribute throughout but refers to the same concept.


While attributes prove useful for cross cutting concerns such as authorization or logging, they can be misused. Attributes should act as metadata, providing no direct behaviour. Failing to do so will make DI, testability and composition very difficult.

These flaws are especially true for validation. Despite all input requiring validation, the manner in which validation is performed is dependent on the entry point to the code. Context matters.

Consider order information that requires a billing address and by definition, its children to be populated. An attribute works a treat here in this simple case.

A problem arises if you only want the billing address validation to activate if the billing address and delivery address differ.

Complexity quickly starts to take over. With a more fully featured example attributes can start to overwhelm the class. This example becomes worse if the validation is required to be performed by a third party library or service. Finding a hook to integrate becomes troublesome.

Solution

Avoid attributes for validation in all but the simplest scenarios. Even simple scenarios lead to some churn if you do decide to switch. My personal preference is to now avoid attributes all together, instead opting to use a validation service.

The obvious downside to this is approach is the appearance of more code. While this is true, composed object graphs can benefit from the ability of reuse. Additionally in the case of attributes some degree of testing is required. These usually fall into the category of asserting the presence of attributes on properties which is far from ideal. The use of validation services do not suffer this problem. Internally the implementation can be switched, altered or refactored without fear of breaking any tests.

Example

The RootValidator is a composite of zero or more actual validators. Each validator can be specific to a particular task. The only requirement being the interface must be the parent object. This is to ensure the context is not lost when making decisions. The actual interface in this case could be made to use generic types if required. The ValidationResults are a simple value type representing an aggregation of validation failures. This could be extended or modified for further enhancements.

Benefits

  • Composition makes it possible to provide multiple validators that all do one thing well.
  • Testing is much easy as you can test each validator in isolation.
  • Null validators provides easier higher level testing as you can provide a no-op validator. Removing the need to build up complex object graphs for other test cases.
  • Developers can follow, debug and understand simple conditional logic more so than framework specific metadata.
  • Open to extension and additions such as third party code.
  • Services never lose context which allows easy runtime decisions to be made.

Tuesday, 5 January 2016

Application Validation and Domain Validation

There are two types of validation in an application - application validation and domain validation. This applies whether or not you practice DDD. One of my mistakes in the past has been confusing or conflating these two responsibilities at the same time.

Application Validation

Application validation is anything technical or anything domain experts would likely scratch their heads at. Examples include:

  • is the input null?
  • is the input whitespace or empty?
  • is the input within ranges for the datatypes used?
  • is the length of the input suitable for the DB?

Application validation should occur in your application service, along with other technical aspects such as transactions or configuration. This is due to different applications having different technical requirements. For example a HTML frontend may differ to a web service, so application validation would need to vary also.

This form of validation is best performed using validation services. The use of attributes/decorators/annotations can also be used though the following post will explain why this is usually a bad idea.

Domain Validation

Domain validation is concepts the business or domain experts would understand. Examples include:

  • "employees can only take a holiday if they have not used their allowance"
  • "estimated delivery dates should not fall on holidays"
  • "users can only edit their own posts"

Once inside your domain, validation should live as part of your domain model or domain logic. If value types are utilised you can safely omit additional application validation as each object would ensure consistency.