Tuesday, 24 May 2016

Your Job Isn't to Write Code

Solving problems is the role of software developers first and foremost. The most interesting aspect is that in many cases it is possible to perform this role without writing a single line of code.

Low Tech

I once worked with a digital dashboard which monitored applications. One of the yet to be implemented features was a key to highlight which each chart related to. During this period many employees would ask which graph related to which feature. The solution was a few weeks a way so as a temporary fix I stuck a post it note to the screen. This was by no means the solution, but it was good enough for the time being. The questions went away and eventually the dash was updated to include a digital version. Total lines of code? Zero.

Problem Solving without a Computer

A common experience that many developers encounter is solving a problem while not actually at the computer, programming. In fact this technique of simply taking a break such as going for a walk can yield some impressive results. One of my fondest memories of this trick was using shampoo in the shower to walk through a buggy A* implementation using the bathroom tiles. After returning to the task sometime after, the stupid mistake stood out. Lines of code to figure out the fix? Zero.

Deferral

Just the other week I began furiously updating an existing application to change how a core feature worked. The solution was not going to be quick, but it seemed like a good idea. About halfway in I reverted the changes. After further thought it turns out there was a much better solution. One that would not introduce risk to the current project's goals. Total lines of code? Minus one hundred, give or take.

Goals over Code

This lack of code is not a bad thing. In all three examples the goal was complete. You can solve problems with a single line, or thousands, it actually does not matter. If you switch your thinking to focus on completing goals or hitting targets, you are still rewarded with a feeling of accomplishment. The slack time you gain can simply be redirected to other areas or personal improvement.

Lessons

Many wise developers have said this before. The role of a software developer is to solve problems, not write code. This is not new, unfortunately a younger, naive version of myself ignored this advice.

  • Focus on solving business/customer problems, not writing code.
  • Sometimes you'll write one line of code, others thousands.
  • Not all solutions require code to complete.
  • Focus on hitting goals, not the feeling of productivity writing code can give.

Friday, 20 May 2016

Foreign Key Constraints and Microservices

Database constraints when used in relational databases are great. They ensure data integrity at the lowest level. No one would argue against using them in practice. Essentially constraints can be thought of as assertions against your database. Rules such as requirement, default values and foreign key constraints double check your use of the database. This ensures your application is interacting in a sane manner. Databases often out live applications therefore constraints also ensure integrity long after the application has been replaced or modified.

Distributed Systems

Distributed systems change how foreign key constraints should be considered. As distributed systems own their data, each piece of data that is mastered by a single service should ensure integrity via foreign key constraints. However outside of this boundary the use of foreign keys should be avoided. This sounds disturbing at first. Especially given the traditional approach of a single system backed by a single database.

Example

Consider a blog post service that provides a selection of posts. The service would be responsible for everything related to blog posts, but nothing more. The comments for the site are a separate service, yet there is clearly a link between posts and comments. For example, in order to display both posts and comments a link is needed.

- tblPosts (blog database)
    + Id
    + Title
    + Date
    + Body

Each post would store data related to the blog post itself.

- tblComments (comment database)
    + Id
    + PostId
    + AuthorId
    + Message
    + Date

The comment service would include a reference to each post that the comment is linked to. In this case both PostId and AuthorId would not use foreign key integrity as other services master this data.

If this was a single database both PostId and AuthorId could enforce integrity, however as each service is independent this is not possible. With physically separate databases this lack of link is quite obvious. Working around this in application code would introduce subtle bugs, and temporal coupling. Such solutions are best avoided.

Check Formats

When using the comment service, this approach leaves you with very little work to do other than simple format checks. The format of a PostId and AuthorId should be known, so the comment service can validate at this level. The core benefit is both the blog post service and comment service are highly decoupled. The comments could be changed to another service altogether, even a 3rd party provider, yet other services would remain unaware.

Valid Format, Invalid Data

Format checks will only provide so much value. There is nothing stopping a valid request for a blog post that does not exist. In cases such as this there are a few options. One is to provide a compensating action. Periodically delete any comments that do not have corresponding blog posts. An alternative would be to rely upon events. Only insert comments when a blog post is added, likewise when the service publishes the fact a post has been removed, any associated comments could be deleted.

Many Services, Single Database

Confusion and resistance around the use of foreign keys is often found when transitioning from a single database, to a single database operated upon by multiple systems. Teams adopting microservices find themselves in this dilemma usually when a large, legacy database is involved. In these scenarios existing constraints may need to be removed, or modified. Another technique is to have the independent services add dummy data in order to pass database constraints. While this is far from ideal, this pragmatic solution can work well while databases are being separated.

Lessons

  • Use foreign key constraints when using a single database via a single application.
  • Modify, replace or drop constraints when multiple services are writing to a single database.
  • Independent services should own their own data. Only enforce integrity within service boundaries.
  • Outside of service boundaries, use format checks to prevent errors.
  • Rely on compensating actions or events for data management.

Tuesday, 10 May 2016

Past Mistakes - Out of Process Commands

Some of the best lessons you can learn are from failure. I figured a series on mistakes I've made in the past would highlight where I went wrong and more importantly what to remember going forward. These real life examples vary from my early days of programming all the way up until present day.


I once wrote a feature that sent email to users on their behalf. On localhost this was fine. Fast, stable and good enough to get the job done.

Despite early successes, under load in a live environment, things were different. Sometimes the process would out right fail, requiring the user to retry. Other times it would be slow to process. This meant the users browser would hang while the email was being sent.

It was hard to replicate these problems. The actual code itself was pretty simple, there was nothing to optimize it seemed.

Mistakes

The core mistake was performing an operation out of process from within the life cycle of a HTTP request.

When sending the email was slow, the HTTP response was slow as the thread was blocked. This was blindingly obvious after the fact.

Frustratingly actually demonstrating or testing this feature was hard. Locally the server was nearby so latency was less. This started to introduce other red herrings such as was the server misconfigured?

What to do Instead

After the user has requested an email, record this fact and simply display a success message. Do this as quickly and simply as possible. While the message states an email has been sent this is not strictly true.

Instead the act of requesting the email is recorded. Ideally via a message queue or other durable storage solution. A separate service then monitors this queue and periodically sends out emails.

Users will not care if an email lands a few seconds or minutes after the fact. Additionally if anything goes wrong during this process no data is lost. The user will get their email eventually. Most e-commerce sites work in this exact manner.

This approach works great when commands from users cannot and should not fail. Examples such as processing payments or key user interactions would be excellent candidates.

Unfortunately not all out of process requests can be avoided. HTTP queries to retrieve data being one example. This cannot be faked. In these cases minimize the number and rely on other techniques, such as HTTP's excellent caching policies to reduce the affect on the system.

Lessons

  • Never perform commands that cannot fail out of process from within the same HTTP transaction.
  • Fear all out of process calls - they are costly, prone to failure and can cause chaos with your systems performance. Reduce and replace where possible.
  • When commands that should not fail are required, use a message queue to record the command prior to processing them.
  • Rely on HTTP caching policies to reduce the effect of queries that cannot be avoided.

Saturday, 7 May 2016

You Rarely Need Custom Exceptions

Implementing custom exceptions usually gives a hint as to why you rarely need custom implementations. They are often nothing more than sub classes where the only difference is the type name and containing message.

In this C# example there is a lot of code for nothing. When checking logs or handling bugs you will read the message and the stack trace. The first line containing a bespoke name rarely matters. Within the code throwing the exception very little context is gained from the type of exception - instead most of the details will be present within the error message.

Each custom exception you introduce adds overhead from source lines of code (SLOC) to compilation and execution.

Alternative

Simply do not create custom exceptions except in the rarest of occasions. Instead rely on the standard library of the language you are using.

Take Python as an example [Video]. ~200,000 lines of code yet only ~165 exceptions. This works out at about one exception for ~1200 lines of code.

If battle hardened and widely used standard libraries need only a fraction of the amount of custom exceptions, what makes your tiny CRUD app so special that it needs a namespace dedicated to handfuls of bespoke implementations?

Example

Rather than throwing NoBlogPostsFoundException use a HttpException with a useful message. Instead of BlogPostConfigurationException use ConfigurationErrorsException. Trying to add a comment to a post that is not published? Use an InvalidOperationException.

The downside to this suggestion is knowledge. You need to know what exception to use and more importantly where to find it. Consulting documentation or simple digging around will often yield what you need. As a rule try and default to reusing an exception over creating a new one.

The benefit of this approach is less code, and the removal of placeholder classes where the only thing that differs is the message. To ensure nothing is lost in communicating intent, care must be taken to ensure the message is useful, concise and clear.

Custom Exceptions

There are two exceptions (see what I did there) to this rule.

  1. When you explicitly need to handle a certain scenario and you cannot allow other unhandled exceptions to trigger that code path. In this case a custom exception may be valid. As usual question whether an exception is necessary at all, it may be possible to control this with an explicit code path.
  2. When the exception has some form of behaviour. This tends to be common with frameworks where when an exception of type X changes the flow but also carries out some action such as building up an error response.

In these cases this behaviour belonging with the exception makes sense. Generally most code bases treat exceptions equally. In other words any exception triggers a failure path, meaning the type of the exception does not matter in most cases.

Lessons

  • Reuse exceptions from the standard library, chances are there is one fit for the job already.
  • Only introduce custom exceptions if the scenario is exceptional and needs to be handled uniquely.
  • Put effort into ensuring the message of an exception is useful - messages and the stack trace are the most important elements.

Tuesday, 26 April 2016

X% of Configuration is Never Used

Code configuration is essentially for the likes of URLs, credentials or other per deployable settings. Sadly configuration seems to fall into examples where there is simply too much configuration, or the system has so many configuration points the actual code becomes far too complex for its own good.

Too Much Config

I once worked on a system with in excess of six hundred different configuration points. In reality all but a handful of these would ever actually need changing. Most configuration is added to enable anyone to make the change. Ironically if these configuration points do need changing, developers need to do it. The business or non technical individuals will never change settings. In this scenario you would need to actually test all six hundred different combinations of configuration. 1 on, 599 off, 2 on, 598 off and so on - this is not ideal nor realistic.

Configurable Systems are Complex

One of the earliest project mistakes I can remember involved creating a system that could be configured by anyone. A simple task became a several day exploration in failure. Each quarter a minor change to a static ASP page was required. This involved a date and some minor alterations to some financial wording for legal requirements. Instead of simply making the change I started building a custom CMS. A form overlayed the content allowing anyone to make the change and generate the page. It worked a treat technically, except it never saw the light of day. The business would not use it. Numerous individuals required approval before the change could be put live; security, legal, branding and several more. Also using the form still required some implicit knowledge of HTML. At the end of this we threw the prototype away and I made the change in a matter of minutes. My mistake here was building a solution that was not required.

Implementation

When it comes to implementing configuration a common mistake is to rely upon the method of obtaining the value, rather than the value itself. Additionally the use of some form of abstraction is often mistakenly used such as IConfiguration.

The solution is to instead provide the configuration value, not the means of obtaining it. This can be done either via a constructor or directly to the method. This allows the configuration to be provided in different manners such as from a DB or file, with no code changes apart from the composition root. Such solutions are easily testable and open to modification.

Lessons

  • Only add configuration for values that will certainly change between deployable units such as credentials or URLs.
  • Leave everything else where it belongs, either in the source file next to a class, in a method or whatever is easiest. If it needs to change, just make the change when the time comes. Chances are it will never come.
  • If a configuration value is changed, run your automated tests (or a subset) against the deployable unit.
  • A configuration change should be treated as a code change.
  • The business will never change your configuration - that's a technical task.
  • Provide configurations values, not the means of obtaining them.
  • Rely upon convention over configuration as much as possible.