Saturday, 25 June 2016

Ten Lessons from Rewriting Software

  1. It Will Take A Lot Longer Than Estimated

    • Its navie to actually think this but if a system has been in production for say five years, expecting to reproduce it in five weeks is not possible. You may be able to get 80% of the core functionality done, but the remaining 20% that was added to, iterated and stabilized over the remaining five years is what will destroy any form of schedule.
    • If your estimate exceeds three months, you need to reasses what you are doing by breaking down the work, or changing plan. The bigger the estimate, the bigger the risk.
  2. Deploy Incrementally Via CI

    • If you aren't deploying to a live environment as soon as possible, any future releases are destined to be failures, troublesome or just plain difficult.
    • Soft releases and feature toggles should be used to aid constant releases.
  3. Morale Will Drop The Longer It Goes On

    • Probably the biggest and most surprising realization is the drop in personal and team morale.
    • If you miss a "deadline" or keep failing to ship, then morale will tank.
    • While software is never complete, a rewrite has a definitive target. If this target continues to move, team morale will move too.
  4. Users Will Probably Hate It Anyway

    • Predominantly the UI, but your users will complain about change.
    • Big sweeping changes often receive the most hate. A website I frequent had a major change both in visuals and the underlying technology used. While there was warning, you were left to your own to figure out where features were. This caused a great deal of frustration and negative feedback.
    • Small, incremental changes allow your users to keep pace.
    • Alternatively some tutorial or hint system can help reduce user pain.
  5. Do What The Legacy System Does

    • As many of the original developers will likely have moved on, no one is really sure what the legacy system does.
    • Even with the source code available, it is likely going to be hard to figure out the intent, afterall that's one of the reasons for the rewrite.
    • If you are not careful you will end up simply reimplementing the same legacy in a new language or framework. Always weigh up preserving existing behaviour versus introducing technical debt.
  6. Be Cheap And Quick - Use Stubs

    • When implementing the new system, don't build a thing. At least at first.
    • Use stubs to build the simplest, dumbest thing you can to get feedback.
    • Without fully integrating the system in an end to end manner you'll end up throwing away a great deal of code.
  7. Feedback, Feedback, Feedback

    • Early and fast feedback is essential.
    • With a working end to end system gather as much as you can from any stakeholders.
    • Chances are as you begin you'll naturally incur some additions, removals or modifications.
    • Waiting months or longer for feedback is a guaranteed path to failure.
  8. Thin Vertical Slices Over Fat Technology Splits

    • Avoid the temptation to have a UI team, a backend team and a data team and so on.
    • Splitting at technology boundaries leads to systems that do not integrate well, or worse fail to handle the required use cases.
    • Your first iteration should consist of all parts of the technology stack, in the thinnest manner possible. Combine this with early feedback and the fast development speed of stubs.
  9. Strangle Existing Legacy Code

    • When rewriting in increments or by logical sections the technique of strangulation is useful.
    • Instead of releasing the new code as a standalone piece, integrate the new code into the existing legacy code base.
    • This may be tricky at first however over time the legacy system will form nothing but an empty shell that integrates with the new system.
    • The beauty of this approach is early feedback, and a guarantee that the new system behaves as intended.
    • The final step would be to replace the legacy shell with the new modern interface or frontend.
  10. Refactor Where Possible

    • Deciding to refactor or rewrite is never easy. Refactoring should be the default approach in many cases.
    • Old languages or unsupported frameworks are good reasons to adopt a rewrite, but this varies case by case.
    • If business agility is suffering such rewrites can be beneficially when using some of the techniques above.

Tuesday, 14 June 2016

DDD - Bounded Contexts

A single domain can grow large when applying Domain Driven Design. It can become very hard to contain a single model when using ubiquitous language to model the domain. Classic examples prevalent in many domains would be Customer or User models. A bounded context allows you to break down a large domain into smaller, independent contexts.

In different contexts a customer may be something completely different, depending on who you ask and how you use the model. For example, take three bounded contexts within a typical domain that allows customer administration, customer notifications and general reporting.

Example

Notification Context

A customer is their account id, social media accounts, email and any marketing preferences. Anything that would be required to uniquely identify a customer, and send a notification.

    + Id
    + Email
    + Marketing Preferences
    + Social
Reporting Context

When reporting customers are nothing more than statistics. A unique customer ID is more than enough just for aggregation and statistic collection.

    + Id
Account Context

Allowing the customer to administer their account would require anything personally related to the customer to be modelled.

    + Id
    + First Name
    + Last Name
    + Address
    + Email

Despite the common elements such as Id and email, the other elements are specific to the context in which the customer is used. One of the biggest mistakes I've made by ignoring a bounded context is to see a common model and try to apply this everywhere. This leads to less code, but increases coupling. A single small change in one context can cause a rippling effect. In fact the best solution is to have a customer model per context.

The result of this approach is you will end up with at least three models using the example above. While structural duplication increases, coupling decreases. Each context can change and evolve at its own pace. This is a good thing. No business logic here is being duplicated, only the model. As each context operates in its own speciality, there should never be a case where this is problematic.

Lessons

  • Structural duplication outside of bounded context is not a bad thing.
  • Resist the urge to use a base class for common attributes. This is especially true if you use an ORM or anything that will couple you further when these models are used.
  • Ending up with multiple models per bounded context is likely going to happen, embrace it.

Tuesday, 7 June 2016

Given When Then Scenarios vs Test Fixtures

There are two common ways of writing automated tests which apply from unit to acceptance tests. These are typically known as test fixtures and Given-When-Then scenarios.

Test Fixture

  • Traditional method of writing tests.
  • The common JUnit/NUnit approach. Other languages have very similar concepts.
  • Single test fixture with multiple tests.
  • Test fixture is usually named after the subject under test.
  • Can grow large with many test cases.
  • Works well with data driven tests.
  • Suited to solitary tests such as integration tests where GWT syntax would be verbose or hard to include.

Example

Given-When-Then

  • Behaviour driven approach (BDD style).
  • Made popular by tools such as RSpec.
  • Single test fixture per behaviour.
  • Test fixtures named after the functionality being tested.
  • Often nested within other test fixtures.
  • Smaller test fixtures but more verbose due to fixture per functionality.
  • Easy to see why a test failed due to naming convention - assertion message is optional.
  • Suited to sociable tests where the focus is on behaviour.
  • Given forms the pre-condition of the test.
  • When performs the action.
  • Then includes one or more related assertions.
  • GWT can be difficult to name in some cases, often more thought and discussion can be required around good naming conventions.
  • Can act as useful documentation on how the code is meant to function.

Example

Lessons

  • No single way of writing automated tests is better.
  • Favour single test fixtures for integration tests.
  • The core of your tests can use GWT style.
  • Mix and match where appropriate however.
  • Your choice of tooling and language may influence your approach.

Tuesday, 24 May 2016

Your Job Isn't to Write Code

Solving problems is the role of software developers first and foremost. The most interesting aspect is that in many cases it is possible to perform this role without writing a single line of code.

Low Tech

I once worked with a digital dashboard which monitored applications. One of the yet to be implemented features was a key to highlight which each chart related to. During this period many employees would ask which graph related to which feature. The solution was a few weeks a way so as a temporary fix I stuck a post it note to the screen. This was by no means the solution, but it was good enough for the time being. The questions went away and eventually the dash was updated to include a digital version. Total lines of code? Zero.

Problem Solving without a Computer

A common experience that many developers encounter is solving a problem while not actually at the computer, programming. In fact this technique of simply taking a break such as going for a walk can yield some impressive results. One of my fondest memories of this trick was using shampoo in the shower to walk through a buggy A* implementation using the bathroom tiles. After returning to the task sometime after, the stupid mistake stood out. Lines of code to figure out the fix? Zero.

Deferral

Just the other week I began furiously updating an existing application to change how a core feature worked. The solution was not going to be quick, but it seemed like a good idea. About halfway in I reverted the changes. After further thought it turns out there was a much better solution. One that would not introduce risk to the current project's goals. Total lines of code? Minus one hundred, give or take.

Goals over Code

This lack of code is not a bad thing. In all three examples the goal was complete. You can solve problems with a single line, or thousands, it actually does not matter. If you switch your thinking to focus on completing goals or hitting targets, you are still rewarded with a feeling of accomplishment. The slack time you gain can simply be redirected to other areas or personal improvement.

Lessons

Many wise developers have said this before. The role of a software developer is to solve problems, not write code. This is not new, unfortunately a younger, naive version of myself ignored this advice.

  • Focus on solving business/customer problems, not writing code.
  • Sometimes you'll write one line of code, others thousands.
  • Not all solutions require code to complete.
  • Focus on hitting goals, not the feeling of productivity writing code can give.

Friday, 20 May 2016

Foreign Key Constraints and Microservices

Database constraints when used in relational databases are great. They ensure data integrity at the lowest level. No one would argue against using them in practice. Essentially constraints can be thought of as assertions against your database. Rules such as requirement, default values and foreign key constraints double check your use of the database. This ensures your application is interacting in a sane manner. Databases often out live applications therefore constraints also ensure integrity long after the application has been replaced or modified.

Distributed Systems

Distributed systems change how foreign key constraints should be considered. As distributed systems own their data, each piece of data that is mastered by a single service should ensure integrity via foreign key constraints. However outside of this boundary the use of foreign keys should be avoided. This sounds disturbing at first. Especially given the traditional approach of a single system backed by a single database.

Example

Consider a blog post service that provides a selection of posts. The service would be responsible for everything related to blog posts, but nothing more. The comments for the site are a separate service, yet there is clearly a link between posts and comments. For example, in order to display both posts and comments a link is needed.

- tblPosts (blog database)
    + Id
    + Title
    + Date
    + Body

Each post would store data related to the blog post itself.

- tblComments (comment database)
    + Id
    + PostId
    + AuthorId
    + Message
    + Date

The comment service would include a reference to each post that the comment is linked to. In this case both PostId and AuthorId would not use foreign key integrity as other services master this data.

If this was a single database both PostId and AuthorId could enforce integrity, however as each service is independent this is not possible. With physically separate databases this lack of link is quite obvious. Working around this in application code would introduce subtle bugs, and temporal coupling. Such solutions are best avoided.

Check Formats

When using the comment service, this approach leaves you with very little work to do other than simple format checks. The format of a PostId and AuthorId should be known, so the comment service can validate at this level. The core benefit is both the blog post service and comment service are highly decoupled. The comments could be changed to another service altogether, even a 3rd party provider, yet other services would remain unaware.

Valid Format, Invalid Data

Format checks will only provide so much value. There is nothing stopping a valid request for a blog post that does not exist. In cases such as this there are a few options. One is to provide a compensating action. Periodically delete any comments that do not have corresponding blog posts. An alternative would be to rely upon events. Only insert comments when a blog post is added, likewise when the service publishes the fact a post has been removed, any associated comments could be deleted.

Many Services, Single Database

Confusion and resistance around the use of foreign keys is often found when transitioning from a single database, to a single database operated upon by multiple systems. Teams adopting microservices find themselves in this dilemma usually when a large, legacy database is involved. In these scenarios existing constraints may need to be removed, or modified. Another technique is to have the independent services add dummy data in order to pass database constraints. While this is far from ideal, this pragmatic solution can work well while databases are being separated.

Lessons

  • Use foreign key constraints when using a single database via a single application.
  • Modify, replace or drop constraints when multiple services are writing to a single database.
  • Independent services should own their own data. Only enforce integrity within service boundaries.
  • Outside of service boundaries, use format checks to prevent errors.
  • Rely on compensating actions or events for data management.