Sunday, November 06, 2016

Feature Branching - Taking the ‘Continuous’ Out Of Continuous Integration

XKCD 1597

Feature branch workflows appear to be the flavour of the month for software development right now. At its simplest, features are developed in isolation on their own branch, then merged back to the trunk. More complicated variants exist, most notably Git Flow.

Git is undoubtedly a powerful versioning tool. It makes branching easy and reliable, and it is truly distributed. These capabilities open up a whole range of possibilities and workflows to support different ways of working. But, as usual, the sharper the tool, the easier it is to cut yourself if you don't understand what you are doing.

My problem is this. Feature branching is a very effective way of working when you have a large, distributed team. Code is not integrated with the rest of the codebase without a lot of checking. Pull requests ensure changes - often done by developers who are in other countries, and often part of a loose-knit open source development team - are reviewed by a core team before being integrated, keeping the main product clean and working. Which is great.


Except it completely undermines the core principle of continuous integration! 


Let’s be clear here. If you are not combining every single code change into a single trunk branch at least once every day you are definitely not continuously integrating.

Constant merging of everything as you go is the key enabler for continuous integration. It is the ’secret sauce’ that provides the main benefit of the technique. The clue is in the name. Every1 commit triggers a build and test sequence that checks whether the code has regressed. And the only truly effective way to do that is to keep everyone developing on a single main branch. If something really, really needs a separate branch (the exception, not the rule), then its lifecycle needs to be kept short; ideally, less than 1 day. The longer code stays away from the main branch, unintegrated, the more likely mistakes will go unnoticed.

Having to branch for each feature inevitably means that integration only happens with the rest of the code when the branch is finally merged with the main development branch. Often this is days, even weeks later, and includes many different, separate commits. This is perilously close to the waterfall practice of late integration, and so needs to be avoided unless other factors dictate otherwise - for example, the team is a distributed and loose-knit group of volunteers. If you are working in a small, collocated team, there is absoutely no reason to adopt these regressive patterns.



1 Well, almost every...

Feature Branching - Taking the ‘Continuous’ Out Of Continuous Integration

XKCD 1597

Feature branch workflows appear to be the flavour of the month for software development right now. At its simplest, features are developed in isolation on their own branch, then merged back to the trunk. More complicated variants exist, most notably Git Flow.

Git is undoubtedly a powerful versioning tool. It makes branching easy and reliable, and it is truly distributed. These capabilities open up a whole range of possibilities and workflows to support different ways of working. But, as usual, the sharper the tool, the easier it is to cut yourself if you don't understand what you are doing.

My problem is this. Feature branching is a very effective way of working when you have a large, distributed team. Code is not integrated with the rest of the codebase without a lot of checking. Pull requests ensure changes - often done by developers who are in other countries, and often part of a loose-knit open source development team - are reviewed by a core team before being integrated, keeping the main product clean and working. Which is great.


Except it completely undermines the core principle of continuous integration! 


Let’s be clear here. If you are not combining every single code change into a single trunk branch at least once every day you are definitely not continuously integrating.

Constant merging of everything as you go is the key enabler for continuous integration. It is the ’secret sauce’ that provides the main benefit of the technique. The clue is in the name. Every1 commit triggers a build and test sequence that checks whether the code has regressed. And the only truly effective way to do that is to keep everyone developing on a single main branch. If something really, really needs a separate branch (the exception, not the rule), then its lifecycle needs to be kept short; ideally, less than 1 day. The longer code stays away from the main branch, unintegrated, the more likely mistakes will go unnoticed.

Having to branch for each feature inevitably means that integration only happens with the rest of the code when the branch is finally merged with the main development branch. Often this is days, even weeks later, and includes many different, separate commits. This is perilously close to the waterfall practice of late integration, and so needs to be avoided unless other factors dictate otherwise - for example, the team is a distributed and loose-knit group of volunteers. If you are working in a small, collocated team, there is absoutely no reason to adopt these regressive patterns.



1 Well, almost every...

Wednesday, June 01, 2016

The Importance of Being Able to Build Locally

A little while back I wrote an article describing how to do continuous integration. But I left out one important step that happens before any code even enters the CI pipeline. A step so important, so fundamental, so obvious that you don’t realise it is there until it is missing.

You must be able to build and test your software locally, on your developer machine. And by “test” I don’t just mean unit test level, but acceptance tests as well.

I said it was obvious. But once in a while I do stumble across a project where this rule has been missed, and it leads to a world of unnecessary pain and discomfort for the team. So what happens when you cannot easily build and test anywhere apart from the pipeline?

Without being able to run tests locally, the development team is effectively coding blind. They cannot know whether the code they are writing is correct. Depending on where the dysfunction is - compile, unit test or acceptance test stage - the code checked in may or may not break the pipeline build. The Dirty Harry Checkin (“Feeling lucky, punk?”). So the pipeline is likely to break. A lot. Good pipeline discipline means broken builds are fixed quickly, or removed, so that other team members can check in code. But here lies the rub - since there is no local feedback, any fix is unlikely to be identified quickly - how can it be when every single change to fix it has to run through the CI pipeline first? The inevitable result - slow development.

Let’s look a little closer at what is going on. Whenever I see this anti-pattern, it is usually the acceptance tests that cannot be run [1] - they are generally difficult to set up, and/or too slow to run quickly, and/or too slow to deploy new code to test. Let’s apply this to a typical ATDD development cycle. we should all know what this looks like:


Standard stuff - Write a failing acceptance test, TDD until it passes, repeat.

Now, let’s drop this development pattern into a system where the developers cannot run their acceptance tests locally and has a slow deployment. This happens:
The only way to check whether the acceptance criteria are complete - i.e. the feature is Done - is to push the changes into the CI pipeline. Which takes time. In this time no-one else can deploy anything (or, at least shouldn’t if we don’t want complete chaos). Getting feedback on the state of the build becomes glacially slow. Which means fixing problems becomes equally delayed. So if, say, the feedback cycle takes 30 minutes, and you have mistakes in the acceptance criteria (remember, you cannot test locally, so don’t really know whether the software works as expected) every single mistake could take 30 minutes each, plus development time!  

So how does being able to build locally fix this? Simple - if you can build and test locally, you know that changes being introduced into the CI pipeline most likely work.  Even if things are a bit slow. Also, instead of having a single channel to check the state of the build, every single development machine becomes a potential test system, so no waiting around for the pipeline to clear - just run it locally, and grab a coffee if needed, or even switch to your pair’s workstation if things are really that slow (clue: if things are that slow, spend the down time fixing it!).

I shall add one unexpected side effect of being able to build locally - it can be used as a poor-man's continuous integration pipeline. So if you have trouble commissioning servers to support a real pipeline (sadly there are still companies where this is the case - you know who you are!), with sufficient team discipline it is possible to use local builds as the main verification and validation mechanism. Simply make it a team norm to integrate and run all tests locally before checking in. It does introduces other risks, but it gives a team a fighting chance.

[1] If unit level tests have this problem, there is an even bigger problem. Trust me on this.



Tuesday, May 31, 2016

Twitter

I don’t know if anyone’s noticed, but I’m not very active on Twitter any more.

No, it’s not the annoying “While you were away” pseudo-window that is keeping me away.

Nor is it the Facebook-esque ‘heart’ button (WTF?).

Nor is it the endless supply of advertising bots and idiots that seem to frequent the service (I have most of them filtered out)  

It’s not even the regular misaddressed tweets meant for the Thirsty Bear Brewing Co in San Fransisco (fine fellows, and purveyors of fine beery comestibles that they are!)

Nope. None of the above. It’s the distraction.

On 5th November 2008, at 0525 (allegedly - I suspect there’s some sort of timezone shenanigans going on there…those who know me realise I am not a morning person) I wrote my first Tweet: "Trying out this Twitter thing….”)

Since that date I’ve got involved in all kinds of interesting discussions, mostly around software development, and often with the Great and the Good of the agile community - for which I am extremely grateful since I consider myself to be more "average chaotic neutral", with some habits that (mostly!) keep me from the Dark Side 😃.  Has it been useful? Yes. Has it been fun? Hell, yeah! Did it take far too much of my time? Oh yes!

The immediacy of Twitter meant that I was being dragged into endless conversations. Discussions that had more in common with IRC realtime chat than simply idle Post-It notes on a wall somewhere. They needed immediate attention. Which meant that I was wasting a huge amount of time. I could give up at any time…but…just …one…more…reply… 



So I’ve took a bit of a hiatus, and found out that the world didn’t end. The sky didn’t fall in. I could still find things out. I could still have discussions. But I had time. I had found that Twitter was actually getting in the way of what I want to do - work with good people to deliver cool stuff. 


I haven’t given it up completely, but I won’t be using Twitter anywhere near as much as I used to. My last serious tweet was October 2015, and I have not really missed the interaction. I will be keeping an eye on it to see if I can improve the ROI, suggestions gratefully received - but best use the comments to this post rather than use Twitter..... 

Thursday, December 31, 2015

Getting rid of those annoying ._ files

It all started so innocuously. All I was asked to do was put some photos onto a USB stick that could be played on a TV in my local pub. Easy. JPEGs duly loaded onto the drive, plugged in...and found to be incompatible. Weird.

So I checked the JPEG file format against the specification. I changed sizes, I changed encoding type.  Still not playing. Really annoying, and not a little embarrassing!

Then I noticed - the TV was trying to display files with names starting with '._'. Where the hell did they come from? And the penny dropped. OSX creates all kinds of special files to work with non-HFS file systems - like the USB drive's FAT32 format.

These files are part of the AppleDouble format, and don't show up if you reveal hidden files in OSX (see this excellent article on how to show/hide the hidden files). So to get rid of them, you can either use a Windows or linux machine, or use the dot_clean command in OSX.

Hopefully this will save someone some time tracking down this problem.




Thursday, June 18, 2015

Don’t overload your Scrum Master!

Aka "Absent Scrum Master Anti-Pattern"

There is a question that keeps cropping up when talking to clients. It is 
“How many teams can a Scrum Master lead?”. 
My answer? Just the one.

Let’s take a closer look at why people persist in asking this question, and try to understand better why it is generally a Bad Idea™. 

When a team is running well, everyone is happy, and product is being delivered successfully iteration by iteration, feature by feature. There is comparatively little for the Scrum Master to do aside from day to day housekeeping. Everyone knows how the process works, everyone plays their part. The system chugs along and delivers. The SM appears to have a huge amount of slack, which most organisations believe is a Bad Thing - slack means they need to squeeze out more performance (read “profit”), in this case by giving the SM something else to do. I.e. another team. And maybe another. And another. Or other random responsibilities. Ignoring the bad consequences of such task-switching, the organisation can get their diary looking something like this:



(I should add that this is a real calendar. I worked with the SM, so I also know that none of the time bookings were related to the main team he was trying to serve)

What you end up with is a system that is at maximum capacity. Which is fine - as long as nothing changes, and nothing goes wrong. A common analogy is having all cars on the motorway driving flat out, bumper-to-bumper. What could possibly go wrong?

A multi-team Scrum Master will not have his full focus on any of the teams. So small problems go unnoticed, or worse still, noticed but uncorrected due to time pressure elsewhere. Perhaps lacklustre delivery, a few bugs appear that are ignored, internal friction in the team. All easy to nip in the bud if caught early. But these kinds of problem tend to grow and multiply if left unchecked. So suddenly a small issue leads to a major pile-up. And all because the Scrum Master was trying to interact with more than one team. As a side effect, while the SM is trying to recover the situation in one team, he will be neglecting the others. And suddenly he is playing a dangerous game of Whack-a-Mole trying to keep all the teams running smoothly. Needless to say, this behavioural pattern often fails.



Finally - coming back to the slack experienced by a team-monogamous Scrum Master during the fair-weather delivery periods. What to do? It is a fair question. My advice is simple. By all means use some of the time on other tasks. But these are entirely secondary. Non-time-dependent "optional extras", if you will. The moment the team needs more time from their SM, all other tasks are dropped and the team needs become the number one priority. No question, no penalty, no punishment (e.g. because secondary xyz wasn’t delivered).

Thursday, March 26, 2015

It's time to make immutability the default

Right. I have to get this off my chest. A follow-on from my habitual coding observation in my previous article.

How many people habitually write Java code like this? (Clue: I see it a lot)

public class Article {
    private String title;
    private String author;
    private List tags;

    public void setTags(List tags) {
        this.tags = tags;
    }

    public void setAuthor(String author) {
        this.author = author;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTitle() {
        return title;
    }

    public String getAuthor() {
        return author;
    }

    public List getTags() {
        return tags;
    }
}

Then you can create an object an use it with something like:
    Article article = new Article();
    article.setAuthor("Cam M. Bert");
    article.setTitle("French Cheeses");
    article.setTags(Collections.asList("cheese", "food"));

    // etc etc

Great, assuming that you want to allow changes to the object later. But if you have no explicit requirement to change anything, then you have left the door wide open to abuse. Your setters are public. You have written too much code.

OK, let's try something else and lock the door. Wouldn't it be nice if we created the object in the correct state from the off, all ready to use, rather than having to construct then initialise. We end up with something like:
public class Article {
    private String title;
    private String author;
    private List tags;

    public Article(String title, String author, List tags) {
        this.title = title;
        this.author = author;
        this.tags = tags;
    }

    public String getTitle() {
        return title;
    }

    public String getAuthor() {
        return author;
    }

    public List getTags() {
        return tags;
    }
}
A bit better. Remember, you don't have any explicit requirement to change values so do not need any mutator methods. The tags List object is still open to abuse, but hey-ho....

But I can't help thinking this is a very short hop from:

public class Article {
    public final String title;
    public final String author;
    public final List tags;

    public Article(String title, String author, List tags) {
        this.title = title;
        this.author = author;
        this.tags = Collections.unmodifiableList(tags);
    }
}
Less code to read, easy access to immutable fields.

My point is the following. If you have no definite requirement to change the values on a POJO, then make it immutable from the start! Only introduce mutator methods as they are required.

One final thing - yes, you can produce a similar result by using frameworks like Lombok to hide the 'getXyz()' methods, but why introduce further complexity of a framework when it can be done so elegantly natively?

There. I feel better now :-)


Note - some of the code is appearing with </string> annotations - no idea why. Seems to be a quirk of the Google Blogger editor
Note #2 - the </string> annotations appear to be a Chrome issue. Oh the irony! :)