Agile continuous delivery in the cloud – Part 2

In this post, I’ll talk about keeping the lines of communication open, testing & DevOps.

Software is abstract until it is operational

In line with our ethos for all development teams to get code to a production-ready state as quickly as possible, operations, testers, project managers, scrum master and software developers are all part of the same agile process and team-working. DevOps came about from breaking down silos, so that people work together to improve collaboration, whilst at the same time building trust and relationships.

Everyone involved in software development works together on all aspects of delivery, enabling collaboration across functional boundaries. There are lots of moving targets in DevOps & continuous delivery, we have some automation in place (e.g. automatic tests, continuous integration) but we also use tools and engineering practices, e.g. Jira, BitBucket, Confluence, Teamcity, GIT, – the rest we manage by hand.

The way we deliver to the TfL website comes together through Continuous Integration & Delivery, DevOps and Agile
The way we deliver to the TfL website comes together through Continuous Integration & Delivery, DevOps and Agile

We’ve optimised for end-to-end cycle time of the whole process to reduce re-work and other unnecessary overheads. Developers commit changes frequently, this way bugs and problems are discovered closer to the time they were introduced and are easier to de-bug, enabling fast feedback loops every time there is a change in code, configuration, or infrastructure. This allows Developers to be more productive by building features, not wasting time de-bugging builds that were created months ago.

Responsibility for code quality is shared amongst the entire team, however to ensure that the codebase remains efficient, consistent and lean, we employ peer-2-peer and Lead Developer code reviews during development to catch bugs and issues early. In addition, each project team has an embedded Technical Architect, to advise on User Story review, Sprint planning and any new/complex items before development starts.

Lead Developers assure standards and best practises within their teams and through weekly catch-up meetings to bridge any gaps in understanding or differences of opinions. We encourage Development teams to reach out to their peers for advice/guidance and to share in TfL Online best practice, thus ensuring cross pollination of standards.

Sprints are aligned to our continuous weekly deployment cycle and developers maintain this pace and heart-beat. Everyone understands that when a release is “signed off” to be promoted to an environment this is final. This ensures that the team’s work is defined as “done”, only when the change has been put into production and is working satisfactorily – not at a remote Developers workstation, completed ages ago.

We always do a final QA via a project release note before promotion to the global release pipe-line. An important aspect of this final QA is, assurance that the build time remains as lean as possible and that code additions are efficient and performance optimised so as not to adversely increase server load or CPU.

It’s good to talk

When continuously innovating and delivering new software, team communication and planning is crucial to maintain momentum, quality and the business drivers. It’s important to get everyone together regularly, to secure buy-in, understanding and collaboration.

We do this through our Tuesday weekly release planning meetings – we aim to be as lean as possible by minimising the need for follow-up actions and further meetings, so we have all of the required people to make decisions in the room.

We have clear guidelines as to our philosophy on agile, continuous delivery to the TfL website
We have clear guidelines as to our philosophy on agile, continuous delivery to the TfL website

At this meeting, all DevOps Leads get together with business and projects, to optimise and adjust continuous delivery scope, plans and schedules for each release package. The complexity of continuous delivery is directly proportional to the number of changes at each level in the stack e.g. infrastructure, Data-base, web application tier, ..etc. and there is always a ripple effect to consider as well, because we can’t change one piece of code without affecting other code. So a critical part of this meeting is to carefully manage scope and be sensible over the volume of change we are deploying each week.

Lines of communication are also kept open and transparent via Jira tickets, Hip-chat and ongoing check-point meetings, this is essential, because of the “domino effect” in our current set-up (there is only one route to market), and one release could quickly block the global release pipe-line and potentially throw the weekly release schedule out of sync if not carefully managed, co-ordinated and quality assured.

Continuous attention to technical excellence

Testing is not something we do after development is complete, testing is something we do through the full life-cycle and encourage everyone to be responsible for quality, and we are always looking to improve process in order to build quality into the product in the first place.

Testers have cross-functional roles, communicating and working alongside Business Analysts and project teams in order to advise on the creation of; good quality requirements, User Stories, code reviews, component testing strategies and test techniques.

Our culture of intelligent agile testing ensures the most efficient results during the sprint
Our culture of intelligent agile testing ensures the most efficient results during the sprint

Intelligent agile testing means analysing what moving parts do we really need to test ?, So we have several intermediate levels, testing in isolation first, then together with a good separation of unit, functional, integration and regression testing.

We conduct full regression, compatibility, accessibility, security penetration tests (application level) in project environments when development is complete. Then in the early stages of the global release pipe-line (Continuous integration & Dev environments) – we use light smoke tests and cross-project, developer verification to check for merge conflicts and any other issues.

We run testing in parallel, to speed up feedback time. If these tests fail, we don’t stop Developers checking in code, but we do make sure a Lead developer & Tester pair, make it their highest priority to work together and fix the problem immediately.

When end-to-end testing has been completed, right up to the blue environment (pre-prod) including security penetration tests (application & infrastructure), the release package is then released to production and goes live, via a blue/green switch.

If the deployment goes wrong, no worries, we instantly revert to the last known good state in pre-prod through an instant blue/green roll-back and then focus on a hotfix.

In the next part of this blog series, I’ll discuss our route to market – the global release pipe-line.


  1. In my mind, continuous delivery isn’t a tool-set, business process or frame-work, CD is a new way of doing business and getting software out to the customers fast. Do you have the ability to deploy on demand, at any time ? are you ITIL, how does that fit ? Thanks

    1. Yes, we can hot-fix very quickly by parachuting urgent code directly into the Test environment (we call it “Amber”), QA, and then promote to pre-prod and finally to production via a blue/green flip. We aspire to ITIL, however we also use Agile, Scrum & PRINCE2 methodologies. Fair to say that we take a hybrid approach and use the best of each frame-work to ensure our speed to market is as fast as possible, currently, weekly releases of code to the website. Our ongoing challenge is ITIL aligned version control in cloud architecture with continuous integration & delivery. A lot of standard changes are automated, tracked in GIT, BitBucket,Teamcity, or Jira. The core of ITIL is continual service improvement (CSI) rather than shipping products into production and forgetting about them. Our mission is also that we are constantly improving the website over time, so we are adapting ITIL to fit our agile culture by using a “light-touch” ITIL – increasing awareness within the development teams of what relevant bits of ITIL we must have, and actually makes logical sense, rather than it being a bureaucratic process that’s heavy & clunky, and no-one adhere’s to. In many respects, we are trail-blazing, learning as we go up a steep learning curve, keeping what works and discarding anything that is a waste of time. Key is to be flexible and maintain a “can-do” attitude. Hope this answers your question and thanks for the interest.

  2. Thank TFL for sharing their day-to-day practices on their projects. It is really interesting and inspiring. PRINCE2 is not used much in France but I am interested in this project management process and how it can be integrated with Scrum. Is it possible for you to communicate on the size of the technical team to know how many people are required to implement this kind of software development process ?

    1. Project teams do not have to share the same heartbeat as our weekly release cycle, but at their own risk of missing an entry point. Scrum-Masters can align their sprints to the pipeline entry point, but the release cycle and the progression of things through it, must be consistent and as metronomic as possible. In terms of numbers, we run a very lean team comprising; Build Manager, DBA, Test Mgr, Dev Mgr, and of course the agile project teams building new features, fixing bugs, applying patches, ..etc. For further reading, this blog might be of interest;

Leave a Reply to Tariq Khurshid Cancel reply

Your email address will not be published. Required fields are marked *