Applied Industrial DevOps - FINAL
Applied Industrial DevOps - FINAL
Applied Industrial DevOps - FINAL
DevOps
2019
25 NW 23rd Pl
Suite 6314
Portland, OR 97210
For further information about IT Revolution, these and other publications, special
discounts for bulk book purchases, or for information on booking authors for an
event, please visit our website at ITRevolution.com.
In May of this year, the fifth annual DevOps Enterprise Forum was held in Portland,
Oregon. As always, industry leaders and experts came together to discuss the issues
at the forefront of the DevOps Enterprise community and to put together guidance to
help us overcome and move through those obstacles.
This year, the group took a deeper dive into issues we had just begun to unpack
in previous years, providing step-by-step guidance on how to implement a move
from project to product and how to make DevOps work in large-scale, cyber-physical
systems, and even a more detailed look at conducting Dojos in any organization.
We also approached cultural and process changes like breaking through old change-
management processes and debunking the myth of the full-stack engineer. And of
course, we dived into the continuing question around security in automated pipelines.
As always, this year’s topics strive to address the issues, concerns, and obstacles
that are the most relevant to modern IT organizations across all industries. Afterall,
every organization is a digital organization.
This year’s Forum papers (along with our archive of papers from years past) are an
essential asset for any organization’s library, fostering the continual learning that is
essential to the success of a DevOps transformation and winning in the marketplace.
A special thanks goes to Jeff Gallimore, our co-host and partner and co-founder at
Excella, for helping create a structure for the two days and the weeks that followed to
help everyone stay focused and productive. Additional thanks goes to this year’s Forum
sponsor, XebiaLabs. And most importantly a huge thank you to this year’s Forum par-
ticipants, who contribute their valuable time and expertise and always go above and
beyond to put together these resources for the entire community to share and learn
from.
Please read, share, and learn, and you will help guide yourself and your organiza-
tion to success!
—Gene Kim
June 2019
Portland, Oregon
Business Objective
The Approach
• sensor firmware
• control-system software
• user-interface software
The Alset teams are going to articulate requirements through the definition of two
core epics in the product backlog:
The epics and associated hypotheses are defined in Figure 1. The teams will
implement epics by applying principles of Industrial DevOps throughout execution,
deployment, and operations. We further define the epics that will be implemented
in Table 1.
For: Alset Consumer Car Operators For: Alset Consumer Car Operators
Who: Want increased safety and Who: Want increased safety and
system trust system trust
Our Solution: Increase actionable closing Our Solution: Increase visibility with camera
distance
Business Outcome: Increase Safety ratings Business Outcome: Increase Safety ratings
Increase consumer trust Increase consumer trust
Increase revenue Increase revenue
Leading Indicators: Net Promoter Score Sales Leading Indicators: Net Promoter Score Sales
Increase Usability by 25% Increase Usability by 25%
NFRs: Increase Actionability 50% NFRs: Increase Actionability XX%
Epic 1 Epic 2
Systems Impacted Obstacle detection, vehicle control, Chassis, vehicle control, camera
sensor management
Downstream Features Impacted Cruise control, lane detection, parking Lane detection, parking assist
assist
The following is the list of Industrial DevOps principles generated from our first
paper. We will walk through each of these principles to discuss in greater detail
how this applies to an autonomous vehicle, which includes hardware and software
components.
In the case of our example, both Epic 1 and 2 are delivering value through the same
streamlet: collision avoidance.
The first collision-avoidance team is focused on making software changes to the exist-
ing fleet. Teams will refactor the architecture to increase modularity, incrementally
update sensor types, improve sensor management refresh rate, and more. Given the
nature of the changes, we selected a more software-focused collision-avoidance team
to deliver Epic 1’s capabilities.
The next collision-avoidance team is focused on the new camera and the correspond-
ing required vehicle updates. Physical models of the system are shown in Figures 3
and 4. The team needs to make updates to both the forward and backup cameras,
which impact the hardware, firmware, and software in the collision-avoidance stream-
Product Vision
Year Lookout of
High Level Functions
Quarterly Roadmap
of Features
Iteration
Plans
Daily
Team Plans
Through each of the planning horizons, teams focus on the principles of DevOps
and how to create flow, gain feedback from stakeholders and users, and take advantage
of the learning from each iteration across a massive value stream. But, what does it
mean to deliver frequently with products that can take years to build? It means break-
ing down the products into smaller components, engaging stakeholders to determine
priorities, and delivering those broken-down parts over multiple horizons of time.
Our approach supports companies like Alset Transport who need longer planning
horizons to communicate with stakeholders and to account for hardware lead times
while enabling their teams to develop components through smaller batch sizes with
frequent integration for rapid learning.
All products begin with a vision and a high-level plan to deliver value. The vision and
high-level plan for Alset Transport includes improving the existing fleet of vehicles
already in production and providing enhancements for the next fleet of vehicles in the
idea and early concept phases.
Annual Plan
Large, complex cyber-physical systems can take many years to complete. Teams
decompose the high-level plans into annual plans in order to simplify problems and
provide more focus on what they are going to build first.
Alset Transport creates their annual plan by taking a holistic systems view and
evaluating its various components and supporting suppliers. They have selected two
epics that will focus on refactoring the system architecture for modularity, increas-
ing sensor types, and improving the recognition and reaction behaviors associated
with braking by incorporating a new camera. The hypothesis associated with the epics
states that these changes will improve customer satisfaction, safety ratings for their
vehicles, and the platform for supporting autonomous, commercial fleet vehicles.
Annual plans can still be quite complex and difficult to manage. Teams can break
down annual plans into quarterly, or slightly smaller, program increment plans. The
plans are visualized through a road map identifying the features to be delivered over
the next few quarters. The road map is constructed at a level high enough to provide
sufficient detail for stakeholders to have an idea of the path forward while allowing for
an ease of change and reprioritization.
Alset Transport further breaks down their annual plan into a quarterly plan that
includes features such as enhancing the Lidar sensor color profile and improving the
sensor management refresh rate.
At the iteration or sprint level, the teams break down work and time into detailed
plans and user stories. The detailed plan for each story or work item for a given
iteration is further defined into step-by-step plans by the team completing the
work. Their detailed plans include designs, tests using prototypes, digital twins,
simulators, emulators, and acceptance criteria that they will demonstrate when the
iteration ends. The team plans the work in a manner that, whether the backlog items
are software or hardware, can demonstrate completeness to some functionality of
an integrated system for the product owner and other stakeholders at the end of the
iteration.
Alset Transport has decomposed some of their team features into user stories that
include splitting Lidar by component value to obtain color saturation and explore
camera interoperability.
Daily Plan
The team collaborates daily to understand what work they completed yesterday and
what they are going to do today in order to complete the iteration goals.
Alset Transport teams hold daily stand-ups to provide situational awareness to the
capabilities being worked on and to identify any support needed. This plan provides a
short feedback loop to improve the flow of delivery.
Multiple horizons of planning impact our example in several ways. The product
managers and owners refined their initial features from the quarterly road map. As
plans were defined, Alset Transport’s teams identified multiple dependencies and
the associated risks. It’s important that all team members, including those from
both hardware and software, physically participate in planning events. Teams define
iteration-level items that can be demonstrated via some specific acceptance criteria.
As their plans are created, they identify the earliest integration points possible using
the existing hardware with an intent to integrate several times per iteration.
An example of how Alset Transport’s collision-avoidance team may break down work
across the quarter can be seen in Table 3.
Add new camera, Interoperability cam- Procure camera’s Enhanced communi- Camera instantiation
associated technol- era (spike) ongoing traffic cations with
ogy, and hardware surveillance braking-system Full regulatory
updates New sensor sensors compliance
Obstacle-detection
Lidar enhancement system enhancement Seat belt–sensor
coordination
Recording playback
Camera prototype on
vehicle
Each of the time horizons support the ability to base decisions on objective evidence.
The evidence supplied will vary based on time, number of teams, and technology avail-
able. Teams will focus on what can be built in each two-week iteration and how it can
be demonstrated in a way that allows stakeholders to interpret what they’re seeing
User Story Iteration Prototype camera mount View camera mount on model
options on a model
Lens Consumer
Vehicle Control
Portal
Signal
Obstacle • Fleet
Processor
Detection management
user
Sensor Vehicle
Management Communications Vehicle
Operations
As the team looks at the impact of Epic 1, they review the architecture of the vehicle-
control software for constraints, dependencies, and risks. When the team works to
improve obstacle detection, they focus on how the event is handled. This means under-
standing the architecture and, later, ensuring the testing of vehicle control to the reac-
tion of the braking system and communications. The team also looks at how changes to
the software impacts the obstacle-detection system and how to deliver the enhanced
functionality without having to do a complete recertification of many subsystems.
Because this is a software-only update, they do not perceive any limitations imple-
menting the improved obstacle-detection feature in the system, as it is nicely con-
tained within the vehicle-control component. Although the algorithm updates are
complex, no extraordinary constraints are encountered, and the risk of being able to
deploy is minimized.
As the team moves toward Epic 2, the scope of the change extends far beyond vehi-
cle control. The new camera not only has a higher resolution but a faster focus time
as well. This leads to changes in the sensor-management software and the interface
with the vehicle-control module. The amount of data vastly increases, affecting the
communication bus of the vehicle-control module, as well as impacting the resource
usage. Both memory and CPU significantly increase. It becomes obvious that obstacle
detection needs to be as near real time (NRT) as possible.
Another constraint that is identified is the impact on testing. In the future, it can
be expected that the variance in cameras will increase. The processing characteristics
for each camera will vary.
As the team works through reviewing the architecture, a constraint is clearly iden-
tified, and the team decides to introduce a new module between sensor management
As discussed earlier, each epic is broken down into multiple levels: from epic, feature,
user story, to task. The break down of work allows Alset Transport teams to reduce
their batch size and iterate through capability development, which enables flow
through the system while offering fast feedback and continuous learning opportuni-
ties. During each iteration, teams implement user stories, which are small pieces of
functionality or system enablers that can be built, tested, and proven to show results.
As the team seeks to learn more about the system they are building, they perform
regular user engagement.
The Alset Transport teams were building and testing the stories in two-week itera-
tion cycles. Over time they recognized that their user stories were too large and taking
During the quarterly planning event, the hardware- and software-engineering teams
analyzed the first epic focused on the software-only system updates for the existing
fleet vehicles. They selected a sensor system with an available test environment, such
as a simulated environment, and small, code-based sensors, such as a forward and
backup camera.
The hardware team focusing on Epic 2 participated in the planning of their sprints and
identified the constraints, interface requirements, and module behavior of each unit.
Cadence and synchronization are critical for the planning and development of the
solution. Cadence provides predictable time boxes and rhythmic patterns for plan-
ning. Synchronization offers the team an opportunity to align their efforts and get
regular feedback on how the integrated system is working.
For Alset Transport, the hardware and software teams agreed to adopt the same
quarterly program increment and iteration cadences. Originally, the teams discussed
different iteration lengths, whether they should be two-, three-, or four-week cycles,
and what would work best for them. During the discussion, the teams balanced their
concerns regarding being able to break down work at a granular-enough level to fit
within a two-week iteration with the concern of having iterations that were too long,
resulting in delayed feedback and reprioritization mid-iteration.
After negotiating, the teams agreed to go with a two-week iteration length and to
adjust it in the future if they find it isn’t working for them. The hardware team still
has concerns but is willing to give it a try based on their learning and recognition that
a two-week iteration may not mean completed capability, but rather a point for them
to receive feedback on their work products.
As part of the synchronization points, the teams agreed that at the end of each iter-
ation they would demonstrate their progress and receive feedback. Both the software
and hardware teams collaborated as Agile units in order to create their quarterly and
iteration plans and to align their demonstrations.
Their quarterly program of increment planning enabled them to identify and plan the
specific software enhancements needed to implement the first epic and areas where
they needed to collaborate with the hardware-focused team supporting Epic 2.
The hardware team focused on solution-specific camera updates, while engaging with
the software team to address risks and dependencies. At the same time, the hardware
team identifiesd and planned out the new camera features, models, designs, and long
lead items.
Both teams established a cadence of regular quarterly planning and iterations at
a two-week period. They will provide regular synchronization by conducting demon-
strations at the end of each two-week iteration at the system level.
As illustrated in Figure 7, cadence and synchronization together provide teams
with the tools they need to help manage the complexity and variability of large-scale
solution development.
Cadence Synchronization
Example: Harmonic
multiple system
integration
Makes routine that which can be routine Causes multiple events to happen at the same time
Lowers the transaction cost of events Prevents alignment errors from accumulating
Makes waiting times predictable Facilitates cross-functional tradeoffs
Facilitates planning Provides objective evidence
Makes small batches feasible Allows synchronization of design cycles
The principle “employ ‘continu-ish’ integration” describes how the goal of truly con-
tinuous integration faces economic and practical challenges when dealing with large-
scale, cyber-physical systems. It isn’t practical to integrate full end-to-end solutions
as frequently as we can with pure software systems. The value streamlets discussed
in section one may evolve at different rates. Rather, we use the economic and phys-
ical constraints to create a plan that integrates as much as possible as frequently as
possible—with the larger goal being overall risk reduction for the program. Figure 8
illustrates this approach.
Each streamlet will mature their part of the solution independently by evolving the
software and hardware. Streamlets evolve hardware using development kits, bread-
boards, brassboards, systems on chip (SoCs), FPGAs, hardware revs, and other pro-
totypical solutions to frequently integrate their localized changes with the system’s
other streamlets. Balancing the costs and effort to create new hardware with the value
of fast feedback and reduced risk helps determine the optimal frequency for revisions.
Looking at ways to lower manufacturing time and costs, many find that frequency to
be within weeks or a few months.
The first epic is the most straightforward. The application of cadence and synchroni-
zation defines that work will naturally occur in short, predefined timeboxes—typically,
sprints of two weeks aggregated into periodic program increments. The logical archi-
tecture illustrates the various software and firmware elements that need to be updated.
In support of incremental development, the program has established a digital twin
and test environment that makes developer-level testing cheaper and faster. Most new
developments and code-level integrations can happen routinely, daily, or even hourly
on that environment. Routine DevOps practices of source code control, automated
builds, and automated build verification tests apply well in this case.
In terms of deployment, however, the situation becomes more interesting as
eventually the new algorithms have to be field-tested in a real vehicle. For software
changes, teams could apply a continuous delivery, the DevOps deployment strategy:
The addition of the new camera presents a more significant challenge to implement-
ing DevOps and assuring continuous flow. Camera specs have been established and
provided by the supplier. The question becomes, how can the development teams
Be Test-Driven
Test-driven simply means beginning with the end in mind. With test-driven devel-
opment (TDD), teams write the tests for a change before they implement it, in both
software and hardware. Over time, TDD creates a large set of automated and manual
regression tests that help ensure quality software and hardware.
When it comes to hardware changes, the vehicle’s modular architecture facilitates TDD.
Test doubles allow development on parts of the system to proceed independently. In
Figure 9, a camera test double simulates the updated
Camera message format and frequency to allow sensor man-
Lens agement to evolve. As the camera design develops,
any changes to the interface or its behavior are also
made to the test double. Sensor management is done
Signal
Processor entirely in software, so common TDD practices can
apply.
Within the camera subsystem, a lens test double
simulates optical signals that the processor can con-
Sensor
Management
vert into formatted messages. In the electrical CAD
environment, the designer starts by creating a small,
new—or perhaps, modifying an existing—signal
Figure 9: Test Doubles
simulation for the lens. The designer sees it fail then
Facilitate Test-Driven
modifies the design to process this new signal. The
Development for Hardware
designer can create or modify another signal from
the new lens and update the design. This small test, or change process, repeats until
the signal-processor design is completed.
Conclusion
“Guide to Car Safety Features.” Consumer Reports website. Last modified June 2016.
https://www.consumerreports.org/cro/2012/04/guide-to-safety-features/index
.htm
Kim, Gene, Jez Humble, Patrick Debois, and John Willis. The DevOps Handbook: How
to Create World-Class Agility, Reliability, and Security in Technology Organizations.
Portland, OR: IT Revolution, 2016.
Oosterwal, Dantar P. The Lean Machine: How Harley-Davidson Drove Top-line Growth
and Profitability with Revolutionary Lean Product Development. New York: AMACOM,
2010.
Plumb, Steve. “Steering Towards Autonomy.” Automotive Design & Production, Decem-
ber 2, 2016. https://www.adandp.media/columns/steering-toward-autonomy.
“SAFe Principles.” Scaled Agile Framework website, accessed April 29, 2018. https:
//www.scaledagileframework.com/safe-lean-agile-principles/.
Udaniz, Alex Udanis. “The Technology Behind Active Safety Systems in Cars.” All
About Cars, 2016. https://www.allaboutcircuits.com/news/the-technology-behind
-active safety-systems-in-cars/
We would like to thank all of our attendees and our friends at XebiaLabs
for helping to make this year’s Forum a huge success.