Myklebust Aso V2presented ESREL 2017

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

A survey of the software and safety case development practice in the

railway signalling sector


T. Myklebust, G. K. Hanssen & N. Lyngby Esrel 2017 Slovenia
SINTEF Digital, Trondheim, Norway

ABSTRACT: Development of safety-critical software systems is changing, since development and innovation
shifts from hardware to software. This trend is a result of more powerful and standardized hardware, the in-
herent flexibility in software and a need to deliver systems to the market more rapidly. We have performed a
survey within the European railway systems domain to build a better understanding of the status and the main
challenges in development projects. Based on data from ten organizations we have found that the main chal-
lenges are related to management of unclear and changing requirements. We also see that development of
software is based on the V-model, which has a strong emphasis on up-front planning and document intensive
work. This creates a tension and need to rethink how safety-critical software systems, including railway sys-
tems, should be developed and certified. To strengthen the survey and to evaluate more information related to
safety cases, we have reviewed more than 35 safety cases issued as part of Nordic railway projects. We have
checked whether the safety cases use normal prose or e.g. goal structuring notations in their safe-ty case
presentations. As part of this review we also checked how of-ten the safety case author are replaced in projects
lasting more than three years and that has delivered more than three safety cases as part of the project.

1 INTRODUCTION work is done in short iterations, each ideally produc-


ing a functional and testable part of the system under
Railway systems are becoming increasingly more development. The goal is to optimize feedback and
based on software and relatively less on hardware, flexibility in the development process and to contin-
which becomes more powerful, standardized and uously develop both an understanding of the prob-
flexible. This creates new opportunities and chal- lem and the solution during the course of the project.
lenges for system providers as well as certification There are several agile methodologies and tech-
bodies. Software is by nature different from hard- niques available whereof the Scrum process frame-
ware as it is far more flexible and changeable. This work [7] seems to be the most frequently used, often
is reflected in requirements management and devel- in combination with techniques from extreme pro-
opment where software can be changed and tested gramming (XP) [1]. Agile software development
more frequently than hardware design and imple- emerged around 2001 as a reaction to heavy-weight
mentation. Based on frequent testing and interaction and plan-driven processes and emphasizes direct
with the system owner and its users, requirements communication, self-managed teams, frequent deliv-
can potentially be adjusted more frequently, creating ery of working and testable software, minimizing
both challenges and opportunities. This change has formal documentation, change tolerance and flexibil-
led to the introduction of agile software development ity in requirements management. The sum of this
methods in development and certification of safety creates a strong contrast to established practices in
critical systems [2, 4]. These methods promise re- industries developing safety-critical solutions, such
duced lead-time, reduced development costs and as railway signaling systems. The V-model [12] is
more flexibility in requirements management and well rooted in the established industry and there is a
development, however there are also some new chal- strong tradition for investing a lot in up-front plan-
lenges that need attention. In particular, certification ning, architecture and design.
and proof of compliance with safety standards be- There are several questions that have to be an-
comes a challenge. swered when considering software development of
Agile software development is an approach to or- safety-critical solutions - in our case - railway sys-
ganizing the software development process where tems. One of the first would be to better understand
up-front planning is kept to a minimum and where the status of software development of railway sys-
tems to indicate present software development chal- quirements on which notations that should be used.
lenges and opportunities in this domain as a basis to It is not known to the authors whether normal prose,
evaluate whether agile methodologies could improve GSN or similar notations method are used by the
software development and potentially how agile railway industry. To be the SC author is an important
methodologies should be fitted. We have approached role as this document is used as evidence both for
the main providers of such systems in Europe and the manufacturer itself, the customer, the assessor
performed a simple questionnaire-based study. and the safety authority. Consistency and stability
There has been much discussion regarding the when issuing updates of a product or system is there-
different ways to write a safety case. Holloway [10] fore of great importance. How often the safety case
has presented different notations like normal prose, author has changed during updates of safety cases
goal structuring notation (GSN) and structured has so far not been studied.
prose. Safety cases are often updated several times
as the products or systems are improved stepwise
over several years. Who is the safety case author 3 SURVEY METHOD
during these years may affect the presentation, con-
tinuity of the project and communication with the 3.1 Sample
assessor. A set of 12 respondents was invited to participate in
The rest of this paper provides some background the survey, based on our knowledge of, and network
on agile software development, our survey approach, within the domain. Each respondent represent a
survey results, a discussion on the results of the sur- unique system provider within the railway domain.
vey and the review of the safety cases. We conclude In that sense we treat data to represent organizations,
with some preliminary ideas on how agile methodol- not individuals.
ogies may evaluate some identified problems.
3.2 Pilot and survey
2 BACKGROUND We first composed a pilot survey where 8 experi-
enced respondents from separate organizations were
Agile development methods are, according to Fitz- invited to respond on behalf of their organization. 1
gerald [2], used in more and more areas of software of these declared to be not eligible (to answer on be-
development, while, according to Jonsson [3], their half of his/her organization), leaving 7 respondents.
use in regulated and safety-critical domains such as One of these did open the survey but never added da-
the railway sector is still limited. However, increas- ta (except 1 question), leaving 6. One of these started
ing usage and complexity of software in safety- the survey, but only completed partially. Based on
critical systems, increases the cost and time to pro- the pilot results we made a few changes and submit-
duce safety software, especially in domains where ted it again to 4 new respondents. One of these only
the combination of regulations and software imple- completed partially. In total we have complete or
mentation of safety related functions are relatively partial data from 10 organizations, which are report-
new, such as in the railway sector [3]. This is sup- ed further on. Partial data means that some questions
ported by Myklebust et al. [16], who states that the have not been answered; the number of respondents
cost of software development is among the major is reported for each question in the results section.
contributors to the total development cost for railway We deliberately designed the questionnaire to keep
control and signaling systems. most answers optional to avoid too many drop-outs
EN 50128:2011 is the standard for "Software for from the questionnaire.
railway control and protection systems". Just as
most other safety standards, EN 50128 uses the V-
model for software development [9]. However, the 3.3 Review method when evaluating safety cases
development organization is given a large degree of We have reviewed 35 safety cases issued as part of
freedom. To quote the introduction: “This European Nordic (Norway, Sweden, Denmark and Finland)
Standard does not mandate the use of a particular railway projects and checked whether they uses
software development lifecycle. However, illustrative normal prose or e.g. goal structuring notations in
lifecycle and documentation sets are given....” their safety case presentations. In this review we also
Mappings between requirements and agile prac- checked how often the safety case author are re-
tices have been reported both for the avionic and placed in projects lasting more than three years and
railway sector. Jonsson [3] and Myklebust et al. [6] that has delivered more than three safety cases as
mapped requirements in EN 50128 with agile meth- part of the project. Four of the projects reviewed
ods and found that agile methods can be adapted to have lasted more than three years.
satisfy the EN 50128 objectives.
EN 50129 presents the different chapters that
should be included in a safety case. There are no re-
4 SURVEY AND REVIEW RESULTS

4.1 Respondents' profile


The respondents have various roles in their organi-
zations, but more than half has the role as RAMS
manager. None described their role as being a soft-
ware expert. All but one of the respondents where
qualified to answer the survey on behalf of their or-
ganization. 50% of the organizations operate mainly
as subsystem providers, 25% as constituent supplier
and 25% as infrastructure manager. Nearly all or-
ganizations integrate various forms of existing soft-
ware in their solutions. Most common are reuse of
own software (60%) and COTS software (70%).
Average project size during development is 2,9 Figure 1 Testing approaches
years. Average time between deliveries is 6,5 (N/A - Blackbox - Whitebox - Automated (regression testing))
months, and average staffing size is 5,3 full-time
persons. Average time can be explained as updates
of existing systems. It normally takes 1-4 years to Requirements management is highly relevant and
develop a system, where updates (new deliverable) is asked which tools that are in use. Ordinary office
issued every e.g. 6 months (which only includes a tools are extensively used (37%), seconded by
few changes). The educational level among the re- DOORS (42%) and Jira (11%).
spondents is in general very high with 39% MSc and
15% PhD. Average experience level in the organiza- 4.3 Certification
tions are 9,2 years with an average staff turnover of
12,8%. We were particularly interested in learning about
which external assessors that are used for certifica-
tion:
4.2 Software development practice
Strong involvement of customers in development
projects is a main feature in agile development and
hence of great interest to our study. We found that
customers are extensively involved in several ways.
Progress meetings are the most common approach
(21%). When asking about which life-cycle models
that are in use we see that the V-model is most
prominent (40%). Other models in use are incremen-
tal/iterative (20%), Waterfall (10%), and Prototyping
(5%). The respondents also answered that agile
(10%) and Scrum (15%) is in use. However, based
on other parts of the survey we tend to believe that
this is more experimental use.
When asked about which documents that are
normally being produced the nine respondents con-
firm that mandatory and most recommended docu-
ments are being used. All answered that project plan,
verification/test plan and safety plan are used.
We also wanted to learn more on integration and
testing. We found that continuous integration is the Figure 2 External assessors being used
most commonly used approach (61%), followed by
component-by-component integration (31%), and When asked when and how often the organization
big-bang integration (8%). We recorded the follow- collaborates with the assessor, eight out of the nine
ing overview on testing approaches: respondents answered that they collaborate with the
external assessor throughout projects from start to
end – this is very uplifting. One answered that his
organization collaborates with the external assessor
mainly close to the end of projects.
On average, 23% of the total project costs are re- safety awareness and understanding by the different
lated to certification, incl. all reviews and testing stakeholders [17].
(standard deviation = 19,6. Max. value reported was A careful review of all the safety cases have
55%, min value reported was 5%). Challenges relat- shown that all of them were written by using normal
ed to certification are of special interest and we prose. The safety cases has been evaluated and ac-
asked the respondents to mention the top-3 challeng-
cepted by an ISA (Independent Safety Assessor). Six
es regarding certification (entered free text in the
questionnaire): different ISAs have been involved that are also ap-
pointed as Notified Bodies (NoBo), two ISAs that
are not appointed as notified body and one internal
ISA. All the different assessors have accepted nor-
Highest rated Second highest mal prose as a method to present a safety case.
• Synchronisation • Immature standards Four projects have lasted more than three years.
• GSM-R • Too strong focus on for-
• ISA/NoBo personal inter- malities instead of actual
One safety case was updated 12 times in a period of
pretation of standards system behavior eight years. In average, the safety case author was
• Winning the initial appli- • The gathering of suitable replaced every three years.
cation contracts for a new application evidence
product/system • Assessor not enough tech-
• Different approaches from nical
Assessors 5 DISCUSSION
Table 1: Top-2 challenges regarding certification
What can we learn from this simple mapping of sta-
tus, and how can this knowledge be relevant when
4.4 Main problems and challenges discussing agile software development of railway
We asked the respondents to rate the most severe systems? We have identified four main challenges
problems related to software development, learning that this industry needs to address:
that challenges related to requirements definitively
are most prominent. The following overview is
weighted and sorted according to score:
5.1 Requirements management is challenging
1. Late discovery of problems/defects
2. Ambiguous requirements From the ranking of problems we see that require-
3. Project cost overruns ments are perceived as ambiguous, insufficient or
4. Insufficient requirements hard to understand. This means that the development
5. Frequent changes in requirements project needs to improve clarity and understanding
of requirements during the development project.
6. Addition of new requirements This represents a major challenge in cases where the
7. Project schedule overruns development process assumes that requirements are
8. OS dependencies clearly defined upfront, like for example the V-
9. Complexity due to large application size model. Lack of opportunities to elaborate and clarify
10. Test case/procedure generation requirements based on growing knowledge of the
11. Low robustness or stability of integrated system under development may create severe prob-
applications lems. As a contrast, the respondents also informed
12. Poor interoperability among tools that requirements are frequently changed and that
new requirements are added. This is inevitable in
larger software development projects and we inter-
pret these findings, as there is weak support for
4.5 Review of safety cases managing requirements volatility. Clarity of re-
As we see from chapter 4.2 above the different safe- quirements is important both to developers as well as
ty cases (GPSC, GASC, SASC) are of great im- the safety assessor, Notified Body and the safety au-
thorities. One promising approach to clarify the dif-
portance. All too often, manufacturers have left the
ferent requirements related to roles is to use safety
important task of creating a safety case to the last stories [13] as they may be used to involve the safety
part of the development project. The reason for this stakeholder and describe situations when avoidance
has often been that “we need to have the test results (negative situations) is of importance.
before we write the safety case”. This has turned out
to be a costly solution. It is much more efficient to
build the safety case by inserting information when it
becomes available during project development – an
agile approach also resulting in a possible increased
5.2 Office tools extensively used for requirement Regarding replacement of the safety case authors
analysis and management in average, every third year, is not a surprising result,
as most of the projects studied do not last signifi-
We see that ordinary office tools are extensively
cantly more than three years.
used, something that probably makes it even harder
to manage changing requirements properly. In cases
where the number of requirements is large (which
naturally is the case in larger projects) and if re-
quirements change frequently it is absolutely neces- 6 AGILE DEVELOPMENT AS A SOLUTION?
sary to manage tractability. This becomes very hard
to manage when tools like word processors and The discussion of the findings from the survey sup-
spreadsheets are the main tool, simply because there ports the idea of applying agile software develop-
is weak support to track and keep record of changes. ment principles in development of safety-critical
DOORS is reported to be used by quite many organ- railway software.
izations, but based on our experience, we think there We believe recent experience with the SafeScrum
is a clear need of a middle course; tools that are sim- framework [8 and 16], which is aligned with
ple but that maintains traceability. Also, within the IEC61508 could be a very relevant basis for address-
railway domain the manufacturers have to comply ing some of the identified challenges in development
with regulations, directives, TSIs (Technical Specifi- and certification of railway systems. SafeScrum is an
cations for Interoperability) and standards while oth- adoption of the Scrum framework for management
er domains only have to comply with directives and of agile software development projects. SafeScrum
standards. (defined by the authors) is based on a research-
industry collaboration involving leading Norwegian
providers of safety-critical systems such as e.g. dy-
5.3 Communication with assessor is hard namic positioning, and offshore fire and gas detec-
tion systems. The main components of the Scrum
The communication with the safety assessor and
process are kept:
NoBo (can be same company and in some cases
same person) is essential to a well-managed devel- - Work is done in short iterations called sprints with
opment project. Based on this finding we think there a fixed length, e.g. 4 weeks.
is a strong need to find better ways of providing the - Each iteration delivers an increment, which is a
assessor/NoBo the information needed to assess and part of the system that can be tested or used as a
eventually certify a system. We believe proper use of basis for providing feedback and plan further de-
tools to gather and combine information and data is velopment.
a viable approach to ease communication together - Work is done by a self-managed development
together with an Agile Safety Case approach [17]. team, having complimentary competency and
skills to solve the tasks in the project. There’s not
5.4 Extensive testing project leader but instead a scrum master which
has the responsibility of keeping the process run-
Testing is fundamental in development of railway
systems. We see that all types of testing have been ning and to deal with problems that might arise
performed, from hardware-in-the-loop, to system, during development. Each working day starts
module, and unit tests. All of these are performed in with a short meeting where each member explains
various ways; black box, white box and automated. their results from the previous workday, any prob-
Based on this, we believe that there may be great lems they have encountered, and their plan for the
improvement opportunities in a fundamentally test- day.
first development process where tests are performed - The team interacts with a product owner, which is
automatically and frequently. responsible of defining requirements and provide
feedback to the development team.
5.5 Review of safety cases - There’s no detailed requirements specification up-
front. High-level requirements are kept in a prod-
All the safety cases reviewed have been written us-
ing the Normal prose method. One reason can be the uct backlog, which is a list of features the system
requirements in EN 50129 that are favoring a Nor- needs to have. Prior to each sprint, in a sprint
mal prose approach and that it in principle requires planning meeting, the product owner prioritizes
no initial learning for an engineer. GSN requires a which features that shall be developed in the
study of the GSN method and can be seen as a fur- sprint. The team details the tasks, estimate time
ther obstacle when starting the safety case process. for development, and add them to the sprint back-
Especially if the SC author has no experience with log, which is the list of features that are going to
the GSN method. This topic should be further stud- be implemented in the upcoming sprint.
ied.
- The sprint ends with a sprint review meeting - When a sprint is planned and in cases where ex-
where the team demonstrates the resulting incre- isting code is being changed, there will be done a
ment and the product owner provides feedback to change impact analysis [5] to evaluate whether
the team. the change will affect the safety of the system.

A process like SafeScrum may help a develop-


ment organization in addressing some of the prob-
lems that we have identified through the survey.
Short iterations open for frequent evaluation of the
design and of the safety function of the system. It al-
so forms a basis for frequent communication, both
within the team, with the product owner (e.g. the
customer), and potentially with the assessor. The key
principle is to uncover and correct problems as early
as possible in development. This can be communi-
cated by incremental development of an Agile Safety
Case [17].

7 CONCLUSIONS

Our survey uncovers some challenges in the railway


system domain, which needs to be addressed:
1. challenging requirements management,
Fig. 3 SafeScrum 2. insufficient tools for traceability,
3. problematic communication with the assessor, and
4. extensive testing).
SafeScrum extends the basic Scrum model in or-
The most fundamental challenge is related to
der to make the process applicable for development
change and unclear requirements. We believe that
and certification of safety-critical software:
the V-model needs to be re-defined and that princi-
- The product backlog is separated in two parts, one ples form agile software development may be a fea-
with functional requirements and one with safety sible approach as it enforces frequent feedback and
requirements. This is done as a mean to link opportunities for re-planning based on continuously
which functional requirements that are related to updated information. We propose to use the
which safety requirements. This us useful to trace SafeScrum process framework as a basis.
how changes in functional requirements affect When writing safety cases, Normal prose has
safety requirements. been used by all the 35 reviewed safety cases and the
- All development and change of requirements and author of the safety case has changed, in average,
code are traced. This is a fundamental part of the every third year.
information that has to be provided to the asses-
sor in order to prove conformance to the standard.
Acknowledgements: This work was partially
Based on experience so far, we see that clever use funded by the Norwegian research council under
of tools is very important to automate production grant #228431 (the SUSS project) and SEP Software
and maintenance of documentation and traceabil- (SINTEF).
ity.
- We propose an additional role for internal quality
assurance within the SafeScrum process (and is
also a preferred role to be included in "waterfall" 8 REFERENCES
projects). This is a person in the team, which will
continuously verify that all quality assurance 1. Dybå, T. and Dingsøyr, T., Empirical Studies of
tasks have been performed [15]. Agile Software Development: A Systematic Re-
- Each sprint ends with a sprint review meeting, view. Information and Software Technology
which also encompasses a RAMS evaluation to 2008. 50(9-10): p. 833-859.
ensure that the last increment does not violate the 2. Fitzgerald, B., Stol, K.-J., O'Sullivan, R., and
safety functionality of the system. O'Brien, D. Scaling agile methods to regulated
environments: an industry case study. in Proceed-
ings of the 2013 International Conference on the IEC 61508-3: 2010 software standard. in Int.
Software Engineering. 2013: IEEE Press. conf. on Probabilistic Safety Assesment and
3. Jonsson, H., Larsson, S., and Punnekkat, S. Agile Management (PSAM). 2014. Hawaii: PSAM12.
practices in regulated railway software develop- 15. Hanssen G. K., Haugset B., Stålhane T., Mykle-
ment. in Software Reliability Engineering Work- bust T. and Kulbrandstad I. Quality Assurance in
shops (ISSREW), 2012 IEEE 23rd International Scrum Applied to Safety Critical Software. XP
Symposium on. 2012: IEEE. 2016 Edinburgh
4. Myklebust T., Stålhane T. and Hanssen G. K. 16. Myklebust T., Stålhane T. and Haugset B. Soft-
Important considerations when applying other ware development cost related to different SILs in
models than the Waterfall/V-model when devel- an agile development environment. ISSC 2015
oping software according to IEC 61508 or EN San Diego.
50128. ISSC 2015 San Diego. 17. Stålhane T. and Myklebust T. The Agile Safety
4. Myklebust, T., Stålhane, T., Hanssen, G., Wien, Case. SafeComp 2016-09, Trondheim.
T., and Haugset, B. Scrum, documentation and
the IEC 61508-3: 2010 software standard. in Int.
conf. on Probabilistic Safety Assesment and
Management (PSAM). 2014. Hawaii: PSAM12.
5. Myklebust, T., Stålhane, T., Hanssen, G.K., and
Haugset, B. Change Impact Analysis as required
by safety standards, what to do? In proceedings of
Probabilistic Safety Assessment & Management
conference (PSAM12). 2014. Honolulu, USA.
6. Myklebust, T., Stålhane, T., and Lyngby, N., Ap-
plication of an agile development processes for
EN50128/Railway conformant software, in Safety
and Reliability of Complex Engineered Systems.
2015, CRC Press. p. 4037-4044.
7. Schwaber, K., Beedle, M., Agile Software Devel-
opment with Scrum. 2001, New Jersey: Prentice
Hall.
8. Stålhane, T., Myklebust, T., and Hanssen, G.K.
The application of Scrum IEC 61508 certifiable
software. In proceedings of ESREL. 2012. Hel-
sinki, Finland.
9. Stålhane, T., Myklebust, T., Hanssen, G.K., Safe-
ty standards and Scrum – A synopsis of three
standards, in SafeScrum.no, G.K. Hanssen, Edi-
tor. 2013.
10. C. M. Holloway. Safety case notations: Alterna-
tives for the non-graphically inclined? 2008. Can
be downloaded at
https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.
gov/20080042416.pdf
11. Myklebust, T., Stålhane, T., and Lyngby, N.,
Application of an agile development processes for
EN50128/Railway conformant software, in Safety
and Reliability of Complex Engineered Systems.
2015, CRC Press. p. 4037-4044.
12. Rook P. Controlling software projects 1986
13. Myklebust T. and Stålhane T. Safety stories – A
New Concept in Agile Development. SafeComp
2016-09, Trondheim.
14. Myklebust, T., Stålhane, T., Hanssen, G., Wien,
T., and Haugset, B. Scrum, documentation and

You might also like