Early Identification of SE-Related Program Risks
Barry Boehm, Dan Ingold, University of Southern California
Kathleen Dangle, Fraunhofer-Maryland
Rich Turner, Stevens Institute of Technology
Paul Componation, University of Alabama-Huntsville
Abstract
This paper summarizes the results of a DoD Systems Engineering Research Center (SERC) project to
synthesize analyses of DoD SE effectiveness risk sources into a lean framework and toolset for early
identification of SE-related program risks. It includes concepts of operation which enable project
sponsors and performers to agree on the nature and use of more effective evidence-based reviews.
These enable early detection of missing SE capabilities or personnel competencies with respect to a
framework of Goals, Critical Success Factors (CSFs), and Questions determined from leading DoD
early-SE CSF analyses. The SE Effectiveness Measurement (EM) tools enable risk-based prioritization
of corrective actions, as shortfalls in evidence for each question are early uncertainties, which when
combined with the relative system impact of a negative answer to the question, translates into the degree
of risk that needs to be managed to avoid system overruns and incomplete deliveries.
Introduction; Motivation and Context
DoD programs need effective systems engineering (SE) to succeed.
DoD program managers need early warning of any risks to achieving effective SE.
This SERC project has synthesized analyses of DoD SE effectiveness risk sources into a lean framework
and toolset for early identification of SE-related program risks.
Three important points need to be made about these risks.
•
They are generally not indicators of "bad SE." Although SE can be done badly, more often the risks
are consequences of inadequate program funding (SE is the first victim of an underbudgeted
program), of misguided contract provisions (when a program manager is faced with the choice
between allocating limited SE resources toward producing contract-incentivized functional
specifications vs. addressing key performance parameter risks, the path of least resistance is to obey
the contract), or of management temptations to show early progress on the easy parts while
deferring the hard parts till later.
1
Form Approved
OMB No. 0704-0188
Report Documentation Page
Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and
maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information,
including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington
VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it
does not display a currently valid OMB control number.
1. REPORT DATE
3. DATES COVERED
2. REPORT TYPE
19 NOV 2009
00-00-2009 to 00-00-2009
4. TITLE AND SUBTITLE
5a. CONTRACT NUMBER
Early Identification of SE-Related Program Risks
5b. GRANT NUMBER
5c. PROGRAM ELEMENT NUMBER
6. AUTHOR(S)
5d. PROJECT NUMBER
5e. TASK NUMBER
5f. WORK UNIT NUMBER
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)
Systems Engineering Research Institute,Stevens Institute of Technology,1
Castle Point on Hudson,Hoboken,NJ,07030
9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)
8. PERFORMING ORGANIZATION
REPORT NUMBER
10. SPONSOR/MONITOR’S ACRONYM(S)
11. SPONSOR/MONITOR’S REPORT
NUMBER(S)
12. DISTRIBUTION/AVAILABILITY STATEMENT
Approved for public release; distribution unlimited
13. SUPPLEMENTARY NOTES
Proceedings of the Conference on Systems Engineering Research, Hoboken, NJ, March 2010.
DOD-sponsored. U.S. Government or Federal Rights License
14. ABSTRACT
This paper summarizes the results of a DoD Systems Engineering Research Center (SERC) project to
synthesize analyses of DoD SE effectiveness risk sources into a lean framework and toolset for early
identification of SE-related program risks. It includes concepts of operation which enable project sponsors
and performers to agree on the nature and use of more effective evidence-based reviews. These enable
early detection of missing SE capabilities or personnel competencies with respect to a framework of Goals,
Critical Success Factors (CSFs), and Questions determined from leading DoD early-SE CSF analyses. The
SE Effectiveness Measurement (EM) tools enable risk-based prioritization of corrective actions, as
shortfalls in evidence for each question are early uncertainties, which when combined with the relative
system impact of a negative answer to the question, translates into the degree of risk that needs to be
managed to avoid system overruns and incomplete deliveries.
15. SUBJECT TERMS
16. SECURITY CLASSIFICATION OF:
a. REPORT
b. ABSTRACT
c. THIS PAGE
unclassified
unclassified
unclassified
17. LIMITATION OF
ABSTRACT
18. NUMBER
OF PAGES
Same as
Report (SAR)
15
19a. NAME OF
RESPONSIBLE PERSON
Standard Form 298 (Rev. 8-98)
Prescribed by ANSI Std Z39-18
•
Analyses have shown that unaddressed risk generally leads to serious budget and schedule overruns.
•
Risks are not necessarily bad. If an early capability is needed, and the risky solution has been
shown to be superior to the alternatives, accepting and focusing on mitigating the risk is generally
better than waiting for a better alternative to show up.
Unlike traditional schedule-based and event-based reviews, the SERC SE EM technology enables
sponsors and performers to agree on the nature and use of more effective evidence-based reviews.
These enable early detection of missing SE capabilities or personnel competencies with respect to a
framework of Goals, Critical Success Factors (CSFs), and Questions determined by the EM task from
the leading DoD early-SE CSF analyses. The EM tools enable risk-based prioritization of corrective
actions, as shortfalls in evidence for each question are early uncertainties, which when combined with
the relative system impact of a negative answer to the question, translates into the degree of risk that
needs to be managed to avoid system overruns and incomplete deliveries.
The EM tools’ definition of “SE effectiveness” is taken from the INCOSE definition of SE as “an
interdisciplinary approach and means to enable the realization of successful systems.” Based on this
definition, the SERC project proceeded to identify and organize a framework of SE effectiveness
measures (EMs) that could be used to assess the evidence that a MDAP’s SE approach, current results,
and personnel competencies were sufficiently strong to enable program success. Another component of
the research was to formulate operational concepts that would enable MDAP sponsors and performers to
use the EMs as the basis of collaborative formulation, scoping, planning, and monitoring of the
program’s SE activities, and to use the monitoring results to steer the program toward the achievement
of feasible SE solutions.
Technical Approach
The EM research project reviewed over two-dozen sources of candidate SE EMs, and converged on the
strongest sources to be used to identify candidate SE EMs. We developed a coverage matrix to
determine the envelope of candidate EMs, and the strength of consensus on each candidate EM. It fed
the results back to the source originators to validate the coverage matrix results. This resulted in further
insights and added candidate EMs to be incorporated into an SE Performance Risk Framework. The
resulting framework is organized into a hierarchy with 4 Goals, 18 Critical Success Factors, and 74
Questions that appeared to cover the central core of common SE performance determinants of SE
effectiveness.
Concurrently, the research project was extended to also assess SE personnel competency as a
determinant of program success. We analyzed an additional six personnel competency risk frameworks
and sets of questions. Their Goals and Critical Success Factors were very similar to those used in the SE
Performance Risk Framework, although the Questions were different. The resulting SE Competency
Risk Framework added one further Goal of Professional and Interpersonal Skills with five Critical
Success Factors, resulting in a framework of 5 Goals, 23 Critical Success Factors, and 81 Questions.
2
Our initial research focused on identifying methods that might be suitable for assessing the effectiveness
of systems engineering on major defense acquisition programs (MDAPs). A literature review identified
eight candidate measurement methods: the NRC Pre-Milestone A & Early-Phase SysE top-20 checklist
[20]; the Air Force Probability of Program Success (PoPS) Framework [1]; the INCOSE/LMCO/MIT
Leading Indicators [24]; the Stevens Leading Indicators (new; using SADB root causes) [34]; the USC
Anchor Point Feasibility Evidence criteria [31]; the UAH teaming theories criteria [14]; the NDIA/SEI
capability/challenge criteria [15]; and the SISAIG Early Warning Indicators [9] incorporated into the
USC Macro Risk Tool [33].
Pages 5-8 of the NRC report [20] suggests a “Pre-Milestone A/B Checklist” for judging the successful
completion of early-phase systems engineering. Using this checklist as a concise starting point, we
identified similar key elements in each of the other candidate measurement methods, resulting in a
coverage matrix with a list of 45 characteristics of effective systems engineering. Figure 1 shows the
first page of the coverage matrix. We then had the originators of the measurement methods indicate
where they felt the coverage matrix was inaccurate or incomplete. This assessment also identified
another six EM characteristics not previously noted.
Figure 1. EM Coverage Matrix
SERC EM Task Coverage Matrix V1.0
NRC
Probability
of
Success
LIPSF
(Stevens)
Anchoring
SW
Process
(USC)
PSSES
(U.
of
Alabama)
SSEE
(CMU/SEI)
x
x
x
(w.r.t
NPR)
(x)
x
x
(5
years
is
not
explicitly
stated)
(x)
(seems
to
be
inferrable
from
the
c onclusions)
(x)
(implies
this)
x
(x)
x
x
x
x
x
x
x
SE
Leading
Indicators
Concept
Dev
Atleast
2
alternatives
have
been
e valuated
Can
an
initial
c apability
be
achieved
within
the
time
that
the
key
program
leaders
are
e xpected
to
remain
e ngaged
in
their
c urrent
jobs
(normally
less
than
5
years
or
so
after
M ilestone
B)?
If
this
is
not
possible
f or
a
c omplex
major
development
program,
c an
c ritical
subsystems,
or
at
least
a
key
subset
of
them,
be
demonstrated
within
that
time
frame?
Will
risky
new
technology
mature
before
B?
Is
there
a
risk
mitigation
plan?
Have
e xternal
interface
c omplexities
been
identified
and
minimized?
Is
there
a
plan
to
mitigate
their
risks?
X
X
x
(x)
x
x
Macro
Risk
Model/Tool
KPP
and
CONOPS
At
M ilestone
A,
have
the
KPPs
been
identified
in
clear,
c omprehensive,
c oncise
terms
that
are
understandable
to
the
users
of
the
system?
At
M ilestone
B,
are
the
major
system-‐level
requirements
(including
all
KPPs)
defined
sufficiently
to
provide
a
stable
basis
f or
the
development
through
IOC?
Has
a
CONOPS
been
developed
showing
that
the
system
c an
be
operated
to
handle
the
e xpected
throughput
and
meet
response
time
requirements?
x
(x)
x
(x)
x
(strongly
implied)
(x)
(implied)
x
x
x
x
(x)
x
x
(x)
(x)
(There
is
no
direct
reference
to
this
but
is
inferrable)
x
x
(x)
(there
is
a
mention
of
a
physical
solution.
That's
the
closest
in
this
regard)
x
x
x
x
(x)
(x)
Legend:
x
=
c overed
by
EM
(x)
=
partially
c overed
(unless
stated
otherwise)
Previous research by the USC team into a macro-risk model for large-scale projects had resulted in a
taxonomy of high-level goals and supporting critical success factors (CSFs) based on [28]. This was
identified as a potential framework for organizing the 51 EM characteristics identified above. Analysis
of the characteristics showed that they could be similarly organized into a series of four high-level goals,
3
each containing 4-5 CSFs, as seen in Figure 2. Our survey of the existing literature suggests that these
CSFs are among the factors that are most critical to successful SE, and that the degree to which the SE
function in a program satisfies these CSFs is a measure of SE effectiveness.
Figure 2. Goals and CSFs for SE Performance
High-level Goals
Concurrent
definition of
system
requirements &
solutions
System life-cycle
organization,
planning &
staffing
Technology
maturing &
architecting
Evidence-based
progress
monitoring &
commitment
reviews
Critical Success Factors
Understanding of stakeholder needs
Concurrent exploration of solutions
System scoping & requirements
definition
Prioritization/allocation of
requirements
Establishment of stakeholder RAAs
Establishment of IPT RAAs
Establishment of resources to meet
objectives
Establishment of
selection/contracting/incentives
Assurance of necessary personnel
competencies
COTS/NDI evaluation, selection,
validation
Life-cycle architecture definition &
validation
Use of prototypes, models, etc. to
validate maturity
Validated budgets & schedules
Monitoring of system definition
Monitoring of feasibility evidence
development
Monitoring/assessment/re-planning
for changes
Identification and mitigation for
feasibility risks
Reviews to ensure stakeholder
commitment
Related to the effectiveness measures of SE performance is the need to measure the effectiveness of the
staff assigned to the SE function. Besides the eight SEPRT sources, six additional sources were
reviewed for contributions to Personnel Competency evidence questions: the Office of the Director of
4
National Intelligence (ODNI), Subdirectory Data Collection Tool: Systems Engineering [22]; the
INCOSE Systems Engineering Handbook, August 2007 [17]; the ASN (RD&A), Guidebook for
Acquisition of Naval Software Intensive Systems, September 2008 [3]; the CMU/SEI, Models for
Evaluating and Improving Architecture Competence report [4]; the NASA Office of the Chief Engineer,
NASA Systems Engineering Behavior Study, October 2008 [34]; and the National Research Council,
Human-System Integration in the System Development Process report, 2007 [23].
These were analyzed for candidate knowledge, skills, and abilities (KSA) attributes proposed for
systems engineers. Organizing these work activities and KSAs revealed that the first four goals and
their CSFs were in common with the EM taxonomy. This is shown in Figure 3, which shows the
compatibility of the four goals in the EM taxonomy with the first four goals in the National Defense
Industry Association’s SE Personnel Competency framework and those in the CMU/SEI Models for
Evaluating and Improving Architecture Competence report.
Figure 3. Comparison of EM Competency Framework with NDIA and SEI Counterparts
SERC
EM
Framework
NDIA
Personnel
Competency
FW
SEI
Architect
Competency
FW
Concurrent
Definition
of
System
Requirements
&
Solutions
Systems
Thinking
Stakeholder
Interaction
System
Life
Cycle
Organization,
Planning,
Staffing
Life
Cycle
View
Other
phases
Technology
M aturing
a nd
Architecting
SE
Technical
Architecting
Evidence-‐Based
Progress
Monitoring
&
Commitment
Reviews
SE
Technical
Management
Management
Professional/
Interpersonal
(added)
Professional/
Interpersonal
Leadership,
Communication,
Interpersonal
As one might expect, the two competency frameworks also had a fifth goal emphasizing professional
and interpersonal competencies. Drawing on these and the other Personnel Competency sources cited
above, an additional goal and its related CSFs were added for the EM Competency framework, as
presented in Figure 4.
Figure 4. Additional goals and CSFs for SE competency
5
High-level Goal
Professional and
interpersonal skills
Critical Success Factors
Ability to plan, staff, organize, team-build, control,
and direct systems engineering teams
Ability to work with others to negotiate, plan, execute,
and coordinate complementary tasks for achieving
program objectives
Ability to perform timely, coherent, and concise
verbal and written communication
Ability to deliver on promises and behave ethically
Ability to cope with uncertainty and unexpected
developments, and to seek help and fill relevant
knowledge gaps
Question-Level Impact/Evidence Ratings and Project SE Risk Assessment
Using these relatively high-level criteria, however, it is difficult to evaluate whether the SE on a
particular program adequately satisfies the CSFs. In its approach to evaluating macro-risk in a program,
[31] suggests that a goal-question-metric (GQM) approach [4] provides a method to accomplish this
evaluation. Following this example, we developed questions to explore each goal and CSF, and devised
metrics to determine the relevance of each question and the quality of each answer.
The researchers began question development for the SE performance framework with the checklist from
[20]. Further questions were adapted from the remaining EM characteristics, rewritten as necessary to
express them in the form of a question. Each question is phrased such that, answered affirmatively, it
indicates positive support of the corresponding CSF. Thus, the strength of support for each answer is
related to the relative risk probability associated with the CSF that question explores.
Rather than rely simply on the opinion of the evaluator as to the relative certainty of positive SE
performance, a stronger and more quantifiable evidence-based approach was selected. The strength of
the response is related to the amount of evidence available to support an affirmative answer—the
stronger the evidence, the lower the risk probability. Feedback from industry, government, and academic
participants in workshops conducted in March and May 2009 suggested that a simple risk probability
scale with four discrete values be employed for this purpose.
Evidence takes whatever form is appropriate for the particular question. For example, a simulation
model might provide evidence that a particular performance goal can be met. Further, the strongest
evidence is that which independent expert evaluators have validated.
Recognizing that each characteristic might be more or less applicable to a particular program being
evaluated, the questions are also weighted according to the risk impact that failure to address the
6
question might be expected to have on the program. Again based on workshop feedback, a four-value
scale for impact was chosen.
The product of the magnitude of a potential loss (the risk impact) and the likelihood of that loss (the risk
probability) is the risk exposure. Although risk exposure is generally calculated given quantitative realnumber estimates of the magnitude and probabilities of a loss, the assessments of risk impact and risk
probability described above use an ordinal scale. However, as shown in the tool below, we have
associated quantitative ranges of loss magnitude and loss probability with the rating levels, providing a
quantitative basis for a mapping between the four-value risk probability and risk impact scales to a
discrete five-value risk exposure scale.
Prototype SE Effectiveness Risk Tools
As a means to test the utility of these characteristics for assessing systems engineering effectiveness,
using the GQM approach outlined above, the researchers created prototype tools that might be used to
perform periodic evaluations of a project, similar to a tool used in conjunction with the macro-risk
model described above. The following section describes this prototype implementation in further detail.
SE Performance Risk Tool
The Systems Engineering Performance Risk Tool (SEPRT) is an Excel spreadsheet-based prototype
focused on enabling projects to determine their relative risk exposure due to shortfalls in their SE
performance relative to their prioritized project needs. It complements other SE performance
effectiveness assessment capabilities such as the INCOSE Leading Indicators, in that it supports periodic
assessment of evidence of key SE function performance, as compared to supporting continuous
assessment of key project SE quantities such as requirements volatility, change and problem closure
times, risk handling, and staffing trends.
The operational concept of the SEPRT tool is to enable project management (generally the Project
Manager or his/her designate) to prioritize the relative impact on the particular project of shortfalls in
performing the SE task represented in each question. Correspondingly, the tool enables the project
systems engineering function (generally the Chief Engineer or Chief Systems Engineer or their
designate) to evaluate the evidence that the project has adequately performed that task. This
combination of impact and risk assessment enables the tool to estimate the relative project risk exposure
for each question, and to display them in a color-coded Red-Yellow-Green form.
These ideas were reviewed in workshops with industry, government, and academic participants
conducted in March and May 2009, with respect to usability factors in a real project environment. A
consensus emerged that the scale of risk impact and risk probability estimates should be kept simple and
easy to understand. Thus a red, yellow, green, and grey scale was suggested to code the risk impact; and
a corresponding red, yellow, green, and blue scale to code the risk probability. These scales are
discussed in more depth below. An example of the rating scales, questions, and calculated risk exposure
in the prototype tool is presented in Figure 5 below.
7
Figure 5. The SEPRT Tool Seeks Performance Evidence
Risk impact ratings vary from a critical impact (40-100%; average 70% cost-schedule-capability
shortfall) in performing the SE task in question (red) through significant impact ( 20-40%; average 30%
shortfall: yellow) and moderate impact (2-20%; average 11% shortfall: green) to little-no impact (0-2%;
average 1% shortfall: gray). These relative impact ratings enable projects to tailor the evaluation to the
project’s specific situation. Thus, for example, it is easy to “drop” a question by clicking on its “No
Impact” button, but also easy to restore it by clicking on a higher impact button. The rating scale for the
impact level is based on the user’s chosen combination of effects on the project’s likely cost overrun,
schedule overrun, and missing percent of promised over actual delivered capability (considering there
are various tradeoffs among these quantities).
Using Question 1.1(a) from 5 as an example, if the project were a back-room application for base
operations with no mission-critical key performance parameters (KPPs), its impact rating would be
Little-No impact (Gray). However, if the project were a C4ISR system with several mission-critical
KPPs, its rating would be Critical impact (Red).
The Evidence/Risk rating is the project’s degree of evidence that each SE effectiveness question is
satisfactorily addressed, scored (generally by the project Chief Engineer or Chief Systems Engineer or
their designate) on a risk probability scale: the less evidence, the higher the probability of shortfalls. As
with the Impact scale, the Evidence scale has associated quantitative ratings: Little or No Evidence: P =
0.4 - 1.0; average 0.7; Weak Evidence: P = 0.2- 0.4; average 0.3; Partial Evidence: P = 0.02 – 0.2;
average 0.11; Strong Evidence: P = 0 – 0.02; average 0.01.
Again, using Question 1.1(a) from Figure as an example analyzing a C4ISR system with several
mission-critical KPPs, then a lack of evidence (from analysis of current-system shortfalls and/or the use
of operational scenarios and prototypes) that its “KPPs had been identified at Milestone A in clear,
8
comprehensive, concise terms that are understandable to the users of the system” would result in a High
risk probability, while strong and externally validated evidence would result in a Very Low risk
probability.
Using the average probability and impact values presented above, the average-valued Risk Exposure =
P(Risk) * Size(Risk) relative to 100% implied by the ratings is presented in Figure 6. The SEPRT tool
provides a customizable mapping of each impact/probability pair to a color-coded risk exposure, based
on the above table. For each question, the risk exposure level is determined by the combination of risk
impact and risk probability, and a corresponding risk exposure color-coding is selected, which ranges
from red for the highest risk exposure to green for the lowest. Figure 6 the default color-coding used in
the SEPRT tool; an additional Excel sheet in the tool enables users to specify different color codings.
Figure 6. Average risk exposure calculation and default color code
Impact
//
Probability
Very
Low
Low
Medium
Critical
0.7
7.7
Significant
0.3
3.3
7.79
Moderate
0.11
1.21
3.3
7.7
Little-‐No
Impact
0.01
0.11
0.3
0.7
21
High
49
21
As seen in 5, the risk exposure resulting from scoring the impact and risk of each question is presented
in the leftmost column. Based on suggestions from workshop participants, the current version of the tool
assigns the highest risk exposure level achieved by any of the questions in a CSF as the risk exposure for
the overall CSF. This maximum risk exposure presented in the rightmost column for the CSF. This
rating method has the advantages of being simple and conservative, but might raise questions if, for
example, CSF 1.1 were given a red risk exposure level for one red and four greens, and a yellow risk
exposure level for five yellows. Experience from piloting of the tool has suggested refinements to this
approach, discussed later in this report.
SE Competency Risk Tool
The initial section of the Systems Engineering Competency Risk Tool (SECRT) is shown in Figure 7. It
functions in the same way as the SEPRT tool described above, but its questions address key
considerations of personnel competency for each CSF. The space limitations of this paper preclude
showing all of the SEPRT and SECRT questions corresponding to the goals and CSF. They are
provided in the downloadable tools and SERC EM project Final Technical Report [36] at the SERC web
site at TBD.
9
SEPRT and SECRT Concepts of Operation
The SEPRT and SECRT framework and tools provide a way for projects to identify the major sources of
program risk due to SE shortfalls. This section summarizes concepts of operation for applying the tools
at major milestones, and at other points where SE demonstration shortfalls or other SE EMs such as the
INCOSE Leading Indicators have identified likely problem situations and need further understanding of
the problem sources and their relative degrees of risk. More detail and examples are provided in the SE
EM technical report [36].
The first step in the concept of operations involves collaborative planning by a project’s sponsoring
decision authority (at a developer level, a program level, or a program oversight level) and its
performing organization(s) to reach agreements on the relative priorities of its needed performance
aspects and personnel competencies, as measured by the relative program impact of their SEPRT and
SECRT question content. The planning necessarily includes consideration of the consistency of these
priorities with the project’s SE budget, schedule, and contract provisions. This stage frequently
identifies inconsistencies between sponsor priorities (e.g., early key performance parameter (KPP)
tradeoff analysis) and contract provisions (e.g., progress payments and award fees initially focused on
functional specifications and not KPP satisfaction), and enables their timely resolution.
The next step involves evaluation via independent experts of the evidence of adequacy provided by the
performers for their ability to perform the desired levels of SE performance within the budgets,
schedules, and staffing defined in their SE plans. This should include the rationale for the choices of
10
evidence preparation such as prototyping, modeling, simulation, benchmarking, exercising, testbed
preparation, scenario generation, instrumentation and data analysis; and their associated resource
consumption. Subsequently, progress with respect to the plans needs to be monitored; it is best to
consider the planned evidence as a first class deliverable, and to include its progress measurement in the
project’s earned value monitoring system.
Finally, the completed evidence needs to be provided by the performers at each major milestone review,
along with the performers’ rating of the strength of the evidence and the associated risk level indicated
by the SEPRT and SECRT tools. These ratings are then evaluated by independent experts, and adjusted
where the evidence is stronger or weaker than the indicated ratings. The revised ratings are discussed
and iterated by the sponsors and performers, and used to determine revised SEPRT and SECRT risk
levels. These will enable to sponsors and performers to determine the necessary risk mitigation plans,
budgets, and schedules for ensuring project success. Again, more detailed scenarios and flowcharts are
provided in the SE EM technical report [36].
Summary of Framework and Tool Evaluations
We solicited pilot evaluations of the EM performance and competency frameworks, using the prototype
SEPRT and SECRT tools, from industry, government agencies, and academic participants. Because the
task re-scoping permitted only a single round of piloting, these initial evaluations were conducted
against historical projects and case studies. The tools were successfully piloted against five DoD
projects, one NASA project, and one commercial project. They were also analyzed by two industriallyexperienced colleagues against detailed case studies of a number of DoD and commercial projects. The
application domains piloted included space, medical systems, logistics, and systems-of-systems. Results
of the pilot evaluations were reported through a web-based survey tool and detailed follow-up
interviews, while the case study evaluations were reported through detailed comments from the
reviewers.
Evaluations were generally positive, and the frameworks were found to be useful across all project
phases except Production, and against all systems types except “legacy development.” The consensus of
reviewers was that the frameworks would be most useful in the System Development & Demonstration
(SDD) phase, and generally more useful in early phases than later. It was noted, however, that in
systems developed using evolutionary strategies, such “early” phases recur throughout the development
cycle, extending the usefulness of the frameworks. The evaluations were reported to take 2-5 hours to
complete for persons familiar with the projects, with materials that were readily at hand. Also, in
reviewing case study material, some evaluators reported that the EM framework was not specific to any
particular problem domain (a choice we made to make domain tailoring user-performable via Excel).
Several evaluators reported that the frameworks generated too many high-risk findings, which might
make the results too overwhelming to take action. In response to this significant concern, the impact
scales were adjusted to make the adjectives better correspond to the quantitative impacts (CriticalSignificant-Moderate-Little or No vs. High-Medium-Low-No impact), and a longer risk exposure scale
developed to allow more nuanced results.
In addition, the University of Maryland (UMD) Fraunhofer Center (FC) performed preliminary
evaluations against the Systemic Analysis Database (SADB), compiled by OUSD (AT&L), and a
11
mapping between the SEPRT questions and the Defense Acquisition Program Support (DAPS)
methodology underlying the SADB results. This evaluation approach allowed analysis of the
effectiveness of the frameworks with respect to historical success and failures of the subject projects,
and another cross check of the SEPRT coverage. Overall, the coverage mapping indicated that the two
were largely consistent, with more domain coverage in the DAPS methodology. A similar mapping was
performed between the SECRT and the Defense Acquisition University’s SPRDE-SE/PSE Competency
Model, with similar results.
Further, a business case analysis for the investment in SE effectiveness evidence was performed, based
on data from 161 software-intensive systems projects used to calibrate the COCOMO II cost estimation
model and its Architecture and Risk Resolution scale factor. It concluded that the greater the project’s
size, criticality, and stability are, the greater is the need for validated architecture feasibility evidence
(i.e., evidence-based specifications and plans). However, for very small, low-criticality projects with
high volatility, the evidence generation efforts would make little difference and would need to be
continuously redone, producing a negative return on investment. In such cases, agile methods such as
rapid prototyping, Scrum and eXtreme Programming will be more effective. Overall, evidence-based
specifications and plans will not guarantee a successful project, but in general will eliminate many of the
software delivery overruns and shortfalls experienced on current software projects. Again, more details
are provided in the SE EM technical report [36].
Conclusions
DoD programs need effective systems engineering (SE) to succeed.
DoD program managers need early warning of any risks to achieving effective SE.
This SERC project has synthesized the best analyses of DoD SE effectiveness risk sources into a lean
framework and toolset for early identification of SE-related program risks.
Three important points need to be made about these risks.
•
They are generally not indicators of "bad SE." Although SE can be done badly, more often the risks
are consequences of inadequate program funding (SE is the first victim of an underbudgeted
program), of misguided contract provisions (when a program manager is faced with the choice
between allocating limited SE resources toward producing contract-incentivized functional
specifications vs. addressing key performance parameter risks, the path of least resistance is to obey
the contract), or of management temptations to show early progress on the easy parts while
deferring the hard parts till later.
•
Analyses have shown that unaddressed risk generally leads to serious budget and schedule overruns.
•
Risks are not necessarily bad. If an early capability is needed, and the risky solution has been
shown to be superior to the alternatives, accepting and focusing on mitigating the risk is generally
better than waiting for a better alternative to show up.
The results of the SEPRT and SECRT pilot assessments, the DAPS and SADB comparative analysis,
and the quantitative business case analysis for the use of the SE EM framework, tools, and operational
12
concepts is sufficiently positive to conclude that implementation of the approach is worth pursuing.
Presentations at recent workshops have generated considerable interest in refining, using, and extending
the capabilities and in co-funding the followon research. However, the framework and prototype tools
have been shown to be largely efficacious only to date for pilot projects done by familiar experts in a
relatively short time. It remains to demonstrate how well the framework and tools will perform on inprocess MDAPs with multiple missions, performers, and independent expert assessors.
Some implications of defining feasibility evidence as a “first class” project deliverable are that it needs
to be planned (with resources), and made part of the project’s earned value management system. Any
shortfalls in evidence are sources of uncertainty and risk, and should be covered by risk management
plans. The main contributions of the SERC SE EM project have been to provide experience-based
approaches and operational concepts for the use of evidence criteria, evidence-generation procedures,
and SE effectiveness measures for monitoring evidence generation, which support the ability to perform
evidence-based SE on DoD MDAPs. And finally, evidence-based specifications and plans such as those
provided by the SERC SE EM capabilities and the Feasibility Evidence Description can and should be
added to traditional milestone reviews.
As a bottom line, the SERC SE capabilities have strong potential for transforming the largely
unmeasured DoD SE activity content on current MDAPs and other projects into an evidence-based
measurement and management approach for both improving the outcomes of current projects, and for
developing a knowledge base that can serve as a basis for continuing DoD SE effectiveness
improvement.
References
[1]
Air Force. Probability of Program Success Operations Guide.
https://acc.dau.mil/GetAttachment.aspx?id=19273&pname=file&aid=976&lang=en-US
[2]
Al Said, M. 2003. Detecting Model Clashes During Software Systems Development. Doctoral
Thesis. Department of Computer Science, University of Southern California.
[3]
ASN (RD&A). Guidebook for Acquisition of Naval Software Intensive Systems. September 2008.
http://acquisition.navy.mil/organizations/dasns/rda_cheng.
[4]
Basili, V., Gianluigi, C., Rombach, D. 1994. The Experience Factory. In Encyclopedia of Software
Engineering. John Wiley & Sons, 469-476.
[5]
Bass, L., Clements, P., Kazman, R., Klein, M. Models for Evaluating and Improving Architecture
Competence. CMU/SEI-2008-TR-006 (April 2008).
[6]
Beck, K. 1999. Extreme Programming Explained. Addison-Wesley.
[7]
Boehm, B. 1996. Anchoring the Software Process. IEEE Software (Jul. 1996), 73-82.
[8]
Boehm, B., et al. 2000. Software Cost Estimation with COCOMO II. Prentice Hall.
13
[9]
Boehm, B., Ingold, D., Madachy, R. 2008. The Macro Risk Model: An Early Warning Tool for
Software-Intensive Systems Projects
http://csse.usc.edu/csse/event/2009/UARC/material/Macro%20Risk%20Model.pdf
[10]
Boehm, B. and Lane, J. 2009. Incremental Commitment Model Guide, version 0.5.
[11]
Boehm, B. and Lane, J. 2007. Using the ICM to Integrate System Acquisition, Systems Engineering,
and Software Engineering. CrossTalk (Oct. 2007), 4-9.
[12]
Boehm, B., Port, D. and Al Said, M. 2000. Avoiding the Software Model-Clash Spiderweb. IEEE
Computer (Nov. 2000), 120-122.
[13]
Boehm, B., Valerdi, R., and Honour, E. 2008. The ROI of Systems Engineering: Some Quantitative
Results for Software-Intensive Systems. Systems Engineering, Fall 2008, pp. 221-234.
[14]
Componation, P., Youngblood, A., Utley, D., Farrington. Assessing the Relationships Between
Project Success, and System Engineering Processes.
http://csse.usc.edu/csse/event/2009/UARC/material/Project%20Success%20and%20SE%20Processe
s%20-%20UAHuntsville%20-%20Componation.pdf
[15]
Elm, J., Goldenson, D., Emam, K., Donatelli, N., Neisa, A., NDIA SE Effectiveness Committee. A
Survey of Systems Engineering Effectiveness – Initial Results. CMU/SEI-2008-SR-034 (Dec. 2008),
http://www.sei.cmu.edu/reports/08sr034.pdf
[16]
Glass, R. 1998. Software Runaways. Prentice Hall.
[17]
INCOSE. 2007. INCOSE Systems Engineering Handbook v. 3.1. INCOSE-TP-2003-002-03.1.
[18]
Kruchten, P. 1999. The Rational Unified Process. Addison Wesley.
[19]
Maranzano, J., et al. 2005. Architecture Reviews: Practice and Experience. IEEE Software
(Mar./Apr. 2005).
[20]
National Research Council (U.S.). 2008. Pre-Milestone A and Early-Phase System Engineering: A
Retrospective Review and Benefits for Future Air Force Systems Acquisition, Washington D.C.:
National Academies Press.
[21]
Office of the Deputy Under Secretary of Defense for Acquisition and Technology, Systems and
Software Engineering, Defense Acquisition Program Support (DAPS) Methodology, Version 2.0
(Change 3), March 20, 2009.
[22]
Office of the Director of National Intelligence (ODNI), Subdirectory Data Collection Tool: Systems
Engineering.
[23]
Pew, R. and Mavor, A. 2007. Human-System Integration in the System Development Process: A
New Look. National Academy Press.
14
[24]
Roedler, G., Rhodes, D. Systems Engineering Leading Indicators Guide.
http://csse.usc.edu/csse/event/2009/UARC/material/SELeadingIndicators2007-0618.pdf
[25]
Royce, W. 1998. Software Project Management. Addison Wesley.
[26]
Royce, W., Bittner, K., and Perrow, M. 2009. The Economics of Iterative Software Development.
Addison Wesley.
[27]
Schwaber, K. and Beedle, M. 2002. Agile Software Development with Scrum. Prentice Hall.
[28]
SISAIG. 2004. System-Software Milestone/Review Decision Support Framework. US/UK/AUS
Software Intensive System Acquisition Improvement Group.
[29]
Standish Group 2009. CHAOS Summary 2009. http://standishgroup.com.
[30]
United States Government Accountability Office. Defense Acquisitions Assessments of Selected
Weapon Programs. http://www.gao.gov/new.items/d09326sp.pdf
[31]
USC. Anchor Point Feasibility Evidence.
http://csse.usc.edu/csse/event/2009/UARC/material/MBASE_Guidelines_v2.4.2%20FRD.pdf
[32]
USC. 2007. Evaluation and Calibration of Macro Risk Model (August 2007).
[33]
USC. Macro Risk Tool.
http://csse.usc.edu/csse/event/2009/UARC/material/Macro%20Risk%20Model%20v%2012.1.xls
[34]
Weitekamp, M., Verma, D. Leading Indicators of Program Success and Failure. Stevens Institute
http://csse.usc.edu/csse/event/2009/UARC/material/Stevens%20Leading%20Indicators%20Project.p
df
[35]
Williams, C., Derro, M. NASA Systems Engineering Behavior Study. NASA Office of the Chief
Engineer (Oct. 2008).
[36] TBD: put in full list of authors (including yourself), final tech report title, 30 Sept 2009 date,
and TBD reference to a SERC web site location.
15
View publication stats