Secdevops
Secdevops
Secdevops
Dinis Cruz
This book is for sale at http://leanpub.com/secdevops
Introduction . . . . . . . . . . . . . . . . . . . . . . . . i
Book under construction . . . . . . . . . . . . . . i
Change log . . . . . . . . . . . . . . . . . . . . . . . . . vi
Contributions . . . . . . . . . . . . . . . . . . . . . . . x
Disclaimers . . . . . . . . . . . . . . . . . . . . . . . . xi
License . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Printed version . . . . . . . . . . . . . . . . . . . . . . xiii
This Book has a Dual Focus . . . . . . . . . . . . . . . . xiv
Why GitHub and JIRA? . . . . . . . . . . . . . . . . . . xv
1. Sec-DevOps . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Concepts . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 SecDevOps or DevSecOps . . . . . . . . . 2
1.1.2 Don’t blame the developers . . . . . . . . 2
1.1.3 Good resources on DevSecOps and SecDe-
vOps . . . . . . . . . . . . . . . . . . . . 3
1.1.4 History of Sec-DevOps . . . . . . . . . . . 5
1.1.5 Making the Sec part invisble . . . . . . . . 5
1.1.6 Rugged Software . . . . . . . . . . . . . . 6
1.1.7 Using Artificial Intelligence for proactive
defense . . . . . . . . . . . . . . . . . . . 6
1.1.8 When Failed Tests are Good . . . . . . . . 7
1.1.9 Why SecDevOps? . . . . . . . . . . . . . 8
1.1.10 Draft notes - DevOps . . . . . . . . . . . . 9
1.2 Dev-Ops . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Disposable IT infrastructure . . . . . . . . 11
CONTENTS
2. Risk Workflow . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Concepts . . . . . . . . . . . . . . . . . . . . . . 22
2.1.1 Abusing the concept of RISK . . . . . . . 22
2.1.2 Accepting risk . . . . . . . . . . . . . . . 23
2.1.3 Can’t do Security Analysis when doing
Code Review . . . . . . . . . . . . . . . . 25
2.1.4 Creating Small Tests . . . . . . . . . . . . 27
2.1.5 Creating Abuse Cases . . . . . . . . . . . 28
2.1.6 Deliver PenTest reports using JIRA . . . . 29
2.1.7 Email is not an Official Communication
Medium . . . . . . . . . . . . . . . . . . . 30
2.1.8 Good Managers are not the Solution . . . 31
2.1.9 Hyperlink Everything you do . . . . . . . 32
2.1.10 Linking source code to Risks . . . . . . . . 34
2.1.11 Mitigating Risk . . . . . . . . . . . . . . . 34
2.1.12 Passive aggressive strategy . . . . . . . . . 34
2.1.13 Reducing complexity . . . . . . . . . . . . 35
2.1.14 Start with passing tests . . . . . . . . . . . 35
2.1.15 Ten minute hack vs one day work . . . . . 35
2.1.16 The Pollution Analogy . . . . . . . . . . . 36
2.1.17 Triage issues before developers see them . 38
2.1.18 Using AppSec to measure quality . . . . . 38
2.1.19 Employ Graduates to Manage JIRA . . . . 39
CONTENTS
3. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . 131
3.1 Appendix A: Creating Workflow in JIRA . . . . . 132
3.1.1 Creating-a-Jira-project . . . . . . . . . . . 132
3.1.2 Step-by-step instructions . . . . . . . . . . 135
3.2 Appendix B: GitHub book workflow . . . . . . . 148
3.2.1 Book creation workflow . . . . . . . . . . 148
3.2.2 GitHub Leanpub hook . . . . . . . . . . . 148
3.2.3 GitHub online editing . . . . . . . . . . . 148
3.2.4 GitHub repos used . . . . . . . . . . . . . 148
3.2.5 Tool leanpub-book-site . . . . . . . . . . . 149
3.2.6 Atom, Markdown, Graphiz, DOT . . . . . 149
3.3 Appendix C: Security Tests Examples . . . . . . . 151
3.4 Appendix D: Case Studies . . . . . . . . . . . . . 152
3.4.1 File Upload . . . . . . . . . . . . . . . . . 152
3.4.2 HTML editing . . . . . . . . . . . . . . . 152
3.5 Appendix E: GitHub Issue worklfow . . . . . . . 153
3.5.1 GitHub Labels . . . . . . . . . . . . . . . 153
3.5.2 Reporting issues . . . . . . . . . . . . . . 155
3.6 Draft Notes . . . . . . . . . . . . . . . . . . . . . 157
3.6.1 Draft notes - AppSec Tools . . . . . . . . . 157
3.6.2 Draft notes - Code Quality . . . . . . . . . 158
CONTENTS
Introduction
There are multiple sections of this book that are still on ‘draft’ mode
(as you can see by the rest of this introduction, which still needs a
serious rewrite)
Here are some ideas that I feel will be good on this intro section.
(previous) Introduction
This book will give you a solution for the following common
problems inside AppSec and Development teams:
Why AppSec
Why Developers
Why JIRA
InfoSec tickets don’t tend to work that well with JIRA tickets.
InfoSec usually has their tools and dashboards to manage (for
example Firewalls, Anti-Virus detections, Active Directory issues)
Could these same techniques be used to manage other types of Risk.
Yes but since I don’t have that much experience with it, I will leave
it to the reader.
Change log
• v0.55
– First contribution via PR
– added license details (CC BY 4.0)
• v0.54
– Major section on Security Champions
• v0.52
– Major content changes including import of multiple
chapters that were in the Quality Book⁵
– bumping to version 0.50 due to the current volume of
content
• v0.11
– Added automation using leanpub-book-site⁶ tool
– Lots of content added, first pictures in chapters
• v0.10 (Oct 2016)
– Created Git repo and hooked Leanpub
– Added first set of content files
⁵https://github.com/DinisCruz/Book_Software_Quality
⁶https://github.com/o2platform/leanpub-book-site
CONTENTS x
Contributions
Disclaimers
License
Printed version
If you are reading this on a digital device, here are some reasons
why you might want to consider buying the printed book ⁷:
⁷the printed version of this book will be created after the first v1.0 release, and will be
released at lulu.com and Amazon.
⁸http://blog.diniscruz.com/2013/09/physical-books-are-best-technology-for.html
CONTENTS xiv
This book has a dual focus. First, it looks at how application security
(AppSec) fits within the security, development and operations
(SecDevOps) world. Secondly, the book demonstrates how a risk
workflow is the key to making the SecDevOps world work.
The first part of the book considers the actions and the activities that
AppSec introduces into SecDevOps. These actions include testing,
insurance, and using the techniques that are part of AppSec to
improve SecDevOps.
The second part of the book details a powerful technique that
gives meaning to the work of SecDevOps in an organization.
The technique is a risk workflow, based on JIRA or GitHub, that
captures and tracks every action and idea, and their implications,
that are raised by SecDevOps. These ideas, actions, and implications
must be considered, addressed, and accepted by the organization.
This book presents ideas, concepts, and suggestions to make the
risk workflow work in the real world. The subject matter is entirely
based on real-world experiences, real-world experiments, and real-
world projects across small and large organizations.
CONTENTS xv
1.1 Concepts
• https://www.reddit.com/r/secdevops/
• https://twitter.com/hashtag/secdevops
• SecDevOps: Embracing the Speed of DevOps and Continuous
Delivery in a Secure Environment https://securityintelligence.com/secdevops-
embracing-the-speed-of-devops-and-continuous-delivery-in-
a-secure-environment/
Sec-DevOps 4
DevSecOps
Sec-DevOps 5
• https://www.ruggedsoftware.org/
• Rugged Manifesto
• add explanation of what it is (and its history)
• why it didn’t really work (as least according to the original
expectations)
– lack of community adoption
– ‘Security Driven’ vs ‘Developer driver’
• The Docker case study
– why Docker was so successful (as an idea and adoption)
• lessons learned
When you are doing security analysis, you are dealing with a vast
amount of data, displayed on a multi-dimensional graph. What you
have is a graph of the relationships, of what is happening. You are
looking for the connections, for the paths within the graph, that are
made of what is really going on and what is possible.
Artificial intelligence technology can assist the human who will put
context on those connections. I think we are a long way from being
able to do this kind of analysis automatically, but if we can make
the human’s job of reviewing the results easier, or even possible,
that is a major step forward.
but the information they seek should already be known and avail-
able to them.
AppSec will often create tools to attack an application to visualize
what is going on. I have had many experiences of spending time
creating technology to understand the attack surface. Once that
task is complete, I find a huge number of vulnerabilities, simply
because a significant part of the application hadn’t been tested. The
system owners, and the various players didn’t fully understand the
application or how it behaved.
So, I think SecDevOps represents an interesting point in history,
where we are trying to merge all the activities that have been going
on in security with the activities that have been going on in DevOps
so we can build an environment where we can create and deploy
secure applications.
This relates closely to my ideas about using AppSec to measure
quality. Ultimately, the quality of the application is paramount
when we are trying to eliminate the unintended consequences of
software.
DevSecOps initially sounds better because development goes first.
But I agree with the view of DevSecOps as being more about
applying development to security operations (SecOps).
This all ties together with the risk workflows that make things more
connected and accountable.
2. Ran ‘a’ tool after build (doesn’t really matter which one what
matters is that it uses the materials created on step 1)
• use for example FindBugs with FindSecBugs ²
3. Find tune scan targets (i.e. exactly what needs to be scanned)
4. Filter results, customize rules to reduce noise
5. Gather reports and scans and put them on git repo
6. create a consolidated dashboard (for management and busi-
ness owners)
7. add more tools
8. loop from 5
after this is mature, add a step to deploy the app to a live environ-
ment (or simulator)
²FindSecBugs (https://github.com/find-sec-bugs/find-sec-bugs) has better security rules
than FindBugs and is under current active development/maintenance
Sec-DevOps 11
1.2 Dev-Ops
is changed (on install and over time), by using git diffs and branches.
• versioned
• reviewed
• tested
• released
• rolled back
• and finally retired/deleted
• provide examples
2. Risk Workflow
Risk Workflow 22
2.1 Concepts
As you read this book you will notice liberal references to the con-
cept of RISK, especially when I discuss anything that has security
or quality implications.
The reason is I find that RISK is a sufficiently broad concept that
can encompass issues of security or quality in a way that makes
sense.
I know that there are many, more formal definitions of RISK and all
its sub-categories that could be used, but it is most important that
in the real world we keep things simple, and avoid a proliferation
of unnecessary terms.
Fundamentally, my definition of RISK is based on the concept of
‘behaviors’ and ‘side-effects of code’ (whether intentional or not).
The key is to map reality and what is possible.
RISKs can also be used in multi-dimensional analysis, where mul-
tiple paths can be analyzed, each with a specific set of RISKs
(depending on the chosen implementation).
For example, every threat (or RISK) in a threat model needs to have
a corresponding RISK ticket, so that it can be tracked, managed and
reported.
Making it expensive to do dangerous actions
A key concept is that we must make it harder for development teams
to add features that have security, privacy, and quality implications.
On the other hand, we can’t really say ‘No’ to business owners,
since they are responsible for the success of any current project.
Business owners have very legitimate reasons to ask for those
features. Business wishes are driven by (end)user wishes, (possibly
defined by the market research department and documented in a
Risk Workflow 23
Not being attacked is both a blessing and a curse. It’s a blessing for
a company that has not gone through the pain of an attack, but a
curse because it is easy to gain a false sense of security.
The real challenge for companies that have NOT been attacked
properly, or publicly, and don’t have an institutional memory of
those incidents, is to be able to quantify these risks in real business,
financial, and reputational terms.
This is why exploits are so important for companies that don’t
have good internal references (or case studies) on why secure
applications matter.
As one staff member or manager accepts risk on behalf of his boss,
and this action is replicated throughout a company, all the way to
the CTO, risks are accumulated. If risks are accepted throughout
the management levels of a company, then responsibility should be
Risk Workflow 25
One lesson I have learned is that the mindset and the focus that you
have when you do security reviews are very different than when
you work on normal feature and code analysis.
This is very important because as you accelerate in the DevOps
world, it means that you start to ship code much faster, which in
turn means that code hits production much faster. Logically, this
means that vulnerabilities also hit production much faster.
In the past, almost through inertia, you prevented major vul-
nerabilities from propagating into production and being exposed
to production data. Now, as you accelerate, vulnerabilities, and
¹https://www.youtube.com/watch?v=9IG3zqvUqJY
Risk Workflow 26
and you ask those questions across the multiple layers, and across
multiple components. The better the test environment, and the
better the technology you have to support you, the easier this task
becomes. Of course, it becomes harder, if not impossible, when you
don’t have a good test environment, or good technology, because
you don’t have enough visibility.
Ideally, the static analysis should significantly help the execution
of a security analysis task. The problem is, they still don’t expose a
lot of the internal models and they don’t view themselves as tools
to help with this analysis. This is crazy when you think about their
assets.
The smaller the issue, the smaller the commit, the smaller the tests
will be. As a result, the whole process will be smoother and easier
to review.
It also creates a great two-way communication system between
Development and AppSec, since it provides a great place for the
team to collaborate.
When a new developer joins the team, they should start with fixing
bugs and creating tests for them. This can be a great way for the
new team member to understand what is going on, and how the
application works.
Fixing these easy tests is also a good preparation for tackling more
complex development tasks.
At the same time the team has constructed the following user stories
that should be implemented: * “as a registered user I want to be able
to login with my credentials so I can access my information”
Combined with the outcome of the threat model, the following
evil user stories can be constructed: * “as an unregistered user I
should not be able to login without credentials, and access customer
information” * “as an unregistered user I should not be able to
identify existing accounts and use them to attempt to login” * “as a
user I should not be able to have unlimited attempts to log in and
hijack someone’s account” * “as an authenticated user I should not
be able to access other users’ information”
• ‘defect dojo’
• ‘bag of holding’
• ThreadFix
• Other…
Threat Modeling tools could also work well between PenTest results
and JIRA.
Risk Workflow 30
• risks
• to-dos
• non-functional requirements
• re-factoring needs
• post-mortem analysis
That statement implies that if you had good managers, you wouldn’t
have the problem, because good managers would solve the problem.
That is the wrong approach to the statement. Rather, if you had
good managers, you wouldn’t have the problem, because good
managers would ask the right questions before the problem even
developed.
These workflows aren’t designed for when you have a good man-
ager, a manager who pushes testing, who demands good releases,
who demands releases every day, or who demands changes to be
made immediately.
These workflows are designed for bad managers (I use the term
reluctantly). Bad managers are not knowledgeable, or they are
exclusively focused on the short-term benefits of business decisions,
without taking to account the medium-term consequences of the
same decisions. This goes back to the idea of pollution, where the
manager says “Just ship it now, and we will deal with the pollution
later”. With start-ups, sometimes managers will even say, “Push it
out or we won’t have a company”.
The risk workflow, and the whole idea of making people account-
able is exactly because of these kinds of situations, where poor
decisions by management can cause huge problems.
We want to empower developers, the technical guys who are
writing code and have a very good grasp of reality and potential
side effects. They are the ones who should make technical decisions,
because they are the ones who spend their time coding, and they
understand what is going on.
This means that what you create is scalable, and it can be shared
and found easily. It forms part of a workflow that is just about
impossible if you don’t hyperlink your material.
An email is a black box, a dump of data which is a wasted
opportunity because once an email is sent, it is difficult to find the
information again. Yes, it is still possible to create a mess when you
start to link things, connect things, and generate all sorts of data,
but you are playing a better game. You are on a path that makes it
much easier in the medium term for somebody to come in, click on
the link, and make sure it is improved. It is a much better model.
Let others to help you.
If you share something with a link, in the future somebody can
proactively find it, connect to it and do something about it. That
is how you scale. Once you send enough links out to people, they
learn where to look for information.
Every time I write something that is more than a couple of para-
graphs long I try to include a link so that my future self, or
somebody else in the future, will be able to find it and propagate
that information without my active involvement.
Putting information in a public hyperlinked position encourages
a culture of sharing. Making information available to a wider
audience, either to the internet or internally in a company, sends
the message that it is okay to share.
Sharing through hyperlinking is a powerful concept because when
you send information to somebody directly it is very unlikely that
you will note on each file whether it is okay to share.
But if you put data on a public, or on an internal easy-to-access,
system, then you send a message to other players that it is okay
to share this information more widely. Sending that link to other
people has a huge impact on how that idea, or that concept, will
propagate across the company and across your environment.
Risk Workflow 34
The thing to remember is that you are playing a long game. Your
priority is not to get an immediate response, it is to change the
pattern, stage the flows.
• Why is it bad?
With these triage tools, AppSec specialists can identify false pos-
itives, accepted risks, or theoretical vulnerabilities that are not
relevant in the context of the system. This ensures that developers
should only have to deal with the things that need fixing.
Create JIRA workflow to triage raw issues
The creation of a security JIRA workflow for the raw issues to act as
a triage tool, and push them to team boards after reviewing, would
be another good example of the power of JIRA for workflows.
• Open, In Progress
• Awaiting Risk Acceptance, Risk Accepted
• Risk Approved, Risk not Approved, Risk Expired
• Allocated for Fix, Fixing, Test Fix
• Fixed
A big blind spot in development is the idea that 100% code coverage
is ‘too much’.
100% or 99% code coverage isn’t your summit, your destination.
100% is base camp, the beginning of a journey that will allow you
to do a variety of other tests and analysis.
The logic is that you use code coverage as an analysis tool, and to
understand what a particular application, method, or code path is
doing.
Code coverage allows you to answer code-related questions in
much greater detail.
Let’s look at MVC Controller’s code coverage as an example. We
must be certain that there is 100% code coverage on all exposed
controllers. Usually, there are traditional ‘unit tests’ for those con-
trollers which give the impression that we have a good coverage of
their behavior. However, that level of coverage might not be good
enough. You need the browser automation, or network http-based
QA tests, to hit 100% of those controllers multiple times, from many
different angles and with all sorts of payloads.
You need to know what happens in scenarios when only a couple of
controllers are invoked in a particular sequence. You need to know
how deep into the code we get, and what parts of the application
are ‘touched’ by those requests or workflows.
This means that you don’t need 100% code coverage. Instead, you
need 1000%, or 2000%, or even 5000% coverage. You need a huge
amount of code coverage because ultimately, each method should
be hit more than once, with each code-path invoked with multiple
values/payloads.
Risk Workflow 43
In fact, the code coverage should ideally match the use cases, and
every workflow that exists.
This brings us to another interesting question. If you take all
client-accepted use cases which are invoked and simulated with
tests, meaning that you have matched the expected ‘user contract’,
(everything the user expects to happen when interacting with an
application) if the coverage of the application is not at 100%, what
else is there?
Maybe the executed tests only covered the happy paths?
Now let’s add the abuse and security use cases. What is the code
coverage now?
If, instead of 100%, you are now at 70% coverage of a web application
or a backend API, what is the rest of the code doing?
What code cannot currently be invoked using external requests?
In most cases, you will find dead code or even worse, high-risk
vulnerabilities that exist there, silently waiting to be invoked.
This is the key question to answer with tests: “Is there any part of
the code that will not be triggered by external actions?”. We need
to understand where those scenarios are, especially if we want to
fix them. How can we fix something if we don’t even understand
where it is, or we are unable to replicate the abuse cases?
enough that they are captured. Moving something into the backlog
in this way, and identifying it as a future task, is a business decision.
However, you cannot use the backlog for non-functional require-
ments, especially the ones that have security implications. You have
to have a separate project to hold and track those tickets, such as a
Jira or a GitHub project.
Security issues or refactoring needs, still exist, regardless of whether
the product owner wants to pursue them, whether they are a
priority, or whether customers are asking for them.
Security and quality issues should either be in a fixed state, or in a
risk acceptance state.
The difference is that quality and security tickets represents reality,
whereas the backlog represents the ideas that could be developed
in the future. That is why they have very different properties, and
why you shouldn’t have quality and security tickets in the backlog
Pit of Despair ⁶.
Many problems developer teams deal with arise from the inverted
power structure of their working environment. The idea persists
that the person managing the developers is the one who is ulti-
mately in charge, responsible, and accountable.
That idea is wrong, because sometimes the person best-equipped to
make the key technological decisions, and the difficult decisions, is
the developer, who works hands-on, writing and reading the code
to make sure that everything is correct.
Risk Workflow 46
You must ensure that you have a very high degree of code coverage
right from the beginning. Because if you don’t, as you add code, it
is easier to let code coverage slip, and then you realise that large
chunks of your code base are not tested (any code that is changed
without being tested, is just a random change).
Reasons why code coverage slips:
1 digraph {
2 size ="3.4"
3 node [shape=underline]
4
5 "Risk" -> "Test (reproduces i\
6 ssue)" [label = "write test"]
7 "Test (reproduces issue)" -> "Risk Accepted" \
8 [label = "accepts risk"]
9 "Test (reproduces issue)" -> "Fixing" \
10 [label = "allocated for fix"]
11 "Test (reproduces issue)" -> "Regression Tests"\
12 [label = "fixed"]
13 }
When you look at the development world from a security angle, you
learn very quickly that you need to look deeper than a developer
normally does. You need to understand how things occur, how the
black magic works, and how things happen ‘under the hood’. This
makes you a better developer.
Studying in detail allows you to learn in an accelerated way. In a
way, your brain does not learn well when it observes behavior, but
not cause. If you are only dealing with behavior, you don’t learn
why something is happening, or the root causes of certain choices
that were made in the app or the framework.
Security requires and encourages you to look deeper, to find ways
to learn the technology, to understand how the technology works,
and how to test it. I have a very strong testing background because I
spend so much time trying to replicate things, to make things work,
to connect A to B, and to manipulate data between A and B.
What interests me is that when I return to a development environ-
ment, I realize that I have a bigger bag of tricks at my disposal. I
always find that I have greater breadth, and depth, of understanding
of technology than a lot of developers.
I might not be the best developer at some algorithms, but I often
have a better toolkit and a more creative way of solving problems
than more intelligent and more creative developers. The nature of
their work means that their frames of reference are narrower.
Referencing is important in programming because often, once you
know something is possible, you can easily carry out the task. But
when you don’t know if something is possible, you must consider
how you investigate the possibilities, how long your research will
take, and how long it might be before you know if something will
succeed or not. Whereas, if you know something is possible, you
know that it is X hours away and you can see its evolution, or if it
is a lost cause.
Risk Workflow 54
• explain test flow from replicating the bug all the way to
regression tests
1 digraph {
2 size ="3.4"
3 node [shape=box]
4 "Bug"-> "Test Reproduces Bug"
5 -> "Root cause analysis"
6 -> "Fix"
7 -> "Test is now Regression test"
8 -> "Release"
9 }
Risk Workflow 55
A good example of why we need tests across the board, not just
normal unit tests, but integration tests, and tests that are spawned
as wide as possible, is the story of a authentication module that was
developed as an re-factoring into a separate micro-service.
When the module was developed, it contained a high degree of code
coverage, in fact it had 100% unit test coverage. The problems arose
when it went live, and several issues occurred. One of the original
issues occurred because the new system was designed to improve
the way the database or the passwords were stored. This meant
that once it was fully deployed some of existing dependent services
stopped working.
For example, one of the web services used by customer service
stopped working. Suddenly, they couldn’t reset passwords, and the
APIs were no longer available. Because end-to-end integration tests
weren’t carried out when the website started, some of the customer
service failed, which had real business impact.
But the worst-case example occurred when a proxy was used in
one of the systems, and the proxy cached the answer from the
authentication service. In that website, anybody could log in with
any password because the cache was caching the ‘yes you are logged
in’.
This illustrates the kind of resilience you want these authentication
systems to have. You want a situation where, when you connect to
a web service, you don’t just get, for example, a 200 which means
okay, you should get the equivalent of an cross-site-request-forgery
token. Ideally you would get an transaction token in the response
received, i.e. “authentication is ok for this XYZ transaction, this user
and here is an user object with user details”.
(todo: add example of 2fA exploit using Url Injection)
Risk Workflow 56
When you make a call to a web service and ask for a decision, and
the only response you get is yes or no, this is quite dangerous. You
should get a lot more information.
The fundamental problem here is the lack of proper end-to-end
testing. It is a QA and development problem. It is the fact that in
this environment you cannot spin off the multiple components. If
you want to test the end-to-end user log in a journey, you should be
able to spin off every single system that uses that functionality at
any given time (which is what DevOps should also be providing).
If that had been done from the moment the authentication module
was available, these problems would have been identified months
in advance, and several incidents, security problems, and security
vulnerabilities would have been prevented from occurring.
Developers should use the JIRA workflow to get better briefs and
project plans from management. Threat Models are also a key part
of this strategy.
Developers seldom find the time to fulfil the non-functional re-
quirements of their briefs. The JIRA workflow gives developers this
time.
The JIRA workflow can help developers to solve many problems
they have in their own development workflow (and SDL).
Risk Workflow 58
Most teams don’t have confidence in their own code, in the code
that they use, in the third parties, or the soup of dependencies that
they have on the application. This is a problem, because the less
confidence you have in your code, the less likely you are to want to
make changes to that code. The more you hesitate to touch it, the
slower your changes, your re-factoring, and your securing of the
code will be.
To address this issue, we need to find ways to measure the confi-
dence of code, in a kind of Code Confidence Index (COI).
If we can identify the factors that allow us to measure code
confidence, we would be able to see which teams are confident in
their own or other code, and which teams aren’t.
Ultimately, the logic should be that the teams with high levels of
Code Confidence are the teams who make will be making better
software. Their re-factoring is better, and they ship faster.
• Are they able to refactor the code and keep it clean/lean and
focused on the target domain models?
http://www.forbes.com/sites/thomasbrewster/2016/07/13/fiat-chrysler-
small-bug-bounty-for-hackers/#58cc01f4606f
Always take advantage of cases when you are not under attack, and
you have some time to address these issues.
Are your logs and dashboards good enough to tell you what is going
on? You should know when new and existing vulnerabilities are
discovered or exploited. However, this is a difficult problem that
requires serious resources and technology.
It is crucial that you can at least detect known risks without diffi-
culty. Being able to detect known risks is one reason to create a suite
of tests that can run against live servers. Not only will those tests
confirm the status of those issues across the multiple environment,
they will provide the NOC (Network Operations Centre) with case
studies of what they should be detecting.
most likely they are brute-forcing the application. You can easily
identify this by monitoring for 404 and 403 errors per IP address
over time.
Teams should talk to each other using tests. Teams usually have
real problems communicating efficiently with each other. You
might find an email, a half-baked bug-tracking issue opened, a
few paragraphs, a screen shot, and if you are lucky, a recorded
screencast movie.
This is a horrible way to communicate. The information isn’t
reusable, the context isn’t immediately clear, you can’t scale, you
can’t expand on the topic, and you can’t run it automatically to
know if the problem is still there or not. This is a highly inefficient
way to communicate.
The best way for teams to communicate is by using Tests.
Both within and across teams; top-down and bottom-up, from
managers and testers to teams.
Tests should become the lingua franca of companies. Their main
means of communication.
This has tremendous advantages, since in order for it to work,
it requires really very good test APIs, and very good execution
environments.
One must have an easy-to-develop, easy-to-install, and easy-to-
access development environment in place, something that very
rarely occurs.
By making tests the principal way teams communicate, you create
an environment that not only rewards good APIs, it rewards good
solutions for testing the application. Another advantage is that you
can measure how many tests have been written across teams and
you can review the quality of the tests.
Risk Workflow 72
One of the interesting situations that occurs when we play the risk
acceptance game at large organisations, is how we are able to find
out exactly who is making business and technical decisions.
Initially, when a ‘Risk Accepted’ request is made, what tends to
happen is that the risk is pushed up the management chain, where
each player pushes to their boss to accept risk. After all, what is at
stake is who will take responsibility for that risk, and in some cases,
who might be fired for it.
Eventually there is a moment when a senior director (or even the
CTO) resists accepting the risk and pushes it down. What he is
saying at that moment in time, is that the decision to accept that
particular risk, has to be made by someone else, who has been
delegated (officially or implicitly) that responsibility.
In some cases this can now go all the way down to the actual devel-
oper/team doing the coding. Paradoxically, usually the developer
didn’t realise until that moment, that he is one that is actually
deciding how and when to fix it (or not).
Developers often hold a huge amount of power, but they just don’t
know it.
Risk Workflow 73
It can also guard for implementing multiple tools with the same
purpose.
In this way the total ecosystem can be as lean and secure as possible.
Risk Workflow 81
• add explation
For bugs and tasks, the smaller the bug the better.
Having many small bugs and issues can be an advantage for the
following reasons:
• easier to code
• easier to delegate (between developers)
• easier to outsource
• easier to test
• easier to roll back
• easier to merge into upstream or legacy branches
• easier to deploy
2.6.1 Confluence
• how it works
• security implications
• power examples of queries
• show examples
• how to use them
• common workflow
• using them to map OWASP Top 10 issues
• makes big difference when they are easy to read and under-
stand
• I hide the labels since I find that they are harder to position
and don’t provide that much more information
• add example of two diagrams diagrams. Same content: one
messy and one clean
– note how the clean one is much easier to read
Naming of states
Other workflows
Add screenshots for: - bug bounty - development - other …
A simple workflow
Sometimes called the ‘Memo from God’, the most famous one is
Bill Gates’ ‘Trustworthy computing’²¹ memo from January 2002
(responsible for making Microsoft turn the corner on AppSec)
cars, etc.
We will create an internal reward system and ensure
there are some good professional perks for finding and
reporting issues.
The word ‘responsibly’ means that if you find a way
to blow up one of our websites, or access confidential
data to which you shouldn’t have access, we expect
you to create a ‘non-destructive’ PoC and use test
accounts, not real customer data. Of course, we know
that accidents happen, and we will use common sense.
Eventually, we will create a public ‘bug bounty’ pro-
gram (after we’ve done a couple of internal rounds), so
if you feel that your app will struggle with a public call
for ‘..please hack XYZ…’, then you better start finding
those issues.
…in conclusion
Times are changing and XYZ is changing. When faced
with a scenario where security will be affected by a
feature, we need to choose security, or clearly under-
stand the trade-offs.
Security is now a board-level issue and if you feel
that any area of our coding/technological world is
not receiving the focus it requires, then your duty is
to escalate it and fight for it. After all, XYZ is your
company too.
These changes represent a great opportunity to make
our technology stack and code even better. I hope that
you are as excited as me to take XYZ to the next level.
• add info about how this is best done using Threat Models and
asking the STRIDE questions
2.7.8 Learning-resources
vulnerabilities so you can learn how they work and you can get
clues if you get stuck. The first thing is to do the exercises and hack
into these applications.
It is very important that security champions get the time, the space,
the mandate, and the information required to do their jobs.
²⁵https://www.owasp.org/index.php/OWASP_Broken_Web_Applications_Project
Risk Workflow 104
The good news is, now that you have security champions (at least
one per team), their work will allow you to see the differences
between the teams and the parts of the company who can make
it work, and those who struggle to make it happen.
To participate successfully in the security of their projects, security
champions must execute the following tasks:
• review code
• fix code
• write tests
• know what is going on
• maintain the JIRA tickets
• create Threat Models
• be involved in the security practices of the teams
• budget to sponsor the best one in last month (or week, if there
is a lot of activity)
– Participation in conferences (ticket + travel expenses)
– Books
Risk Workflow 106
In the Risk ticket, explain that you tried to persuade the team
to accept the risks of not doing security, and that they are now
responsible for their security, because you cannot help them.
In such cases, the problem lies not with the Security Champion,
but with the company and the organisation, maybe even sometimes
with the team itself. This is why it is important to have success
stories you can point to and say, “Hey! It worked with that team,
and it worked with that team. If it doesn’t work with this team, then
I am not the problem”.
• think of the money that it costs the company to have all those
resources in there. Make it count and don’t waste participants
time
– the developers are very busy and if the meetings are not
relevant or not interesting they will just don’t turn up
• central AppSec team (to help kickstarting the meetings) and
to keep the energy level up) must always be looking for topics
to present at the next SC meeting. For example:
– Threat Models
– AppSec Questions
– AppSec ideas
– events from a point of view on an attacker
* attacks
* AppSec news,
* basically any AppSec related topic that has not
been presented recently)
2.7.23 To Add
add references to
Risk Workflow 115
2.7.27 BugBounties
Also go for bug bounties, which is a nice list of companies that give
you permission to ‘hack/attack’ them and find security stuff.
• Get a mug with lots of white space on the front and back
Risk Workflow 118
Microsoft
The Microsoft Agile SDL describes them as Team Champions
In Simplified Implementation of the Microsoft SDL²⁸
and
Mozilla
Mozilla has a good pages at Security³⁰ and Security/Champions³¹.
• Contributor
• Security Contributor (Bug Bounty Reporters/Patch submit-
ters)
• Security Mentors
• Security Group
BSIMM
In BSIMM security champions are named ‘satellites’ and described
in section 2.3 > The satellite begins as a collection of people
scattered across the organization who show an above-average level
of security interest or skill. Identifying this group is a step towards
creating a social network that speeds the adoption of security into
software development.
…others?
… add more
For example, I have seen threat models where one will say, “Oh,
we get data from that over there. We trust their system, and they
are supposed to have DOS protections, and they rate limit their
requests”.
However, after doing a threat model of that system, we find that it
does not have any DOS protections, even worse, it doesn’t do any
data validation/sanitisation. This means that the upstream service
(which is ‘trusted’) is just glorified proxy, meaning that for all
practices purposes, the ‘internal’ APIs and endpoints are directly
connected to the upstream service callers (which is usually the
internet, or other ‘glorified proxies’ services).
This example illustrates how, when you start chaining threat mod-
els, you can identify data that shouldn’t be trusted, or data that is
controlled by the attacker. Usually the reverse also applies, where
when you go downstream and check their threat models, you will
find that they also trust your data and actions far too much.
Of course, the opposite of this scenario could also be true. One of
the threat models might say, ”…we have a huge number of issues at
this layer”. However, when you look at the layers above, you find
they are doing a good job at validating, authorising and queuing the
requests; they are all working to protect the more vulnerable layer,
so the risk is low overall.
When you chain a number of threat models, you track them,
document them, and you greatly increase your understanding of
the threats. You can use this new knowledge in the future to ensure
that you don’t expose that particular threat into new systems or
new features.
• for each case study, add list of Risks that need to be added
• Examples of Case Studies to add:
– Source code, Keys and passwords stored in Developer’s
laptop
³⁵https://www.microsoft.com/security/data
Risk Workflow 125
These questions are important because they are the ones that really
allow us to plan and understand the best way to structure Threat
Models.
Risk Workflow 126
are much easier to visualise in a state where you can actually see
the new connections and the impact of the change/feature request
(by the business owner).
Reviewing Threat Models is much easier when only looking at what
changed since the last version.
Existing efforts
Note: OWASP currently has an active Slack channel³⁶ and an
inactive project³⁷ on Threat Modeling
1. Take the threat models per feature, per layer and confirm that
there is no blind spots or variations on the expectation
2. Check the code path to improve the understanding of the
code path and what is happening in the threat model
3. Confirm that there are no extra behaviors
Next, you want to create a full flow of that feature. Look at the entry
point and the assets, and look at what is being used in that feature.
Now, you can map the inputs of the feature; you can map the data
paths given by the data schema, and then you follow the data.
You can see for example how the data go into the application, what
it ends up with, who calls who. This means you have a much tighter
brief, and a much better view of the situation.
1 digraph G {
2 size= "3.0"
3 node [shape=box]
4 Idea -> "Project brief"
5 -> "Scheduling"
6 -> "Development"
7 -> "QA"
8 -> "Release"
9 }
1 digraph G {
2 size= "4.5"
3 node [shape=box]
4 Idea -> "Project brief"
5 -> "Scheduling"
6 -> "Development"
7 -> "QA"
8 -> "Release"
9
10 "Project brief" -> "Threat Model"
11
12 "Threat Model" -> "Option A" -> "Risks"
13 "Threat Model" -> "Option B" -> "Risks"
14 "Threat Model" -> "Option C" -> "Risks"
15 "Risks" -> "To be accepted"
16 -> "Scheduling"
17 "Risks" -> "To check implementation"
Risk Workflow 130
18 -> "QA"
19
20 "To check implementation" -> "Pen-test"
21 -> "Release"
22
23 }
3. Appendix
• Appendix A: Creating Workflow in JIRA
• Appendix B: GitHub book workflow
Appendix 132
This section shows how to create the JIRA workflows without using
any JIRA plugins
Key concepts of this workflow
3.1.1 Creating-a-Jira-project
For these examples we will use the version hosted JIRA cloud called
(in Oct 2016) JIRA Software.
Note that the same workflow can be created on the on-premise
versions of JIRA (including the older versions)
If you don’t have a JIRA server you can use, then use can create
on using the Jira evaluation page¹ and choosing the JIRA Software
option. I would also add in the Documentation (aka Confluence)
module since it is a very powerful wiki (which is called Confluence)
¹https://www.atlassian.com/software/jira/try
Appendix 133
3) login
7) Your new JIRA Project dashboard should open and look some-
thing like this
Appendix 135
9) call it Risk
10) Go to Issue type schemes and click on ‘Add Issue Type Scheme’
11) Call it Risk Scheme and add the Risk Issue type into to (click
Save to continue)
Appendix 137
• what are all the actions that occur, from making a code
change to having a preview done
• explain two modes (Github online editing and offline editing
using Atom editor)
• how it works
• what it does
• https://github.com/DinisCruz/Book_Jira_Risk_Workflow
– hold content and raw files
– better searching since the manuscript files are not there
– used to create the stand-alone version of the book
• https://github.com/DinisCruz/Book_Jira_Risk_Workflow_Build
– holds files in Leanpub friendly format
– hooks into leanpub via GitHub service
* every commit to this repo will trigger a build
Appendix 149
Editing and diagram creation was done on Atom editor² with the
markdown-preview-enhanced³ plugin
Text was written in markdown⁴
Diagrams where created using DOT Language⁵ , rendered by Graph-
Wiz⁶ and Viz.js⁷
This is what the IDE looks like:
²https://atom.io/
³https://atom.io/packages/markdown-preview-enhanced
⁴https://leanpub.com/help/manual
⁵http://www.graphviz.org/doc/info/lang.html
⁶http://www.graphviz.org/
⁷https://github.com/mdaines/viz.js
Appendix 150
References:
⁸http://www.graphviz.org/doc/info/shapes.html
⁹http://www.graphviz.org/doc/info/attrs.html
¹⁰http://mdaines.github.io/viz.js/
Appendix 151
These are a mix bag of notes made on real notebooks which need
to be normalized, converted into paragraphs and placed in the right
location
There might be some repeated content which has already been
covered in the book
after this is mature, add a step to deploy the app to a live environ-
ment (or simulator)
These files are a first pass at a particular topic, done as audio files,
recorded on my mobile, and transcribed verbatim.
Some say that you know the vulnerability where user and account
ID was received from the outer world that was basically passed into
the back end and user and he was able to just use easy to be data.
The root cause of those problems tend to be the fact that the
controller is actually able to access and make those calls. So you
need to open a risk for that. Then you also need to open another
risk for the fact that your need to create an account ID from the
front end from user data.
So there is already implication of that dangerous operation. And the
third one is that actually step of actually using that you know the
one that creates a vulnerability, passing the risk directly to the back
end with the violation.
Now when you fix this a lot of times to fix is done at the control
level so you add the method there to say, “hey this user has access
to this, this user has access to this account”.
The problem that is the wrong fix, it is a hack so you need to create
a risk to that. Because the real fix should be done at the back end
servers the real fix should be you should pass for example the user
token or the back end and then use that to make a decision whether
the user can access that information or not.
So that is a good example situation though the fix was formed, you
actually the hack and also need to review other cases where that
occurred. So you need to create a new ticket to be accepted to saying
hey although we solved the problem it is actually not, we didn’t fix
the root cause of the problem and then add that as code references.
How to deal with teams that say they are already secure and don’t
have time for security.
Appendix 163
So, every now and then you will find a team that has a lot of power
to live with the level of applications.
And it is able to push back even at the very senior levels of the
company to say we don’t need these security stuff, we don’t need
these threat models, we don’t need all these security tasks, all these
activities that you guys are asking us to do.
Assuming that the security isn’t really in attack space reading
dragged by a team that is actually trying to do the right thing and
is trying to push good practices and is actually trying to add value
then the way to deal with these guys is to basically call their bluff.
The way to deal with that is to basically play them at their own
game and to basically say well if you don’t do that it means that
you are better reproducing really high secure code, you better have
no security vulnerabilities, you better have no security exploits, you
better fully understand your attack surface, you better have no side
effects, very clean deployments all that jazz because those are all
side effects of bad security practices.
So, basically what we then need to do is to document them, make
them accept all those parts of risks, and then wait for Murphy’s Law
to come along and to do that.
The other thing that is very important is that you also need to
challenge the assumptions. So for example if they have pen tests,
make sure they understand pen tests aren’t worth unless they are
full white box.
The solution is to make them click on that accept to be supported.
What is important to understand is that that isn’t the moment that
they will accept the risk that is the moment where they will actually
read it.
So that is very interesting long term gain that you play where it is
all about changing the culture, it is all about finding ways to create
a positive work flow.
Appendix 164
days more and more the clients are getting way more savvy. And if
they can make sure the multiple products that they are consuming,
then they will put pressure and in a way they will vote with their
decisions which is really what you want to see happening.
And a great way this will occur is as the insurance company gets
along which will push a much better validation of the issues which
will bring a huge amount of rationality and data points and data
analysis into what is at risk, how you can measure, how you name
it, how you define it, we need to make sure that those mappings and
that information is all public.
And this is something that can easily not happen, but I feel that
because the industry is still immature and is so young, we can
actually point it in the right direction from now on.
Reducing risk to a number, a very interesting idea given to me by
a CISO friend is to reduce all of AppSec and InfoSec activities into
one number, which gets presented to the board, which then can be
used to measure what is the current risk profile.
In a way the number is a collation of multiple numbers which
are then easy to understand and easy to map. And this actually is
very tightly connected to the idea of maturity models where you
use different…you measure the maturity of the multiple teams or
multiple systems or multiple products and then understand better
who is doing what.
What I really like about the maturity models is that it allows the
temperature to be defined in the actual model, so it allows the
temperature much more objective and much more pragmatic way
of actually looking at the problem and looking at the issues and
that is basically a really nice way of controlling the temperature
and applying pressure on the right places.
And also knowing where to act because when you look at the
multiple patterns and the multiple activities in the maturity model,
you see which activities are working or it isn’t working or is being
Appendix 166
done or not being done and that is a great way of analyzing the
organization.
In fact even the individual items of the maturity model needs
sometimes maturity models because when you say you have a
security champion, the whole security champion world has in a way
its own maturity model where in the beginning it is the binary, do
you have one or not?
Then you start to look at how effective they are, how knowledgeable
they are, how actually able they are to perform their duties same
thing with code reviews, same thing with path chain, same thing
with management of dependencies, same thing with threat analysis,
all sorts of stuff.
All of those are basically things that you should measure the
maturity model eventually leading to a higher one which eventually
leads to a number.
So it is quite a nice work flow to do across the enterprise, it also
scales really, really well.
As the majority of the AppSec world grows up, it is very key to
make very clear to business that a typical black box pen-test i.e. test
of website from the outside world who know inner knowledge is at
most a waste of money, at best just a check to see if a kid or a non
specific attacker has a go at your application whether you will find
the job or the pen-test is to find the blind spot.
In fact the job of the security assessment is to find the blind spot. Is
to take everything you know about the application or their threat
models or the assumptions, in fact all those ticket items that your
security champs have raised as being the problem we don’t have
enough evidence and then find it.
So, in a way the job of security assessment is to improve your
evidence, improve your evidence of the problem, or improve your
evidence of how secure you are.
Appendix 167
for the damage then that is okay, then you can argue that you can
actually leave them on because it is actually not a big deal.
An interesting thread that I have been having recently with security
champions is that they really need to provide evidence for the
tickets they open.
So it is quite nice when you have this work flow with the security
champion opening up issues and they have been pract about it. In a
way the next step is to start to provide very strong evidence about
it and if they can that is an issue in itself.
And a good exercise for example for hackathons or for get togethers
is actually come to the table. So for the security champions to come
to the table and say, “hey, I think there is a problem, I think there is
an issue can we prove it?”
Because that is the key of the game, you have to build a proof,
you have to be able to provide evidence for the findings you are
actually opening up. And you need to basically be able to allow the
person who will actually accept the risk or make a decision to really
understand what is going on.
Which is why the exploits these are critical, see the power of
exploits there because that is so critical, you really need to have
evidence and sometimes you need an exploit to really show this is
how it could happen, this is how the problem is and sometimes you
do in production, sometimes you do in QA it depends on the impact.
The key of all this is evidence and proof.
Once you get high degree of code coverage, a really powerful
technique that you can start to use is to start to run specific suite of
tests and specific slices of the application and see what gets covered
and what doesn’t get covered.
It is very important especially on things like APIs where you are
able to understand what actually is tabled from the outside world
and now you argue that especially from an API that is maintained
Appendix 169