Maturity of Code MGMT 2016 04 06

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Maturity of source code management – another

approach to code quality


Bogusz JELINSKI

Abstract. Source code quality is often viewed only through the prism of the static analysis and good programming
practices. But there are many other code characteristics which if treated carelessly can entail loss of the entire code or
increased number of incidents in the production environment. The purpose was to depict the neglected traits of source
code management (SCM) and to propose a tool which should address some process vulnerabilities – the encapsulation,
in which the contamination of deployed code is prevented by limiting the input points and the dependency on humans
are minimized. Additionally to the tool a metric has been proposed to measure the extent to which the process is made
hermetic (SCM maturity), preventing code contamination. Finally statistical data is provided for the code integration
pattern – one of the neglected trait of SCM.

Keywords. Source code, quality, proces maturity, software process improvement, SPI, SCM.

1. Introduction
Source code is a vital company asset – it is the core of the digital business. It determines the
behavior of computer systems but is often not visible by the business staff and is not managed
explicitly. When “good code quality” is required, “the compliance with good programming
practices” is meant. It is overlooked that any perfect code can by outdated or licenses
unconsciously used by developers can obligate code’s owner to share it with the whole Internet.
The code can even be completely useless (any further change or maintenance blocked) as there
might be no knowledge or tools to compile it. Missing know-how can effectively limit the
competition between potential suppliers and can strengthen the negotiating position of the current
one. One should also know that it is technically feasible to deliver a fully functional IT solution
without having delivered the compiled part of the source code with all negative consequences for
future development and maintenance.
In less scary scenarios the way you use source code affects the efficiency of IT departments
and the user experience. If you construct 90% of code during integration or user acceptance tests
(see Section 7), either the tests are costly or the production environment is exposed to a serious
threat. All three project dimensions - cost, time, scope/quality are badly affected.
This article is based on author's experience in an implementation of the source code
management process (SCM) in a large telecommunication company using hundreds of computer
systems. The goal of the effort was to depict both threats arising from mismanagement of source
code and ways to avoid negative consequences of the materialization of risks. The following key
questions were formulated at the beginning of the research:
1. What configuration items should be managed to protect the source code?
2. What are the aspects of code quality?
3. What are the risks of a mismanagement?
4. How to design the SCM process in order to address these risks?
5. Can a performance indicator be proposed for the process?
6. Are there any other vital process metrics? For example – how much of the code is
constructed during tests?

As a result of the research a tool has been proposed, which addresses some process
vulnerabilities – the encapsulation, in which the contamination of deployed code is prevented by

1
limiting the code delivery points and the dependency on humans is minimized by enclosing their
knowledge into the process (eg compilation and deployment scripts and tools). The improving
performance of the process can be measured using a commonly understood scale – maturity,
which may be also a part of a customized software maintainability measure, as the ISO 25010 [1]
sees it but does not operationalize it. The term “SCM maturity” was used in a similar context eg
by Forester Consulting [2] and there were many other attempts to create a software maturity
model [3], not to mention ISO/IEC 33004 [4].
The maturity scale is proposed in Section 6, two snapshots of its progress measured during a
vast IT transformation are provided. It is worth mentioning that the scale was used not only for
measurement itself but also as an improvement guide - a checklist of good practices, a target to
be achieved by managers. The goal of the SCM transformation was to provide complete and up
to date source code for a vendor consolidation programme, so the idea behind was to concentrate
on the asset management, not on agility or efficiency.

2. Elements of source code


Let us make a brief summary what “source code” actually is. It is made up of files containing
one or more instructions executed in a specified order by a computer. These instructions
determine functionalities of a system, regardless of the programming language - C/C++, Java,
PL/SQL, PHP or HTML. The source code also includes SQL scripts eg scripts performing a
rollback to the last, error-free software version.
When transferred to the purchaser the source code should also be accompanied by:
• a description of compilation sequence and all tools necessary to complete the compilation,
• resources not considered as source code but required to compile and run the system eg
properties and configuration files, Ant/Maven scripts, and libraries,
• shell scripts used to compile, configure, run, manage, and monitor the system,
• unit tests – numerous albeit small fragments of code which goal is to verify the correct
operation of the code itself.

3. How does the quality of code manifest itself?


The table below presents some operational characteristics linked to the quality of code. This is
more a subjective business approach than an attempt to provide a complete categorization of
software characteristics as proposed in ISO-25010 [1].

Trait Description If neglected


coherent with the production some functionalities missing
up to date
environment
containing all the elements indicated a blocker for development and
complete
in Section 2. maintenance
the software process (dynamic) a blocker like above. If we do
perspective - should compile if not know how to compile the
compilable complete. code or do not have tools to do it
we are unable to conduct any
change in a computer system. Its
source code has then only a
sentimental value for us.
no change/functionality forgotten for some functionalities missing
fully
a release (ISO functional when the release goes live.
merged
suitability/completeness)
Had all expected functionalities been More issues on production likely
delivered before tests started? What to appear (see Section 7)
well tested
does code integration pattern [5] look
like?
coding ISO-9126-3 [6] (replaced by ISO- Security or performance issues,
standards 25010) or other good programming code hard to maintain
obeyed practices conformed with.
One has to be sure that a software It can restrict the use of software.
open-
complies with open-source (BSD, In the best case - you shall be
source
ASL, …) and free software (GPL, forced to reveal your source-
licenses
LGPL) licenses, as these licenses are code.
obeyed
unconsciously used by developers.
that metaphor of the coding The interest on the debt is the
standards has gained momentum cost of bug-fixing
lately. It is an attempt to estimate
low IT how much would it cost to remove
debt from the source code all violations of
good practices. The methodology for
the calculation of this measure is still
in its infancy.
Yet another metaphor – how greedy More energy consumed. more
eco
for computing power a software is. It hardware to be purchased.
friendly
might combine the static analysis
(green
with a some run-time characteristics
index)
– memory and processor usage
Avoid duplication of functionalities Software harder to maintain.
within one system (less copy-pastes)
code and across the organization - see
reused ISO/IEC 25010 reusability. It can
help reduce the cost of system
development.
code what proportion of code is being Higher risk of bugs
covered tested by tests such as unit tests?
Table 1 – Traits of the good quality code.

Three traits from the table will be discussed in more detail later on.
Any discourse about code quality should (but very often fails to) start with its completeness (or
integrity, if ISO terminology is to be followed) and being up-to-date. In order to protect these
valuable characteristics it is proposed to meet a few requirements regardless to what extent the IT
services have been outsourced:
• take care of legal aspects, including the right to independently modify the source code if it

3
is reasonable.
• store the code in your own repository (revision/version control system, in other words) or
at least in one which is maintained by a trusted third party (escrow).
• oversee your manufacturing process to be sure that your source code repository is
consistent at least with your production environment. You can do it by encapsulation of
the code management process (recommended, see below) or by a periodic comparison of
the repository with runtime environments.
• automate as many steps of the process as feasible. Get rid of the human factor.
• review (inspect) your code manually or automatically and measure other process
characteristics (the code integration pattern – see below).
All source code should be conserved, not only that of the biggest or critical systems. It
happens that a loss of source code of a seemingly irrelevant module, which is integrated
(sometimes in an undocumented way) with crucial systems blocks an important change in these
crucial systems. That was exactly the case that gave the author of this article the stimulus to take
up the matter and to spend a few years on it.

4. Encapsulation of the SCM process


Based on author’s intuition rather than having committed an extended research a technique has
been proposed to ensure the completeness and timeliness of the source code - the self-
deployment of production environments by the acquirer with its own repository, or carrying
out this task by a trusted third party, thus avoiding any unauthorized, undocumented or just
unknown changes to go live. If adapted also to tests environments and accompanied by some
supervision of repositories' content it helps avoid the deployment of untested changes.
Therefore, to avoid “contamination” of changes, the following preventive encapsulation is
proposed:
1. the software supplier should be formally obliged to the provision of the complete source
code (vide description, see Section 2) with an unequivocal documentation of compilation
sequence and a deployment procedure, most preferably a script, eg a Jenkins job,
2. after the construction phase the supplier should submit the code to the acquirer's
repository, by hand or by automated replication of the supplier's own repository. The
decision on which stages of the project the code is to be delivered and whether the test
environment should also be subject to the regime of this process depends on the particular
organization and situation.
3. the supplier provides the address (location) of the source code (for Subversion in form of
URL@REVISION). The location of each version should be stored in a Configuration
Management Data Base (CMDB). This detail is important as there can be hundreds of
branches in the repository and the withdrawal of changes due to the occurrence of
unforeseen, negative occurrences requires the knowledge of the location of the previous
version. A good practice is to appoint a person (a product owner) responsible for keeping
CMDB up-to-date, even if the release history is supported by tools.
4. The code submitted by the supplier, which location is stored, is checked out from the
repository under the supervision of a person who is loyal to the acquirer (if the process is
not automated).
5. The code gets compiled as described in the compilation procedure under the same
supervision; executables are produced.
6. The interpreted code and executables are deployed to runtime environments under
supervision or in an automated way. Let us repeat – the only executable allowed to be
deployed is the one produced in step 5 (the interpreted code checked out in step 4)
7. in further phases stabilization and maintenance fixes are delivered (and new change
requests, new functionalities) – the process iteratively returns to step 2.
Encapsulation does not mean here creating a black box, hiding implementation details. The idea
is to restrict human interference which would mean adding some undocumented knowledge,
complicating repeatability of the process.
If you pursue that process at least for production environments then you get confidence of
having up-to-date and complete source code. With one crucial exception – this process does not
guarantee the delivery of source code for libraries other than open-source software and
commercial third-party stuff. We mean libraries with ordered functionalities for which no source
code is provided, deliberately or mistakenly. Human intervention is required, periodically or
every time – the verification of changes in the list of libraries required for the purpose of
compilation. Unfortunately, sometimes we discover the lack of code for a library many years
after an implemented change, when you need to change the functionality delivered by the library,
often when its author is not available or the agreement frees him from liability. Such a library or a
part of it is to be implemented from scratch. As stated before, this is not always due to malice
intent of software provider, they may also have competency gaps, as well as their sub-contractors.

5. An alternative for encapsulation


While studying various SCM process implementations the author noticed two alternatives to the
implementation of the above-described approach to source code management (as a method of
achieving completeness of the code):
- performing at certain intervals a comparison of code archive with the content of production
environments.
- deploying the production environment from the repository periodically or at the end of an
outsourcing contract, with extensive regression tests.
These methods allow to dispense with keeping excessive human resources on the acquirer’s
side.
However:
• such ad-hoc actions must be conducted or supervised by a competent and loyal third
party, usually a highly paid consultant.
• due to the nature of technology it is not always possible to determine the conformity of
the object code (executables), even if they are formed from the same source code.
• The restoration of the production runtime environment from the repository is generally
not accepted by the business side because of the risk of downtime and requires costly
regression testing of the whole system.
• an efficient, repeatable process, subjected to self-optimization, in which people act in a
quasi-automatic, learned way, is usually much less expensive (and often even unnoticed)
than single audits, requiring escalations and interfering with the course of other business
processes in the company.
• postulated in Section 4 techniques and tools do not imply the creation of any new process
or process instance, but they only shift already functioning activities in terms of
supervision, roles and tools towards the acquirer. The resulting costs are already incurred
and included in the price, the acquirer pays for them (eg for the repository storage for
developers) as the supplier is not a charity organization. So the postulated change is

5
mainly about the loyalty of supervisors, ownership of tools and procedures and their
location.

6. The SCM process maturity


It is suggested that you can't control what you can't measure [7]. For the needs of a
transformation of source code management within a vast collection of systems (197 systems,
some of them counting 1M+ lines of code) the author developed an assessment model [8] with a
process maturity measure (see Fig. 1) which described the degree of encapsulation. In this way
it was possible to report transparently (eg with suggestive colors) and numerically the progress of
expected organizational changes and to report aggregated values per department or functional
area. The idea was to assign a number to qualitative attributes that described the achievement of
an important milestone, as follows:
• level 0 (black): lack of source code, unknown location of code
• level 1 (red): source code in a drawer, in a server, not in a repository under version control
• level 2 (orange): code under version control, known location stored in CMDB but unable
to compile it – no knowledge, no tools or no hardware. A case where there is the
knowledge and tools but they are not used for deployment is also classified as Level 2.
Here we have no certainty that the code is coherent with the production environment (eg
environments are deployed from supplier’s repository). That is probably the most
common case in the business with an outsourced IT - a semblance of having complete
source code.
• level 3 (yellow): the production environment is deployed from code owner’s repository
(the process is supervised) but compilers are owned or managed by the supplier or the
compilation is an arduous process – no single build script. In other words – we have much
more trust in the content of a repository but there might be some knowledge missing how
to use it. This situation might be very risky too – old versions of tools might be hard to
get, threats arising from a migration to new compilers might be hard to accept by the
business owner. For the interpreted code this level could be the first to be labeled “code
under control” as level 4 and 5 are compilation oriented.
• level 4 (green): compilation/build of a system might still be conducted „by hand” but it is
carried out in business owner’s environments (not in suppliers’ premises) and a single
build script exists – the build knowledge is stored in form of a computer
algorithm/sequence, not stored in someone’s head or an ambiguous text description.
Executables produced by this script are deployed to the production environment. In other
words – we have the code, compilers and most of the knowledge, we use them but there is
little automation.
• level 5 (light blue): automated build, compilation is carried out eg. by a continuous
integration tool - there is a front-end with the “build” button and build history. It is an
important step because of getting rid of the human factor and due to the repeatability of
the process.
• level 6 (navy blue): the production environment is deployed automatically with no human
intervention, the executable is distributed by a script/robot. That is the last place for a
human to monopolize the knowledge and a vital step towards increased operational
efficiency.
Fig. 1. Levels of maturity of source code management

When the transformation started more than half of the code was not in the repository. After
fifteen months nearly all compiled systems had their own compilation automation, many
compilation scripts were written from scratch as suppliers did not want to provide them or there
was not any contract relationship with any supplier at that time. About one third of systems (71 of
total 197) stayed at the level 3 as they had no compiled components.

Fig. 2. The progress of code maturity within a SCM transformation carried out by the author
– the share of maturity levels within 197 systems.

7. The code integration pattern – have the changes been well tested?
There is a trait of code that requires special attention and which characterizes both the trust in
code and the quality of the whole SCM process. The following questions may be raised during a
software project:
- were all changes planned for the next release ready to be merged to the integration branch
before starting tests?
- in other words - did the changes have the opportunity to be tested?
- were all functionalities expected by business owner developed? At least developed before the
code affected a test case?
This might resemble the “fully merged” trait (see table 1) but it is not about forgetting to conduct
a merge but about an intentional behavior – a decision to start integration or acceptance tests
without having closed the development. Unfortunately the practice to construct source code

7
during tests is very common and may be caused by poor requirements management or inadequate
volume of human resources (developers, testers). This is a pathological phenomenon - a disease
which should be fought as it ruins quality. When there are fixed release deadlines it implies
running untested functionalities. The source code repository lets you easily, quickly and cheaply
measure the percentage of source code delivered at each project phase. If you use Subversion
then 'svn diff' does the trick. If a reasonable threshold is exceeded you can analyze the causes and
risks. This is feasible provided that also the test environments are made hermetic (encapsulated;
see Section 4), not only the production ones. This metric is like a clinical thermometer, it
aggregates all the pathologies which pollute today’s software processes.
The concept to pay attention to the extent of code being constructed during particular project
phases is not new. The code integration pattern was mentioned by Stephen Kan [5] as a “simple
and useful project management and quality management tool” (see also [9]).
The author measured the percentage of code constructed during integration and user
acceptance tests during eighty-one projects, which changed five systems. The value (see table 2)
is underestimated by at least ten percent points as in most cases there was a delay in the code
delivery to the repository or indicated code revisions were late. On average 43% of the code was
constructed during tests.

Number
Average
System of
[%]
projects
System1 5 56,3
System2 12 40,5
System3 14 33,9
System4 10 50,7
System5 40 43,8
Table 2. Average percentage of code constructed during tests

Moreover, the number of projects with the value exceeding 80% was nearly the same as these
below 20%.

Fig. 3. Frequency of particular code integration patterns (the percentage of code constructed
during tests)

This data is presented to visualize the scale of the problem within modern companies, which
aspire to become “digital” ones.
8. Some other interesting topics
Many other interesting aspects related to the management of source code have been omitted here,
which are a source of potential risks and opportunities, for example:
• organization of work in the repository, including issues so meticulous as permitted
directions of promoting bug fixes in the project branches, regime of code merges, naming
of branches.
• measurement of size changes in a software (backfired function points)
• monitoring of code delivery and its impact onto work efficiency (Hawthorne effect [10])
• which programming language to choose to reduce lock-in to suppliers or meet expected
performance goals?

Summary
There are many code characteristics which if treated carelessly can entail loss of the entire code
or increased number of incidents in the production environment. Most of the risks described
above are due to innate imperfect human nature. Software suppliers and employees will always
want to have a better negotiating position by monopolizing know-how wherever possible. By
automating tasks we can make the software process repeatable and self-documenting and the
negative consequences of losing an employee are mitigated. Instead of a vague Word document
XML files are used (eg. Jenkins jobs). The automation is often already used by software
suppliers, mainly to oversee the work of subcontractors and their employees. No additional effort
to be claimed and reimbursed!
Therefore, the code management is not just a matter of engineering – it helps maintain
competition in the supply of software and exerts pressure on the suppliers bids, thus it assists in
building competitive advantage.

References
1. ISO/ IEC CD 25010, Software Engineering: Software Product Quality Requirements and
Evaluation (SQuaRE) Quality Model and guide. (2008)
2. Continuous Delivery: A Maturity Assessment Model. Forrester Consulting (2013)
https://info.thoughtworks.com/Continuous-Delivery-Maturity-Model.html [16 March 2016]
3. Schweigert T., Vohwinkel D., Korsaa M., Nevalainen R., Biro M.: Agile maturity model:
analysing agile maturity characteristics from the SPICE perspective. Journal of Software:
Evolution and Process. DOI: 10.1002/smr
4. ISO/IEC 33004 2nd Edition, Information technology — Process assessment — Requirements
for process reference, process assessment and maturity models. (2015)
5. Kan, S. H.: Metrics and Models in Software Quality Engineering. pp. 242, Addison-Wesley,
Boston (2004)
6. ISO/IEC TR 9126-3, Software Engineering - Product Quality - Part 3: Internal Metrics.
(2003)
7. DeMarco, T.: Controlling Software Projects: Management, Measurement and Estimates. pp.
3, Prentice Hall, Upper Saddle River (1982)
8. Wagner S.: Software Product Quality Control. pp. 36, Springer-Verlag, Berlin (2013)
9. Laird, L.M., Brennan, M.C.: Software Measurement and Estimation: A Practical Approach.
pp. 186, IEEE Computer Society Press, Los Alamitos (2006)

9
10. Landsberger, H. A.: Hawthorne Revisited: Management and the Worker, Its Critics, and
Developments in Human Relations in Industry. Ithaca, Cornell University (1958)

You might also like