Academia.eduAcademia.edu

Cognitive skills of experienced software developer: Delphi study

2004

In this paper a qualitative study of cognitive skills of experienced software developers is presented. The data for the study was gathered using the Delphi method. The respondents were 11 software developers who have worked at least five years after their graduation. The respondents were found using recommendations since the goal was to find especially good software developers. Thus, they are not a statistically representative sample from all software developers but more like a focus group. Two questionnaire rounds were conducted. In the first round, the respondents mentioned altogether 32 different skills. In the second round, 10 of the respondents answered and evaluated the importance of these 32 skills. The results are divided into two categories: composition and comprehension. For each skill, the evaluated degree of difficulty of the skill is presented (e.g., does the skill efficiently differentiate experts from novices).

Kolin Kolistelut - Koli Calling 2004 Paper R/05 1 Cognitive skills of experienced software developer: Delphi study S. Surakka and L. Malmi Helsinki University of Technology, Laboratory of Information Processing Science, P. O. Box 5400, FIN-02105 HUT, Finland [email protected] and [email protected] Abstract In this paper a qualitative study of cognitive skills of experienced software developers is presented. The data for the study was gathered using the Delphi method. The respondents were 11 software developers who have worked at least five years after their graduation. The respondents were found using recommendations since the goal was to find especially good software developers. Thus, they are not a statistically representative sample from all software developers but more like a focus group. Two questionnaire rounds were conducted. In the first round, the respondents mentioned altogether 32 different skills. In the second round, 10 of the respondents answered and evaluated the importance of these 32 skills. The results are divided into two categories: composition and comprehension. For each skill, the evaluated degree of difficulty of the skill is presented (e.g., does the skill efficiently differentiate experts from novices). 1 Introduction What are cognitive skills? According to ERIC Thesaurus (2004), the term ‘thinking skills’ should be used for the term ‘cognitive skills.’ The description for the term ‘thinking skills’ is the following: Interrelated, generally “higher-order” cognitive skills that enable human beings to comprehend experiences and information, apply knowledge, express complex concepts, make decisions, criticize and revise unsuitable constructs, and solve problems—used frequently for a cognitive approach to learning that views explicit “thinking skills” at the teachable level. In this study the goal has been to identify cognitive skills that are important for expert software developers’ work. Our research origins from the need to better understand what kind of topics and skills should be included in the Masters level education of software systems specialists in the Helsinki University of Technology. Typical sources for such curriculum development work include various model curriculums such as Computing Curricula 2001 (Engel and Roberts, 2001). However, they mostly concentrate on listing topics to be covered in the curriculum. The skills to be achieved during the education are covered more vaguely. Since programming is a high-level cognitive skill, we wanted to find out in some more detail what kind of cognitive skills should be trained in the education. We decided to search for high-level software development experts and ask from them which topics in computer science they consider important for their work. Moreover, we were interested in identifying tacit knowledge needed in software development. Since such information is difficult to be grasped with simple questionnaires we decided to apply the Delphi method (Wilhelm, 2001) in which people in the same focus group are queried two or more times. After each time a summary of results is presented for them followed by more closely defined questions of the topic of interest. Delphi is a qualitative research method, where the quality rather than the number of respondents is the more important factor. The statistical reliability of the results is therefore not the general goal, and thus the number of respondents need not be very large. In this study we selected, based on some general quality criteria, 11 respondents among a group of recommended 59 experts. Two questionnaire rounds were performed, and the second round concentrated especially on the tacit knowledge of software development. In this paper we concentrate on the results of the second questionnaire round. 2 Kolin Kolistelut - Koli Calling 2004 The structure of the paper is the following. First, we consider some related work in Section 2. In Section 3 we describe the research method in some detail. The results are presented and analyzed in Section 4. A discussion including some implications to education and evaluation of this research summarizes the paper. 2 Related work We did not find any research papers where the Delphi method has been used in the field of psychology of programming. This is understandable because it is not common to use even questionnaires as a research instrument in this field.1 Because the lack of similar research, some more general references are presented next. In the end of this section it is explained how these issues relate to our research. Greeno and Simon (1988) wrote ‘Computer programming may be characterized “as a whole” as a design task.’ Brooks (1983) wrote about design task domains: . . . , two fundamental activities in design task domains are composition and comprehension. Composition is the development of a design and comprehension results in an understanding of a design. The essence of the composition task in programming is to map a description of what the program is to accomplish, in the language of real-world problem domains, into a detailed list of instructions to the computer designating exactly how to accomplish those goals in the programming language domain Comprehension of a program may be viewed as the reverse series of transformations from how to what. Stanislaw et al. (1994) divided expertise in computer programming into two components that were time-based expertise and multiskilling expertise. They wrote (p. 351): ‘Timebased expertise corresponds to the conventional notion of expertise, and is a function solely of the time spent on programming. Multiskilling expertise, by contrast, accrues through exposure to a variety of programming languages and tasks, and is related to the cognitive development of higher-level programming schemata.’ Detienne (2002, p. 35) wrote that one of the characteristics that distinguishes ‘super experts’ or ‘exceptional designers’ from other experts is: ‘a broader rather than longer experience: the number of projects in which they have been involved, the number and variety of the programming languages they know.’ In addition, Detienne (2002, p. 35) wrote that experts carry out some aspects of programming task completely automatically. She refereed to Wiedenbeck (1985, p. 383) who found that experts were faster and did fewer mistakes than novices when both groups had to do a series of timed true/false decisions about short, textbook-type program segments. One might assume that, for example, the following skills are automated gradually when the programming experience increases: (a) using basic commands of an editor (such as Emacs) and the programming system frequently used, and (b) knowing details of syntax and code conventions of a certain programming language such as C. The previous issues relate to this study as follows: (a) We have used the two activities, composition and comprehension, to interpret and divide our results. (b) The division timebased expertise vs. multi-skilling expertise was used so that we required that at least half of the respondents should be characterized as multi-skilled experts. (c) The concept of skill automation was used with the questions about cognitive skills: the first question concerned higher-level skills and the second question concerned skills that might be partially or totally automated. 1 We found only seven articles where questionnaire has been used, for example (Capretz, 2003). However, none of these articles is really related to our study beside the use of questionnaires. Kolin Kolistelut - Koli Calling 2004 3 Paper R/05 3 Method An overview of the Delphi method can be found, for example, from (Wilhelm, 2001). The method was originally used to forecast the future; the name originates from ‘the oracles of Delphi’ where Delphi refers to an ancient Greek island. However, in this study, estimating future was only a small part. Some basic properties of the method are the following. First, there are several questionnaire rounds. Second, the results from the previous round are used as material for the next round. Thus the respondents may change or tune their previous answers. One of the main reasons for using Delphi was that it allows group communication without gathering all respondents to the same place in the same time, which in this case would have been very difficult to achieve. Moreover, in this way the respondents had more time to consider their answers and make their views more explicit. Originally consensus building has been an important part of the Delphi method. In this research, however, the second questionnaire round was not used for building consensus on the whole issue but targeted more to refining the results of an interesting part of the first questionnaire; that is, cognitive skills. The first questionnaire had three open questions about cognitive skills required by a software specialist. Based on the answers in total 36 different skills were identified. In the second round the respondents defined the level of these skills, that is, how long learning and experience is needed before such a skill is mastered. The questionnaires are presented in more detail in Section 3.2. The decision of limiting the second questionnaire to only one area of interest was based on several reasons: (a) The results from the other areas of the first questionnaire were satisfactory enough. Thus, the need to conduct a second questionnaire round for the sake of the other areas was low, (b) The respondents thought that the questions about cognitive skills were the most difficult to answer. We interpreted this as a hint to explore more this area, (c) Regardless of the answering difficulties, some respondents thought cognitive skills as interesting or promising area for this kind of study. This was our own opinion, as well, and finally, (d) In the beginning of the study we promised to the respondents that participating would take 1-3 hours, and we wished not to break this promise. After the cognitive skills were chosen as the topic for the second questionnaire round, the goal was set to evaluate how demanding or difficult the different cognitive skills that were mentioned during the first round are. 3.1 Finding respondents The goal was to find 10-20 especially good software developers. The respondents were found using recommendations. Thus, they are not a statistically representative sample from all software developers but more like a focus group. Probabilistic sampling was not used because it was difficult to identify the target group using properties such as age, education, and title. For example, the title and working years are not enough to separate especially good software developers from poor or intermediate developers. Our decision thus fits well with guidelines presented by Kitchenham and Pfleeger (2002, p. 19): ‘Nevertheless, there are three reasons for using non-probability samples: 1. The target population is hard to identify. For example, if we want to survey software hackers, they may be difficult to find. . . ’ The minimum criteria were a degree, five years working experience after graduation, at least half of time used to programming during these five years, and at least 100,000 lines of self implemented code. In addition, at least half of the respondents should have versatile software development experience. Here, versatile means different kind of projects, for example various programming languages and application domains. Two extra criteria were that (a) maximum of three respondents can be included from the same organization and (b) only one respondent can work full-time at the Helsinki University of Technology, where the authors work themselves. The degree could be from other programs than computer science and engineering. 4 Kolin Kolistelut - Koli Calling 2004 For example, some older respondents had the degree from electrical engineering. The title of the respondent needed not be programmer, software developer or software engineer, since the important issue was only that their work included enough programming. Altogether, 59 persons were recommended. 40 of them were not asked because of several different reasons (e.g., the person was graduated less than five years ago). Thus, 19 persons were asked to participate starting from those who had more recommendations. From these 19 persons, 11 promised to participate. The criterion of at least 100,000 lines of self-implemented code and enough programming experience during the last five years were checked when the person was asked to take part. Some candidates declined because of these two conditions. The criterion of at least half of the respondents should have versatile software development experience was controlled with the first questionnaire. No respondents were excluded because of this criterion. 3.2 Questionnaire rounds Two questionnaire rounds were conducted. The first questionnaire was answered between November 2003 and January 2004, the second questionnaire between January and February 2004. During the first round, most respondents answered so that they were able to ask questions from the researcher (from one of us) who was present during they answered. The researcher was not present during answering on the second round. The mean answering time for the first round was one hour and six minutes, and 54 minutes for the second round. The original questionnaires are available in Finnish only at (Surakka, 2004). However, their main properties are presented in the following two subsections. 3.2.1 First questionnaire The first questionnaire had 14 open questions and 14 multiple-choice questions. The topics were (a) background information from the respondent, (b) the importance of various subjects and skills for software development, such as discrete mathematics and concurrent programming, (c) cognitive skills, (d) problem solving techniques, and (e) software quality. For brevity, only results about the background information and cognitive skills are presented in this article. The questions about background information were title, proportion of time used to programming, number of employees under the respondent, lines of code implemented by the respondent, number of different groups involved, number of different projects, personal skills in various subjects (42 subitems such as discrete mathematics and object-oriented programming), skills in various programming languages and knowledge of various operating systems. Instead of cognitive skills, the term ’tacit knowledge’ was used because we assumed that it would be easier to understand for the respondents. An explanation of the concept including initial division to cognitive skills and technical skills was given before the questions. The three questions were: • For top-level software developer, what are important mental models, beliefs and understanding that belong to the cognitive element of tacit knowledge? • For top-level software developer, what topics or skills belong to the technical element of tacit knowledge? This can also be called as skills that are located in the fingertips. • Do you believe that some area of tacit knowledge will be more important in the future? 3.2.2 Second questionnaire The second questionnaire was based on the respondents’ answers and comments to the first questionnaire. These were analyzed to identify and separate different skills mentioned in the comments. Comments clearly denoting the same skill were joined. Typing skill was included Kolin Kolistelut - Koli Calling 2004 Paper R/05 5 into the list, based on researcher’s observations, even though the respondents did not mention it. Finally we had a list of 36 comments each identifying at least one skill, for the next round. In the second questionnaire, the respondents had to evaluate the level of these comments according to the following categories: 1. Very low-level skill that even novices can learn quickly (during a 1-4 credits basic course) 2. Somewhat low-level skill that requires working experience of 3-6 months to be learned, for example 3. Somewhat high-level skill that starts to differentiate good programmers from less good programmers 4. Very high-level skill that takes usually several years to learn and typically only top-level programmers have this skill. The second questionnaire also had questions about problem solving techniques, use of editor, and typing skills. For brevity, these results are not reported in this article. 4 Results First, some background information about respondents is presented. Second, the results about respondents’ opinions from cognitive skills are presented. 4.1 Background information of respondents All respondents were male and mean of respondents’ ages was 37.1 years. Their degrees were as follows: one college degree in computer science and engineering (9%), five masters in computer science and engineering (45%), three masters in other engineering disciplines (27%), one doctor from applied mathematics (9%) and one doctor from computer science and engineering (9%). The respondents’ positions were distributed into following groups: senior software engineers and developers 45%, researchers 27%, and managers or directors 27%. Each respondent was asked to give himself a grade in 42 subjects or skills related to various fields of computer science, or other sciences (mathematics, physics), and software development phases of the waterfall model. In Table 1 are shown the ten subjects or skills that respondents evaluated they knew best on average. There are two issues that are worth noticing. First, script programming skills are ranked very high. This obviously correlates with the heavy use of Unix/Linux environment in their work. We did not ask more questions on scripting on the second round. However, our interpretation of this phenomenon is that for this target group scripting is a regular method for solving simple computational problems, for example, filtering and manipulating data files, or building auxiliary tools for them. This is strongly related with the important cognitive skills of recognizing the need for building new tools and choosing a suitable tool for each purpose. The second observation is that functional programming is ranked much higher than the general use of functional programming languages in software production would indicate. We believe that this is related to multi-skilling. A plausible explanation is that many of the respondents have used functional programming during the career and/or hobby programming. Based on answers to the open question about working experience, at least four (36%) respondents had actually used Lisp in some work project.2 2 Nine (82%) respondents have graduated from the Helsinki University of Technology where Scheme was the language of the first compulsory programming course in the degree program of computer science and engineering (CSE) during 1989-2003. However, this is not a suitable explanation because all these nine respondents were admitted before 1989 or were from other degree programs than CSE. That is, the course in question was not compulsory for them. 6 Kolin Kolistelut - Koli Calling 2004 Table 1: Respondents’ top strengths according to question ‘Give yourself a grade in the following subjects or skills’ (scale: 1 poor . . . 4 excellent). Rank 1 3 5 7 10 4.2 Subject or skill Implementation Procedural programming Data structures and algorithms Script programming Design Object-oriented programming Operating systems Testing Version and configuration management Functional programming Mean 3.8 3.8 3.5 3.5 3.4 3.4 3.1 3.1 3.1 3.0 Respondents opinions about cognitive skills In the second questionnaire, the statements of skills were divided according to the division used in the first questionnaire. However, for this article we reclassified the results into two categories: composition and comprehension. We also combined some comments. Two comments are not presented in the tables because they are not related only to software development. These two comments and their means were Being systematic 2.1 and Ability to type using ten fingers 2.1. Thus, the tables contain fewer comments than the second questionnaire did. First, the results related to composition are presented in Table 2. The comments are ordered according to the means. The numbers in the leftmost column are used for commenting the items. Even though statistical analysis was not our main purpose, we were curious to see, whether the observed differences are significant or not. We used the Mann-Whitney test (Conover, 1999, pp. 271-275) for the analysis because this nonparametric test is suitable for small samples. Note that the test compares the ranks, not the means. However, for brevity we present the test results in the same column with the means. The ranks of single items were compared to the ranks of all items. A star (*) indicates that the difference is statistically significant (p<0.01). If the star is missing, the difference is not statistically significant. In Table 2, there are a few observations which need commenting. First, the high mean of item “2a Automating one’s own work using scripts, keyboard macros etc.” obviously does not indicate the time needed to learn such skills. Instead, it indicates the time needed to use them efficiently as one’s personal tools, when necessary. Our assumption is that this is a skill which is analogous to bottom-up software design, where the programmer recognizes the need for general-purpose procedures and data structures. Thus, it has a role in differentiating excellent developers from others. Second, the items ‘Design of interfaces’ and ‘Isolating the implementation behind well defined (and documented) interfaces’ are kept separate. The first one is more associated with designing and the latter one with using interfaces. It is obviously easier to learn to use ready-made interfaces properly than actually designing interfaces that support good software architecture. Third, comments 2b and 7b are similar but we think that 2b is broader than 7b. Comment 2b includes also low-level knowledge, for example knowing language’s keywords by heart. Forth, we think that the low ranked items 15a and 17 are not really cognitive skills, but other kind skills or knowledge. However, we have not omitted these items from the table because they are related to composition. In Table 3 we present the results related to category ‘comprehension’. As a general note, it is interesting that the respondents have used often words like ‘see’ and ‘notice’ to describe Kolin Kolistelut - Koli Calling 2004 Paper R/05 7 Table 2: Comments classified into category ‘Composition’: Means to question ‘What do you think is the level of this skill?’ Scale was: 1 very low-level skill. . . 4 very high-level skill. Number 1 Mean 3.6* 2a 2b 4 5 6 7a 7b 3.5* 3.5* 3.4 3.3 3.1 3.0 3.0 9 10a 10b 12 13 14 15a 15b 17 A star Comment A good programmer has always a model. The code itself comes from spine and brains operate only the model. Automating ones own work using scripts, keyboard macros etc. Mastery of a certain programming language or a certain environment Writing code so well that it is not even necessary to comment Design of interfaces Choosing as optimal data structures and algorithms as possible Ability to find right abstractions Mastery of the structures and idioms that are characteristic for each language or environment Ability to write code clearly and shortly Choice of the programming language Implementing programs as independent from the operating environment as possible Isolating the implementation behind well-defined (and documented) interfaces Changing lower level cognitive models/design patterns to code. For example, table field in C/C++ object and its memory management get/set/constr/destr. Identifying concepts Ability to find existing Open Source solutions from Net and being familiar with libraries Procedural or object-oriented way of thinking about programming Documenting code (*) indicates that the difference is statistically significant (p<0.01). 2.9 2.8 2.8 2.7 2.6 2.4 2.3 2.3 1.9* these skills. We think that item ‘13 Understanding the function of programming languages and computer (e.g., parameter passing, order of execution, and concurrency)’ is rather explicit than tacit knowledge. 5 Discussion In this section conclusions are drawn, implications to education are presented, and the research is evaluated. 5.1 Conclusions and implications to education The skills listed can be divided into two main categories: skills associated with composition and skills associated with comprehension. The composition category obviously includes skills that are related to the mastery of the programming languages and environments used. Other important skills associate with having an inherent model of the goal in one’s mind, designing interfaces and abstractions, mastering and developing one’s own working process, for example. The comprehension category includes skills such as understanding the program as whole, ability to notice isomorfisms with other known problems, ability of change fluently view to the code in various aspects, for example. On a general level, the results confirm that different comprehension-related tasks are an 8 Kolin Kolistelut - Koli Calling 2004 Table 3: Comments classified into category ‘Comprehension’: Means to question ‘What do you think is the level of this skill?’ Scale was: 1 very low-level skill . . . 4 very high-level skill. Number 1 Mean 3.9* 2 3 3.6 3.5 4a 4c 6a 6b 8a 8b 10 11 12 13 A star Comment Ability to see all possible alternatives from the source code (this comment was related to debugging) Ability to notice isomorfisms with some known problem Ability to evaluate how the system will operate even before its implementation has been started Ability to see esthetic values in solutions Ability to see the big picture. What is the core of the problem and how it is connected to the environment around it? Ability to distinguish essential matters Interpreting the program as whole Ability to change fluently - abstraction level (e.g., single line of code vs. procedure or big picture vs. details), - perspective (e.g., is the control flow or the data flow of the program examined), - concepts (e.g., are the concepts of program or the concepts of application domain considered) - and view (e.g., users needs vs. maintenance vs. development speed). Ability to debug Ability to see symmetries Exploring the architecture of the existing systems Ability to see a big problem as several partial problems Understanding the functioning of programming languages and computer (e.g., parameter passing, order of execution, and concurrency) (*) indicates that the difference is statistically significant (p<0.01). 3.4 3.4 3.2 3.2 3.1 3.1 3.0 2.9 2.7 1.8* important part of software developer’s cognitive skills. Approximately 40% of the items mentioned by the respondents can be classified as comprehension-related tasks. Obviously, this is not at all surprising result because according to the definition presented in the very beginning of this article, cognitive skills enable human beings to comprehend information. It is obvious that many of the skills listed above cannot be taught directly on the courses. They are highly related with a long experience gathered when programming solutions to different problems. The challenge for education is to design project assignments where students will face problems, in which the mentioned skills are useful, and how to present guidelines for adopting such skills. On a more general level, we assume that the deployment of the results of this research might increase the proportion of time used into concept exploration, requirements analysis, and design phases but decrease the proportion of time used into implementation phase. For brevity, we mention only two course examples of such development. The first example would be an advanced course that emphasize comprehension. A possible course title could be ‘Refactoring.’ During a refactoring course, a student should repair and/or partly rewrite a program (maybe 2000-3000 lines) that contains different kind of mistakes and bad planning choices. During the task, a student has to read and thus comprehend a program written by others. Moreover, he/she should argue about the findings made, and how the code should be improved. Second, from the composition viewpoint a possible course title could be ‘Software design Kolin Kolistelut - Koli Calling 2004 Paper R/05 9 workshop.’ This course would emphasize analyzing and decision-making skills related to design. The course would contain an open or semi-open design problem that can be solved using several different strategies and tools. The student group should compare various options, argue their pros and cons, and finally evaluate the result. 5.2 Evaluation of the research This study would have been very different if the original main goal was to gather information from cognitive skills of software developers. Questionnaires are used seldom in psychology of programming where experimental research setting is dominant. One source of criticism is that questionnaires measure opinions, not observable behavior. However, in this research the purpose was to measure especially the opinions of experts. During the first questionnaire round, most respondents commented that the questions about the tacit knowledge were the most difficult to answer. A possible interpretation could be that the used research method was not suitable or the questions were poorly designed. However, we interpreted that the answering difficulties were mainly due from the topic itself; that is, the topic is genuinely difficult. It is possible that the respondents do not remember or cannot describe skills that have been automated already several years ago. For example, adults often have difficulties to describe how bicycle is ridden or car is driven. We tried to minimize this problem by dividing the questions in two parts and adding an explanatory text before the questions. 6 Acknowledgements We thank emeritus professor Veijo Meisalo from the University of Helsinki for suggesting use of the Delphi method and PhD Sari Kujala from the Helsinki University of Technology for commenting manuscript of this article. References Brooks, R., 1983. Towards a theory of the comprehension of computer programs. International Journal of Man-Machine Studies 18, 543–554. Capretz, L., 2003. Personality types in software engineering. International Journal of Human-Computer Studies 58 (2), 207–214. Conover, W., 1999. Practical nonparametric statistics. 3rd ed. John Wiley and Sons, New York. Detienne, F., 2002. Software design—Cognitive aspects. Springer, London. Engel, G., Roberts, E., 2001. Computing Curricula 2001. Computer Science. Final report, December 15, 2001. Association for Computing Machinery and IEEE Computer Society. ERIC Thesaurus, 2004. ERIC Thesaurus. Retrieved on April 27, 2004, from the Educator’s Reference Desk web site: http://www.ericfacility.net/extra/pub/thessearch.cfm . Greeno, J., Simon, H., 1988. Problem solving and reasoning. In R. C. Atkinson, R. J. Herrstein, G. Lindzey and R. D. Luce (Eds.): Stevens Handbook of Experimental Psychology, vol. 2 , 589–672. Kitchenham, B., Pfleeger, S., 2002. Principles of survey research. Part 5: Population and samples. Software Engineering Notes 27 (5), 17–20. Stanislaw, H., et al., 1994. A note on the quantification of computer programming skill. International Journal of Human-Computer Studies 41 (3), 351–362. Surakka, S., 2004. Supplementary material for article ‘Cognitive skills of experienced software developer: Delphi study’. http://www.cs.hut.fi/u/ssurakka/papers/Delphi2/index.html . Wiedenbeck, S., 1985. Novice/expert differences in programming skills. International Journal of Man-Machine Studies 23 (4), 383–390. Wilhelm, W., 2001. Alchemy of the Oracle: The Delphi technique. The Delta Pi Epsilon Journal 43 (1), 6–26.