Academia.eduAcademia.edu

Human-Information-Thing Interaction

This book is about how to make it out with technology. Humans have come a long way since leaving the trees, and through our tools we have neutralised most natural forces and adapted the environment to ourselves. This struggle for supremacy built, and still builds, knowledge about; nature, the tools needed, and how our society and we work. Relating this knowledge to the next generation of technology is the major challenge for the following 200+ pages. The human genome is now known, which is a major achievement. We know the wiring of our neural network, and have cars that transport us comfortably from A to B. However, not all problems are solved, and not everything understood. Far from it! The last thousands of years have for instance not shed much light on emergence, the effect of long term social processes, dynamic human behaviour, society, our mind, consciousness, self, and love. Aristotle is still the reference.

I am deeply grateful for all comments, and suggestions that helped improve this book to the current state. Please surprise me! Okay kids, you can have the computer now. Hello my dear, missed me? HITI vs 6.500 Tufte Tufte HCI J. Preece Calculus UmU MATLAB 7.0 Hakan Gulliksson 1 Hakan Gulliksson 2 INTRODUCTION 7 PART I: THE HITI MODEL 11 I.1 The model, presenting the interactors I.1.1 Human I.1.2 Thing I.1.3 Information/idea I.1.4 Interaction I.1.5 Context 11 12 12 12 12 13 I.2 Applying the model, and more basic concepts I.2.1 Hierarchy and other topologies I.2.2 Classification I.2.3 Aggregation I.2.4 Sequence or Parallelism I.2.5 Mediative roles I.2.6 Design and creativity 13 13 13 13 14 14 14 PART II: TECHNOLOGY, SCIENCE AND EDUCATION FOR DEVELOPMENT 15 II.1 Technology and science 15 II.2 Education 17 II.3 Creativity and designing for the vision 18 II.4 Basic assumptions 18 PART III: SYSTEMS, IT AND WE ARE SYSTEMS 19 III.1 System properties, common to us all III.1.1 Processing, Sequential or Parallel III.1.2 Distributed or Centralised III.1.3 Memory and Feedback III.1.4 Adaptation and Learning III.1.5 Heterogeneity, Autonomy and Intelligence III.1.6 Communication and language III.1.7 Emergence III.1.8 Space and time, change and mobility 21 21 23 23 26 28 30 32 33 III.2 Complexity, we and it are certainly complex III.2.1 Why are systems complex and difficult to understand? III.2.2 Reducing complexity 37 38 40 III.3 Modelling, it and us III.3.1 Abstraction level III.3.2 Modelling view III.3.3 Basic types of models III.3.4 Representations III.3.5 Language 47 49 49 50 53 54 III.4 System environment, context, it is all around us 55 Hakan Gulliksson 3 PART IV: INTERACTORS, WE ARE NOT ALONE 57 IV.1 We have an interface, a structure, and processing capability IV.1.1 Representation IV.1.2 Perception and cognition IV.1.3 Processing summarised 59 60 60 62 IV.2 Human representations 63 IV.3 How to recognise Information? IV.3.1 Shannon’s information theory IV.3.2 Representations of information, see the soul of I IV.3.3 Painting, Image and Video IV.3.4 Text IV.3.5 Sound and music IV.3.6 Speech 65 66 66 68 70 71 72 IV.4 The Thing outside in 73 IV.5 Sensing it IV.5.1 Which sense is the most fundamental? IV.5.2 Neural pathways IV.5.3 Internet data pathway 75 78 78 79 IV.6 Acting out IV.6.1 Action, the concept defined IV.6.2 Visual realism, information blending in IV.6.3 Sound, speech synthesis and telling stories 80 83 86 89 IV.7 We need knowledge, and we represent it IV.7.1 Knowledge representation 90 91 IV.8 We think and process IV.8.1 Situated action IV.8.2 Distributed cognition IV.8.3 Trends in thinking IV.8.4 Artificial intelligence IV.8.5 Representations for processing 93 94 96 97 97 99 IV.9 We remember 99 IV.10 We attend to it IV.10.1 Reaction time and attention span 101 104 IV.11 We reason 105 IV.12 We plan and search 108 IV.13 We make decisions 111 IV.14 We learn and adapt IV.14.1 Taxonomy for learning IV.14.2 How do we build knowledge? IV.14.3 Knowledge management IV.14.4 Machine learning 113 115 115 117 118 IV.15 Humans are creative 120 IV.16 Humans feel presence, and have social abilities 121 Hakan Gulliksson 4 IV.17 Humans experience it IV.17.1 Emotion IV.17.2 Appraisal IV.17.3 Concern (need, urge, drive, goal, utility, desire, motive) IV.17.4 Action tendency (coping strategies) IV.17.5 Experience 125 126 128 129 130 131 IV.18 Human’s subjective well-being, emotion, and flow IV.18.1 Flow 134 138 IV.19 Unique features for each of us IV.19.1 Unique human abilities IV.19.2 Features and limitations not found in man IV.19.3 Summary Human vs Thing 140 140 142 144 PART V: INTERACTION, WE DO IT TOGETHER 145 V.1 H-H Interaction, the reference 147 V.2 I-I Interaction, so far for efficient data transfer 149 V.3 H-I, H-T Interaction, joining forces V.3.1 Ubiquitous computing 150 152 V.4 I-T Interaction, access to reality at the speed of light 153 V.5 T-T Interaction, forces matter 155 V.6 Why do we interact? V.6.1 Why use others for interaction? 156 157 V.7 Context, it is everything else V.7.1 Use of context V.7.2 Real, Virtual and Augmented reality V.7.3 Context of H-H interaction V.7.4 Context of I-I interaction V.7.5 Context of H-T and H-I interaction V.7.6 Context of T-I interaction 160 162 164 167 169 170 174 V.8 Interaction modelling, back to the basics V.8.1 Modelling view 175 176 V.9 Interaction characteristics, some suggestions 178 V.10 Mediation, with the help of it V.10.1 Mediation as a model V.10.2 The medium V.10.3 Pragmatics V.10.4 Social dynamics and timing V.10.5 Meaning and inferential model V.10.6 Infrastructure of H-T, H-I interaction V.10.7 Screen or paper as the medium 184 185 186 188 190 190 192 195 V.11 Interaction control, a joint venture or one of us in control V.11.1 Coordination 195 197 V.12 Co-operation, we are all in control V.12.1 Measures of co-operation V.12.2 Mechanisms for co-operation 199 201 202 Hakan Gulliksson 5 V.13 We compete, and compromise 203 V.14 Computer-supported co-operative work V.14.1 Taxonomy V.14.2 Effective interaction V.14.3 Interaction bandwidth V.14.4 Social quality of service V.14.5 The social-technical gap 205 206 208 211 212 214 V.15 Command based interaction, someone in control V.15.1 Mechanisms V.15.2 Intelligent support V.15.3 Identification V.15.4 Navigation V.15.5 Choice V.15.6 Manipulation 215 215 217 224 230 234 235 PART VI: DESIGN, HUMANS CHANGE OUR FUTURE 241 VI.1 What is the problem? 242 VI.2 Design for H-H VI.2.1 Ethics, privacy and security 245 246 VI.3 Design for I-I 247 VI.4 Design for H-I/T VI.4.1 Information overload VI.4.2 Incidit in Scyllam, qui vult vitare Cha-ry'bdim 247 248 249 VI.5 Design for T-T 249 PART VII: RESOURCES 251 VII.1 References 251 VII.2 Index 259 VII.3 Think along 265 Hakan Gulliksson 6 Introduction This book is about how to make it out with technology. Humans have come a long way since leaving the trees, and through our tools we have neutralised most natural forces and adapted the environment to ourselves. This struggle for supremacy built, and still builds, knowledge about; nature, the tools needed, and how our society and we work. Relating this knowledge to the next generation of technology is the major challenge for the following 200+ pages. The human genome is now known, which is a major achievement. We know the wiring of our neural network, and have cars that transport us comfortably from A to B. However, not all problems are solved, and not everything understood. Far from it! The last thousands of years have for instance not shed much light on emergence, the effect of long term social processes, dynamic human behaviour, society, our mind, consciousness, self, and love. Aristotle is still the reference. While we have been pounding on this kind of, seemingly impossibly, complex problems, technology has developed, infiltrating, manipulating and supporting more and more aspects of our lives. We have a cancer in our midst and will soon face fundamental and scary questions such as:     "If you're not having fun, you're doing something wrong." Groucho Marx Can we control global, pervasive, networked technologies with a multitude of sensors and actuators, and with unlimited memory and processing power? Will such a technology support us and the society as we know it now while improving our quality of life, or will it start living a life of its own? A life that we cannot comprehend or control, and that will disrupt established behaviour? Will the behaviour of such advanced technology mirror our own (we built it), or will fundamentally new behaviour emerge? One that for instance does not acknowledge our preference for behaviour in accordance with the laws of nature. This could force us to change how we think and plan, or how we perceive, rate and order objects and events. Will we humans control this new technology, be the masters, inspire it, provide it with creativity, or teach it, support it, be enslaved by it, visit it, or maybe just sit back, eat a fruit, and watch it develop by its own? We are already forced to make choices by this new wave of technology. How many cameras do you accept in a city street, a school, at your work and at home? They will all be well motivated, preventing terrorism, bullying, burglary, or studying who empties the dishwasher and who just leaves dirty dishes in the sink. What will the emerging long term effects be of our choices? As another example consider the pros and cons for society Hakan Gulliksson 7 if a car can be positioned at any instant. Effects on road tariffs and road planning? In the case of an accident? Effects on bank robbers, navigation, or for a restaurant close to the highway? A human is an enormously complex biological system that together with other humans forms an even more complex society. Can we together with technology better understand our society and ourselves? The society is usually considered a result of the interaction between humans, and between humans and their environment. It is a feedback system where one loop (out of many) is humanity and technology evolving together. The development of technology depends on human involvement; technology changes human behaviour and through human involvement changes itself. Here is where design, defined as purposeful creation is necessary, random evolution is very slow and resource intensive. The design process has its own world of words, problems, and solutions. If the result is a commercial product it should provide adequate quality of service, within schedule, spending as little resources as possible. Technology provides the means and methods for creating more and more complex systems. So far, humans have mostly interacted with other humans. Not counting some simple tools and friendly dogs to play with, there has been little else around to interact with. This situation is now changing dramatically. We are entering a new era where humans also interact with systems designed and built by humans. These systems will become increasingly interesting as interactors, and soon they will start to compete and co-operate amongst themselves. Soon we are not alone! Not only does technology provide the means to build complex systems. It also supports the context of our lives to the extent that the life as we know it in the industrial world would be impossible without it. Technology affects every aspect of our lives. Art, music, film, and this book are formed within the constraints of technology. A competitive world implies mastery of technology even by artists. This book is about interaction. To be more specific, it illustrates design of interaction, interactive systems, and interaction technology, involving the three participants humans, information/ideas, and things. We will use the acronym HIT for the participants, and HITI when alluding to the whole concept of interaction amongst the participants. Studying HITI in this book will enhance understanding of the participants and their interactions, improve the usability of new systems, and speed up the development of new technologies. As a side effect in depth knowledge will be gained of the main participant in the interaction, the human being. H I T Following technology Hakan Gulliksson 8 Focusing on anything related to interaction technology is like trying to take a photograph of a racing car. Technology is constantly evolving and the speed of the change seems to be ever increasing. How long is it since Internet was introduced? World Wide Web? Is MP3 an old technology? There is probably a computer in your kitchen, a laser in your living room, and a hologram in your wallet. Perhaps we can use technology to better understand technology? Humans and society change at a much slower rate. This means that the products of technology more and more will be limited by humans and human society. Technology itself is not good or evil, but merely reshaping and developing rapidly. Part I introduces the HITI model (Human, Information Thing and Interaction) that will be used throughout the book. Part II clarifies what we will mean by technology and science. Part III discusses systems and models in general. It is important because it establishes a systematic view that can be used in many areas of work. Basic characteristics of systems are covered in this chapter, such as that a system can be adaptive. A large part of the chapter is devoted to models. They are very important because without models, computers would not be of much use, and our understanding of the Universe would be reduced to blind search. Part IV details the characteristics of the participants in the interaction. As the curtain rises the spotlight finds the human (H), who of course from our point of view is the most important interactor, the thing (T), and the information (I). Starting from a generic interactor specific features and characteristics are added for processing, sensing, and knowledge representation. Part V is about interaction. It tries to answer the questions What is interaction? Why is it needed? , How is it composed? and When is it performed and by Whom Where? . Once again a perspective from H is used. The action starts and the plot is unveiled. H, I and T exercise their abilities and try to overcome their limitations by exploiting each other. Some highlights are: a discussion on computer supported co-operative work, use of context in human-computer interaction, and technology support for mobile services. Part VI ends the book with a short discussion on design, our vehicle for change. The intent with this book is to bring forth recommendations, constraints, risks, and data for the choices we will be forced to take to cope with technology. Relevant limitations in, and differences between H, I and T, and their interactions will be discussed, and we will do this using H, and H-H interaction as references, and as sources for examples. Most of the ideas are of course not mine. I want to express my gratitude to those thinking faster, further and farther, perfectly exemplified by professor Haibo Li, and professor Lars-Erik Janlert at Umeå University. By the way, this book will never be finished until someone pays me to stop working on it. Hakan Gulliksson In Out System Do you think Hitler and Mussolini would have remained friends for long after the war was won? "Believe me, Baldric, an eternity in the company of Beelzebub and all his hellish minions will be as *nothing* compared to five minutes alone with me...and this pencil." Edmund Blackadder to Baldrick, `BA III' 9 This page was intended to be left blank, but… … now I come to a new Scene of my Life. It happen'd one Day about Noon going towards my Boat, I was exceedingly surpriz'd with the Print of a Man's naked Foot on the Shore, which was very plain to be seen in the Sand: I stood like one Thunder-struck, or as if I had seen an Apparition; I listen'd, I look'd round me, I could hear nothing, nor see any Thing, I went up to a rising Ground to look farther, I went up the Shore and down the Shore, but it was all one, I could see no other Impression but that one, I went to it again to see if there were any more, and to observe if it might not be my Fancy; but there was no Room for that, for there was exactly the very Print of a Foot, Toes, Heel, and every Part of a Foot; how it came thither, I knew not, nor could in the least imagine. But after innumerable fluttering Thoughts, like a Man perfectly confus'd and out of my self, I came Home to my Fortification, not feeling, as we say, the Ground I went on, but terrify'd to the last Degree, looking behind me at every two or three Steps, mistaking every Bush and Tree, and fancying every Stump at a Distance to be a Man; nor is it possible to describe how many various Shapes affrighted Imagination represented Things to me in, how many wild Ideas were found every Moment in my Fancy, and what strange unaccountable Whimsies came into my Thoughts by the Way. Robinson Crusoe, Daniel Defoe but …. … it cast a gloom over the boat, there being no mustard. We ate our beef in silence. Existence seemed hollow and uninteresting. We thought of the happy days of childhood, and sighed. We brightened up a bit, however, over the apple-tart, and, when George drew out a tin of pine-apple from the bottom of the hamper, and rolled it into the middle of the boat, we felt that life was worth living after all. We are very fond of pine-apple, all three of us. We looked at the picture on the tin; we thought of the juice. We smiled at one another, and Harris got a spoon ready. Then we looked for the knife to open the tin with. We turned out everything in the hamper. We turned out the bags. We pulled up the boards at the bottom of the boat. We took everything out on to the bank and shook it. There was no tin-opener to be found. Then Harris tried to open the tin with a pocket-knife, and broke the knife and cut himself badly; and George tried a pair of scissors, and the scissors flew up, and nearly put his eye out. While they were dressing their wounds, I tried to make a hole in the thing with the spiky end of the hitcher, and the hitcher slipped and jerked me out between the boat and the bank into two feet of muddy water, and the tin rolled over, uninjured, and broke a teacup. [] We beat it out flat; we beat it back square; we battered it into every form known to geometry - but we could not make a hole in it. Then George went at it, and knocked it into a shape, so strange, so weird, so unearthly in its wild hideousness, that he got frightened and threw away the mast. Then we all three sat round it on the grass and looked at it. There was one great dent across the top that had the appearance of a mocking grin, and it drove us furious, so that Harris rushed at the thing, and caught it up, and flung it far into the middle of the river, and as it sank we hurled our curses at it, and we got into the boat and rowed away from the spot, and never paused till we reached Maidenhead. Three men in a boat, Jerome K. Jerome Hakan Gulliksson 10 PART I: The HITI model This book will use the Human, Information/Idea, Thing, Interaction model (HITI model) to structure thinking about systems [UCIT]. The model originates from professor Lars Eric Janlert at Umea University, and only has a few basic elements that have to be interpreted within the specific context where they are used. The strength of the model is that it is graphical and close to everyday thinking. This chapter introduces the HITI model and applies it to some simple examples. I, Human Thing I.1 The model, presenting the interactors The constituents of the model are the three possible participants of an interaction, i.e. Human (H), Thing (T), and Information/Idea (I), and the Interaction itself. Those are my principles. If you don’t like them I have others. Groucho Marx Below, see figure 2.1, the participants are shown as text boxes along with the possible interactions between them, represented by arrows. As technology improves, so do the possibilities for supporting interactions. This creates new opportunities in different applications, for new categories of users, new situations, environments, and activities. H T T H Figure I.1.1 The HITI model. I I One example of the HITI model at work is that you want to write a poem in a love letter. You, a human (H) print the poem (I) neatly on paper (T). A father asking his son to do the dishes could exemplify the arrow representing a H-H interaction. If you are accessing a database using Internet Explorer ® this is an interaction where an H (you) interacts with an I (Internet Explorer) which in turn interacts with another I (the database). The first interaction is through a human-computer interface (Windows ®) and the second is implemented by data communication (Internet). Z Z Z Z Z Z Z z Z Now, we will give a short introduction to the interactors H, I, and T. They will act in examples showing four fundamental modelling principles, such as hierarchy, abstraction (classification, aggregation), sequence, and parallelism. Hakan Gulliksson 11 I.1.1 Human The first interactor to introduce is the most important one, the human. We are quite intelligent – at least that is what we think ourselves. As an object of study we have been popular for several thousand years, and we can even do our own introvert excursions. This knowledge makes the human a perfect role model of an interactor, and human-human interaction the reference, the most basic, and well developed, form of interaction. The human is also interesting because we will use human well being, and quality of life, as a first rate constraint when discussing interaction technology and design. We will study how to exploit technology and design to fulfil this constraint using knowledge of human characteristics, behaviour, features and limitations. An assumption is that we, as a side effect, will better understand ourselves. I.1.2 Thing The thing is our oldest friend. For more than 2 million years it has been with us. Quite a long time compared for instance to the dog that has followed us for just a little bit longer than 10.000 years. Recently the thing has acquired some new abilities, sensing, processing capability, and new possibilities to effectuate and display its actions. A major difference between the thing and the human is that a thing is designed. It can be given characteristics and behaviour chosen for a specific task and environment. I.1.3 Information/idea The third interactor is information, which include ideas. Information has also been with us for quite some time, at least 20.000 years, doing a good job as our social memory. Recently, with the advent of the global Internet, dissipated information is increasingly a major player in social progress. Managing and processing lower levels of information, i.e. raw data, is also more and more important. Both because it is now feasible and the amount of data is growing, but also because combining different kinds of data can create new knowledge. I.1.4 Interaction Interaction and communication drive and support progress. They are imperative for knowledge acquisition and maintenance, adaptation, resource allocation, and many other things that make our society work, and the world go around. If the interactor is a static object, interaction is the process, the dance that interactors engage in. Interaction implies communication. If one way communication is intended, rather than interaction, a one directional arrow is used in the HITI model. Interaction and communication in turn suggests adding action and representation to the core concepts. Actions make things happen and through representations we see the the world before, and after the action. Actions and representations are two complementary facets of reality. A representation is what an action changes, and without representations there will be no cause for action, and no result. Hakan Gulliksson Interaction Communication Action Representaton 12 I.1.5 Context The aggregation of individual interactors of the three different kinds constitutes an environment, or a context. We live in a world full of context aware and context dependent systems such as humans, thermostats, web counters, flowers, dogs, and mosquitoes. H/T/I The distributed nature of an interaction suggests sharing. How else can the complexity be kept low and the efficiency high? One way to regard sharing is as the generation of meaning through interaction. Interactors select meaningful representations for what they do, or want to do, based on previous experience. They perform actions and, importantly, also choose to display representations based on previous experience. In effect a converging feedback loop is created, representation-action-representationaction, which is actively evolved by the practitioners. Thus meaning is created as a skilful praxis. To support this development coordination mechanisms are needed [KS]. The motivation behind a coordination mechanism is to offload complexity from actions. It simplifies interactions, and can improve efficiency by providing precompiled permanent representations of conventions, rules, protocols, representations of plans, maps, and scripts. Each situation could be supported by many coordination mechanisms and they could work in concert in different ways, e.g. aligned in time. Sharing a resource Coordination mechanism I.2 Applying the model, and more basic concepts I.2.1 Hierarchy and other topologies The HITI model can be used to describe hierarchical relationships. One example is that you have a letter to write. The information content of the letter is confined in your brain (somehow). Hierarchy is only one example of a topology that seems to be intimately linked with human thinking. A ring structure is another, and a network a third that could degenerate into a point-to-point relation between two participants. The network is a very general topology that we recognise for instance in social life. Anyone can have a relation with anyone else. But, this freedom is also a curse; the number of possible relations grows quickly with the number of participants in the network. Power structures and trade offs become difficult to optimise, especially for computers. Humans seem to better manage the enormous complexities involved in real life. H I HI-HI-HI-HI I.2.2 Classification Sometimes we want to hide aspects of a model and disregard irrelevant information, or we want to present similar aspects under a common name. For this we can use classification, also called generalisation, which is a special case of abstraction. Instead of referring to a long list of our family members; aunt, brother, sister, and uncle we classify them all as relatives. I.2.3 Aggregation Grouping objects together into a new object is another way to reduce complexity by abstraction. A house has roof, walls, windows, and a door. Hakan Gulliksson I H H T T T 13 I.2.4 Sequence or Parallelism Another way to use a HITI model is to model behaviour as a sequence of actions. Sending a love letter can at one level be described as a one-way communication from you to the reader of the letter. H T H Including the letter (the paper) adds a new interactor to the model. We can easily add additional details. Let us model the following You write down your thoughts on paper, see illustration to the right. The receiver, sadly, puts your letter in the pocket without reading it . H H T I T We will need yet another concept nicely complementing the sequence, and that is parallelism. With it we for instance can model how a television station concurrently broadcasts a show and stores it on tape. I.2.5 Mediative roles Each of the participants H, I and T serves another participant in one out of three different ways. First, it can serve as a tool. We use a hammer, or a hit man as tools, and they are specialized for well defined tasks. Second, a participant can serve as a medium and provide an experience. A clown, Porsche, and a movie are three examples. The third alternative is a participant serving as a social actor, e.g. as a friend to be trusted. T and I are so far severely limited both as social actors and as receivers of media, partly because they do not have access to reality in the same way as H. No one has yet heard a computer laugh spontaneously as it parses a beginners first Java program. H2I = Chess? X Y I.2.6 Design and creativity Design is currently possible for humans only, and can be brought to bear on almost any aspect of our lives. The important distinction is that design is a purposeful and deliberate choice or change of something. It is how we visualize and realise our dreams of the future, and much quicker than random evolution. The resulting designs are evaluated and used in a context, and as they are, ideas for new designs are found. Finding a new idea cannot be done without creativity. “Human needs will not be obsolete” “Identity confirmation from family, teacher, and friends with immediate feedback … Need for exploration, Time to think, Sense of home Philips vision of the future Hakan Gulliksson 14 Part II: Technology, science and education for development This is a book about interaction, design, and interaction technology. But, before we launch a major attack on these issues we need to briefly discuss what technology and science is, their relation and education, main forces behind the development of technology. Technology to mankind is like giving an axe to a maniac Why should you care about science? Well, for one thing, it seems that a scientific approach to life is a prerequisite for development in general, at least as we know it. We always try to verify statements and beliefs before accepting them as we explore the world. Critical thinking is a basic, necessary behaviour, complemented by curiosity. Curiosity provides a driving force for exploring reality and critical thinking keeps us from drowning in new ideas and facts. There are also other drives urging us to explore, could you name some of them? Could you elaborate further on that point? Is that really true? Could you be more specific? Critical thinking II.1 Technology and science Technology emanates from, and manipulates, the human-made world. It affects and concerns the ways people develop and use technical means – things, tools and machines – to control, both the natural, and the humanmade world. Through technology we now communicate and interact more efficiently, thus improving our quality of life. We can use technology to automate boring tasks, which gives us time to spend on more interesting activities. Technology is also used to increase our comfort, for instance through the use of air conditioning and central heating. Using it well we can create stimulating, challenging, learning environments where we can develop ourselves, as exemplified by computer games. By the way, the word technology emanates from the Latin word technos, a word meaning skill in joining something, combining and working it. The word art had a similar meaning originally it meant a specialized skill, rather than fine art . According to the definition to the right, inventing new technology is a way of mastering resources more efficiently. Sufficient resources are rarely available and this shortage necessitates trade-offs and drives creativity. Take the car as an example. We are well able to walk, but that costs us time, so we invented the car at the cost of developing technology, i.e. money. It is possible to drive at 1000km/h, but that is too costly, both in terms of money and lives. It would be nice to have a larger boot, but that would take away room for the passengers legs and might obscure the rear view. Why does a car have two headlamps, not one or three? Efficiency is exemplified by car factories that assemble a car almost automatically. The assembly line of human workers, as perceived by Mr Henry Ford, is no more. Hakan Gulliksson Definition:Technology is the technical means people can use to improve their quality of life. It is also the knowledge of how to use and build efficient tools and machines efficiently. There are three types of technology, good, bad, and cool. Patrik Eriksson, TFE Technology is for: Efficiency, information,compatibility, usability, accuracy, documents, work, technology, intimacy, communication, novelty, enchantment, ambiguity, postcards, fun, people Joseph Kay 15 We can argue that efficient technology, using components in a simple, economical way is also aesthetically pleasing. A short mathematical proof is considered elegant, and inventive solutions such as the safety pin, or the clothes peg, never cease to arouse curiosity and wonder. A good programmer with a sense for taste, judgement, and aesthetics can be enormously more productive than an average programmer. In this book we are mostly concerned with technologies based on computers and sensors, i.e. primary technologies needed for intelligent interaction. Please NOTE that even though technology not often is mentioned together with emotions and social topics it undoubtedly affects these processes. Science is a prerequisite for technology. It studies the laws of the universe and has resulted in an immersible, monumental, number of facts. Science for instance tells us about fundamental limits. We know that there are limitations to how fast we can travel, and how much data we can transfer in a given transmission time over a channel with a given physical transmission capacity. Physics, chemistry, and biology are examples of scientific disciplines, and mathematics is an important special case. It is both a tool used in scientific work and a science in itself. There is also an aesthetic dimension to science. When many explanations are superseded by a new simpler unifying theory we are pleased and even grateful. The new theory will help us to better understand reality. Science and technology are inter-dependent. Science deals with "understanding" while technology deals with "doing". Science provides us with knowledge that we can use to build technology. Technology, on the other hand, helps science develop and reveal new facts that in turn spawns new technology. Note that all progress involves interaction! Technology, society, economy and the individual are involved. The individual s needs provide demands that distributed through the economy, within the constraints of society, will drive technology. Technology will create new needs (some maybe not altogether necessary for survival) that will once again fuel progress. Some other formulations of distinctions between science and technology are that science abstracts whereas technology makes concrete, science generalises, but technology is specific. Results from science are typically generated at a University and are available within a subject. Technology is applied, interdisciplinary, expensive, and patented by industry. Engineering is about how to make technology useful to people. This should be done with available resources, on time, and within budget. The word engineering comes from Latin ingenerare, meaning to create. An engineer seeks optimal solutions to problems, but there is usually no formal way to find the right compromises, and any solution is typically modified the next time it is applied. The engineer has to use good judgement as well as scientific knowledge to trade-off for instance speed and accuracy, speed and cost, or speed and size. Art resembles technology in that it deals with doing rather than with understanding, maybe even more so than technology. Art is about creating an expression, and for this technology is once again useful. Technology gives new possibilities to artists that the body and the natural environment cannot provide. Hakan Gulliksson Titanic, the ship, the film, the camera, the movie projector. A scientist usually aims at presenting a regularity (invariance) of empiria by means of an abstract formula or a model that the general public can apply to its own problems Pentti Routio [PR] Technology Science A scientist likes surprises, not so an engineer The Net The word engineer is derived from the latin word ingenium meaning ‘ability’ or ‘genius’ An engineer is one who contrives, designs or invents; an author, designer, an inventor, plotter or layer of snares. Oxford English Dictionary An artist, on the other hand, prefers to demon-strate general regularities in the form of one concrete, special case that the public can then apply to their own situations Pentti Routio [PR] 16 Since technology depends on science, art will too. But, the development of science is quite different from that of art. Art is created while scientific knowledge is researched, a work of art illustrates, but science characterises. The difference between art and technology is less obvious. One possible distinction is that technology aims at creating useful things, but art creates experiences. That said, we acknowledge that an experience can be useful, and using something is an experience. Art Craft / Technology Magic is a craft activity, as is engineering and arts. A magician s goal is control over nature through artificial means, which is the same as for an engineer. The means are different though, and credibility for magicians is currently low. Casting spells and reciting incantations is not engineering even though an occasional curse can be heard from the computer lab. Increasingly, the results of engineering could well be perceived as magic by anyone not familiar with the technology. A door that automatically opens is certainly magic if you do not know about sensors and electrical motors. Interestingly, engineers of the middle ages cultivated rather than shunned, their reputation as sorcerers [WE]. F=k·m1·m2/r2 ? Eva, I feel strange Figure II.2.3 Famous experiments. II.2 Education Education is a mandatory prerequisite for both science and society, as we know them. Technology would certainly be magic without it. You, a reader of this book, have already understood that knowledge of science, research and technology is not given for free. Hard work is necessary to learn intellectual tools such as critical thinking, and to cultivate the multiple views needed of reality. The ticket is paid in time, time spent reading, discussing, thinking about, and testing facts and relationships. How long time is needed for this? The mean time set aside for education in Sweden is 12 years, but a political goal is that 50% of the students should continue for 3 more years. In practice our fast moving society will force us to learn even longer than that. Life long learning will be the norm, and the first 15 years will be spent learning how to learn. The bearer of knowledge is language, either written or spoken, and each discipline has its own. This means that to learn a discipline you need to understand, and use, its language, and this can only be done by spending time actively interacting with more knowledgeable practitioners and with knowledge sources, such as books and articles. A research oriented approach to life, spiced with a large dose of critical thinking, will make this interaction much more fun, and also more efficient. Hakan Gulliksson Society Education One computer science professor used to characterize the standard length of his lectures (a little less than an hour) as a microcentury. Internet "Not me. I'm depending on athletes and actors to raise my kids.” John Dobbin 17 Not only is education needed to maintain the knowledge level of how to build and use technology. It is also needed to understand what the consequences are if it is used, and to learn when not to apply the technology at all. Should we fear technology or fall in love with it? II.3 Creativity and designing for the vision Systematically repeating the work of someone else is by itself not very interesting. We need to fuel the process with new thoughts, inventions, imagination, intuition, by taking chances, looking out for surprises and examining their causes, i.e. we need creativity. If the application of creativity is goal based we call it design. In general, curiosity drives the scientist, and self-expression the artist. The designer on the other hand is other-serving working on behalf of others. Only as a special case a designer serves himself [ES]. Be creative, Draw your own illustration II.4 Basic assumptions There are several necessary assumptions behind the discussion of science above. One is that there is an objective reality and that this reality exists independent of human discovery or observation. Another assumption is that this reality is a basically orderly and regular environment where nothing happens without a cause. Events are assumed not to be random or accidental, effects have to have a cause. We also assume that we exist ourselves! Find out the cause of this effect, Or rather say, the cause of this defect. For this effect defective comes by cause. Hamlet (Shakespeare) A limitation of science is that it cannot produce absolute final truths. Throwing away a theory when a better one is found is a trademark of science. If two theories explain the same facts the simpler is considered the best one (Occams razor). "Entia non sunt multiplicanda praeter necessitatem, ("Entities should not be multiplied more than necessary"). Occams razor, WilliamOfOccam (1300-1349) Morals and ethics are also out of scope for science and we should not confuse scientific and technological development with progress, they are only means that could be used for good or evil. This is similar to how we regard money; it is up to the spender to decide how life is affected. Knowledge is power Sir Francis Bacon Z Z Z Z z Effect YeZ Cause Hakan Gulliksson 18 Part III: Systems, it and we are systems Development through technology, science and research has proved successful. We study the bits and pieces of existence, compare observations with previous knowledge, and arrive at various new conclusions. A systems oriented worldview supports this process well. Definition: A System is a set of variables selected by an observer. Ashby The following chapter is an overview of systematics and modelling. The rationale for this is manifold. First, a systems view provides us with a framework for discussing interaction. Second, such a view gives us a chance to introduce properties of systems that are found in many sciences and which are useful to characterise interactions and interactors. Some important examples are memory, feedback, adaptation, and learning. Third, looking at the world as a system is very much an engineering stance and equally applicable to design. Many technical and social systems serve a purpose and this is what motivates their existence in the first place. For other systems, such as a human, the purpose is less obvious. Hakan Gulliksson Definition: A system is a unified whole made up of one or more subsystems or components. 19 Seen from the outside, the system can be described as a processing unit in an environment where it produces outputs given some inputs. The system is contained inside an interface that encapsulates the behaviour of the system. Environment Interface Input Output Processing System Figure III.1 A system. Output is the result of processing input. An interface separates thesystem from its environment. Input provides necessary resources for the system, specifically data about the environment in which the system delves. This data can be perceived at different levels of abstraction. An image can for instance be described as photons, pixels, or as a data file. Espionage provides input, so does a glass of milk. Output is data or actions that brings about changes to the environment of the system, which in turn affects the system itself. Input is sometimes named events, i.e. dynamic aspects of the environment affecting system behaviour. An event is typically the cause that is transformed by the system to new events, the effects. A general system is composed of subsystems, or in other words, can be partitioned into modules, which makes up the internal structure of the system. Appropriately the great, great, grandfather of the word system was the Greek word sy´stema, which means, “ whole composed of parts . In order to be characterised as a system the subsystems have to interact, i.e. relationships between the subsystems are necessary. The interaction is the behaviour of the system and can be anything from physical atomic interactions to public transportation in a big city, to some way of expressing love. The interaction serves as the glue that keeps the subsystems together and is usually implemented by exchange of information units. Information units can be real, such as photons or a birthday gift, or they can be very abstract such as the idea of democracy. The idea of democracy spread through many information channels and resulted in new subsystems, new structures, new interactions, and new information units (quite a lot of documents). Modelling interactive systems as above does have limitations. An inputoutput view of a system restricts the way we think of processing. Alternatively we can view processing as change of system state, which for instance allow us to see motion, i.e. change of position, as processing. It is also difficult to use the system view to understand self-organisation, or the evolution of a system. Further, a system sometimes has qualities not found in, or at least not easily derived from, the subsystems. These emergent phenomena could come from behavioural aspects of local interactions and are easy to miss since the system model encourages focusing on overall input/output flows. Pile up four wheels, an engine and all of the other parts that make up a car, show them to someone who does not know Hakan Gulliksson (x,y,z) Infinity Home Eternity t 20 about cars (if you can find such person) and see if they can guess the concept of a car, motorway, parking lot, or a traffic jam. III.1 System properties, common to us all The following pages discuss properties that characterise a system. These properties are important because they will surface all of the time, when all sorts of systems are studied, in all sorts of sciences. III.1.1 Processing, Sequential or Parallel Central to the system is processing or computation. In its deepest sense it is that what changes, deliberately, by design, or just by chance, by nature, or by habit. Without processing nothing happens to input on its way to output, and there will be no change of the system state. Processing is performed by a processor, brain, CPU, or some other device that has a method to use input or memories to modify output or internal states. At a high level of abstraction it can be seen as reading symbols from memory, processing them by applying operations, and storing the result. It also reads the operations to be performed from memory, i.e. operations are data. Hmmm..… 07.55   Processor: Interprets and executes operations, to process data Figure III.1.1 System for computation. Memory: Operations, Data It is an interesting fact that with memory and a few very simple operations anything computable can be computed. One example of this is the DNA computer that can perform any computation using a soup of genes. The very simple operations are merging, splitting, and copying of genes. System overload is a problem to all processing, either the input volume or the number of inputs exceeds the capacity, or input arrives to fast. One solution is to somehow expand the total system capacity, e.g. buy a new, better, faster computer. If this is not possible then the system has to be modified. Another, good, solution is to skip less important tasks, or perform them with less accuracy. If the processing is necessary we can divide the original system up either sequentially, by adding concurrency (several subsystems in parallel), or by using multiplicity (several identical subsystems in parallel). Partitioning the load might involve restructuring the original solution and usually adds complexity to the control. Note that expansion by dividing a system up into a sequence, improving each step in the chain, can be seen as expansion in time, while concurrency and multiplicity is expansion in space. Hakan Gulliksson 21 One example where parallel expansion is used is in the telephone system. If one telephone exchange is over-loaded, another exchange is added in parallel. We use the same strategies also in everyday situations. When you serve pancakes for only one guest a single frying pan will do. With two guests the delay between consecutive pancakes could cause some irritation and if you have a big family, or have invited your neighbours, you have to exploit concurrency, i.e. two or more pans. A parallel system is inherently more efficient than a sequential and if it consists of several subsystems we can get additional bonuses. Some of the subsystems could specialise in important tasks, several solutions could be simultaneously tested, and if necessary the system could be made redundant and provide for fault tolerance. For a prose, parallelism helps bring about clarity, efficiency, forcefulness, rythm and balance. [ET] The human mind does not seem to handle concurrency well. Rational thought moves sequentially, tracking an effect for each cause. Take the example of Achilles and the tortoise. The warrior and the tortoise race and the much slower tortoise is given a head start at the beginning of the race. The competition starts and Achilles reaches the position where the tortoise started. However, in the mean time the tortoise has moved yet another increment. The race continues and Achilles steadily decreases the distance to the tortoise, but never catches up. If you try this at home Achilles will win the race every time. We have been tricked into doing the wrong coupling because it is difficult to keep track of two concurrent chains of causes and effects. A reflection here is that there is a need for both parallel and sequential computations to implement conjunctions, i.e. to draw conclusions on interesting events overlapping in time. Let us say that you line up a number of computers in parallel, each computer calculating the speed of a specific car on the motorway. Without adding a sequence of inference steps we cannot know more than the speed of each car. If we on the other hand add additional steps, we can calculate the mean speed, or detect patterns in the behaviour of the cars. This principle is visible in all sorts of system solutions, for instance in human vision processing. The eyes first execute a massive parallel data processing step, computing local light intensity, and other local information. The next step is to combine the many information channels to higher-level visual objects. Perception for identification Perception for action Figure III.1.2 Visual processing, stage 1 parallel, performed by millions of rods and cones. Stage 2 sequential in the visual pathway. Stage 1 Stage 2 Hakan Gulliksson 22 III.1.2 Distributed or Centralised How system control is organised is important. It is simpler to centralise the control, but the system will be more robust if we distribute. The cars on all of a country s motorways are one example of a system with distributed control. Even if one of the cars loses control, all the other cars in the country will still manage. Flight control at an airport on the other hand is a centralised system. Possibly an airport could still work without central control, but it would certainly be a very inefficient and risky airport. A nuclear power station is another example of a centralised system run from the control room. Maybe the biggest system of all, the Internet, owes much of its success to a decentralised approach to control. Two people talking to each other is a system with distributed control. Each participant has control over her own behaviour, and interacts with the cotalker. An author running a word processor when centralised control is exercised, at least as long as Word® does not start to edit by itself. It does make rudimentary attempt to do this, for instance by automatically change I to I, which is quite annoying when you are writing in Swedish where i means inside . Not only can control be distributed, but also computations and memory. To understand how memory can be distributed, think about a shopping list, or a diary. Distributed computation is exemplified by how monkeys (and you?) have been found to guide the movement of their fingers. It seems that the computation leading to this movement is not localised to a small spot in the brain, but is spread out over a large area. Distributed solutions are powerful and a proof of this is that the whole population of Sweden maintains a supply of food through local decisions. Does democracy mean distributed or centralised control? Really? Is the a system with centralised or distributed control? Memory Lane The main rationale for having multiple components rather than one extremely complicated component is that a single component doing everything is the prime suspect to be the bottleneck for speed and reliability. Partitioning the functionality gives us a modular, easily modified, flexible system, where resource sharing and load balancing are possible. There is a natural tension in design between distributing data and managing resources. Distribution provides fast access to resources, close to where they are needed. But, on the other hand, keeping resources in a central repository gives easy access for control and manipulation. Cohesion and coupling are two measures, mostly used to describe software, and they say a lot about a distributed system. Cohesion describes to what degree its subsystems together perform a single task. A car is a good example where all subsystems in a car co-operate with the objective of transporting the passengers from one place to another as comfortably as possible. Coupling describes the level of interdependency between two subsystems. Interdependency is quantified the number of messages exchanged, and in the amount of shared resources such as memory. If the target is to design a highly modular system the coupling between the system and its environment should be minimised. A proof of success is whether the communication between modules, and between modules and the environment, is minimised. Definition: Cohesion, to what extent the functionality of a system is self-contained. Definition: Coupling refers to the strength, directness and complexity of causal relations among parts of a system Weick III.1.3 Memory and Feedback Memory is a key to the intelligent system, and will be discussed recurrently in this book. One way to introduce it is through feedback, another important aspect of a system. Feedback means that somehow Hakan Gulliksson 23 previous output affects the input. To accomplish this memory, or physical storage, is needed. In the figure below a delay serves as memory slowing output (Out) down to modulate input (In). The memory could be either external or internal to the system sheer transmission delay, as in the figure, Neurons, RAM, paper, and hard disk are some examples. Delay (memory) Figure III.1.3 Delay in a feedback loop could serve as memory. S + Out In The concept of a memory is clear, it is where the information is stored and retrieved. How to find it is maybe not that obvious. To use it we need to access it, and there are two ways to do this. The first is to fetch and store at a specific address. This is how an ordinary PC works, and also how you use a deposit box at the bank. The alternative is to use associative memory where information is found through an associative mechanism. Remember the lyrics of the songs hum hum hum My way , Yesterday ti di ti da ? Also this kind of memory can be built using technology. Forgetting information is sometimes as important as remembering it. Ants mark the path to the food by pheromones. When the food is gone the pheromone path will not be refreshed, and it will slowly disappear, erased by wind and rain. Similarly, paths followed by data packets on the Internet will periodically be forgotten to enable new, perhaps more efficient, paths to be established. Java The figure below shows a simple system built such that output is always slightly bigger than the input. If we feed the output back into the system the output will increase until some power limit sets in, or something breaks. Delay (memory) In + t In Out Figure III.1.4 Positive feedback could result in an output that increases until something breaks. S Out > In t Misuse of feedback is a good way to build a worthless system, so beware. An everyday example is that if you do not care for washing up the dishes, you are likely to postpone it, and there will be even more dishes to do when you next time get around to do it. This will of course amplify your aversion and the problem escalates. Hakan Gulliksson 24 Feedback can however also be used to our advantage. If we know how the system reacts we can use feedback to moderate system behaviour. This is another human speciality; with a vivid memory of the last huge pile of dishes you might be more observant to your behaviour. If you are hungry you eat until you are satisfied, not more. If the food is really tasty the built-in feedback loop will still help you to stop. An alternative to feedback control is open-loop control. Without feedback the controller has to guess what control signal to apply. One example is a fire alarm where the alarm control does not really care if the fire is put out or not. Open loop control is fast, and robust against errors in output sensors and feedback loops. In Control Figure III.1.5 Open loop control, i.e. without feedback to control. S Out Yet another alternative is the feed-forward control, or in other words purpose focused, predictive, behaviour control, where the control system tries to estimate the likely disturbance of the system and compensates for this through an extra signal path for control. This type of control requires a very good knowledge about system behaviour. One example is that you bring your raincoat on a cloudy day, and put it on just before it starts raining. Another example is making a budget. This kind of control is important in design. In S1 S2 Out In Intelligent control for compensation Feed-forward feedback: What will happen if I do this? Figure III.1.6 Feed forward control, anticipating and compen-sating for future states. Try to think of situations where you use these different control principles and you will find plenty of them. Open loop control is for instance applied when accelerating a car from standstill, feedback is needed when you try not to exceed the speed limit, and feed forward is in action when you see a steep hill ahead of you and accelerate at the bottom of the hill, to take your car over it. A related system classification is to what extent a system is open. An open system is responsive to its environment and exchange energy, information, or materials with it. A closed system on the other hand, will eventually decay into chaos because of the limited interaction with its environment. A house under normal circumstances is heated and maintained. Residents enter or leave through doors, windows allow light into the house, news are shown on the television set, and the plumbing allows for disposal of sewage. A house is in other words a fairly open system. If we seal it up, close all inputs and outputs, it will slowly degenerate. At the other extreme imagine what will happen if we remove all windows and doors of a house. In some disciplines, e.g. computer science, a system is alternatively referred to as an open system if its interfaces to the external world are fully defined and available to the public. Hakan Gulliksson Information welcome 25 III.1.4 Adaptation and Learning If a system is given a choice between two 2 paths, both leading to the goal, it weights the pros and cons of the alternatives, and follows the best path. An adaptive system manages to take new facts into account. If one of the paths suddenly ends in a big black hole, the system changes it mind, turns around and takes the other path. Adaptation is very important for survival and consequently has been re-invented many times, and in many variations during human and machine evolution. Four different adaptations can be identified; by predetermined reflex, reasoning, learning, or by evolution over successive generations. Definition: Adaptation is the process whereby a system uses perceptions from the environment to optimise its behaviour. Flexibility is related to adaptability, but also describes systems that are capable of being turned, or twisted, without breaking. A more flexible system has additional degrees of freedom, i.e. additional state variables possible to modify. If a system has too many degrees of freedom it will be difficult to control, but with too few it cannot adapt. Think of the problems you would have to walk with two splinted knee joints. There is also an economical dimension to flexibility. Reuse is difficult without it. A brick is not very flexible in itself. But, given their characteristics they are easy, i.e. flexible, to combine. On the other hand, flexibility means that adaptation to the situation at hand is necessary, and this comes with an associated cost. A pre-fabricated house is the preferable solution to a pile of bricks in many cases. A balance is sought between rigid structures, e.g. plan economy, and flexibility, e.g. market economy. Simple adaptation is described in the next figure, see figure III.1.7. The parameter adjustment block has a fixed model that maps output behaviour to new parameter values, i.e. the system adapts by a predetermined reflex. The new values are passed to the controller that in turn adjusts the controlled system. In Controlled system Controller Out Figure III.1.7 Simple adaptation. Parameter adjustment One example of this type of system is a camera equipped with auto focus and where the controlled system is the lens. The controller positions the lens and the parameter adjustment processes the image to calculate a new lens position that is fed back to the controller. The loop from the output back to the controller is used to fine-tune the position of the lens in real time. Simple fixed reasoning could be added either to the parameter adjustment, or to the controller, to make the adaptation more intelligent. The auto focus controller could for instance use information about light or battery conditions to modify new parameter adjustments. Hakan Gulliksson 26 Usually the adaptation is associated with a real time constraint. The system must change at a rate determined by the environment. If it snows in summer and the temperature falls below zero, something that does not happen often even in Sweden, an adaptive creature such as yourself would just put on some more clothes, a plant on the other hand would die. The time scale of adaptations varies enormously, from the milliseconds range when we run and place our feet on uneven terrain, to cultural adaptation acting over years, or even generations. Another example of adaptation is a simple sponge, feeding on filtering water. It orientates itself such that the flowing water aids feeding. This way the sponge does not have to pump the water itself. What it must do is to somehow sense the current and adjust its own orientation. z z Z We talk of learning if we mean adaptation as a characteristic of an individual system and of evolution when referring to the collective process where reproductive mechanisms come into play. Reinforced learning is how we enhance the simple adaptation described above. As figure III.1.8 below illustrates, parameter adjustment, the critic block in the figure, is given more information and intelligence. It reinforces behaviour by telling the system if the output is right or wrong. One example of reinforced learning is a child learning to walk. It is very difficult to instruct a one-year-old. Yet somehow, through internal rules, courage and stubbornness the child manages. A learning algorithm goes something like this Under these conditions I did that, fell, and it hurt, so, I did this way instead under the same conditions and did not fall, “ha! “n example of learning by doing . In Controlled system Controller Out Figure III.1.8 Reinforced learning. Reinforce Critic In supervised learning an external expert is added with knowledge that improves the behaviour, see figure III.1.9. The controller also could take a more active part in adjusting behaviour, not only react from given parameters. One example of this type of learning is a child learning addition and division, 5 + 5 / 2. The expert provides the child with the right answer and also suggests strategies that the controller uses to solve the problem. Hakan Gulliksson z z Z 100 10 27 In Controlled system Controller Out Figure III.1.9 Supervised learning Expert Another way to learn, without reinforcement, is to integrate knowledge from the problem domain into the system. If the system is a machine we can directly implant knowledge, but for humans this is impossible and the necessary rote learning is laborious and error prone. Why is teaching a machine difficult? Learning improves performance in several ways. New facts, experiences, processes, and concepts, including new ways of reasoning, will be made available, and information can be reorganized for more efficient use. Learning is important also for generalisation and specialisation of knowledge and concepts. How else could we know the difference between hard rock, country, house, and techno? To conclude, in supervised learning the environment serves as a teacher, and in reinforced learning as a source for evaluation. Studying how to learn is important, both for the development of the human race and of the machines, and there are many approaches from human life that can be reused also for things. Examples are; learning by imitation, discovery, analogy, learning by studying books, learning from society (ethics, morale), learning by doing (e.g. programming), by writing, or by teaching (dialog with students). We will come back to learning and to its companions adaptability and reasoning many times in this book. Learning by imitation? III.1.5 Heterogeneity, Autonomy and Intelligence If all subsystems are the same, we say that the system is homogeneous. For mechanical maintenance this really matters, you will buy 50 identical aeroplanes for your company in order to simplify maintenance. Two ants chosen at random from an ant mill are very similar, i.e. they form a homogenous system. Heterogeneity on the other hand means that there are differences among subsystems. Internet is one example where a large system is put together from different networks and computers. A system managing on its own is said to be autonomous. A more specific definition of autonomy is given to the right. An ant is for instance not an autonomous system according to de definition since its behaviour is not learnt. It has evolved. An autonomous system is normally instinctively attributed with intelligence by a human observer, but autonomy or heterogeneity is not necessary for intelligent behaviour. An ant leaves a chemical marker indicating the trail to a nice source of ant food. Other ants follow the trail, and add to it when they return to the stack. As the trail strengthens more ants will follow it. This emergent behaviour that appears to be intelligent is actually accomplished by a very simple control mechanism. For intelligent autonomous systems the assumption of homogeneity is not valid, but it is a convenient approximation. We for instance have laws that should be the same for everyone. Hakan Gulliksson A system is autonomous to the extent that its behaviour is determined by its own experience [RN] 28 Autonomy is one aspect of intelligence, the ability to adapt, and to learn are others. Already Aristotle considered intelligence to be the main distinguishing feature of humans, but neither he nor anyone since, has come up with a definition of intelligence that everyone agrees on. Some attributes of intelligence are:      Memory capacity o Quantity o Information organisation and retrieval Problem solving ability o Speed o Complexity o Creativity Ability to learn o Facts o Concepts o Processes Self-awareness Social ability Several of these attributes are now no longer unique to humans and animals. They are found also in technology. In many areas and applications a computer program is much faster, can handle more complex relationships, and search better for information than humans. Intelligence is notoriously difficult to measure, which is not surprising considering that no one has managed to define it. Several attempts have been tried, for instance measuring vocabulary, reasoning, or memory, but they have all been criticised on different grounds. Three demands for intelligence are: complexity of purpose, structural plasticity, and unpredictability [TWD2]. The first claim is that a more intelligent system can manage activities with more complex purposes; which seems reasonable. The second claim is that intelligence improves with structural plasticity, i.e. with the ability to modify internal structures of the system to behave differently. The plasticity can be used to adapt to the environment, or to accomplish new goals, i.e. to reprogram behaviour. For a computer based system this capacity comes from its layered structure, see figure III.1.18. There is no need to change the hardware if we want to run a new program. Intenelligence measures: -Distance between the eyes -Rates of learning Nonsense -Syllables -Standardized test of Intelligence + 1110 + 0101 10011 How do you define stupid? ”Doing the same thing over and over again, in the same exact way, and expecting different results!” “Causing damage to self or others without corresponding advantage.” Stupid people believe that almost everyone else is stupid including: ”People who dislike me” ”People who disagree with me” ”People who are novices in an area in which I am an expert.” Application Objects/Functions Instructions Figure III.1.18 Plasticity is a result of a layered structure. Machine code Hardware With structural flexibility and a purpose that is not too obvious there is a possibility that the behaviour will appear unpredictable, but still purposeful. This system will look intelligent to us. Hakan Gulliksson Are you intelligent? Prove it! 29 III.1.6 Communication and language A world of isolated systems is simply not possible, and to keep a system together the subsystem need to communicate. We need communication, and language is a key communication concept that we will discuss many times in this book, along with various representations, transmitters and media. It is difficult to define and describe what communication really is, and the topic could easily fill a book by itself, but simply put it is the process of transferring data, information and meaning between humans, animals, plants, and things. One example is a dog barking at a cat, another is the colour of a flower, signalling to the bee. A third example is text input to a computer, and a last one is signalling that helps an aeroplane in bad weather conditions. Communication needs a language, code or signal with which it can represent the message, and a mechanism using a physical transmission medium. Assumptions for successful communication are that the message sent is well organised, consistent, follows the agreed protocol, and is about something meaningful to the receiver. The habitants of a West Indian Island will find it difficult to understand messages about different types of snow falling in Umeå. Communication generally does not imply any intention on behalf of the sender, but usually a sending mind has a reason for, and gains something by, communicating. Successful communication often demands that the sender adapts to the receiver and to the peculiarities of the transmission mechanism. The definition of communication in the margin to the right is a bit restrictive since you have to have a mind to participate, which excludes some active communicators, exemplified by an intelligent thing. Nor is passive communication covered, such as that seeing snow outside informs you about the weather conditions. A much less restrictive description of communication, formulated by Adam Bailey, is found below the first one. Use of the word process in this latter definition implies that communication has a time aspect, which in turn implies changes in the causes, characteristics, and results over the time span of a communication. Another comment is that the disseminated information should have some effect on the recipient. Otherwise, there is not much use of the communication. Our last attempt to a definition, also to the right, is the most general one. Human language can be seen as messages, i.e. as packets of information. It is however also a medium for interaction. A good language should be efficient, expressive and easy to use. Ease of use is supported by the fact that the meaning of a sentence is a function of the meaning of its parts, which in turn depends on the meaning of the words. Not as obvious as it seems when you think about it. The grouping principle applies. Experiments show that subjects remember the words for a couple of seconds, but the meaning of the words stays for a longer time. Perhaps the structure of language has evolved to facilitate remembering and understanding the meaning of what is said? Important properties of a human language are efficiency, ease of use, and expressiveness. Hakan Gulliksson The word “communication” itself originates from the Latin Communicare, which means, “to make common”, or “to share”, which makes sense. Definition:"Communication takes place when one mind so acts upon its environ-ment that another mind is influenced, and in that other mind an experience occurs, which is like the experience in the first mind, and is caused in part by that experience" I. A. Richards 1928 “Communication is a process where information is disseminated from a source to a recipient” Adam Bailey “Communication is change of state in a receiver caused either actively by an intent of a sender, or passively by properties in the context of the receiver” [HG] Interaction is acting within a representation Brenda Laurel 30 Efficiency in the use of language means that we do not tell everything we know to everyone, and what is left out is of no interest. Who has not heard a lengthy tale of a bus trip where nothing happens? You wait for the action, but instead learn everything about the colour of seats that were not comfortable. If you came from a different culture, where bus riding is an art, you might be interested, but otherwise the story violates the efficiency code. Expressiveness is another important issue, because we want to express ourselves as clearly as possible. We want to say what we mean, even if it is something complicated, or deep. Other features in a good language are ease of learning and error detection, as well as precision and compactness of the language for efficiency of use. What is laughter? Is it a language? Is it communication? Is it interaction? Vocal communication between humans, i.e. spoken language, takes place on many levels of experience; physical (sound waves), physiological (nerves, muscles), chemical (processes in muscles and brain), psychological, cultural (speaker environment), linguistic (language specific), and semantic (meaning of message). The written language looses information in the transformation from words spoken in a conversation. Rhythm, face expressions and many other helpful tips on how to interpret the message have disappeared. To compensate for this lack of context we need to add more words. Written language needs words such as explain and propose whereas in a face to face conversation we would understand that someone is trying to explain something to us, without explicitly telling us so. The language used for SMS messages have to be as concise as possible. A new sublanguage has developed with expressions such as 4u, lol, and roflmao. This language is also perfect for interactive TV-shows that display SMS messages in tickers. Language reflects culture, human behaviour, action, and other aspects of life and people seem to always find ways of saying what they need to say. One example is the word "drunk" that is supposed to have more synonyms than any other term in the English language. Another Modifies interesting, and equally useful, observation is that any language can express almost anything expressed in any other language (given a little time and lot of words). Culture is the creator of language and language sets the limits for what can be expressed in a culture. Society (users) Modifies Language Compared to the language of animals our own language is quite advanced. Animals are not able to communicate about things outside their immediate temporal and spatial contiguity. One exception is bees! They have a language to describe both where to find a nectar source and also the amount of nectar at the source. One limitation is that the bee language is pre-programmed in bee genes. “ statement such that I hope I feel as good tomorrow is impossible to make for any animal, at least as far as we know. The statement abstracts the current state, and projects it to a representation of a day that has not yet happened. When we expand an utterance to a sentence, or to a whole story we will add layer of layer of abstractions and symbols describing context, moods, actions, and much more. Hakan Gulliksson 31 III.1.7 Emergence Emergence is when fundamentally new behaviour or properties emerge as things, or other entities, are hooked up, run together, or are united. It is more than an effect of multiplicity, so doubling the workforce to get twice as much done does not qualify as emergent behaviour. If the workforce starts socialising and drinking beer, then you have an emergent behaviour. The collective behaviour is in that case not readily understood from the behaviour of the parts. Definition: Emergent properties of a complex physical system are neither (i) properties had by any parts of the system nor (ii) a mere summation of properties of parts of the system Examples of emergent properties are temperature and pressure of a gas. They do not follow directly from the description of one particle. Forming a drop of water out of atoms of Hydrogen and Oxygen is another example. Who could have thought that these two gases should combine into the necessary ingredient of life? Systems can be studied by first dividing them up into subsystems, investigating each subsystem separately, and eventually breaking each subsystem up into its basic elements. This strategy is called reductionism and has been used in many sciences, for instance to help us discover the atom and the cell. The basic idea is that through the understanding of the elements we can deduce the workings of the whole. An alternative strategy claims that the whole is more than the sum of its parts, and that the system is best understood when viewed as a whole, i.e. holism. There is an interesting asymmetry hidden here. When we reduce a given system to its parts there are no surprises, the book is composed of a sequence of pages each revealing a part of the plot. We start with knowledge of the whole, which gives us a particular view of the parts that explains how they contribute to the whole. For reductionism science works nicely, even though there are systems that are currently beyond our understanding. We do not understand how to reduce them to their separate parts. One example of such a complex system is our brain. Other systems, such as a filled red balloon, cannot be studied by taking them apart. If we, on the other hand, join already known parts together there can certainly be surprises. It can be very difficult, if not impossible, to predict the behaviour of the composed system, without actually putting it together and testing it. If we add three short sequences of film together the effect can be quite different from the sequences shown separately. A typical example is the cat-mouse-cat sequence indicating a threatened mouse. One example of emergence from physics is that two pendulums placed closed to each other will synchronize their swings. From the world of insects we know that some termites tend to drop their mud balls close to where other mud balls have already been dropped. This will create impressive architectures and is also a good example of where the environment and the actors interact. Hakan Gulliksson BUH! You don't take apart a frog to see how he jumps Seinfeld Emerging line Emerged pile 32 In many designed systems the behaviour of the system emerges as new skills are adopted and learned. This is a process that in turn could enable new behaviours, even more difficult to envision. According to the researcher Mark ”ickhard we should abandon what he calls a false metaphysics – a metaphysics of substances particles and properties and substitute it for a process metaphysics [MB3]. We already did that when we exchanged phlogiston for quite a different model of fire. The problem with emergent behaviour surfaces for instance when designing formal procedures for a work setting. In reality what people do is to develop the given formal procedures to new practices, from their own point of view better adapted to their context of work [WC]. Why then does emergence emerge? The answer is of course that emergence is the result of increased complexity, which is difficult to formalise, or even to comprehend. Our perceptual and cognitive abilities are not accurate enough and the analytical models of today are not sufficiently powerful. The real world (including humans) is too complex to understand! Intuitively it might seem that emergence is an unusual phenomenon, but this is not true. We are constantly surrounded by emergent phenomena. How come that the chair you are sitting on does not collapse? Why should quantum fields form atoms that are grouped into molecules, constituents of a tree that was used to make the chair? Level after level of emergent phenomena are present in this example. You are yourself an emergent phenomenon! Not only objects but also behaviours can emerge. One example is panic at a rock concert. How could you counter this behaviour using technology? Should technology be applied at a personal basis, e.g. earphones, or for the whole crowd loudspeakers? Heureka! Archimedes The white color of all refracted light, at its very first emergence ... is compounded of various colors. Sir I. Newton. It is tempting to say that emergence is a result of interaction! The constituents by themselves are not enough; interaction is the key that merges them, increases complexity, and possibly emerges the result to something new. III.1.8 Space and time, change and mobility Starting from the beginning we need to shortly introduce space and time. They are presumably the two most fundamental aspects of reality. Why this is so is a rather philosophical question, which we currently cannot answer. Undoubtedly space and time are very important because they provide any system with reference frames and contexts to delve in. An input can for instance be larger or smaller than another, and before or after. We compose a photograph by first placing objects on a scene, and a video is a sequence in time of images where we can watch our grandchildren play. Let there be light: and there was light…. And God said, Let there be a firmament in the midst of the waters, and let it divide the waters from the waters….. And God said, Let there be lights in the firmament of the heaven to divide the day from the night; and let them be for signs, and for seasons, and for days, and years:… on the seventh day God finished his work which he had done, and he rested on the seventh day from all his work which he had done. Genesis In the Beginning there was nothing, which exploded. Terry Pratchett Aztek calender Hakan Gulliksson 33 A stable system is a nice system; any input will deliver a limited output. Unstable systems do not behave that well, but can be very interesting. If, for a particular unstable system, you change the input even a teenyweeny, the systems output in theory rises to infinity. In real life we have no such thing as infinity because we live in a world with limited resources of energy and time. Anyway, you most certainly do not want to design unstable behaviour into your system. One example of an unstable system is a ball on top of a very high mountain. The ball is balancing on the top and even the smallest disturbance will send it down the mountain slope with increasing kinetic energy. Another example is the familiar microphone-loudspeaker interaction that quickly makes you turn the volume down, or move the microphone out of the way. The human equivalent to stability is someone who is not easily disturbed, or brought out of balance. Patterns are necessary prerequisites and consequences for stability. Adaptation needs a stable background, i.e. a pattern, to adapt to. The Adapt ability for pattern recognition is fundamental and extremely important. Adapt Adapt The example that first comes to mind is perhaps visual and auditory Adapt Adapt pattern recognition, but the body registers patterns too, a callused hand Adapt Adapt Adapt from playing golf is one example. A tree also recognizes patterns, the Adapt Adapt Adapt Adapt foliage tends to be thicker on the south side of the tree because of direct Adapt Adapt Adapt sunlight, and if a tree is partially shadowed it will lean over to catch more 45 sec Breakfast light. 3 min Bathroom 2 min dressing Equilibrium is another, quite natural description of a system state. The 12 hours work balls in the figures to the right are in equilibrium. The ball on the mountaintop is not in a very stable state, but it is in a state of equilibrium since no energy is transferred between the ball and the system. When the ball starts rushing down a slope it leaves equilibrium. Social equilibrium is the human equivalence where the individuals involved do not have anything important to say, and there are no outstanding questions to resolve. In terms of energy, the state of a stable system will not be disturbed by a small amount of energy. Equilibrium means that the energy of the system is divided among interaction participants and the environment in a way that does not change with time. A glass of water on a table will soon be in equilibrium, but put an ice cube on the table and you have a system that is not. You can actually try this at home. In terms of energy, the state of a stable system will not be disturbed by a small amount of energy. Equilibrium means that the energy of the system is divided among interaction participants and the environment in a way that does not change with time. A glass of water on a table will soon be in equilibrium, but put an ice cube on the table and you have a system that is not. You can actually try this at home. Hakan Gulliksson The relevant equation is: Knowledge = power = energy = matter = mass; a good bookshop is just a genteel Black Hole that knows how to read. Terry Pratchett 34 Time invariance is another of the nice system properties. It is nice in the sense that a time invariant system is predictable and will respond the same way to the identical input today, as it did yesterday, and as it will tomorrow. Humans are not time invariant, but many computer based applications are. Time variance implies some memory in the system because if you have no memory of yesterday, why should you change your behaviour today? In general invariance, and not only over time, is something that humans need and look for. It stabilizes aspects of the environment and helps us survive. Gravity, for instance, works one way only and has done so for a long time. This makes predictions about falling apples possible. A static system is a time invariant system with a behaviour that is independent of time; a dynamic system on the other hand is a time variant system. Its behaviour can be described as the function system_behaviour(t), where t is the time. Behind any dynamic aspect of a system, such as time variance and motion, is the fundamental notion of change. The level of change is a trade off between static aspects, such as stability, familiarity and security, and more dynamic ones, for instance flexibility and creativity. An alternative formulation is that existence is a trade off between being strapped down and bored on one hand, and being addictive to perceptual experiences on the other. Some fundamental changes are: open, on, start, increase, decrease, pass, rise, collapse, towards, away from, stop, turn off, close. A typical example of a physical system is a spring. If you pull a spring, the force F you have to apply will be proportional to the amount L that you want to extend it, i.e. L(F)=kF. This is independent of when you pull the spring (it was ideal, remember). If you let go of the extended spring the system is no longer static, but dynamic. The extension of the spring will depend on when you measure it, i.e. L(F, t), how far you pulled it, and when you let go of it, i.e. its initial state. A very important aspect of a system is its current location and whether this location is changing. Mobility makes exploration of a space possible, and the necessary prerequisite motion unites the concepts of space and time. The notion of space is not limited to the physical space of reality; a world created inside a computer also qualifies as space, and a social space is another example of a world where mobility means navigation using language and emotions. One space where we still have not figured out how to move is time itself. The basic categories of animate motion for survival and reproduction are: pursuing, evading, fighting, and courting. In abstract terms an interactor moves to access a resource. We go to see our friends; nomads move their tents when seasons change, or when the local food supply is finished. The main feature of a mobile system is to maintain a continuous availability of resources, for instance a communication channel. If a resource for some reason is not accessible often enough the system, and the designer, has failed. A GPS device that does not give the position when you are lost in the wood will not be used for many excursions. Sundials do not tell you when it is time to go to bed. Hakan Gulliksson "Right, you bastards, you're... you're geography" Terry Pratchett  - infinit decimal change Long now foundation is developing a 10.000 year clock F L Definition: Mobility - the quality of moving freely Ask Jeeves on the Internet If the mountain will not come to Muhammad, then Muhammad will go to the mountain. 35 Interaction, and especially sharing a resource between two interactors, is more difficult while moving. Without a stable context to use as a reference communication efficiency can be low, and distracting interrupts from changing context are more likely. If sound or vision is used interaction is only possible within a small area, or if the two systems are moving in parallel at approximately the same (low) speed. In all other cases additional technology is needed. A mobile phone solves the problem, but introduces new problems such as for instance low bandwidth channels and lost messages. Mobility has many implications. First of all a mobile system has to bring its own energy source, or somehow extract energy from the environment, e.g. using solar cells. If the system is to be carried around it should not be too heavy, or otherwise be in the way, which implies restrictions on its size. Similarly a mobile system cannot communicate over the wired network without a wireless access network. In general, for any mobile system, we have to bring technology along with us, or use some pervasive infrastructure, e.g. a watch, church bell, or sundial. Almost any change of the physical environment of a moving system is stochastic because we do not have enough information to infer how the context at the next position will differ from the current. Try out the experience of travelling over the plains in the south of Germany and suddenly see the Alps rise sharply in front of you. It is quite a surprise. Naturally, both the physical position and the social context change as the system moves, but so do also other environmental constraints change such as lighting, and time zone. The wireless transmission capacity could be reduced, or there is no one in sight to interact with. Some mobile systems are carried around, and in this case the carrier at least provides a stable context. M. Duchamp, Nude descending a staircase Definition: Mobile, capable of moving or being moved about readily. Up in the sky, swoop, swallows flees and mosquitous follow, loop. Up in the sky, swoop, swallow flees and mosquitous, follow, loop. A major problem is that changed context forces real time constraints on the system. Recall from the previous chapter that recall from memory, signal processing, feedback, adaptation, planning, and learning must be fast enough. Also, important tasks such as identification, manipulation, and navigation by a mobile system will be done under time pressure, as well as decision-making. Guiding the driver of a car using a map in a large unknown city is difficult. Just thinking about sitting beside the swearing driver during rush hour, guiding him to the train station, where the train is about to leave soon, raises the stress level. Is mobility a prerequisite for intelligence? Let us limit the discussion to the interactor Thing in the physical reality. The earth orbiting the sun is not a very intelligent system by the criteria that we have discussed, so mobility is not enough for intelligence. Many things behave intelligently for specific simple tasks, such as playing chess, but this intelligence is not learned, it is programmed into the thing by a (mobile) programmer. Even if we gave the thing wheels, and close to unlimited processing power and memory, it would still not manage in the physical reality. A major reason for this is its limited sensory system. If we added a distributed network of sensors and the ability to process the resulting information, would we now have enough functionality for intelligence? In effect the distributed sensors make a stationary system mobile without the need to move, but to build an intelligent system we also need effectuators. Through them a thing could affect the world and learn to draw conclusions from the result, i.e. the thing in this case is an autonomous, context aware, interactive system. Hakan Gulliksson As the philosopher Hegel said: Always already 36 Some researchers claim that even this is not enough. According to them the thing needs to master a language and use it to explore reality, interacting with equals that probably are mobile too. Given the characteristics of mobile systems discussed above there is no surprise that designing flexible, usable, secure, user interfaces for them is a real challenge. Motion One possibility would be to use space and time as a basis for this book. The problem with this is that they are too fundamental. We sense their presence in everything, but cannot really exploit this feeling when describing interaction and technology for interaction. Therefore we will not start this presentation all the way from space or time, but leave a little Objects something to philosophers and artists. (Well, all right, here is a piece of wisdom to wrap this passage up. Space without something in it is truly meaningless, Ok? Enter matter, which in turn is energy. A chunk of matter without anything happening is not very interesting, so we need events. As we add events time starts flowing and Rock n roll is not very far away. It s really very simple , and in the end it s all about having fun.) Actions Time Frame of events III.2 Complexity, we and it are certainly complex The most important property of a system is perhaps its complexity. All of us have had encounters with complex systems that we cannot comprehend, fix, adjust, or manipulate. Examples are the inner workings of a television set, and organising a group of ten 10-year olds at a birthday party. In other words, complexity is a natural concept to us humans, but how can we measure and quantify complexity? If we could estimate it we know if the task at hand is practicable, which helps us to predict and decide, see table III.2.1. For a system formulated as a software algorithm such estimation is usually possible. It is however much more difficult to quantify the complexity of the discussions around the dinner table, Friday night, at eight o clock, after a glass, or two, of wine. Low complexity Overview Fast reaction Cheap to manage Small amount of data Hakan Gulliksson High complexity Details Slow response Expensive Large amounts of data Table III.2.1 Some properties of complex systems versus systems with low complexity. 37 III.2.1 Why are systems complex and difficult to understand? As human beings we have a confession to make. Our environment is too complex for us, and the fact that we have survived thus far is truly amazing! At any one time we can keep track of three to seven items, which is not very many in the physical world. The input rate is also rather low. We can focus on approximately 20 bits of information per second where one bit is either a or a . Compare this with the million bits per second needed to code a video transmission of an ordinary televisionshow. Definition: A complex system is not easy to understand or analyze. It is easy to underestimate the complexity of ordinary life. Perhaps this is because we are so used to our own environment that we live our life through powerful, already accepted and learnt, abstractions? In design and development of technology we are on the other hand constantly faced with the details of implementation, and the abstractions are swept away, or are useful only as referential patterns. Obviously, we have somehow survived, so let us proceed with the complexity issue. Why are systems complex? One reason for this is hinted at in the definition to the right. To understand a system we need to understand its parts, and to understand the parts we need to understand how they interact. We start by looking at a system we all know well, the family. It is an example of a fairly complex system and the complexity is the result of, among other things, the following properties. There are different kinds of families, e.g. nuclear family, but all of them consist of a number of individuals, each with a relationship with the others. The relationships depend on the qualities of the individual. Families do not live alone, they have to interact with the outside world. We can describe different views of the family, economical or social, and a family fulfils different functions such as raising children, and providing for a sheltered social environment. A family is formed, grows, and disintegrates in a number of ways. It also adapts to changing circumstances. A stone is perhaps the simplest of things, but in the world of man, even a stone provides a stunning complexity. To start with its exterior has a texture, colour, and structure that is very difficult to describe in detail, but we can still easily separate Granite from Sandstone, and without looking we easily discriminate a round stone from an apple using our hands. Internally granite is composed of quartz and potassium feldspar, but also of other minerals depending on the conditions when it was formed. Quartz is typically grey to colourless and potassium feldspar is almost always pink coloured. Actually feldspar is a generic name for three very closely related minerals: Orthoclase, Sanidine, and Microcline with similar physical properties. They are all composed of the same elements, but with slightly different crystal structures. A small stone is something that we can throw, but it can also be used to mark one post of a soccer goal, or to stop a car from rolling down a slope. Several religions even recognize a Devine Deity in stones. Hakan Gulliksson Definition: A complex system consists of interconnected and interwoven parts. Complexity: When a wine is at once rich and deep, yet balaned and showing finesse Internet Hadschar al Aswad (the Moslem sacred “black stone”) Kohinoor. 38 Some systems, like the weather, are natural constructs, while other have a complexity because of man. Why was it necessary to make them complex? One reason is that we try to build general systems that are applicable in many contexts, but generalisation comes with a price tag. Radio is for sounds only, but the Internet can be used by many medias, at the price of complex behaviour and resources spent. A general system can handle many types of inputs, but more specialised systems restrict their input and functionality. Compare a bicycle to a family car, both can transport one person, but the more complicated car is useful under many more circumstances. Another reason for complexity is that we need, or want, to perform complex tasks, and even many relatively simple problems become complex because we do not have the time, computational resources, or memory to solve them in a straightforward way. Because of development of technology we can manage new and more complex tasks, and our interaction with the world becomes increasingly technology dependent. We want to visit Los Angeles and London, listen to music when we are out running, and use the Internet to contact our children, at our summerhouse, out in the archipelago. Simple systems?  Oscillator  Pendulum  Orbiting planet  Spring Simplify simplify Thoreau Complex problems have simple, easy-to-understand wrong answers. Internet 1234567∞ When is a system complex? One obvious answer is that if a system consists of a large number of components (more than 7) then it is a complex system. Adding components results in a more complex, but potentially also more adaptive system, which is one reason why increased complexity is built into progress. Another measure of complexity is the number of state variables that we need to describe a system. If we need a lot of them to describe the system (more than 7) it is a complex system. Whenever state variables are mutually dependent the complexities multiply, which is why systems involving interaction tend to be very complex. This can be reformulated as: feedback and interaction adds to complexity. Highly concurrent systems where subsystems work in parallel, rather than in sequence, are also complex. This is not surprising, as the number of possible states at a given time will increase. Dynamic systems, for instance mobile systems, are inherently more complicated than static ones, and timing issues such as synchronisation and delay complicates them even more. Each of us has encountered some technical equipment with too long reaction time. When you switch it off, it does not respond, so you become impatient and try the switch again, which turns the system on once more, and so your struggle continues, constantly out of synchrony. Time delays that vary in an unpredictable way are even more confusing. George Gates Hakan Gulliksson Internet now consists of more than 108 interconnected computers, a figure quite close to the 109 human individuals now living. Me complex? He, he, a naughty teaser you are. t t Bill Bush 39 Humans are good at handling many complex systems. We can recognise a familiar face behind sunglasses, or behind a hat pulled down over the eyes. Try building a robot system to wash up after dinner! How much force can you apply to hold a plate without breaking it when you clean it? A creative solution is to buy a dishwasher (or use paper plates). Last, but not least, stochastic systems, where we only know probabilities of events and behaviours, are more complex than the deterministic ones. This is not surprising since we know less about the system s states. To reduce complexity a stochastic signal is typically characterised by its mean value. The average Swedish male for instance has a shoe size of about 42, but this fact is of little use in the shoe shop when you want to buy your uncle shoes for Christmas. The answer to the ultimate question? 42! D. Adams Science has defined many measures of complexity for different purposes, some complementary, or even contradictory. Intuitively a highly regular thing is simple; think about a circle or a cubicle. We could also characterize something that is very irregular, or highly random, as having low complexity. White noise is mathematically very simple to describe, even though it is impossible to predict the value of such a signal at a specific time. Maybe we can understand complexity better if we contrast it to simplicity? The table below lists some differences between simple and complex systems [RR2]. Simple System Fully predicative and predictable Fully fractionable Has computable models Synthesis is the inverse of analysis Complex System Contains impredicativities and non-predictable Contains non-fractionable aspects Has non-computable and computable models Synthesis generally distinct from analysis One way to define complexity from a human perspective is that something is complex if it surprises us, makes errors, or behaves unexpectedly. Another, also intuitively acceptable definition is that a complex system is difficult to fully describe. Next we will discuss how to reduce complexity, i.e. to help us stay in control and understand what is happening. “War of the ants” “Simple – Not involved or complex“ Websters dictionary Table III.2.2 Contrasting simple and complex Rate Time III.2.2 Reducing complexity It s a fact is that somehow humans survives. How is this possible? What mechanisms have helps us to manage the amazing complexity of reality? We are geared at recognition of patterns of behaviours, objects, and structures. This is supported by of our physiology, and might be the best (only) way to master complexity. Pattern recognition is however only the tip of the ice berg. There are many mechanisms and principles supporting and complementing it. Hakan Gulliksson Some characteristics of really complex systems are: they learn, adapt, react, organize, mutate, evolve, explore, couple, expand and organise. 40 The first principle is to group similar or closely linked items together and the second principle is to order the items. It works by reducing the number of features, behaviours, relations, or whatever the type of components attended to. The principle also applies to the case where something occurs often. Information about such events will be stored efficiently and for fast access. The second principle reduces the perceived randomness of the system under study. We will discuss grouping and ordering in this chapter, but there are also other strategies. Attention, i.e. ignoring information is one, and concentrating on differences another. Grouping will create networks of hierarchies of items and hide details behind interfaces. A good grouping will result in a structure that is modular with low coupling between different partitions. In physical reality space and time provide natural references for grouping, e.g. a family lives at an address over a period of several years. Events or things sensed simultaneously probably have a relation. Coupling and cohesion are two measures of how close two objects are. Coupling refers to the number of connections between objects, and cohesion to what extent they are glued together. They could for instance be parts of a pattern. Two objects with high coupling are a brother and a sister, less coupling is found between cousins. A modular group with high cohesion and coupling can easily be encapsulated as a new item, or concept, like a football team, a kitchen, or a lawn. The windows on a house have an extremely high level of cohesion to the house. Car Family Figure III.2.1 Aggregated systems. Tires Engine Parent Child There are many notations for describing grouping. In UML (Unified Modelling Language) grouping components into a system is called aggregation, denoted by a diamond as shown in the figure III.2.2. The diamond is placed at the grouping object s end of the association. “ car is a grouping object, aggregating four tyres. Car 4 Tyres When we group similar objects and abstract the similarities into a new object the resulting object is referred to as a generalisation, or in UML as an Is a relationship, or in other disciplines as an abstraction. In art the word abstraction is however used differently, interchangeably with nonobjective imagery that departs from representational accuracy . The figure item to the left in the figure III.2.3 below is a generalisation of a circle. If we follow the relationship the other way we say that the circle inherits the properties of the figure. Hakan Gulliksson Figure III.2.2 Aggregation in UML. Simplification a’la Pablo Picasso 41 Square Figure ”Is a” Ellipse Circle Colour Draw() “Is a” Man Son Figure III.2.4 Generalisation as it is used in object oriented programming, described in UML. Father Generalisation for both actions and objects is useful, even necessary, but the downside is that we loose track of any inner workings and internal details. In real life humans exploit such additional information in many ways. If you are told that someone has escaped it makes a difference in if you are a terrorist or a magician. If you are a magician you are interested if the escape was from a cast or from a chain. You would like to see a video because maybe you could reuse some ideas. All of this is hidden by abstraction in the action verb escape . In software abstraction is used to hide the implementation of subsystems. This makes software easy to reuse at the cost of loosing control over internal data structures. A control that could be vital for optimised implementation of the system as a whole. The use of inheritance simplifies because one object serves as the blueprint that can be extended to describe a new system. The circle can be described as a particular graphic figure that inherits the properties of the abstract graphic figure such as having a colour and being something that you can draw, see illustration to the right. A new figure, a rectangle, can be created from the basic graphic figure and will also immediately inherit the colour property and the basic method draw(). Similar to generalisation and aggregation is scaling. The basic idea is to reduce complexity by studying the system at an appropriate level of detail. If we for instance want to calculate the orbit of the moon we should ignore the effect of people on earth, and model the system as two solid bodies interacting by gravitational forces. Nicely complementing scaling is focusing. When we focus we ignore everything we do not attend to. By dynamically changing resolution levels and adjusting focus we can find the optimal way to study a system. At the lower resolution we make assumptions and build hypothesis, which we test at higher resolution. Pablo Picasso (i.e. Pablo Diego Jose Francisco de Paula Juan Nepomucenco Maria de los Remedios Cipriano de la Santisima Trinidad Ruiz Picasso) Figure Colour Draw() ”Is a” Rectangle Stretch() Hint: Hungry for blood Order reduces complexity and also the computational demand for access (search time), which is important for many tasks. The telephone directory would be quite useless without sorting the subscribers. Order can be imposed by sorting into simple categories, such as in the phonebook, or according to more complex criteria. One example is the family tree for our flora where flowers and plants are sorted in a way that is not always obvious. Another example is when we order actions to form a plan. Ordering does not only show up for reasons of simplified data access. Sometimes orderly behaviour and specific sequences of actions are appropriate. There are furthermore also constraints in the time domain, for example that you have to start working before you can stop, and also Hakan Gulliksson 42 spatial constraints such as passing the hallway is necessary to get from the front door to the living room. A special case of order is symmetry, i.e. where entities exhibit correspondence in size and shape. The left and right part of the human body and face are for instance roughly symmetrical. Other special cases of order are repetition of identical, or similar, instances, i.e. multiplicity, empty space, and total randomness. Programmers look for the features above (similarities, order, symmetries, couplings) in a problem. If for instance a similar functionality is needed in several places, the software for this functionality can be reused. Almost any structure found in a problem can be used to simplify the design and the implementation of software, and of course also to simplify other types of design. Grouping and ordering works for human interactions, and also for human-computer interactions. One example is a hierarchical menu systems where each menu provides commands that have something in common, as exemplified by the edit menu in Microsoft Word ©. Society tries to order and group humans, but this is not trivial, the medium size does not seem to fit all of us. Humans on the other hand are adaptive which means that if we want to, we ourselves can reduce complexity by accepting the grouping and the orderly behaviour imposed on us. We start working at about the same time in the morning even if it is not necessary, and go to lunch when our colleagues do. Why? What about nature in general? What is its level of complexity? Systems that are too simple have problems adapting to changing conditions. But, very complex systems on the other hand need to extract a lot of energy from the environment to survive. There is a productive cycle here that human evolution has explored. Adapt to gain access to more energy, use the energy to adapt and find even more energy, and use it to further fuel adaptation. One important concept for orderly behaviour is the cycle where something is kept constant over each cycle. Repetition, iteration, and recursion all profit on the cycle that also naturally describes many aspects of our lives, the cycle of life, seasons of the year, 60 seconds and the CPU cycle. Reducing complexity by layering is whether a useful way of grouping. Functionality in computer programs, information processes and the like, can be grouped and ordered this way. The system model used in this chapter is itself divided into input–processing–output layers. Another example is our vision. The eye registers light and pre-processes it, the preprocessed signal is sent to the brain where it is further processed and delivered to the cortex of the brain. Hakan Gulliksson Photograph of snow. Rendering of ocean waves. Rendering of sand dunes. Photograph (left) and rendering (right) of human skin Why are so many solutions in nature, roses, waves at the sea, snowflakes, beautifully ordered and structured? Do you prefer a cyclic model of the Universe (rebirth for ever) to a singular (one shot)? Why? The king is dead long live the king 43 Input Output a) Human sensory system , hearing, vision, smell. Input Output b) Human vision Figure III.2.5 Layered systems are found everywhere in nature. c) Information flow in an organization. The figure above shows three different situations where layering is used. Given n layers and m possible actions per layer, situation a) above allows mn interactions since from any action at a layer, m new actions are available. In situation b) there are (n-1) interfaces to consider and the number of interactions is m2(n-1), a much less complicated system. The simplicity gained however comes with a cost, as always. Implementation of a complex system becomes manageable as functions are separated, but the cost is that shortcuts between layers are difficult to implement. Someone on the factory floor of a multinational co-operation will have few opportunities to address the big boss himself, compare c) in the figure above. There are too many opaque intermediate levels. Humans are good at finding criteria for ordering and imposing different structures. This is exemplified in registration plates on cars, university departments, and books in libraries. The social security number is another interesting example. It has a structure specially designed for computers! Finding a good structure is a creative task. It takes time and experience to find one, and a bad choice will usually result in more computations. It will take longer to find and sort items for manipulation. Layering is also an example of divide and conquer. Complexity is conquered by intelligently dividing a complex set of items, or a problem, into subgroups, and then recursively dividing each subgroup until manageable subgroups are achieved. Caesar used this when he split his enemies into smaller groups and defeated them one by one. In order for this strategy to work the subgroups should have low coupling. In Caesar s case the low coupling was provided for free by his opponents themselves who were not too eager to co-operate. Complexity is not always a bad thing. Because of complexity many solutions to a problem can be found, and we can exploit this by adding, or rather reconsider, some degrees of freedom of the problem we study. If we consider repainting the kitchen we might find acceptable new dinner plates. Einstein reconsidered the constraints on what light should behave like, and invented relativity theory. Hakan Gulliksson Divide et impera. Julius Caesar (100 - 44 BC). Give some example where the strategy divide and conquer does not work because of dependencies among subproblems. 44 III.2.2.1 Examples of groupings Gestalt theory studies the laws of human perception through the use of visual patterns. Individual parts combine to reveal identifiable patterns. This combination is done as an active process by our minds using sensory input. The following laws for grouping have been identified by Gestalt theory. 1. Gestalt theory says that humans actively identifies patterns. Why do humans have such a facility? How has it developed? Is it possible to copy human pattern matching to a computer? Law of proximity. Objects close to each other are grouped. The example is seen as three groups rather than nine characters 2. Law of similarity. Similar objects are grouped. 3. Law of closure. We close structures where parts are missing. 4. Law of appropriate continuation. We assume that structures behave regularly, and as simple as possible, when we cannot see them. 5. Law of common fate. If almost everyone is involved, or takes part, we assume that the rest do likewise. 6. Law of Prananz. Whenever there is a choice, a simple structure is preferred to a complicated one. The figure will be interpreted as a square and a triangle although other interpretations are possible. How many? 2? 3? 4? 5? 6? xxx xxx xxx The image to the right exemplifies that humans have problems whenever an image does not adhere to the regular, simple structures we have evolved to interpret. Gestalt theory applies to sound as well. We will for instance fill out missing parts of a sound to match it against our expectations. How we do our grouping will always depend on our point of view. Let us say that you are interested in animals. At one point you may want to look up all of the dogs in a database, but at the next you want all brown animals. A database is a computer tool for retrieving such diverse associations and one way to implement it is to group characteristics in tables, see figure III.2.6. This is called a relational database. Animal 1 Dog Brown Four legs Animal 2 Human Black / pale Two legs Animal 3 Cat White Four legs Figure III.2.6 A relational database. When you need to find all of the brown animals in the database you just search it and check the third row. If you want all brown animals with four legs you check the third and fourth rows. Hakan Gulliksson 45 Cat We will not go into sorting algorithms here, but as you can understand from the above, sorting is very important for efficiency reasons. No one really knows (yet) how humans perform grouping, sorting, and searching in their internal database. Grouping as described above (Animal (Dog, Human, Cat)) is one example of approximation by generalisation, another type of grouping is approximation by relaxing constraints. White Dog Red 4 legs Brown Human 2 legs Output Output Figure III.2.10 Approximation by relaxation. Input Input The figure above shows how data values can be relaxed. The simplification is not only visual; the mathematical description of the resulting linear system also becomes very simple. The output in the approximated system is only a factor k times the input, i.e. output = k·input + offset. Give an everyday example of approximation by relaxing of constraints. A geometrical way of representing even more complex signals and systems is to use a vector space. Pixels in a greyscale image can for instance be ordered in such a vector space. Each pixel in the image defines an axis in a coordinate system, i.e. one dimension in a vector space. An image with two pixels: (23, 12) will be represented as a point in a twodimensional plane, see figure below. Pixel 2 Figure III.2.11 Image with 2 pixels. Pixel 1 has the greyscale value of 23. Image (23,12) Pixel 1 Images with four pixels are similarly represented by four axes, i.e. in four dimensions, where each axis represents the value (colour or greyscale) of a pixel. Pixel 3 Image Pixel 1 Pixel 4 Figure III.2.12 Image with four pixels. With eight bits for each pixel we have a space of 4096 images. Pixel 2 A specific picture, i.e. with values defined for each of these four pixels, will be represented as a point in this coordinate system. In fact, all possible pictures with four pixels can be represented by the space spanned. A typical image displayed on a computer is 400 by 600 pixels. If each pixel is represented by 3 bytes the total number of images are 24 240.000 , which is quite a lot of images (> 1 followed by 240.000 zeroes). Hakan Gulliksson 46 A super sphere is the generalisation of a sphere to more than 3 dimensions. One way to use the vector space is to approximate all pictures inside a super sphere to the point in the middle of the super sphere, see figure below for an example in 4D. Hey presto, we have invented an image compression algorithm where we can represent a lot of similar pictures with only one point. This technique is also called clustering. Of course there is a price to pay, the cost is that the approximated pictures will be distorted. Vector spaces are powerful tools, so think this example through! Best compression, around 100% for ”DEL”. Internet Describe some everyday situations where you use clustering. Supersphere Pixel 3 Pixel 1 Approximation of all pictures in the supersphere. Pixel 4 Figure III.2.13 Using the image space for compression. Pixel 2 III.3 Modelling, it and us A model is a mapping of a system, or a design, onto something formed, natural or artificial, physical or virtual that can be used to test, explore, or communicate aspects of that system, or that design, puh... The word derives from the Latin word mo´dulus, which means measure or scale. Used in design a model is a device for understanding, communicating, testing and predicting aspects of systems. It is important to distinguish between the system itself, and the model that represents the system, see figure III.3.1. The model is not reality, at least not until virtual reality improves considerably, it is an abstraction, from a particular view, of the system under study. Some formal systems such as those constructed in mathematics are exceptions and have an exact model - namely the system itself. This is true also for virtual worlds that do not simulate reality. Modeling is coping with complexity Van Dam Things should be done as simple as possible, but not simpler. Albert Einstein Model world Abstract model Deductions, reasoning, calculations Abstraction, modelling Forecasting, comprehension Application, evaluation Questioning Observations Figure III.3.1 How a model (upper part of the figure) relates to a system (lower part). Anticipation, understanding System world Hakan Gulliksson 47 On the other hand, what you learn from any model by reasoning, or calculation, can easily make an impact in the real world, see figure III.3.1. One example is that a weather forecast can cancel the next day s skiing. The taxation authorities also use rather advanced and influential models. Sometimes results and understanding are better reached by questioning the system, rather than by model based reasoning, or by solving equations. The horizontal arrow in the system world in the figure above indicates this. Individual humans are for example very difficult to model; it is easier to ask them directly. The answers can be used to improve predictions, or for evaluating a model. Usually a model is designed for a specific purpose and supposedly irrelevant details are omitted. The mapping from the system to the model should be as close as s necessary, but not closer, which implies that any model comes with built-in errors and that there is no universally best model in any domain. “We have to remember that what we observe is not nature itself, but nature exposed to our method of questioning” Werner Karl Heisenberg Figure III.3.2 An image and a digitised model of its essentials as a signal. If you for example want to model the painting in the figure above, the frame is not relevant information. If you want to build a motherboard for a computer your model will probably ignore the colour of the motherboard. Free painting included! Who should do the modelling? When developing a computer based service, full time workers in the modelled domain are preferred! The problem is that knowledge about modelling is mostly found among software developers, mathematicians, and physicists, not among the practitioners of the domain. In general anyone involved in designing a system benefits from knowledge on modelling. How do we create new models? The following list is some suggestions [NG3].      Composition and decomposition of previous models. There is nothing new under the sun Reordering. Deletion and supplementation. One example of supplementation is the phi phenomena where a dot of light moving fast back and forth will be perceived as a line, i.e. an internal model of a line is emerges in the viewer by supplementation. Deformation. Emphasizing parts of a model in a new way. This is of course only one of an infinite number of ways to model modelling. Remember, no single description will ever be the ultimate one for all purposes. Hakan Gulliksson There is no more a unique world of worlds than there is a unique world. Nelson Goodman [NG3], (think of a model as a world) 48 III.3.1 Abstraction level One of the key questions in modelling is to decide the abstraction level to use. A model that is too general will not reveal any secrets; interesting behaviours will not be seen. Nor will the model be of any use if it is too specific. We can model a car as a chariot of war or of triumph; a vehicle of splendour, dignity, or solemnity, but this model will not help us if we need information about how to change the tires. The maintenance guide of the car, on the other hand, is of no use to us if we want to describe how it feels like to drive a car, or what petrol smells like. Definition:Abstraction is a view of a problem that extracts the essential information relevant to a particular purpose and ignores the remainder of the information. IEEE Standard Glossary of Software Engineering Terminology, Abstracting behaviour and operations is also useful, for instance in situations where many actions are necessary to accomplish a goal. Changing all headings to uppercase may need one operation per heading, but by using an abstraction in the form of a macro we could do all changes with a single command. Macro_up: If header Then Change a-z to A-Z End If III.3.2 Modelling view At a given level of abstraction any system can be studied from one or more of three different perspectives, intentional, conceptual, or physical [DB1]. The intentional view describes the system from the perspective of how, and where, it is going to be used, and what goals and expectations it can fulfil. One example of an intentional view of a system that transmits a message is as a sender that intends to tell you what time it is. While doing this we should ask ourselves if our model is useful. Modelling the mental states of a light switch does not add much value to us. From a conceptual view, we can learn how the system works, its properties, and what mental model we should use to understand the system. To explain a concept similarity, pattern matching, metaphors, and cultural associations are important. Understanding the principle behind a cut and paste function in an editor is one example, another example is a message viewed as packetised information. B A The physical view is not too difficult to understand. It is concerned with how the system and the real world influence each other. At this level a packetised message can be modelled as a sequence of bits, sent over an electric cable. Another example of a physical level model is a neuro-physiological model of the workings of the eye. The 3D-model of a coffee cup with a handle that is a perfect match for your finger is a third, and a red button indicating a stop function a fourth. Why do we need three levels? Why not two or four? This question is the subject of an ongoing discussion among philosophers, but it seems that at least three levels are needed. One physical interaction can be used to implement many concepts, and many concepts can use the same physical interaction. The cut and paste editing function can for instance be implemented through the use of either a keyboard or a mouse. In a similar way many intentions can make use of the same conceptual function and many alternative concepts can be used to fulfil an intention. You can achieve an objective in many different ways. If you want to tell your mother some good news to cheer her up; you can either visit her and tell her in person, or send her an email. The intentional view is needed Hakan Gulliksson Intentional Conceptual Physical 49 because the concept of sending your mother a mail does not include the additional information that you want to cheer her up. Can you think of an example where a single concept is used in many intentional views? At which levels of description is a car controllable and observable through measurements? Reflecting on systems from these three perspectives is something that humans do all of the time and it is a very useful practise in a design process. We can use an alarm clock to illustrate the three perspectives above. This specific clock is added as an extra function to a digital camera. If you are told that a digital camera is equipped with such a function you will understand, because of the intentional view, when it can be used. You will start looking for a user interface, i.e. you use a physical view of the device, and you expect, from your conceptual view of an alarm clock, that this interface will give you the opportunity to set the alarm, set the current time and turn the alarm off. This leaping between levels is typical for how humans think, matching different views. Why, Who? What? When? How? Next we will present other ways to model systems starting with the idea of system transparency. III.3.3 Basic types of models The transparency of a model describes to what degree the inner workings of a system are modelled. For minimum transparency, a phenomenological model can be used. This is a qualitative model constructed top down from observations and experiences, and there are many such models particularly for complex phenomena. We can take the stock market as a first example. In Sweden stock prices fall in the late spring. This is a phenomenological model that is verified every year. Medicine is another area where phenomological models are prevalent, some cures work and, are used, but no one knows exactly why they work. Many models from applied psychology are also phenomenological, models on human reaction times for instance. If we want to describe a messaging system at the phenomenological level it can be specified as a set of messages, Computer “ sends a message to central control C saying that it is alive . The good thing about the phenomenological model is that it models available data very well, and yet can be quite simple, perhaps just a graph showing the input-output relationship. The bad thing is that new data might completely disrupt the model since it is neither constructed on a solid theory, nor from deep knowledge about the system. When we increase the transparency more information about the inner working of a system is added. Either we focus on the structure or on the behaviour of the system. A model of the structure of the system describes how it is organised and composed of sub-components. In a communication system we can for instance identify a transmitter, and a receiver, connected by a channel as components. If we instead focus on the behaviour of the system we are more interested in how the system s states develop. In a communication system model the sender starts in a state where he wants to send something, he generates a message, transmits it over the channel, and ends up in a state waiting for an answer. Hakan Gulliksson A model expresses semantic properties of a modeled world W by syntactic properties of a representation R.. P. Wegner, Brown University (What more is there to say?) + H2O = Four wheels, engine accelerator, and a steering wheel. Release the clutch pedal slowly; when you hear or feel the engine begin to slow down, slowly press down on the gas pedal as you continue to release the clutch. The car will start to move forward. 50 One alternative is to model the data flows through the system. This can be exemplified by following a message as it passes different components, e.g. communicating computers. The sender generates the message, which is then transmitted and received by the receiver where it is assimilated and interpreted. A corresponding example from another world is when a letter is put in the mailbox, collected by the mailman, and delivered by the postal service. A data flow model is also a functional model since if we follow the data stream we can see how data is transformed at each functional node it passes. At each node input is mapped to output. F  To conclude, all of these models structure, behaviour, data flow, … describe the system, but they do it using different concepts and level of detail. If a precise, clearly defined, model is necessary a formal model is used. It has a well-defined textual or graphical representation, and rules that, when applied, result in a predictable, bounded behaviour. A message system described at this level can still include a set of possible messages, but now the messages are described using a language with predefined symbols, e.g. C <- “live “ “ sends message “live to C . Formal models can be designed to describe the structure, behaviour, or any other view of the system. The precise nature of the formal model has the added value that designing it enforces clear thinking about the problem. Less precise models are also needed since formalism restricts what you can express with a model. One example of a less precise modelling language is the spoken language. It is very expressive, but it is also very difficult to parse it and to extract meaning from an utterance. Many systems are difficult to formally describe. They are usually very complex, such as a theatre group, or a user work situation and are better described in writing, or by using informal graphics. A manuscript for a film is one example that describes sequences of events to be interpreted by the actors. Another way to classify models is as descriptive or predictive. Models used for engineering are mostly predictive and built on mathematics. They can be used to evaluate performance without actually building the real thing. Descriptive models, as the name suggests describe. A metaphor is one example, describing by analogy, where the mapping is done, not by formal rules, but by associations, cultural, personal, or other. One example of a useful metaphor is the window as a user interface. The user looks through a window into the realm of the computer application, much in the same way that we look through an ordinary window. Another useful metaphor, that we will use later when we describe different ways to organise data, is the tree with its roots, trunk, branches, and leaves. In fact, metaphors are much more common than we realise. One example is that up is used as a metaphor for more , and that the future is ahead of us. As we identify patterns using associations and metaphors we reduce complexity and save effort, adding another useful behaviour to those discussed in Chapter III.2.2. How do you do? ”Before you do this, close all windows” Cleaning Windows: A few simple Steps to a Clearer Outlook. Anti Microsoft campaign slogan? Metaphor? Window cleaning ad? Descriptive models: topographic map, orthophoto map, satellite image map, ecological map, geologic map… What metaphors would you apply to describe how to get where you want in a complex topology of web pages? From the above we can learn that modelling involves trade offs. In addition to the choices indicated above we have the following: Hakan Gulliksson 51    Analytical - Learned. An analytical model formulated as a program or a mathematical formula is explicit, but rigid. Learning a model is nature s alternative solution, but what is learnt is not always possible to explain, verify, or delete. How do you for instance estimate the time it takes you to go from bed to your chair at the office in the morning? Do you use an analytical model, a learned estimate, or a combination? Pure model – Data only. A pure model is computationally demanding. The alternative using data only is exact, but might need a lot of memory. Consider for instance the size of a database of photographs where all possible views of a face are stored, to be used for face animation. Context screened– Context driven. Allowing context to affect the model makes it believable and reality based, but at the cost of increased complexity. The ultimate model of reality is reality itself, and such a model will certainly be complex. III.3.3.1 Discrete, continuous Many system variables are best approximated and represented by continuous, real values, i.e. a real value in an interval. Reality can be thought of as a gigantic hierarchical composition of small, smaller, and extremely small building blocks. One effect of the depth of this composition is that using discrete attributes becomes impractical. You could probably calculate your weight by summing atoms, but you will save yourself a lot of trouble by using a bathroom scale. This preference of viewing reality as continuous is a problem for the current transistor-based computers that use binary representations. III.3.3.2 Deterministic or Stochastic Our knowledge of the behaviour of a system is sometimes reduced to a statistical measure. We can estimate that the next car seen will be red with the probability of 5 per cent. But, for all that we know there might be an enormous line of red cars around the corner, maybe not very likely, but we can never know for sure. We have to accept surprises. We for instance quickly characterise people we meet, but if we spend more time with them they will show new behaviours and talents and probably surprise us many times. Our lives would be incredibly dull without surprises, yet we fear loosing control, and the randomness that eventually will kill us. Hello, This is probably 438-9012, yes, the house of the famous statistician. I'm not at home, or do not want to answer the phone, most probably the latter. Leave your name and I'll probably phone you back. So far the probability of that is about 0,645. The Net Systems such as a car break down predictively, we just don t know when. We only know that it will be at a very inconvenient time. Systems whose behaviour we cannot predict with 100 per cent probability are called stochastic systems. If they are fully predictable systems they are called deterministic. One example of a deterministic system is the planetary system. The sun will rise again tomorrow, at least with reasonable assumptions about the Universe. Hakan Gulliksson 52 Real world deterministic systems are always stochastic to some extent. You can always invent some event that will make it stochastic (such as that the sun explodes one morning). On the other hand you can argue that any stochastic system is deterministic. If, somehow, you know all the facts you can exactly calculate, and predict with 100% probability, that the sun will explode, and turn the stochastic system once again into a deterministic one. What are we missing here? Is there no distinction at all between stochastic and deterministic systems? The point is that we study a system for a specific purpose and with some predefined knowledge. Given that, either a deterministic view or a stochastic view is the most appropriate. If our purpose is to build a practical calendar for everyday use, there is no need to take an exploding sun into consideration. Feb. 11. 12. 13. Give an example of a system that is 100% deterministic. Since modelling is about representing patterns the system must at least be determinitic enough to show a pattern to model. III.3.4 Representations To sense another system means that the system must somehow present itself. A representation, expression, or coding is needed. We will often use the word representation throughout the book and sometimes loosen the definition to the right. The demand for a formal system in the definition will be ignored, and how the information is made explicit is sometimes obvious, and other times left out. One example is that we will refer to a human face as a representation. Definition: A representation is a formal system for making explicit certain entities or types of information together with a specification of how the system does this. David Marr. For physical representations let us start by using natures own building blocks. Beginning with the small and working upwards we have: atoms, molecules, pressure fluctuations, cells, neurons, mechanical components built by materials with different properties, electronic and electromagnetic devices, and human behaviour. Combinations of representations give new representations, in the same manner as data types are combined to new data types in a programming environment. Man made structures of the above representations are used in different forms of technology. Even the volatility of reality can be put to use for communication. Pheromones deposited by ants evaporate over time. Gravitation is used for dropping bombs, which sends a message easy to understand by the receiver. The environment in these cases can be seen as an active participator in the interactions. A representation makes explicit certain entities or types of information. David Marr, sloppy version. H H H H O O H H H O H O ICE (3D) The digital world adds virtual representations, for instance computer based internal representations of image, video, or text. Being virtual in the definition to the right means that an interactor can accomplish something without having a physical representation. A virtual representation needs to be transformed into a physical representation before a human can sense it. Virtual representations are usually man made, i.e. created by technology. A database is one example, a web page another. Definition:something virtual is possessing a power of acting without the agency of matter. Internet Context For each representation there are many possible methods to access it, depending on the application, and on the context where the application is used. Examples of access technologies are a microphone, and a camera. Access methods used for information (I) use technology (T), and are by themselves technology, Access methods inherent to human (H), such as hearing, are natural, i.e. do not need technology, even though an H is also man made. Hakan Gulliksson Application Access Representation 53 There are also many ways to generate each representation, or in other words, to synthesise it, and the methods and tools used for this are themselves technology. Technologies for synthesis matching microphone and camera are the loudspeaker and the projector. Finally, for each pair of synthesis and access at least one representation is needed. Representations exist at different abstraction levels and are transformed to suit particular uses. The figure below shows different representations of the logical connective AND. Leftmost the representation consists only of a number of ink dots that form the word “ND . The truth table describes the rules for the connective, and the symbol & is used in logic to represent the connective. The symbol to the far right is used to denote an AND-gate, which performs electronic operations on its input. & AND Input 2 N Ink dots Give some examples of H-T-I, T-I-H, and I-H-T transformations. Transformations Different abstraction levels A Give some examples of virtualphysical-virtual transformations. One example of a physicalvirtual-physical transformation is the verification from the computer by a clicking sound when using the keyboard to input the letter f. D 1 0 Input 1 1 0 1 0 0 0 Figure III.3.5 Different abstraction levels and transformations of representations. Representations of an And gate are shown from ink dots to symbol used in electronic schematics. Truth table (AND means that the only time output is “1” is when both inputs are “1”Ψ The choice of representation is important; a bad choice was for instance one reason why the Roman culture did not develop mathematics. Using the right representation can save a lot of computational resources. This goes for language too, it is no coincident that most sciences have their own add-on language. It is efficient! How do Romans multiply X and X? X2=100 all the time? The representation of something is not the same as its internal organisation, content, or behaviour. As humans we are used to this separation, we know that someone can feel sad behind a broad smile. Most of the time we are interested in getting all relevant aspects of the message through as clearly as possible, but overall optimisation is not always possible. There can be conflicting demands, or we simply are not creative, or knowledgeable, enough to find the best representation. One example is the problem of finding a slogan for a large company. How can a one liner represent all possible aspects of the company? III.3.5 Language An advanced form of representation is a language. As the definition to the right says it is a system intended for communication. When used it must be associated with something meaningful, i.e. it is used to model something, and rules are needed to facilitate its interpretation. Representations (symbols) are organised according to the rules of the current context. One example is to order characters into words and sentences for English readers. Another example is how binary digits form packets of information that can be sent over the Internet. A third example is body language that can signal social information such as power hierarchies and individual desires. Hakan Gulliksson Definition: A language is a system of communication using a representation, metaphor, and rules for language use. 54 III.4 System environment, context, it is all around us What a system can learn from its environment is very important. The environment is also a system, so many of the properties discussed in Chapter III.13 apply. A system can for instance be immersed in a deterministic or non-deterministic, static or dynamic, discrete or continuous physical environment. But, even if the environment has certain properties this does not mean that the immersed system will perceive them as such. If we for instance send an intelligent thing that can sense only discrete events out into a continuous environment, the environment will be perceived as discrete. Physical environments are maybe the most obvious environments with variables such as temperature, humidity, and acceleration. They are inherently dynamic and continuous, for instance located in a mobile space, e.g. a bus. Furthermore it is impossible to have complete information about every aspect of them, which makes them nondeterministic and difficult to fully control, modify, and plan for. Inaccessible, non-deterministic, and continuous environments are sometimes referred to as open environments. Physical environments are also important since they cannot be ignored. One example is how noise affects speech recognition. Other examples are that keypads are difficult to use in the dark, and that LCD displays do not work well at temperatures below zero. The physical environment is of course also important since it serves as a reference for the virtual environment. Space, time, shape, motion and colour are all reused. "What is real? How do you define real? If you're talking about your senses, what you feel, taste, smell, or see, then all you're talking about are electrical signals interpreted by your brain” Morpheus in “The Matrix” +10 _ + -10 Adjust outdoor temperature Other kinds of environments are human based social and cultural environments, such as a nation, an organisation, a family, or a discipline, for instance mathematics. Humans group themselves in many dimensions, i.e. in many different societies and in principle we could list an infinite number of possible social environments, each with a specific knowledge base, skills and behaviours. In practise, each application, or interaction, defines its own specific environment and together with administrative considerations and cultural habits this means that most interactions take place in relatively well defined environments, e.g. in homes, hospitals, schools, or cars. Some cultural environments are goal based, for instance a group of people travelling to a vacation resort, or focused employees at Ericsson. Others have less clear objectives such as a family, or the citizens of Umeå. A cultural environment is, unlike the physical environment, not necessarily placed in time or space. The IEEE organisation for engineers has over 30.000 members all over the world. Social environments are open and the complexity of the physical and human social environments cannot be overstated. Hakan Gulliksson 55 Software based environments are examples of a third type of environment, a technology defined environment. Developers need to carefully select the operating system, software libraries, and maybe also the hardware used. The virtual environment is a special breed of software-based environments. It is technology defined, with laws that you can tinker with, as opposed to the physical environment. In a virtual environment even the notion of locality could be modified, all participants in an interaction might occupy exactly the same location. For virtual environments, adapting the environment to the user or agent is a real possibility. So, instead of adapting thousands of software agents to the current Internet environment it might be more efficient to prepare and structure the Internet itself for the agent and user invasion. So far this has not been done, we talk about cyberspace and the information highway but structures, policies and social behaviour for those who travel on the highway are missing. Cannot find Internet.sys. Universe halted. Internet World of entertainment, world of Star Trek, poetry, physics, golf, politics, parasites, Coca Cola. We will discuss the environment, or context, many times in this book. "They're made out of meat." "Meat?" "Meat. They're made out of meat." "Meat?" "There's no doubt about it. We picked up several from different parts of the planet, took them aboard our recon vessels, and probed them all the way through. They're completely meat." "That's impossible. What about the radio signals? The messages to the stars?" "They use the radio waves to talk, but the signals don't come from them. The signals come from machines." "So who made the machines? That's who we want to contact." "They made the machines. That's what I'm trying to tell you. Meat made the machines." "That's ridiculous. How can meat make a machine? You're asking me to believe in sentient meat." "I'm not asking you, I'm telling you. These creatures are the only sentient race in that sector and they're made out of meat." "Maybe they're like the orfolei. You know, a carbon-based intelligence that goes through a meat stage." "Nope. They're born meat and they die meat. We studied them for several of their life spans, which didn't take long. Do you have any idea what's the life span of meat?" "Spare me. Okay, maybe they're only part meat. You know, like the weddilei. A meat head with an electron plasma brain inside." "Nope. We thought of that, since they do have meat heads, like the weddilei. But I told you, we probed them. They're meat all the way through." "No brain?" "Oh, there's a brain all right. It's just that the brain is made out of meat! That's what I've been trying to tell you." "So ... what does the thinking?" "You're not understanding, are you? You're refusing to deal with what I'm telling you. The brain does the thinking. The meat." "Thinking meat! You're asking me to believe in thinking meat!" "Yes, thinking meat! Conscious meat! Loving meat. Dreaming meat. The meat is the whole deal! Are you beginning to get the picture or do I have to start all over?" "Omigod. You're serious then. They're made out of meat." "Thank you. Finally. Yes. They are indeed made out of meat. And they've been trying to get in touch with us for almost a hundred of their years." "Omigod. So what does this meat have in mind?" Terry Bisson (shorted version) Hakan Gulliksson 56 Part IV: Interactors, we are not alone Now the time has come to introduce the participants of the interaction. We will use the term interactor to stress interest in interaction. It is also a generic term, which helps us to abandon any prejudice linked to more specific terms such as person, or thing [BMD]. Other terms used for the interactor are agent, actor, participant, citizen, and sometimes also demon, monitor, interpreter or executive. A line of thought, that we will not follow up in this book, is that an interactor can be a part of another interactor, and itself a construct of interactors [BMD]. The participants chosen are Human, Thing and Information. Why these? Why three? The human is a natural choice, the specie that created the environment for this book, close to the writer, and to the intended readers, a typical user of technology. The thing as the second choice is more questionable. How about Dolphins? Ant colonies [DH]? The thing with the intelligent, designed, thing is that it is made by people and evolving very fast. Dolphins and ant colonies may be more intelligent than we think and have a lot to teach us, for instance on how to live our lives, but currently the communication bandwidth to them is quite limited. The tool has been with us since our savannah days, so interacting with its grandson is only natural. Also, human thing interaction has been explored by the Human-computerinteraction (HCI) research community and by many other scientific disciplines, which means that there is a lot of knowledge around. As a last motivation, the thing is something real and commonplace. So we accept the thing as a participant. Are there enough participants? Let us take the perspective of a thing. This particular thing is communicating at 155 Mbit per second with software somewhere else. This software, is it a thing? It is certainly not human. The software in this imagined case is a component based database system, distributed such that the software for controlling the data retrieval is executed on many computers connected over the Internet. The data and the software is broken up into zillions of pieces, each residing on its own computer, out there somewhere. It seems that we have a participant here that is not human and not a thing (localised, real, something you can touch). This participant we call information, and it is not necessarily confined in space or time. It is virtual, hard to touch, difficult to lay your hands on. Hakan Gulliksson Definition: A thing is a physical object that can be referred to as an. Websters dictionary: Thing: The real or concrete substance of an entity. (one out of 19 meanings) N I F O A R T M N O I 57 So, do we accept information as a participant? Information does seem a bit dull as a partner in an interaction. Where is the spirit? People have intelligence, we have things all around us for physical support, but maybe information could match our intelligence, and our creativity? One way to define intelligence is as the ability to surprise a human and come up with new ideas, like the zipper, or the desktop metaphor in human-computer-interaction. What then is an idea? It is not a human and it is not a thing, even though both of these participants interact with ideas. A good idea is to include the idea in the information interactor where it opens up new possibilities. It interacts with other ideas to generate new ideas. If we add idea to the information concept, information does not seem so dull any more. Now, consider the World Wide Web. The initial idea was to associate information using active character strings, i.e. addresses as links. By clicking on a link we can display new information, including new links. This idea gave us the first web browser, and presently software agents are researched that spend their time out on the web collecting information. So from the initial idea of the active string many new ideas have emanated. Who could have foreseen the software agent when the idea of a link was born? The agents are also information, interactors that we might refer to as active information. They can move, communicate among themselves, and even replicate. Such objects could monitor context, keep track of the interactors, and manage information transfer between them. HOME With more than three types of interactors the number of permutations of pairs of participants in interactions will grow. So, also for practical reasons three is a good number. We will have a problem to classify many topics as belonging to either the world of things, information, or humans. A human can be categorised as a thing, as well as an information processing entity, and knowledge is relevant to all interactors. The approach taken in this book is stepwise procrastination with a human centred perspective. We start out by introducing the human, providing it with as many relevant characteristics as possible. Next, we discuss information. Once again we include as many aspects as possible. Much of what has been discussed up to that point applies to the thing as well, but we will not repeat any material, only add. There are still quite a lot of unique properties left to discuss for the thing. The illustration to the right shows how we focus on H in our model, embedded in information. Some of this information is directly mediated from the physical reality (T), but an increasing amount of it is managed by computers, and gradually this information will be a world of its own, indicated by the arrows expanding the information area in the illustration. Hakan Gulliksson T I H T I H 58 Now we will introduce a generic interactor and start by describing some of its characteristics. We continue by studying its processing in context, i.e. how interactors process input, and generate output. Examples are as much as possible given using the human as an interactor. IV.1 We have an interface, a structure, and processing capability To be really interesting an interactor should be autonomous, be aware of its environment, and have a rational behaviour. With identity, intelligence, and a social life it can be someone like you. “n actor s behaviour can be conveniently represented by the Stimulus-response diagram (SR-diagram) where the behaviour is represented by its response to stimuli, see figure below [RA]. Stimulus Response Behaviour Figure IV.1.1 Stimulus-response diagram. Elaborated somewhat more, an interactor is a physical or virtual entity with some, or all of the following skills [JF]:          It is capable of sustained acting in an environment. It can communicate directly with other interactors. It is driven by a goal, a set of tendencies (in the form of individual objectives, commitments, or of a satisfaction/survival function which it tries to optimise). It possesses resources of its own. It is capable of perceiving its environment to a limited extent. It has a partial mental model of this environment and of its own history. It possesses skills, offers services and is prepared to handle a possible failure. It might be able to reproduce itself. It has a behaviour that tends toward satisfying its objectives, taking account of the resources and skills available, and according to the information it receives, i.e. a rational behaviour. Definition of an ideal rational interactor: For each percept sequence, do whatever action to maximise performance, on whatever knowledge available.” [JF, adapted] Goal! Mental model Communications Resources Figure IV.1.2 The interactor shown with its skills. Actions Perceptions Environment One important difference between the human and the other interactors is that humans cannot be changed whereas information and things can be designed and adapted to suit the situation and the application at hand. Hakan Gulliksson 59 IV.1.1 Representation The representations of two interactors can be quite different. Face, interface, and surface are good descriptions of the representations for the interactors we have chosen, and will be discussed in the following chapters. Many interactors have a unique name or number, but representations also include internal structures and architectures. Behaviours or actions, sounds, and organisation can represent a system and some examples are gestures, and a line of people waiting at a bus stop. What it is – Object What it does - Computation Affordance and accountability are two important concepts related to representation. Affordance refers to how appearance suggests function and gives opportunity for action. Accountability is how interactors tailor their representations such that they can be understood, also in the course of action, and is not always a static property of the system. IV.1.2 Perception and cognition The interactors detect, interpret, processes and effectuates. In other words it senses, perceives, and has cognitive and maybe even social abilities. Perception and cognition drive effectuators that generate outputs. Cognition is additional functionality in the interactor for selecting information (attention), manipulating it (thinking, processing), and storing it (learning, knowledge representation, memory), see figure IV.3.3 that shows the cognitive architecture. Sometimes perception is considered as a part of cognition, but we will treat it as a separate subsystem. The word cognition stems from Latin cognition, meaning examination, learning, and knowledge and cognitive science is the science studying such systems. It is a multidisciplinary research and psychology, computer science, philosophy, linguistics, and biology all contribute. We will use the cognitive architecture to structure the discussion. Discrimination, identification, manipulation, describing and responding to descriptions of objects, events and states of affairs in the world. Five tasks a cognitive theory has to explain. Harnad Why the focus on human cognition you might ask yourself? There are two reasons for this and the first is constructive; if we better understand human cognition we might be able to build better robots or other tools. The second goal is simply curiosity, how do we work? By building systems that mimic humans we might learn a thing or two about ourselves. "A human being is the measure of all things – of things that are, that they are, and of things that are not that they are not." Protagoras 480-411 BC Sensor Sensing modality Central system Action modality Eyes Ears Skin Nose Mouth … Vision Hearing Taste Smell Touch … Thinking Attention Memory Learning Reasoning Planning … Gaze Voice Facial expression Hand/body movement …. I cannot survive without brainwork. What else is there to live for? Sherlock Holmes Effectuator Eye Face Hand Body Mouth … Figure IV.1.3 The cognitive architecture for human information processing. Language From figure IV.3.3 we can see how sensory inputs, in different sensing modalities, flood the sensory system. Following paths through millions of neurons the input is passed, and massaged, until it reaches the central Hakan Gulliksson 60 nervous system. Here sensations are combined to abstractions and processed by perception and cognition. higher-level Perception, from Latin perce’ptio, means to receive or apprehend. It is the process of merging input into a usable mental representation of the world. This means organising, ignoring, and interpreting sensations, is very important tasks for complex beings. It allows animals to function in the real world, find their prey, and separate predators from nice looking individuals of the opposite sex. The number of perceptions and their content limits the complexity of the behaviour. Definition:Perception: the act of appre-hending material objects or qualities through the senses. A thermostat is simple and can perceive only two aspects of reality, too cold or too warm relative to a reference temperature. For a complex interactor like you to put itself in another interactor s place you have to mentally emulate the perceptions of the other interactor. Try to put yourself in the place of a thermostat, how would you perceive reality? What is life like to a thermostat? From a human viewpoint perceptions by evolution are marvellously adapted to our environment, and they are very resource efficient. We perceive a tiger through sensations, a smell, and a terrible sound. This makes us focus our attention on the tiger, and we recall that a tiger is rather dangerous. At this stage not even our mothers would recognise our facial expression, and the blood freezes to an extent inversely proportional to the speed with which our legs propel. An interactor of type H or T senses its environment and converts external physical variables to internal representations. To an interactor of type I input and output degenerates to I-I interaction. Typical characteristics sensed are the physical properties, and their resolution in sensor space and time. There is an interesting trade-off here between sensing and knowledge that is nicely illustrated in figure IV.3.4, adapted from [RA]. In a structured world knowledge is easy to reuse and sensing is not needed to guide actions. In a dynamic world, such as a motor highway, sensing is necessary. Dynamic and uncertain worlds Structured worlds Difficulty of sensing Utility of world knowledge Figure IV.1.4 Trade-off between sensing and using knowledge. One room indoor navigation The price of living by knowledge is additional demands on memory, but sensing also comes with a price. Continuously inspecting the environment is expensive. The world that people inhabit is definitely dynamic and uncertain, which means that sensing is of great importance to us. Hakan Gulliksson We are all disabled in certain situations, e.g in the dark. Outdoor navigation Value of sensing Possible to predict. Apple of my eye Perceiving is separating form from matter Aristotle 61 A fundamental, and important fact is that perception is a guess! It is as close as we get to reality, but perceptions are often wrong because it is often a guess about objects and situations that is made based on insufficient evidence. A moving shadow could be an eagle, and the bird quickly hides, but it could be almost anything else that moves. Output, or action, is realised by effectuators, such as a display, a muscular arm, or some mechanical device. Similar to sensors effectuators also can be characterised by resolution. "Pen bad. Keyboard good. Thought transference really good.” John Dobbin's matrix of input devices IV.1.3 Processing summarised Processing is the internal workings of the interactor, in contrast to perception that interprets external sensations, and to the actions that are the result of processing. Processing needs an execution unit that is fed by input data or information, i.e. memory or perceptions is also necessary. The execution unit can be characterised by its architecture and capacity. Our brain is one example of an execution unit, and one with a very special architecture and a limited capacity. For the input channel the most important attribute is the amount of data it can access per unit of time. A processing unit needs a description of how to use the information and programming, planning, reasoning, and learning are ways to generate this description. Processing unit Input/ output Memory Many interactors reason on cause and effect: They think to achieve goals, form a plan, or to overcome some difficulty. A prerequisite for reasoning is a context, e.g. a subset of the complete mental state. A plan is a sequence of actions positioned in time to fulfil one or more objectives. The action sequence in is constrained by the resources assigned, such as time, money, and states. Planning might also include an optimisation problem if the best plan is not easily found. Reasoning is once again useful when the individual steps of the plan are selected and executed, and slight changes in the environment might necessitate adjustments to the plan. The reasoning can be practical or theoretical, where for instance evaluating the pros and cons of writing a book is an example of practical reasoning directed towards actions. In theoretical reasoning we rather reason about beliefs. If you believe that Sweden is the best football team in the world, and they loose to Norway, you have some theoretical reasoning to do. All of this seems rather useless without a goal such as a pat on the back, a smile, or reproduction, all very much social rewards. An objective, or goal, is traditionally viewed as a mental construct, but with new technology it is better thought of as an information structure. The goal is many times selected by reasoning and they together with their relationships make up the overall objective. Sometimes a goal can be reduced to subgoals and often they can be described as a desirable change of system state. Finally, both plans and goals have to somehow be represented whether they are goals of a human, or of a thing. Hakan Gulliksson Definition:Planning is the deliberate process of generating and analysing alternative paths through a system state space, before they are followed. Definition: A goal is the state of a affair that (when achieved) termi-nates behaviour attended to achieve it. It is the end that justifies the mean. 62 At this point it is time to introduce knowledge, the input to processing, and the rules for it. A vital component of any system that is supposed to exhibit intelligent behaviour, and yet difficult to define. Without our collective knowledge and especially without the predictive power of this knowledge, human societies would not work. Knowledge is equally important to individuals. For instance, knowledge on how to dress in cold weather, the best wine to drink with an elk steak for Saturday dinner, and on how to behave when you get a temperature. Knowledge has to be generated and here is where learning comes in. Interactors operating in open, dynamic environments must adapt and for this they need to learn which is also important to improve reasoning and planning. The somewhat abstract definition to the right describes learning. Somehow the interactor gains knowledge and acquires skills, which increases performance. In a social environment where individuals learn in parallel and communicate the results to each other, we have a powerful tool for progress. Chapter IV.8 to IV.11 will elaborate on reasoning, planning, learning, and several other cognitive aspects for different interactors. Figure IV.3.5 models some of the aspects. New goal Plan Plan found Action Idle Finished Learn Goal achieved Fail Reason Revised plan found In the following chapters we will discuss representations and characteristics of interactors and their modelling and implementation. We will do that using the human interactor as a reference as often as possible, So much to say, so little time, and only a few pages. We would like to stress the monstrosity of the subject ahead. Almost every statement in the following is worthy of its own book, and it is for instance impossible for any human to fully comprehend all of the behaviours that together sums up to what we call a human being. The enormous complexity of the human brain and of human society surprisingly enough stabilises the system, a good thing for survival. As a by-product the complexity adds inertia to the system, which is perhaps not so good for survival. Definition: Knowledge is information in context, organised so that it can be readily applied to solving problems, perception and learning. Definition:Adaptation is an act of changing to fit different conditions. Definition: Learning is to become able to respond to task-demand or an environmental pressure in a different way as a result of earlier response to the same task (practice) or as a result of other intervening relevant experience... Figure IV.3.5 One (out of many) possible state descriptions of the relations between planning and learning. It seems that we do not understand humans. Much is magic, like creativity. Is it possible to design and build something that cannot be understood? Are computers better at this? On the other hand, babies are made without understanding how we work ourselves! IP ID kg IV.2 Human representations What can you read from a human? There are actually many different output channels available. Speech of course, but in general our entire behaviour in any given situation. As technology evolves it will open up new channels, both external and internal to the human body. External information could be extracted from measurement devices worn close to the body, such as a pulse meter that you always carry around with you. Complementary internal information can be extracted from the neural network, or other body systems at cell, or even at molecular, level. Hakan Gulliksson H $ cm IQ hue O H ZIP H 63 The main physical features of a human, from the perspective of representation, are eyes, mouth, fingers, hands, and body posture. All of which is specified and built by nature. The face is roughly symmetrical, probably because gravitation does not care for right or left, and the eyes are directed forward indicating that human is a hunter rather than hunted. Sensuality, integrity, power, intelligence, attractiveness [CC] Human noise and voice are other significant representations, and each human has its unique smell. How all of these representations affect other humans is not fully understood, but new clues are found by researchers all of the time. We for instance now know that we adapt more quickly to bad smells than to good smells. We on the other hand are more sensitive to bad smells, presumably for evolutionary reasons. applause, bite, boo, breath, burp, cheer, chew, chomp, cough, crowd, cry, drink, eat, fart, footsteps, gargle, gasp, giggle, groan, grunt, gulp, heartbeat, hiccup, kiss, laugh, scream, sigh, slurp, sneeze, sniff, snore, write, yawn, yell The representations of humans are dynamic. Gestures and facial expressions can mean different things depending on their timing, but dynamics is even more important for the human voice. The human being is an entity studied by other human beings; it is in other words a social being, and neither good nor bad. A human is equipped with a consciousness somehow integrated with the body and can have a belief about things (intentionality), and discuss what it feels like to taste chocolate (qualia). One remarkable feature is the freedom of choice, or in a slightly more negative formulation, the necessity to choose how to act at every moment. No human should be treated, or even regarded, as a tool, but as an individual with a unique personality. The human body is an engine consuming energy, and it is a 100.000.000 times as heavy as a drop of rain, and 10.000.000 times as long as a virus. Basic building blocks are atoms of carbon, nitrogen, and hydrogen. At another level she is a system of interacting organs, brain, lungs, heart, guts, that is sustained by food. On yet another level of abstraction she can be described by behaviour, thinking, memory, perceptions, consciousness, feelings, and emotions. As a race humans weights in total about as much as 50 pyramids. An amazingly complex blueprint for a cell structure is continuously developed through interactions in a social and physical environment. The disposition of a person as a whole is not easily attributed to any single reason. But, as with looks, at least some features are inherited from parents. Who, or what, is looking out my eye? The average human body contains enough: iron to make a 3 inch nail, sulfur to kill all fleas on an average dog, carbon to make 900 pencils, potassium to fire a toy cannon, fat to make 7 bars of soap, phosphorous to make 2,200 match heads, and water to fill a ten-gallon tank. Internet trivia A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying. Take orders, give orders, co-operate, act alone, solve equations, analyse a new problem, pitch manure,program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialisation is for insects. Robert Heinlein Hakan Gulliksson 64 IV.3 How to recognise Information? Information, featuring the idea, is the second participant in the HIT troika and the most information packed one, characterized by repetition, change, pattern, and surprise. How basic it is to life is best exemplified by the genetic code. This code is copied, mutated, and interpreted, and gives rise to all living creatures. Information does not die from old age, but it can be deleted. It can also be modified, filtered, and replaced. Another description is information as a correlation between two things or events produced by a lawful process , exemplified by that information about the temperature is found by looking at a thermometer [SP]. The word information itself originates from the Latin word informo that means to educate, or to shape, something, but nowadays the popular everyday usage of the term refers to facts and opinions obtained through life. Without it we do not know anything. Not all information is however knowledge, i.e. deeper insight, and certainly most information will not qualify as wisdom, where morals and ethics are important. Information is a necessary condition, but not sufficient in itself for knowledge and wisdom. A word often (mis-)used synonymously with information is data, plural of datum. Strictly speaking data is shaped information, i.e. information that is represented and coded, often digitally, in documents or databases. The concept of data assumes that there is a reality outside of the human mind or the machine, which cannot be directly captured, but can be indirectly sensed, measured, and represented as data. An interesting analogy is between data and information on the one hand, and sensation and perception on the other. IP ID kg ZIP H $ cm IQ hue Definition:Data is any symbol, sign or measure in a form that can be directly captured by a person or a machine. Definition:Information is data that has a value depending on context. In this chapter we will sometimes view data as a signal, which is broadly defined as an interruption in a field of constant energy transfer, possible to code or refer to, using spoken or written language. Defined this way a signal is the most basic unit of communication. The value of information depends on by whom, when, and where it is used. Information at the right time can be valuable, as history shows and if the thermometer shows zero degrees Celsius a bushman would quickly hide from the cold, while an Eskimo does not complain at all (maybe it is too hot?). In the wrong context information on the other hand is just data. The value of information also affected by its reliability and completeness. If the price is X98.99, this is not very valuable information, we would really like to see that missing number; 298,9X on the other hand is all right. Information as data, with a value, in context, is not enough to make information an interesting participant in interactions. But, if we add interpretation of information, as in the genetic code, information becomes more interesting. If we to this add some capability to act on the information, we have active information, possibly a rational agent. This is a technological counterpart to a human and is exemplified by software based mobile agents roaming through the Internet transmitting themselves, their program, and their states between computers. A really annoying example is the computer virus. Some of them are now called worms, beware when they reach higher levels of consciousness. Hakan Gulliksson 65 IV.3.1 Shannon’s information theory Published as “ mathematical theory of information the information theory was formulated by Claude Shannon as late as 1948. Quite late for mathematics used today in practical applications. Actually Mr Shannon did not intend to publish his findings, but he was urged to do it by his fellow employees! He retired at the age of 50, a wise man indeed. The theory defines information as a probability. A less probable event carries more information and is more difficult to transmit. The most used unit of information is bit and has two states, normally denoted and . This is strangely enough also the name and notation used for data represented by a computer. One example of an information system is flipping a coin. Head or tail are the only information units and the information content of the system is low (1 bit) since both head and tail have an equal, and not too high, probability. We can choose to represent head with and tail with which makes it possible to describe any event with one bit, i.e. as a or a . “ large number of improbable messages could represent an enormous amount of information. One example is a ballet. “rmed with Shannon s model we can predict the amount of information possible to send over a channel with a limited capacity, at least for simple channel models. We also can predict the compression ratio of a video. The bounds provided by information theory are fundamental, they are like the speed of light, nothing that can be stretched or circumvented. Order the following messages according to their information content: I see a red car. I see a yellow car. I see a T-Ford. I see a horse. Inform. theory Using only probability as a basis for information is however not always a good choice. It will not take into account the semantics or the aesthetics of what is said, just the probability that it will be said. If you call the fire station and tell them that it is your birthday it will be a highly unlikely message with lots of information. Despite this the firemen will not be very interested, they are only interested if the candles on the cake have set fire to your house. Some other problems with the model by Shannon are that many expressions are ambiguous, they mean different things in different situations, and that we use non literal sentences such as Nice weather, huh when it snows for the fifth day in a row. The last problem of the model, that we will discuss here, is that it fails to take higher levels of noise into account. Trying to persuade a wolf that rabbits are cuddly is one example where bias, prejudice, and cultural effects distort the message. Another example is using complex terms, as in a technical jargon. Ambigous expressions IV.3.2 Representations of information, see the soul of I Talking about representations of information is actually a tautology since information is representation, and a representation is information. Anyway, almost all information is analogue, i.e. continuous in amplitude and time. But, analogue signals are not very suitable for a computer, so we have to represent the signals as sequences of 1 or 0 binary numbers). In computer storage a or a is referred to as a bit. Bits are grouped in groups of eight bits, each group called a byte. Why eight? Hakan Gulliksson Analogue signal. Discrete signal. 66 Figure IV.3.2 1011 (binary) = 11 (decimal number representation 101 + 100 = 11). Digital numbers can easily be represented in the computer as a string of bits, each interpreted as a power of two. The digital number 11 is binary , which could be written as is · 3+0·22+1·21+1·20 as in the figure IV.3.2 above. Representing floating-point numbers is more awkward, 0.5 is all right (2-1), but how do you represent 0.3? Text in the computer is coded, i.e. represented, as a sequence of ASCII codes, each a sequence of bits. One example is the string H G , with the binary ASCII representation (1001000, 1000111) and the decimal ditto (72, 71). If we measure our temperature and find it to be zero degrees we know we are in trouble. We have several choices, either we represent the temperature by a single bit , or by a fixed length number , or as the “SCII character in bits represented as . Computer based representation of information is already a bit complicated. RIP 1910-1990 Moving upwards in the representation hierarchy we come to image, video, and sound that also have to be stored as digitised samples in the computer. Analogue representations such as sound cassette tapes and analogue records are less interesting with improved digital quality and storage capacity. For video and photography analogue formats are still important, but at least for photography the digital technology will soon be both cheaper and better. At an even higher level we can represent a story, law, constitution, and a weather forecast. What is the next higher level? What is the highest level? Not all information has a physical representation. A yellow coffee cup, a book, and a television set have representations, but what about Master of Science, subtraction, or reincarnation? You understood what we meant by Master of Science without us showing you a typical specimen. We only needed three printed words (17 characters) to represent this abstract idea. The duality of information as both representing something and at the same time being a physical entity suggests that our own complex mental processes also can be represented physically. A house is represented by the word house , by voltage potentials in memory circuits, or by ink dots on a paper. How do we humans represent it internally? The temporal dimension is a special case. Time can be represented as a time line where events are positioned, and from such a representation the temporal relationships between events can be found. We can easily see if an event is before, after, or overlaps another event, or time period. Like any other value, time can be represented as binary, digital, or as ASCII. Time can be pinpointed quite exact as in 24.00 31/12 1999, or less precise as in a couple of hours, eons, or a generation. Some time values are cyclic, 7.00 AM happens every day, not only the day the alarm clocks celebrates it. The year 2000, on the other hand, will never again occur, unless we change the calendar. Hakan Gulliksson “The wheel of time turns and ages come and pass, leaving memories that become legend. Legend fades to myth and even myth is long forgotten when the age that gave it birth comes again” Wheel of time, Robert Jordan y t 67 There are only a few basic ways that we can arrange data [RSW]. The most important ones are by number, and by time as described above, but we can also use magnitude, alphabet (from A to Z), category (similarity), location, continuum or randomness, i.e. lack of organisation. Each of these alternatives can be represented in many different ways. Magnitude, for instance, can be represented by a measuring stick, distance, and viewing angle, or by words such as in bigger than . Size Light Shape IV.3.3 Painting, Image and Video The creation of real world artefacts and paintings as representations of our inner mental models must have been a great leap for humanity. It was the first attempt towards a written language, taken at least 40.000 years ago. Image synthesis has evolved quite a lot since the cave-graffiti, and our external representations are now central to society. The engineering community depends on graphical tools for synthesis, and so do designers and artists. We are however still very limited by our tools, and at the same time totally dependent on them. Is it for instance possible to create an oil painting using a computer? Is it even a good idea if it can be done? Image analysis is the other side of the coin. From the scatterings and reflections of light around us we perceive the world. This light is, as all physics, very much an analogue phenomenon, impossible to represent exactly in the computer. An oil painting is quite a different representation from the computerised image, see figure IV.3.3. The painting is analogue (real) and the brushstrokes are three-dimensional structures modulated such that the painting will look different due to lighting and viewing direction. How do we represent this in a computer? 123 123 145 178 179 178 145 123 123 135 142 138 244 244 232 122 123 123 145 178 179 178 145 123 123 135 142 138 244 244 127 122 123 123 145 248 249 248 245 123 123 135 142 138 244 131 123 122 123 223 145 178 179 178 245 123 123 135 142 118 175 134 129 222 123 123 145 178 179 178 245 123 123 135 142 134 129 122 223 123 145 178 179 178 145 223 123 135 142 118 144 132 175 234 129 122 123 123 145 178 149 178 145 123 123 144 244 132 255 134 129 122 123 123 145 239 179 178 145 123 138 244 144 132 175 134 129 122 123 123 245 178 179 178 145 142 188 144 244 132 175 134 129 122 123 223 145 178 179 178 135 142 138 144 244 132 175 134 129 122 223 123 145 178 119 123 123 135 142 118 244 144 232 175 234 129 122 123 123 145 123 123 145 178 129 128 245 223 223 135 142 118 244 244 232 A major difference between computer-based images and real world representations is that it is much easier to copy a computer-based image; we can even do an exact copy. A digitised representation of a painting is consequently possible to copy, but not the painting itself. This possibility causes quite a lot of copyright problems, especially when a digitised copy with lower quality, e.g. MP3, images, or movies on the Internet, satisfies the presumptive customer. Hakan Gulliksson If the number of humans that has lived so far is 1011 and they each have seen 30 images per second, 16 hours a day, for 70 years. What is the total number of images that have been seen? [Owe this example to Professor Haibo Li at Umea University] Figure IV.3.3 Computer image representation. what is this? Red filled circle on blach background. What is the perceived difference if done in oil, water colour, or if it is a digital representation? 68 A digital image is represented as an array of pixels in the computer. Each pixel is stored as a number of bits, usually 8, 16 or 24, see figure below. The bit value of a pixel is mapped to intensity or to a colour and the resulting image is called a bit mapped image. 1 0 1 1 0 1 0 1 Information is free if bailed out. /HG Bit map table Colour table Pixel Colour value Figure IV.3.4 Bit mapped image. 1 2 Image … Pixel A nuisance with images is the large amount of data needed to represent them. This is one of the problems attacked by the important and extensive standardisation for image coding. The two formats most used are GIF and JPEG where GIF is mostly used for computer graphics and JPEG for compressed images. The problem is aggravated for video by a factor of 25 each second. The rate of 25 frames per second is fast enough for a human to perceive a fluent motion of objects in the video rather than individual images. For the resolution of 640*480 pixels, a typical resolution for digital television and the standards MPEG-2 and MPEG-4, a calculation of the memory demand gives 307,300 bytes per frame, even if each pixel is represented by only one byte. A video with this resolution and 25 frames per second will need 7,682,500 bytes per second, or 61,460,000 bits per second. The impressive numbers present us with quite a problem! . Pixel correlation -5 5 Figure IV.3.5 Pixel correlation in a picture. Pixel distance The figure above shows an excerpt from an image of a bird. The point we want to make is that the correlation between two pixels drops off quickly with the distance between pixels. Only six or seven pixels away almost all information from the reference pixel is forgotten. This behaviour means that the information in an image is quite close to noise, which in turn affects the way we process the image. Image compression standards such as MPEG-2 consequently manipulate the image in blocks of eight by eight pixels. There is on average nothing to gain by using larger blocks because of the low pixel correlation. Hakan Gulliksson How many images can you represent with 500 Kbit? If each pixel in the image is 8 bits, how many pixels are there in the image? A graphical object O of the Euclidian space Rn consists of a subset U  Rn and a function f: U  Rp (U defines the shape of the graphical object and the function f defines the attribute space) 69 Colours in an image can be represented in many different ways, some more adapted to human physiology than others. One way is to use the RGB (Red, Green, Blue) colour space, which is chosen to match the human colour perception. Another alternative is to specify a colour by its hue, saturation and luminance (HSV). In this representation the RGB cube is viewed along the greyscale axis. Luminance describes lightness, and saturation is the purity of a colour. A highly saturated hue has an intense colour, and with no saturation at all the hue becomes a shade of grey with the specified luminance. RGB HSV How we see colours depends on what we are used to see and on the colours of neighbouring patches. Colour dots on a painting merge additively when viewed on a distance such that blue and yellow dots will appear grey rather than green, which would be the result if we just mixed the pigments together into a new colour. CMYK (Cyan Magenta, Yellow and blacK) is a colour space for prining that acknowledges this fact. Colour and shades of grey can be represented on screens and paper in many different ways. Dithering is one example where a pattern of screen pixels or printer dots simulate colour or grey levels, see figure to the right where Mona Lisa is built up by Mona Lisas in three layers. Dithering works because human vision has a limited resolution and blends impressions to make sense of them. About five hundred thousand people in the United States claim that they are artists [MC2]. If each of them makes one piece of art, each year, this would amount to fifteen million artworks per generation. You have to do something pretty amazing to get noticed. Mona Lisa by Adam Finkelstein IV.3.4 Text Written text is not much data, but still can represent a lot of information. A typical data screen filled with text consists of a couple of thousand bytes, and a typical novel is less than a megabyte of words. Dracula by Gram Stoker is approximately 900 Kbyte, and Hamlet about 200 Kbyte. Printed text on paper is voluminous, but still not much data. Hand written text is even more voluminous except for psalm verses written on the backs of stamps. A not so very important fact is that about one hundred thousand new books are published every year in the United States [MC2]. The library of congress holds morre than 25 million books. At the lowest level text is composed of visual features. Letters are the next level, they are combinations of the primitive visual features, and themselves arranged to words. Words also have an outline that for short words simplifies reading. It is for instance difficult to manually find the spelling error where anl is misspelled as anl. Short words such as and, a, and the are efficient. They are used frequently, contain little information and are consequently, for efficiency, short. Words such as phantasmagoria, serendipity, and flabbergasted are longer, and the few times they are used conveys a lot of information. Hakan Gulliksson 70 Words are grouped into different categories of phrases. Noun phrases (NP) and Verb phrases (VP) are two examples that can be combined into a sentence (S). We could formulate this as S: NP VP, and in a real sentence The actor is dead. . ”ut, a text is usually something more than a collection of random sentences. There is an idea behind it that binds the sentences together. This could be a simple causal relation, or a common theme. A second sentence can elaborate on the first, or explain it. The development of writing also lead to the discovery of the representational structure of speech. Merlin Donald Something written is an external representation of thoughts and spoken language. By using this representation we simplify rational and scientific thinking. If the text is properly stored in the computer it can be traversed, searched, and grouped in different ways according to the application. This is a task suitable for technology, but how can we make technology understand the meaning of what is written? One step towards this objective is to represent meaning itself, for instance using a markup language such as RDF used on the Internet. IV.3.5 Sound and music Sound is very important for human communication, partly because it carries speech. Speech that carries language and through language culture. Even though the demands for data storage and processing of sound are less than for images it is still a challenge to technology. An interesting observation here is that there appears to be an inverse relationship between simplicity of the representation of a stimulus and of how easy it is to perceive. We for instance easily identify the sound of a smashed glass, which is extremely complicated and very difficult to describe by mathematics and to store efficiently. There is no such thing as a sound at a specific time. Sounds are because of differences over time and this of course affects how sound can be stored and manipulated. Even if data from sound is less voluminous than data from images there is still a lot of data involved. When using 16 bits per sample, a sample frequency of 44.1kHz, and stereo we need 1411200 bits per second (CD quality). By sample frequency we mean the number of times per second we want to check the signal value. Sending or storing 1411 kilobit per second is quite an assignment so one important question is whether, and how, this figure can be reduced. Compression of sounds reuses many techniques from image processing, and some of them actually emanates from speech processing. Speech and sound are both signals that need to be sampled from the physical environment. Their representation is simplified by the fact that human hearing is limited. The highest perceived frequency is about 20 KHz, which means that a sampling frequency slightly above 40 kHz is sufficient. This explains why the sample frequency for a CD is chosen to 44.1 kHz. Our ability to differentiate sound levels is also limited, 16 bits, or 65536 levels, are enough for almost any application and 8 bits, i.e. 255 levels, are sufficient to comprehend speech and is used for example in the telephone system. dokidoki indicates the beating of a heart in Japanese. Frequency response of a cymbal crash Sound pulses separated by more than 3 seconds can no longer be grouped into pairs. quack Music is represented as sheets of music where individual notes are grouped to motifs, which are grouped to movements, which are grouped to pieces . A few sheets of music can keep an orchestra working hard for several days to find the right interpretation. If successful, the orchestra creates auditory scenes, where for instance birds can be represented, emotions can be felt, and with rhythms that almost force us to move along. Hakan Gulliksson Give meaning to noise, sound becomes communication. Daniel Sonnenschein The rest is silence. Hamlet 71 IV.3.6 Speech pip At the lowest level of representation of speech we have the phonemes. In English there are about 42 phonemes, roughly corresponding to the letters in the alphabet [JD]. The phonemes by themselves form a phonetic alphabet that can be given symbolic representations. The table below shows some of the phonemes in use, along with their symbols in the ARPAbet from the United States Advanced Research Projects Agency. Symbol i A p r Example heed mud pea race “… writing is not a language, but merely a way of recording language by visible marks .” Bloomfield 1933 Table IV.3.1 Phonemes in use and their symbols. This seems all very organised and orderly, but the problem is that there is no one-to-one relationship between a phoneme and its physical shape, i.e. its sound. How we pronounce a phoneme, i.e. the prosody, or rhythm and melody, of speech, depends heavily on context. A higher level of representation is words, but as phonemes, words are also pronounced differently depending on where and why they appear. Another problem, sometimes called the segmentation problem in speech recognition, is that it is difficult to find the pauses at word boundaries when we speak, i.e. speech is much more continuous than we perceive it to be. For speech the sound generation system is known, it is the human. Knowledge about this system gives much information that can be used both for speech analysis, compression, and speech synthesis. We also know that speech is mainly used for speaking, hmmm…. It therefore is constrained by what you want to achieve when you speak. One example is that you rarely go from a very low amplitude to a very high amplitude. This is not true for all other types of sounds. Another useful constraint is that speech is not continuous (although most of us know exceptions). Talk spurts are interleaved by periods of silence. Speech is a volatile medium with some inherent limitations. It is for instance difficult to describe spatial information using speech, and more than one speaker at the same time is a bad idea. Using speech and its constraints to extract meaning and intentions is still a challenging task best, if not only, done by humans, and even we can only make an approximation of what is really behind the words. “Man knows that there are in the soul tints more bewildering, more numberless, and more nameless than the colours of the autumn forest;… Yet he seriously believe that these things can every one of them, in all their tones and semitones, in all their blends and unions, be accurately represented by an arbitrary system of grunts and squeals. He believes that an ordinary civilized stockbroker can really produce out of his own inside noises which denote all the mysteries of memory and all the agonies of desire.” G. K. Chesterton [SP] (keep complexity in mind) What is the sound an angry Viking makes? An image says more than a thousand words, but demands 1000 times more power, 1000 times higher bit rate and 1000 times more expensive equipment. Internet talking LOUD! Runa next to Gullik behind Tova Hakan Gulliksson Anna and Hakan eleaving the room 72 IV.4 The Thing outside in "I'm sorry Dave, I can't let you do that." "I know you and Frank were planning to disconnect me... and I'm afraid that's something I cannot allow to happen." "I enjoy working with people." "Will I dream?" Hal 9000, Space Odessey 2001 Some four billion years ago life struck earth for the first time. Humanity and all other living beings are strict descendants of these ancient bacteria, built by DNA. Everyone is related! According to some probability calculations life should appear every 500 million years, but so far we have not seen any new forms of life. Someday though, new life will appear, and probably wipe out or assimilate mankind. Could it be that the computerbased thing will evolve into this new form of life? The thing is something that you can touch, it is real, and it can be pushed and kicked. It is more interesting if it is smart, a bit like us, but an ordinary table is also a thing. The table even has legs! A human being can also be kicked, so it is a thing, but in this book we have our own interactor. An animal could also be considered a thing, but we will ignore that line of thinking here, and stick to designed physical objects. Words with similar, or overlapping meaning to thing , are object , artefact , device , automaton , robot and machine . “Since man is a child of God and technology is a child of man I think that God regards technology the way a grandfather regards his grandchildren” Roberto Busa According to the theory of thermodynamics the information of a piece of material is of the order 1024 bits. (information in a book about 106 bits) Since the thing is a matter of matter its external representations are physical. Examples of attributes are material and surface properties, colour, shape, and moving mechanical arms and hands with many degrees of freedom. The appearance of a thing is fixed, at least compared to the appearance of information. It is for instance difficult to scale a thing a factor of two, or to reconfigure the constituent parts to better support a given task. It is also impossible to physically reach into an advanced computerised thing and modify its internal structure; a command is only a suggestion! External representations of things often can move which is in accordance with human representations. They can also be heard and smelled. A thing is designed, and to specify all of its attributes is a major issue. The specification must be done with respect to time and money at disposal, situation of use, user, security, and many other aspects. The cost of manufacturing a thing is highly related to the number of things produced. This tends to make things homogenous and generally applicable over large markets. Since price is important features, such as adaptability and additional sensors, are not added if they do not directly support the Hakan Gulliksson 73 specified task. General applicability and low adaptability results in low sensitivity to context. Internally we build things up by hierarchical, and layered structures of electronics and mechanics. At least one central processing unit, memory, perceptual system, and one motor are needed for an autonomous, mobile thing. For this book we will assume that a thing is computerised and networked. This means that it mediates the virtual and the physical world, and this is a very important feat. A thing will be able to perceive the same things as a human does, while at the same time in theory having access to an almost unlimited memory. While the network provides for mobility in the virtual world, the thing still has the problem of physical mobility. This is important because if the thing is supposed to learn about the physical world it needs access to as many aspects of it as possible. Will and should things ever look and behave as humans? This is an ongoing debate in the research community and the main supporting argument is that things should look like humans because then we could reuse a wealth of social behaviours. Co-operation and communication would be effortless. The main argument against is that we only fool ourselves. There is no way we can build a thing with such functionality. The result will only be frustrated human interactors unable to make themselves understood. One requirement for believability is unique reactions to many types of stimuli, implying a rich personality, emotions, and self-motivation. On A physical entity is countable, observable, and existent at some point in time. It has a mass and a volume. Computers are not intelligent. They only think they are. Internet Imagine two computer children playing outside. Will they ever play hide and seek? Will a mother computer read the same story every night to her computer baby? Will a computer baby skip dinner if the wrong dish is served? How will a computer baby draw its family? These are examples that make you think about the difference between a computer and a human. Will there ever be an surrealistic computer artist? Inevitably things will be more socially competent as they learn about context and how to adapt to it, and as we learn about how to equip the thing for adaptation. Managing social space is however not easy. One reason for this is that social spaces cannot be directly seen; they need to be discovered and formed through interaction. This is an active, generative process of observation and action that is inherently dependent of a specific context. Hopefully we will not be able to build things that can do as terrifying and horrible things as humans have done. Hakan Gulliksson 74 IV.5 Sensing it Interactors need to observe and understand their context directly through sensors or indirectly through representations provided by other interactors. Without this information we cannot interact! For indirect observation information processing is vital, including for instance visual realism that will be discussed in the next chapter. In this chapter we will focus on the sensors available. Humans observe, i.e. sense, the world through receptor organs organised as 5 senses, vision, touch, hearing, taste, and olfaction (smell). Claims have been made for a sixth sense, but so far there has been no evidence for telepathy, or other even more suspect human abilities. The modality, or channel, used is one aspect of sensing. The fidelity of a sensor is another. It includes the level of detail, i.e. the precision, and the accuracy, i.e. to what extent the information can be trusted. How do we for instance conclude that someone is happy and her level of happiness? Can you tell the difference between the touch of a loved one and the touch of a stranger? Is it possible to learn how to interpret a touch? T The raw information input stream from all of our senses is quite impressive. One calculation sums to about 11 million bits per second approximately distributed as in the table below [TN. Sense Vision Hearing Touch Smell Taste Information stream bit/s 10.000.000 100.000 1000.000 100.000 1000 Table IV.5.1 Bit rates (information content) for human senses [MZ2]. The amount of information that we can consciously handle is much, much less. About a hundred thousand times less, see table below! Notice how the weight of the information from hearing has been increased. Sense Vision Hearing Touch Taste and smell Hakan Gulliksson Bandwidth of consciousness bit/s 40 30 5 1 Table IV.5.2 Bit rates for human consciousness [MZ2]. 75 1000 The table should not be used to conclude that television could be shown using only 40 bits per second. Humans move their attention and select the interesting information. More information to choose from enhances the experience. After focusing, and extracting interesting information, only 11 very personal bit/s are left to be stored and processed. As for output, ordinary speech supports values of the same magnitude. Reading one page (2400 characters) aloud, in a radio show, takes about two and a half minute, i.e. approximately 16 characters per second. One character is about 2 bits of information. So much for human sensing, but what does the world look like to an interactor of type information? Its goal is to manipulate some data structure, and to do this it needs information, also found in data structures. Access to data can be direct if it is stored locally, otherwise data communication is needed. The world of the interactor is discrete, dynamic, and since it is a sampled version of the real world, usually stochastic. Context dependency is still an issue, and data without context, e.g. 42, is no information. If the data structures involved are static the world is fully deterministic and with ample computational resources an agent in such a world is omnipotent, and will never loose a first-shooter computer game. Sensory compression 1 Human skin covers about 2m square and weights 3 to 5 kg. Sensing is extremely important since it is the basis for adaptive behaviour. The more unpredictable the operating conditions are, the more important, and difficult sensing is. A slightly more detailed discussion on the physics behind the interactions is postponed to the chapter on T-T interaction, see Chapter V.4. A thing can perceive:    Through databases that can be local or distributed. Examples are address books, profiles, or the Internet. Through input to applications run by the thing. Input can be given by humans, other interactors, or by active environments. Through sensors which is discussed next. The figure IV.5.1 below shows the principle for how a computer-based thing senses and acts in physical reality. The sensor converts a physical variable to an analogue voltage. This voltage is converted to a digital representation using an A/D converter and back again to an analogue with the help of a D/A converter, the D/A block in the figure. The energy supplied through this signal is transduced and possibly amplified to some physical variable. Even if sensors are important, effectuators (actuators) are what makes the difference in the outside world. Cause and effect is a natural law, not easy to bypass. Some examples of effectuators are the loudspeaker, video screen, propeller, and the electric train. A/D Hakan Gulliksson T D/A Definition:A sensor is a device that receives a signal or stimulus and responds with an electrical signal. Jacob Fraden Figure IV.5.1 From sensing to action. 76 Stimulus Sound Visible light Infrared light Touch Force Proximity Temperature Time Sensor Microphone CCD, CMOS sensor CCD, CMOS sensor Switch Strain gauge Hall sensor Thermometer Clock Effectuator Loud speaker Photo diode Any body at a temperature above zero Kelvin Moving, pushing object Spring, string, motor Magnet Heater Sleeping pill?? The table IV.5.3 shows some common physical sensors and effectuators. Most of the signals in the table can be sensed in many different ways, with many technologies from electronics, physics, biology, or chemistry. Electronics is the branch of physics responsible for the hardware side of the computer revolution, and Physics is the basis for electronics. Physics describes properties of things such as why some materials are conductors and others are not, and how a photo detector, transistor, or a photo diode works. This is essential knowledge for understanding information technology and especially its limitations. Once again mathematics is the main modelling tool. Information possible to measure, but not listed in the table are, frequency, wavelength, torque, acceleration, position, humidity, pH, revolutions per second and many others. A sensor senses by either generating an electric field, or a current, or by changing its resistivity. The change in resistivity depends on material characteristics and can be used to modulate a current. The resulting current or voltage usually has to be amplified before it can be digitised. Note that humans have built in sensors! Table IV.5.3 Some examples of sensors and effectuators (also known as actuators) along with their physical stimulus. “We are no longer creatures of five senses: technology has given us hundreds of senses. We can see the universe throughout the electromagnetic spectrum. We can hear the vibrations, from the infrasound of the seismologist to the ultrasonics used in destructive testing. We can feel molecular forces. We can sense the age of ancient objects.” Myron Krueger, Artificial reality II, 1991 x x+x The choice of a sensor in a particular situation depends on factors such as: price, precision, input signal range, reaction speed, output range, sensitivity (output range/input range), noise sensitivity, stability, or simply personal preferences. For mobile applications portability, power consumptions, size, weight, calibration, and set-up time are also important. To this we can add constraints from design. The sensor device should not force changes to the appearance of the product, and there are also environmental constraints. For some applications we need more than one sensor. If we for instance want to surveille a room we can use one or two cameras in a distributed sensor network. Multiple sensors can also be chosen to increase the reliability of the system. Hakan Gulliksson 77 IV.5.1 Which sense is the most fundamental? Hearing and vision provide us with the most information, but touch is our oldest sense with a close coupling to the deeper, faster parts of our brain. Touch is implemented by skin that is our primary physical interface with the real world, even the eardrum is skin. While many blind people learn to have a prosperous life, a person who has completelypip lost skin sensitivity will not. They cannot move around without risking to inadvertently hurt themselves, and they have difficulty standing and pip walking. Taste is a social sense, and of course also necessary. Even though we only pip have five tastes (salt, sour, bitter, sweet, umami), a dinner might well be the highlight of the week. pip Would you rather be deaf or blind? (Right (?) answer is that deafness means a higher degree of isolation) pip CLUNK All senses supports well-being and give pleasure. By selecting the optimal input for a particular sense we could even use it as a drug. But, how do we find this optimal input? One clue is to look back through human evolution and search for positive inputs that can be extracted and concentrated. pip pip JUMMI with the taste of UMAMI Smell has sensors that detect combinations of seven basic smells: minty (peppermint), floral (roses), etheral (pears), musky (musk), resinous (camphor), foul (rotten eggs), and acrid (vinegar) [DA]. Only eight molecules are needed to trigger a nerve impulse, but forty nerves need to concurrently react before a smell is detected. Cover your eyes and you stop seeing, cover your ears and you will stop hearing, But if you cover your nose and stop smelling you will die. Diane Ackerman It is difficult to describe a smell. How does Jolt Cola smell? Still, when we smell it we can say Oh boy, Jolt Cola . What is the sound of a blink? IV.5.2 Neural pathways Current knowledge about the neural organisation of the senses suggests that they are organised in pathways. Sensory input follows a path from the receptor, via thalamus, to the cortex where most of the processing is done, see figure below. Thalamus and the cortex are both parts of the brain, thalamus in the centre, and cortex is the layered exterior (grey). Thalamus Optic nerve Figure IV.5.2 Neural pathway, transmitting and transforming sensory information. Cortex There are cross connections between the senses. We can for instance see a Morse signal. Short, short, short, long, long, long, or we can hear it given - - - . . . - - - . One of the perceptions will do. Try to recall the sound of chewing a carrot. Another example is that our inner ear detects head motion and feeds this signal forward to the vision system. This way we can quickly compensate for head movements. In other cases vision supports hearing. It is for instance difficult to decide whether a sound originates in front or behind us in a room. This decision is left to vision. The alternative? Three ears, one on top of the skull? A bit of a nuisance when you wash your hair. Hakan Gulliksson It’s am-s-ng h-w m-ny l-ttrs we c-n r-m-v-d 78 All perceptions are individual and contribute to a personal history and representation of the world (which will make twins less alike for each second). Perception is also an active internal process. Results from research in the neuro-physiology of perception show that there are far more nerve paths going from the cortex of the brain to the lateral jointed body than there are paths coming directly from the eye. The lateral jointed body is a region in thalamus, see figure below, which acts as an intermediary between the optic nerve and the visual cortex. The feedback loop created indicates that perception in humans is an active process, where the perceiver is involved. Thalamus Optic nerve Cortex “ The senses do not give us a picture of the world directly; rather they provide evidence for checking hypotheses about what lies before us. Indeed, we may say that a perceived object is a hypothesis, suggested and tested by sensory data.” D. Drascic and P. Milgram Figure IV.5.3 Feedback where sensory information is modulated by previous knowledge. In other words, what an individual sees is in fact both information that she retrieves from the brain, and information from the eye. One example is reafference that links the movement of the beholder and that of the perception. By re-injecting what should be seen when moving, the selfmovement is cancelled out. When this is done the movement of the perceived object can be calculated. Another example is when you slow down from driving 100+ km/h, and without looking at the speedometer turn right. You will be really surprised by the screaming tires, because low speed coming from 100 km/h is not the same as slow speed coming from 40 km/h. Perceptions can also be associated with stimuli from within an interactor. This type of perception is called proprioception and one example is that you can tell how your hand is turned even if you hide it under a table (try it). IV.5.3 Internet data pathway Media signal processing, such as decompressing the mp3-file The gambler by Kenny Rodgers, is by no means the only task where information is processed by technology. On its way to or from the user over the Internet data is processed, stored, and presented, by selected sets of applications. The applications are needed to manage access control to web resources, and to provide easy structuring of basic functionality. Take as an example a user who wants to automatically download a photograph to his web site. The photograph by default is stored in JPEG in a suitable folder, close to other pictures. It should be compressed to use less than 25 kB of memory, and be automatically labelled by the motive since the user does not want to add this information herself. All of the functionality needed should be applied to the data stream on its way from the camera to the web server. Hakan Gulliksson Psst .. 79 Access point Figure IV.5.4 Information in cyberspace is collected, transformed and stored. Access point IR, Radio…. Sensors Different processing possible. Another example where execution units are chained together is a weather service for a mobile phone. The phone asks the operator for the cell location, and gets GPS position data in return. This data is locally transformed in the device into zip code data. The zip code is sent to an Internet service provider who returns the current weather conditions for that area. IV.6 Acting out We will focus on two forms of interactor actions, expressing itself, and moving around. In general there is no end to the number of intentions and their corresponding physical and virtual actions, and we could also add innumerable actions done without intention. The actions chosen here are however two of the most basic and we will later, in Chapter V.15, return to the subject and give more examples of actions, there in the context of command based interaction. Human consciousness has a very limited information processing capability in the sense of Shannon s information theory, and also when it comes to consciously expressing ourselves we are limited. One approximation is that we can express less than 50 information bits per second, using speech, dance, facial expressions, or other means. This bit rate is reasonable in a face-to-face conversation with another human, but in an interaction with a computer? The computer has the possibility to input and output tenth of millions of bits per second. We however subconsciously can use much more of the possible bandwidth. We for instance reveal, without our intention, what we are really saying when we are talking with someone. The output bandwidth for muscular and motoric management are approximately shared as; skeleton 32%, hands 26%, language generation 23%, facial muscles 19%. Through our actions we can change much more information, for instance by setting fire to a newspaper, or throwing the hard disk out of the window. The young generation will also have overall better coordination, faster reflexes and perhaps a better ability to deal with 3D spaces. Will this change how tools are built for humans? Skills like using a mouse and keyboard will be taken for granted. How will this affect social interaction and product development? "Where they have burned books, they will end in burning human beings." Heinrich Heine 1821 "What about shutting Internet down? Hakan Gulliksson 80 Motoric behaviour and perception are interdependent and the initiating action is not always obvious. Our eyes move which allow us to focus on the interesting part of a scene, and we stop to listen. Without motion we cannot access the interesting sensory impressions, and without sensory input we do not know where to move. An obvious fact is that we do not have wheels! This is not because nature was not clever enough to invent the wheel, it eventually did , but because nature did not think wheels was such a good idea. Instead it provided us with two legs. Not four, or six, which would have simplified programming, just two, and we manage quite well with only a left and a right pedal. The best research laboratories in the world are still trying to figure out how to copy this feat. Nature understood that he problem is not just to move forward at a steady pace. Acceleration, trail bumps, steep slopes, jumping, turning, and inspection of the environment have to be managed as well. Physical context ?? H is localised, not global or distributed. Your next marvel is how you keep track of your hand when your upper and lower arm, and your wrist move in 3D. Quite a lot of real time trigonometry! As a result of our mastery of this, humans are quite good at aiming and throwing things. Maybe better than any other species on the earth? Throwing involves the 3D real time trigonometric problem noted above, also related to as kinematics or the geometry of motion, and also estimation of dynamics, i.e. effects of forces. The loop to move a muscle, from the sensors in the hand, through the brain, and back again is quite slow, somewhere around 200 to 450 milliseconds, which is much too slow to be useful for controlling a throw of a ball. So the mystery is, how can we be so accurate when we do not know how we are throwing? The solution is that the brain has already done some pre-calculations of how the hand, the arm, and the rest of the body should manoeuvre to perform the throw. It is running a concurrent simulation of the throw, in real time performing inverse kinematics and inverse dynamics, and data from this simulation is fed to the muscles. An adjustment from the simulation can reach the hand in less than 100 milliseconds, which is an acceptable delay. The simulation is built from previous experiences, so practice is necessary and improves the accuracy. Let us say that you have performed an action with your hands where vision was also needed. The next time you do the same action your eyes will move ahead guided by the muscles of your hands. And, even more fascinating, if someone else performs the same action and you are only watching, your eyes will still prepare you for the action. Inverse kinematics and inverse dynamics are examples of ill posed problems, which means that the solution is ambiguous. Another example is to figure out which numbers to multiply to get a given product how do you? What about 42 for instance? To solve such a problem we need more information, or we have to guess intelligently. When we throw, the missing information comes from the model of the throw in the brain, i.e. trained behaviour, and from the environment, e.g. sensing a strong wind. The simulation still depends on some feedback, for instance from the eyes, because otherwise it will soon loose track of reality. The interesting observation from the above is that by using an internal model the system is actually faster than a pure physical implementation! This is a good counterexample to the intuitive assumption that an internal representation always will slow behaviour down. Hakan Gulliksson Transcribing movements 81 Throwing something is also a good example of the complementary roles of analysis and synthesis. Analysis is needed to estimate the parameters for the throw, its length, and the weight of the stone. Synthesis is needed to execute the throw, contracting muscles to move the arm, and twisting the wrist. We will come back to analysis and synthesis many times in the following chapters. Last but not least there is the magic of our hands and fingers. They are magnificent instruments for gripping, pulling, and pushing in a variety of ways. We manipulate objects of different size, form and with different kinds of surfaces, and do not think much about it. Things and information have their own means of acting. Information needs a physical representation to express itself in the physical world. For this the computer display and the loudspeaker are useful. We will later in this chapter discuss how they can be brought into action. Although information has a problem to access the physical world, the same property gives information a clear advantage when it comes to moving around. Roaming around approaching the speed of light is possible. Thinking about what things do we can identify four kinds of, not mutually exclusive [AS4] acting things: 1. 2. 3. 4. Information in action: Ad against drug abuse Mediators of force and energy, e.g. a chair, and car. Manipulators of matter, e.g. a lawnmower Transformers of physical state, e.g. an oven Processors of information, e.g. a calculator. Note that this taxonomy uses different interactions to characterise things, and that they are pre-wired and adapted, rather than adapting. To perform the above things needs to generate heat and power, use forces, and move around in the real world. The motor or engine is perhaps the most important active device ever invented. It transforms electricity or some other form of energy to motion, and is implemented in many forms such as the steam engine, jet engine, car engine, DC-motor, and the stepper motor. Around this power source many other ingenious mechanical details have been invented, such as the gearbox. Mobility can be achieved either through moving the thing itself, or by selecting an input source matching the new physical environment. A line of cameras is roughly equivalent to a moving camera. Currently however we tend to think of a thing as a single physical unit, and we will stick to this perspective in this book. Hakan Gulliksson 82 Given a flat surface wheels is a good idea, which in turn implies the axle, bearing, and the brake. Without the flat surface a cableway, aeroplane or a rocket are alternatives, and if the surface is wet a boat (with a motor) will do the trick. IV.6.1 Action, the concept defined Inter-action implies action, which is the behaviour resulting from internal processing by the interactor. But what exactly is an action? The word seems to be used in so many contexts that a precise definition is hard to find. Action is a very basic concept for humans and often the more basic a notion is, the more difficult it is to describe. Not very encouraging, but let us start with the definition to the right-> That gave you a ride did it not? Let us look at another description, from the world of UML (Unified Modelling Language), a standard language used for software modelling, where an action is defined by its context:     Definition: An instance of behaviour is an action if and only if it is associated with an inten-tion making the behaviour into a means for some end. Jens Allwood Preconditions: expected to be true at the start of the action. Post conditions: ensured to become true at the end of the action. Guarantee conditions: ensured to remain true during the action. Rely conditions: expected to be maintained true during the action. An action is in other words a modification, and might include a reaction, i.e. it is sensitive to its context. Since the real world that we live in does not provide a stable environment, an action need not give the same result twice, and in two different situations most probably not will give the same results. Note also that an action in the definition above can be purely mental, such as a prediction or an analysis. Why is action and similar concepts so difficult to understand? One reason is that if we focus on one concept and try to express it clearly, in all its entangled details, we at the same time need to clarify all the other related concepts, along with the relations between them, and their dynamic behaviour at the same level of detail. Consensus is not very likely since all of these issues are quite complicated, partly because they have evolved over millions of years, and partly because of the many different of views possible. A working reference architecture that we implement ourselves, and results from neuroscience will help us to better understand the issues. A precise definition of the term action poses nume-rous problems, and depends on whatever quasiphilosophical blanket ideas we may hold on the subject Jacques Ferber (sanskrit) Karma: the sum of all that an individual has done and is currently doing.Will actively create future experiences. Cause Agency Figure IV.6.1 Web of concepts affecting human action. Reason Intention Emotion Action Hakan Gulliksson 83 Over the years we will increse clarity and depth, and even though we might never be finished the search and the discussion is a reward in itself. The grand unifying theory will however not be found. An action can be triggered by any number of casuses and by reasoning, see figure below. One philosophical problem here is whether we always can find a chain of causes for an action, or if an actions can be caused by an impulse without reason or cause. This discussion is akin to whether there are truly random events, or if the world is deterministic if we only could follow all details of what happens. Unconscious (social, environmental, …Ψ Framed by consciousness Figure IV.6.2 Triggering action. Cause Cause Reason Cause Action Back to reality, let us exemplify action by the everyday event of going to a very interesting lecture at Umea University, early in the morning [DAN]:        Form the goal. The goal is the purpose underlying an action. (Be in time for lecture) Form the intention, i.e. form a conative intentional state (Have to speed up, or I will be late) Specify the action (More throttle) Execute the action (Push the pedal) Perceive the system state (Check the speedometer) Interpret the system state (Speed is accurate) Evaluate the system state, i.e. check if the goal will be, or has been fulfilled (I will be on time or I missed the lecture because I slept too long) ID This decision cycle, or action cycle, is performed over, and over, and over, and over again. A common intuition is that an action alters the world close to the cause, but now the global network will extend the reach of an action in space. Cause a disruption, and it can have an effect somewhere, or even anywhere, else. Similarly, actions local in time will increasingly affect future states. This has already been done by books, newspapers, laws and television shows. If you write a book it might be forgotten and found 10 years later, “h, this is a master piece . With extensive networked data access the possibilities increase. Hakan Gulliksson As far as we can go in conceiving the depths of the physical world, we find agitation and specific interactions. Immobility, fixed states and repose are local and provisional phenomena at the level of our timescale and of our perceptions. E. Morin 84 Concepts related to action are activity and task, which is what the user must perform to achieve a goal. In everyday use of the words the distinctions between goal and task, and between task and action, are fuzzy ones. If my task is to write this word, it is also my goal. An action has a more coherent behaviour than a task; it is well known, and well practised. Also, it involves no problem solving, and needs no control structure [JP]. But, recall that action is a fuzzy concept with no clear definition accepted by everyone. After the discussion of interaction in the next part of the book, we will better understand action. Definition: An activity (task) is an observable, distinguishable, goal-oriented sequence of state-changes within a system initiated , controlled and monitored by one or more agent(s). IV.6.1.1 Action cycle revisited Last section discussed the action cycle shown in the figure below. This model is however much too simplistic to fully describe how humans behave in interactions and we will list some of the deficiencies, just to get the point through that human behaviour is not easily formulated in a simple model [DK]. 1. Formulate goal UnDo Do 7. Evaluate goal 2. Intention Human 6. Interpret perception Figure IV.6.3 Action cycle. 5. Perceive result 3. Detailed plan/action 4. Execute plan/action To start with, goals are more dynamic than the model indicates. They are not always well specified, prioritised, or consistent. Goals interact and change while we try to accomplish them, and they could even be formed while performing the action to fulfil another goal. This reflects the fact that the context is changing and is affected by actions. One way to view culture is as a device to cope with the complexity of everyday situations. It is impossible to plan the future, we can only prepare for it. Teknisk framsyn Consider an artist who sets out to do a painting. The overall goal is well defined, but how about the sub-goals? The composition and the detailed choice of colours is something that will grow while painting. The painter creates an environment (the painting) that will affect the sub-goals. Before starting the actual paintwork the artist has probably done some preliminary work, a sketch, bought some paint and selected a motive. This preparatory work is also part of the solution. Another example of preparatory work is that we rearrange the dishes before washing up in order to start with the glasses. Servicing the car to keep it fit for fight is another example where a task, this time a maintenance activity, will increase the probability to reach the objective. A third category of activities that is not found in the simple decision cycle above are complementary actions [DK]. Complementary actions are actions that help us perform the main task in a better way. They transform the problem into one that better suits our cognitive abilities. We, for instance, divide telephone numbers up in groups to make them easier to remember, and we mark the current page in the book so that we will start at the right page after a short nap. ZZzzzzzzzz… Hakan Gulliksson 85 IV.6.2 Visual realism, information blending in What does it mean for information to express itself, and why is it worth the effort? The first question was discussed in Chapter IV.5 and the answer to the second question is that the expression is a prerequisite for any interaction, so if information wants to join the game it has no choice. One example is business-to-business services where one information agent publicly announces its functionality, and what it can deliver. The reader in this case is another software agent. If an information agent on the other hand wants to match human sensors it has to re-represent itself as graphics, sound, tactile information, or some other sensible media. Luckily the technology for this is available, and reasonably adequate. Ultimately a human should see, hear, or feel information as something natural, perfectly blended with other natural representations such as faces of people, trees, wind, and falling rain. Achieving true audio-visual realism means facing the complexity of reality, and for this the tools we have at hand are still limited in many respects; displays have limited visual resolution, acoustic properties of rooms vary, and so on. The figure below illustrates the problem. Information chooses to express itself as good as possible using available means. In this case for instance using a graphical representation. There are constraints such as bad lighting, or perhaps no color is available. The receiver perceives the presentation from its own perspective, knowledge and goals. For efficient communication the expression, should match the constraints, and the receiver s point of view. Complex information agent Point of view Black square (fully realistic) Perspective: A basic principle of Euclidean geometry is that space extends infinitely in three dimensions. The effect of monocular perspective, however, is to maintain that this space does nevertheless have a center - the observer. By degrees, [in the Renaissance] the sovereign gaze is transferred from God to "Man." Victor Burgin Figure IV.6.4 Information expressing itself. Expression Constraints What makes the problem feasible is first of all human limitations. We have, for instance, limited acuity. Additionally, our incredible adaptivity means that we will adapt to almost any perceptual quality, and if we are really interested in the view its quality is of less importance. In the following we will focus on visual realism, which is steadily increasing because of improved colour and lighting representation, better methods for surface representations, use of texture mapping and reflection modelling, and also because of advances in physical simulation of particle systems, fluids, and flocking behaviour. Despite the advances we still have many unsolved issues, for instance how to measure visual realism in an image. The more general problem of efficient interaction will be a major theme of the next part of the book, and again note that realism is not necessary for high bandwidth communication and interaction. Hakan Gulliksson Imagineering 86 Using everyday technology we have to make do with a two dimensional projection of our 3D model, a process called rendering. The simplest representation is a see-through solution where all edges are displayed, see figure IV.6.5a below. At the next level of sophistication hidden edges are removed using some smart algorithm, of course at the cost of some extra computational load, see figure IV.6.5b. There is still not much realism in the generated box; we need to add more visual information to the surfaces. To do this we can use texture mapping to map a texture, i.e. an image, to the surface of the box. The image can be a photograph, or some computer generated graphics. A quadratic earth is no problem, but maybe not a very realistic rendering. Red – Blood Green – Grass Blue – IBM Figure IV.6.5 a) See through, b) hidden edges removed, c) texture mapped image d) bump map The mapping can also be done with additional constraints such as the mapping of a bumpy surface that can produce interesting visual effects, see figure IV.6.5d. Changing the characteristics of the texture is another way to increase realism and to create surfaces with different colours, or reflectivity. Transparency can also be modelled, and for certain scenes and objects give spectacular results. Lighting is an important parameter for increasing the realism in an image. Simulation of point light, ambient light or directional light gives possibilities that the computer can use to calculate visual effects. For really good results, add an artistic touch, and usually lots of computations. The example in figure IV.6.6 below shows two point lights reflected from a sphere. The reflection is computed using normals, and to save computer resources the sphere is modelled as facets. Each vertex is assigned a normal that is the mean value of the normals of the surrounding facets. This normal determines the intensity of the reflected light at that vertex. As can be guessed from the figure, light values are calculated as linear interpolations between the vertices. Not surprisingly, the illusion of a sphere improves for smaller facets (with more computations). Darkness, unlike light, does not need a source. Jeremy Brin Figure IV.6.6 Compromises can give visible features (Silicon Graphics, Inc). Shadowing is the next item on the list and it can be accomplished in at least two ways. The first way is to generate rays from the light source and follow them as they bounce around. Whenever they hit something this something is lit up. This is called ray-tracing. Another method is to sort all objects according to their distance to the light source and calculate the shadows an object casts on more distant objects. Hakan Gulliksson 87 Collision detection is another feature that improves realism, sending the ball through the floor to the other side is not very realistic. This interaction, that is so fundamental to nature, is actually quite hard to reproduce in the virtual world. The principle is simple enough, we only have to detect if the contour of one object overlaps the contour of another. Figure IV.6.8 Perfect representation of collision detection is computationally expensive. The figure above hints at some of the problems. First, all of this is happening in 3D where predictions and calculations are more computer intensive than in 2D. Second, the level of detail needed for realism is quite high, further increasing the computational load. The bounding volume of the person in the figure is not a sufficiently good approximation for collision detection in the case shown. A better approximation would increase the number of volumes to be checked for collision, aggravating the load problem. We can also use a hierarchy of bounding volumes and if a collision with an enclosing bounding volume is detected, then collision is tested for at a finer level of detail. The cost of this is more complicated control and some additional memory. A general solution is to use more or faster hardware, at the cost of $. Virtual information can be presented any way the designer chooses. If you throw a ball at a wall it can get stuck on the wall, bounce back or continue through the wall, everything is possible. As humans we are adapted to reality. The question is, will our previous experiences of reality hinder adaptation to the virtual environment, or do we not have such a limitation? A particularly annoying problem when displaying shapes is the hidden surface problem. This problem arises when 3D objects are projected to a 2D screen. The computer has to decide which object that is in front of another. Consider the example in figure IV.6.9. To the human eye it is quite obvious which of the bars that is the closest, even without the dotted indication. Why? Figure IV.6.9 The hidden surface problem illustrated. From the I/T perspective visual realism is not interesting. Visual accuracy is perhaps a better description of the problem to extract just enough data for the task at hand, and where sampling and careful selection of relevant visual information involves a trade off between the amount of data and the computations necessary to analyse the data. Hakan Gulliksson 88 IV.6.3 Sound, speech synthesis and telling stories An auditory space is a shared environment where many can hear similar sounds while attending to different visual representations. Sounds, when used as music and environmental signals, are necessary for realism. A whining sound in the background of a movie, which we are not even consciously aware of, can by itself build up a tension. By designing sound scapes we can exploit sounds also for other contextual information, e.g. to signal the time left until the alarm clock goes off. Uses of colour in GUI:  Soothe or strike the eye.  Draw attention  Discriminate  Organize  Evoke behaviour [BS1] Whether music without song can actually say something, and be used as a language to convey meaning is less obvious, but there is certainly a complexity and a number of variables that can be used to construct messages. Harmony, rhythm, intensity, pitch, speed, composition, form (pop, jazz ..), timbre, and texture could in principle be combined to messages, but maybe the effort of learning and using it is too high [GG]. “ thing speaking, saying Good morning , i.e. implementing speech synthesis, is a technology already in use. Automatic answering services directing you to press #, *, or  are no longer a surprise. The perceived impersonality of speech synthesis does not even make you angry any more. Modern technology gives us new possibilities for telling stories and presenting facts in compelling ways. Story telling, or narration, has been performed since the beginning of time, but there is much more to telling a story than knowing the syntax and semantics of a language. A cohesive story is supposed to evolve orderly over time, and to have a structure such that cause and effect in the story makes sense. This means to answer the following four questions essential to storytelling: who, when, where, doing what. Who is the reader? Why is she reading? What is she supposed to learn? Answering such questions, and adapting content, style and format to the reader and the situation is extremely difficult. Many attempts have been made to structure the stories according to common components, or motifs. Here are some of the suggestions by Georges Polti most of them loosely applicable to this book, chosen from the original 36 suggestions; Supplication, to humbly and earnestly asking for help. (1), Deliverance, recovery or preservation from loss or danger (2), Disaster (6), Falling Prey To Cruelty And Misfortune (7), Revolt (8), Daring Enterprise (9), The Enigma, puzzling, ambiguous, or inexplicable (11), Fatal Imprudence, the quality or condition of being unwise or indiscreet (17), Self Sacrificing For An Ideal (20), All Sacrificed For A Passion (22), Necessity Of Sacrificing Loved Ones, here applied also to behaviour and situations (23), Rivalry Of Superior And Inferior (24), An Enemy Loved (29), Ambition (30), Erroneous Judgement (33). Another attempt to analyse narratives was made by Vladimir Propp who looked for, and found common elements in Russian folk tales. From his list of 31 elements we select the following: One of the members of a family absents himself from home (1) and the villain causes harm or injury the family (8). The hero(ine) is tested, interrogated, attacked, etc., which prepares the way for his/her receiving either a magical agent or helper (12). The hero(ine) is transferred, delivered, or led to the whereabouts of an object of search (15). The hero(ine) and the villain join in direct combat (16). The hero(ine) is Hakan Gulliksson 89 pursued (21). Rescue of the hero(ine) from pursuit (23). The hero(ine), unrecognized, arrives home or in another country (24). A difficult task is proposed to the hero(ine) (25). The task is resolved (26), and the villain is punished. The hero(ine) is married and ascends the throne (31). Note that there is an implicit ordering of the elements suggesting how a story is built, and also indicating a linear cause-event structure of a story. Scholars also have worked on categorising the type of a tale. Folktales are interesting because they could be assumed to follow a tradition of story telling from the dawn of time. One elaborate catalogue has been suggested by Antti Aarne and Stith Thompson [CC]:      Animal Tales (Types 1-299) Ordinary Folktales (Types 300-1199), selected items a) Supernatural adversaries and helpers b) Superhuman tasks, to search fortune, or the quest for the unknown. c) Magic objects and supernatural power or knowledge d) Romantic tales Jokes and Anecdotes (Types 1200-1999), selected items a. Numskull stories b. Stories about married couples c. Jokes about parsons and religious orders d. Tales of lying Formula Tales (Types 2000-2399) Unclassified Tales (Narrationes Lubricae) (Types 2400-2499) The extended list is quite long, but the point we want to make here is that most tales are social statements. IV.7 We need knowledge, and we represent it Knowledge is one of those familiar concepts that you think you know all about, until you try to pin down its exact meaning. Here we will not even try that, instead we will describe it indirectly, in action and also ignore any differences between memory representation and knowledge. Two types of knowledge can be distinguished: knowing-about (declarative knowledge), and knowing-how (procedural knowledge). Declarative knowledge is associated with a thing, place, person, fact, or subject, for example Norrland is not too hot, Tova is a girl. It could be further subdivided into episodic and semantic knowledge, where episodic knowledge is derived from the experience of a particular individual and semantic knowledge is commonly held information. Procedural knowledge is understanding from experience, knowing how to do something, such as preparing for an examination. Philosophers have been studying knowledge for centuries and a whole branch of philosophy called epistemology focuses on its forms, limits, nature, and validity. One particular view from epistemology is that knowledge is acquired by reasoning. This is called rationalism. Another view, empirism, is that knowledge is derived from experience. A variation of empirism referred to as positivism insists on a scientific approach based on observation. Ontology is another branch pf philosophy concerned with identifying the things that actually exist. This study of Hakan Gulliksson Definition:A system has knowledge of something if the system has a model of something perceived by the system. Is it possible to teach wisdom without knowledge? System Something 90 what there is nicely complements the goal of epistemology, i.e. we can know . what IV.7.1 Knowledge representation How is knowledge represented in the brain, and can this principle be reused in artificial intelligence? Nobody knows, but theories on representations have evolved, and some have been tested on real world problems. It is a popular topic of academic debates and at a very low abstraction level the answer is easy (if you do not ask a neurologist). We simply store knowledge by adjusting chemical potentials. The real problem is how to store information effectively, while preserving relations between data, such as abstractions, and still have quick access. Information in the brain seems to be stored in memory in at least four major representations. The first, and maybe the most important one, is information stored as the equivalence to images. A second representation is the phonological representation, i.e. stretches of syllables, such as how we remember a phone number. Next, there are grammatical representations, exemplified by nouns, verbs, and clauses. The most abstract representation is one we use to store thoughts and it is called mentalese [SP]. Each of the representations listed has its own important application. The grammatical representation for instance supports language. I remember it so well. ”This is for you”, he said. “Happy Birthday dad”. His eyes was shining and his voice full of pride. “I have done it myself”. To achieve the representations above two suggestions are a symbolic, or a distributed connection oriented representation. The symbolic representation can be further divided into analogue representations such as mental graphics, where the actual shape of a dog is directly stored, and propositional representations that are more language oriented and capture the conceptual content of knowledge. One example here would be the concept of God stored as the word God. Actions can be represented as state diagrams, or software programs, where discrete symbols are manipulated on the basis of syntactic rules that operate on the symbols, ignoring any meaning attached to them. This view nicely matches the web of concepts shown in figure IV.7.1 Symbolism is a first solution to the so called mind-body problem. This problem is evident in the question How can the world of meanings, thoughts, intentions and desires, i.e. the mental world, be connected to the real physical, material, world as represented by our brain? The solution suggested by symbolism is that a cat in the physical world is sensed and represented by a symbol of a cat in our brain, or in a computer. In a connection-oriented representation it is impossible to pin down the exact location of where a piece of knowledge is stored, i.e. God is spread out and stored as a pattern of activities throughout the system, for instance a neural network. The idea explored is that mental processes are performed by simple processing units, e.g. neurons in our brain, communicating with simple signals in a highly connected network. In this model knowledge is encoded as interconnections of different strength. This representation is trained into existence, and every instance will be different. Everyone will for instance have a different interconnection pattern representing water. Hakan Gulliksson 91 Everyone knows about water and understands reasoning about this concept, and yet it remains decidedly difficult to provide a clear, undisputed, and unified view of water (or actually any other concept, under study):     Water as a drink – water to drink, use a glass of water. Water leaks – transport of water, we need a piping system that sometimes needs mending. Water as a molecule – H20 molecule, it turns to ice in the refrigerator and can be used in cool drinks. Water for washing up – hot water and detergent, need to rinse at least three times. Each of these statements refers to know-how related to water and also to problems involving water used in a context, which assumes conceptualisation. One view of knowledge is as a web of concepts where the know-how, i.e. the usage of the knowledge, is what motivates acquiring the knowledge in the first place. Water as a drink Water leak Boiling water to steam Filling a glass Plumbing Figure IV.7.1 Knowledge web, illustrated by a small excerpt of a representation of water. Water as a molecule The knowledge web can be used to illustrate the workings of our memory. As we traverse the knowledge structure our short-term memory constrains the nodes we can hold in memory. Since each node only gives a small part of the picture short-term memory can never hold a complex representation. We can solve this problem in two ways. Either we can abstract a number of nodes into a new node, or we can traverse the knowledge web as quickly as possible and try to solve the problem at hand dynamically. Each retrieval will cost us approximately 400 ms, so we must not be in a hurry when we use the latter mechanism [WK]. Sometimes we need to organize knowledge in more complex units that represents a situation, or an object, in a domain. One example is that we would like to represent the general concept of a teacher . Hakan Gulliksson 92 For this we use a schema, also referred to as a frame, which is a structure collecting all general properties of an object or an event, see figure IV.7.2. It is an abstraction that allows the formulation of more general categories and was used as early as in the eighteenth century by the philosopher Immanuel Kant. Eats all the time Dog The simplification of anything is always sensational. G. K. Chesterton Four legs Figure IV.7.2 Generalisation using a schema Barks External representations will increasingly complement our own memory and what we have learnt. Representations such as books, paintings, databases, and personal digital assistants will help us to overcome our memory shortage. But, as usual getting rid of one problem creates a new one. We now have to learn to manage our external memories. Is there another word for synonym? The net IV.8 We think and process There are presumably many reasons why we humans have developed thinking. Let us list some of them [SP]:  Enhanced group living, this extra joy in life provides lots of opportunities for knowledge transfer and knowledge trade. It is for instance very useful to be able to guess the intentions of other members in your group.  Extended use of vision, we use vision to examine our surroundings in 3D and this gives us the possibility to organise our mental world accordingly. We gain a framework for useful reasoning and planning.  Better use of our hands, our most versatile and important tool, the tool of tools.  Efficient hunting, because without hunting, and preferably intelligent hunting, we would not have had the resources to expand our brain to the extent that it can be used for intelligent thinking (including hunting). “I think, therefore I am” R. Descartes “said by an intellectual who underestimates toothache” Milan Kundera Maybe you can think of more reasons? Human thinking is slow, and the mental models are unstable and incomplete. “lso, unscientific, intuitive models, and beliefs guide behaviour. Despite this we are able to quickly solve many quite complicated problems. The quote to the right clearly describes the main difference between computer based technology processing and human thinking. The computer can perform the same computational task a million times, without making one single mistake. A human can perform it ten times and make three different mistakes. In controlled environments, with a suitable problem, a cluster of computers working in parallel can compute at almost any speed, impossible for humans to match. Sealing bottles is a task that computers do at breathtaking speeds. But, if we slacken the control, let us say that the bottles arrive at stochastic intervals, the computer will fail. A human will continue sealing a bottle every five seconds while the computer smash bottles into smithereens. For other tasks the computerised thing is surprisingly sluggish. Humans solve simple survival tasks in the jungle in real time (proven ability). A computer-based thing of today cannot manage! Why is that? Hakan Gulliksson “Human thought and its close relatives, problem solving and planning seem more rooted in past experience than in logical deduction. Mental life is not neat and orderly…Human thought is not like logic; it is fundamentally different in kind and spirit. The difference is neither worse nor better. But it is the difference that leads to creative discovery and great robustness of behaviour” Donald A Norman [DN] 93 Moore s law is another distinguishing feature of the thing. Every eighteen months the computer performance has doubled as measured in MIPS (millions of instructions executed per second). This will probably not last for more than ten years from now with the current technology, but at that stage we can use parallel computers to increase the performance. If this trend continues, one prediction is that a computer matching the human brain in performance will be available 2020 [HM]. After another 50 years we will have a chip of the size of a sugar cube that stores the equivalence of 10.000 human brains, and with the power of a million Pentiums! There are also other hardware anomalies. Soon a CPU and a camera will be inexpensive enough to allow millions of them to be used in everyday life. There are already cameras attached to glasses that constantly take pictures of everything the wearer looks at. Combined with global ubiquitous, fine-grained networking such new technology, gives enormous new possibilities difficult to foresee, especially if it is inexpensive. The term process covers a lot. Thinking, preparing food, building a car, running a computer program, reading a book, or editing a text are all processes. Computing is a general term for a machine executing software, i.e. performing operations on the content of its memory. There are several levels at which information processing is active [DM]. At the highest level it is concerned with the question of what is done, and why, i.e. the intentional level. This is the processing of ideas. While reading that last statement you actually performed information processing. Thinking about why you read it is also information processing, and clearly at the intentional level. Maybe you want to understand the meaning of information processing? At the next level we specify the idea of how to manipulate a representation of information, perhaps using an algorithm, what we have previously in this book called the conceptual level. Reading is one way to learn about information processing, and at the conceptual level we can describe it as sequentially processing symbols, one by one. At the lowest, physical level, we are concerned with how to implement the algorithm, and how to access the representations. Reading can be done using a textbook, feverishly scanning each line, frenetically turning pages, more, more, more. What problems need faster computers? Even more cmplicated word procesors? How can information or ideas reason?, How can information or ideas perform actions? H T I IV.8.1 Situated action One view of thinking is as manipulating an inner model of what we are thinking about. When we think about dinner we imagine how the food will look on the plate, the ingredients needed, the cooking procedure mmm… . Daydreaming, and planning ahead, are other examples of inner world manipulations, i.e. thinking. This view is challenged, or rather complemented, by another emanating from work on situated action [LS]. According to this we use the world as its own model, and we think using this model. “ blind person thinks by touching with her stick, her mind leaks out into the world using the stick as an antenna. Hakan Gulliksson The web is a natural extension of the spider Shaleph O’neill 94 For situationally determined activities, such as avoiding collisions, tying shoelaces, or laying out a jigsaw puzzle, it is not necessary to think about what we are doing, we just do it. But, if we realise that there is a piece missing in the puzzle we have to engage higher-level processes. We have to make inferences, consider the alternatives [DK1]. Maybe the dog took it? We can exemplify some different ways of thinking by playing the game of Tetris where a block is rotated and placed to level a landscape of previously placed blocks, see figure below. To gaze is to think Salvador Dali In the Cartesian tradition meaning occurs because of structures of the mind, not experience; because of language (the general language system), not parole (the speech act or interaction) [MC3] Figure IV.8.3 Thinking with your hands when playing Tetris. A beginner consciously rotates the block and finds a match for it. This is very time consuming, and time is always a shortage, so also in Tetris. To simplify the procedure, the player can first rotate the block using the control buttons, and then do the mental matching. She is now thinking with her hands! An even more advanced user will skip the rotation step altogether, and directly match the block to possible openings; she has a mental map of where a block of a certain kind matches. The figure above indicates three possible matches, which an experienced user easily identifies. To the skilled user the choice is also a tactical one. Which of the alternative matches will give the best long-term result? Furthermore, there is a context to consider, the next block can be made visible and might influence the choice. Is it possible to learn how to ride a bicycle using a computer simulation? I type, I am not aware of the boundaries and interfaces between my mind, my fingers, the keys, the virtual text on the screen M. Rettie Completing a jigsaw puzzle is another example where thinking is aided by the physical reality. We order pieces in groups by colour or shape, to simplify matching. As in the Tetris example we can also physically rotate the pieces for matching. This action-evaluate behaviour is also termed an action loop [AC]. Situated action emphasizes that humans plan on demand , i.e. that improvisation is at least as important as planning, and many times more efficient than trying to figure out everything in advance. The artist Stelarc takes the consequences of situated action to the limit. He argues that we are limited by our physiology and to evolve further, and to think deeper, we have to enhance our bodies. A third ear, other additional sensors, and new body extremities such as a third hand are some of his attempts. Hakan Gulliksson The body is obsolete. We are at the end of philosophy and human physiology. Human thought recedes in the human past. Stelarc 95 IV.8.2 Distributed cognition Distributed cognition provides another angle on thinking. A systems perspective is taken, where humans and tools collaborate to reach the objective, for instance to manage a ship into port. In principle a reflective system can adapt to anything by modifying itself, but in practice no technology is indefinitely malleable. The designer, material, original intentions with the technology, and much more constrain the possibilities to modify a system and its technology in a given situation. It is hard work writing an essay on a pocket calculator. Below the figure IV.8.4 shows some of the complexity we are facing when studying and designing for distributed interaction. Participants in the system interact with other participants, and the system itself, with context, and they also reflect and act on themselves. All of these interactions create numerous feedback loops where meaning is created and supported. T T H Context System H Figure IV.8.4 Some distributed interactions in Hierarchical HITI. I The abstraction levels chosen for describing the interaction is an important trade off. Should for instance the same description be used at all abstraction levels for all interactors in the figure IV.8.4 above? If not, then we need to find ways to match the description (multiple concepts, different name same meaning, different frames of reference) resulting from top-down and bottom-up analysis and design. Designing a system bottom up creates a complex model with good descriptive properties, but often with low predictive and generative power. A top down approach could catch emergent properties, but will miss the finer details. 090.. A related issue is the number of models used at a given abstraction level. Inventing a situated model for each here, now, me, this, and do is maybe not a good idea. But, a well-developed general modeling tool such as Activity theory is on the other hand quite abstract, and needs training and experience to exercise well. Hakan Gulliksson 96 Whenever we consciously think about something we have to explicitly build, or use, a predefined model of the subject. One problem with this is that the model is a selection, we cannot include everything, cannot consider all aspects at the same time. The selection by necessity creates a blindness to all the aspects we do not include. It seems that reflective thinking is impossible without this blindness [TWD2]. If we think about how to wash up the dishes we usually do not include our children in the thought process. This might be a serious mistake! There is a minimal chance that one of them will volunteer if we ask them. On the other hand, if we started to consider every possible escape from doing the dishes we would never get round to do it. Have you ever said, Oh, I did not think about that ? “ssuming that you had the chance, why did you not think about it? I think I think, therefore I think I am. Descartes' failed attempt to discard the notion of objective reality. IV.8.3 Trends in thinking Two trends can be found in thinking about thinking and technology. The first is that there will be increased computerised support for thinking in the outer world. Computers can help us think. One example is the computerised calendar that helps us to keep track of meetings and reminds us of our wedding day. The other trend is more long-term. What we are thinking about is more and more relieved from the constraints of physical reality due to the increased complexity of our society. This is a trend that started when we left the savannah. Many of us for instance think about taxation, or research. What will happen when the two trends combine? "Why do we need to think? Can't we just sit here and go BLBLBLBLBLB with our lips for a bit?" Douglas Adams … 60,000 to 90,000 conscious thoughts each day, 98% same as yesterday, 2% new ways of justifying old thoughts Thomas Enhager IV.8.4 Artificial intelligence Personal computers of today are deaf, dumb, and blind, even bathrooms in most airports are smarter as they at least can sense a person using the sink. Actually the sink is also computer based, but hopefully you get the point. We are not at all surprised if the computer does not recognize us after months of daily use, or if we get the same warning message for the tenth time. Artificial intelligence is a young science. Although it builds on philosophy and psychology it is the exploration of computer technology since 1940 that has spurred development. The frontier of what a computer can and cannot do is constantly moved. “ computer cannot play chess, at least not beat a human being , and other similar statements, at first seemed reasonable, then questionable, and are now proven utterly wrong. Computers currently beat the chess world champion Kasparov, and that would certainly have been considered impossibly intelligent just a few years ago. We have had to redefine intelligence to keep it in human possession. Hakan Gulliksson Why is it AI and not AWE? 2b or not 2b? As a computer, I find your faith in technology amusing. Internet The essence of intelligence is to act appropriately when there is no simple pre-definition of the problem, or the space of states in which to search for a solution. [TWD2] 97 The question of what constitutes intelligence is almost as difficult as implementing it. Is an alarm clock intelligent? It fulfils its task, but is still not considered intelligent. Will a computer ever be as intelligent as a human, or even more intelligent? There are still many tasks not yet accomplished by a thing. One is the Turing test, a test of intelligence proposed by Alan Turing. Isolate a human, or a computer, in a room. Ask her/it questions and if you cannot tell whether it is a human or a computer by the answers, it has passed the test. It has yet to be proven that intelligence has any survival value. Arthur C Clarke As another example, compare a cockroach to a car [AC]. A cockroach is quite good at disappearing at the right time. It can sense wind disturbance from an approaching attacker, and distinguish this from normal air movements. As it detects a danger it escapes into the closest hiding, avoiding obstacles along the way. A car on the other hand cannot even sense another car approaching, and if it did and tried to avoid it, it certainly would end up in the ditch. ”uy this new car it has got a cockroach brain! “The computer as intelligent is not in our future; we haven’t even achieved a Congress of intelligent agents after 200 years of trying. Instead, the computer for the twenty-first century will be the computer that stays out of your way, gets out of your desktop and into your clothing, connects you with people instead of with itself” Mark Weiser, Xerox PARC Some researchers in AI are very optimistic about future advances, but beware, reality is much more complex than a game of chess! In fact, some say that artificial intelligence is impossible. These researchers claim that it is impossible to reproduce consciousness in a computer system. This is because of the lack of interaction with the outside world, and the fact that a computer is not part of a communicating community of other intelligent entities of the same kind [JF][TWD2]. Researchers first have to help the thing to translate aspects of the real world into symbolic representations that a computer can use. Computer vision, and speech recognition are some partial solutions. Second they have to represent the information thus achieved such that computers can use it to reason. An interesting question here is if the Internet can substitute for the outside world and provide an interaction community for things and information. When this happen this book will surely be rewritten. One approach to AI is the expert system. An expert system represents and uses knowledge from a limited area of experience such as diagnosis of diseases. The knowledge is stored as rules, and inferences can be made from the rules by a reasoning mechanism. Expert systems can be useful, but after more than 30 years of research they still do not solve very many real world problems. Representing the necessary knowledge in simple rules has proved very difficult. Real world knowledge is fuzzy and depends on context (and there are quite a lot of different contexts around). We however constantly better understand the properties and complexities of many of the problems to solve. This combined with increased mathematical sophistication leads to more robust methods. One example is speech recognition where new mathematically based methods such as hidden Markov models (HMM), and large databases have greatly improved usefulness. The question of whether computers can think is like the question of whether submarines can swim. E. W. Dijkstra Why can’t a goldfish long for its mother? Longing for one’s mother involves at least: (i) knowing one has a mother, (ii) knowing she is not present, (iii) understanding the possibility of being with her, and (iv) finding her absence unpleasant. Aaron Sloman I have been made by bright monkeys. What other clever little tricks will they pull on me before my time is done? Greg Bear There is no silver bullet, only hard work! Hakan Gulliksson 98 IV.8.5 Representations for processing How can we represent processing, i.e. planning and action? One way is to use a specific type of schema called a script, see figure IV.8.9. It orders a sequence of actions, and can be used for instance to describe how to cook pancakes. Pancakes Figure IV.8.9 Action sequ-ence modelled as a script. Stir together flour, salt, and sugar. Stir together beaten eggs and milk. Mix dry and wet components. Fry in butter. Goals and cause-effect relationships stored as propositions is another way to implicitly specify processing. We use some mechanism to fetch a goal and select cause-effect propositions to find subgoals. We execute the actions corresponding to the subgoals, and eventually reach the main goal. IV.9 We remember From the discussion above it is obvious that memory is very important. For many systems memory size and access time are is a scarce resources that have to be used economically. I hear and I forget; I see and I remember; I do and I understand Chinese proverb Memory is essential for learning and reasoning. We will now introduce a model of memory in which a human being is supposed to have three kinds of memory, sensory registers, short-term memory and long-term storage. This is one out of many models, each with a different view on the functionality and the structure of memory. Sensory registers are intermediate storage spaces between senses and short-term memory. They assure that memory is associated with sound, colour, touch and smell. In such a register, visual iconic storage will be accessible, but only for a few hundred milliseconds, while auditory memory can be available for up to 20 seconds. This is fortunate since it takes some seconds to formulate and decode speech. We would certainly have problems if we forgot the beginning of the sentence before the end arrived! Short-term memory is small, usually holding only three to four, or at most 9, information groupings called chunks. With such a small working memory it is important that information refresh rate is high. Saved space, i.e. less memory, has the benefit of higher refresh rate and faster access. The result is that not only do we have a small memory, it is also short! Try to remember what you thought about 10 seconds ago! Hakan Gulliksson Sensory register Short term memory Long term memory 99 The severe limitation of short-term memory is strange considering the total capacity of the brain. It seems to be a result of a balance that has been established throughout evolution. Using more memory would increase the possibility that the correct memory context is active. On the other hand also irrelevant memory context is more likely to still be active. One example can be that two conflicting goals are active at the same time. "To know everything would be impractical; access time would be exceedingly high" From the television series ‘Mann and Machine’. Long-term memory complements by preserving memories for years, up to 2 billion seconds in the end. Do you remember your best Christmas present? The total amount of long-term storage, and the number of associations possible, depends on the individual and can be increased by training. Brain capacity expands as more data is entered! Compare this to a hard disk where the available number of bytes is fixed. It has been estimated that a human processes 10 terabytes of data over a lifetime, and soon we will be able to store that amount of data on a single hard disc. The soul catcher chip investigated at ”ritish telecom aimed at doing just that, i.e. it tried to catch the soul of its user by collecting all of her experiences. Humans have a very small short-term memory. Furthermore, neither General failure short, nor long term memory is very exact. In one experiment people were reading disk C asked to draw both sides of a penny (the experiment was performed in the USA). Out of eight possible features the median number remembered was Who is this General Failure and three [SP]! Try it at home (with your local currency). why is he reading my disk?. Internet All knowledge, and also information about how to process information, i.e. programs, have to be physically stored and accessed. In the human brain this functionality is somehow integrated into the neural network. A thing uses other mechanisms, including RAM memory and hard disk drives. Some parameters influencing the choice of memory type are price, access time, capacity and power consumption. Examples are; CPU registers with access time of 10 ns, a 256 Mbyte, main memory RAM with an access time of 100 ns, and a 50 Gbyte hard disk in a PC, with an access time of less than 10 ms. The 10 ns CPU register access time should be compared to the 70 us reported for raised human brain activity, and the 100 ms for human conscious reactions, almost an eternity. The reason for the name Random Access Memory (RAM) is that any (randomly chosen) memory cell has the same access time. This is not true for a CD-ROM where access time depends on where data is placed on the disc. Why a hard disk is called a hard disk is a mystery for us. Ever seen a soft disc? Or a wet disc? System memory and its backup. For each MIPS of increase in computer performance there is an extra 1 Mbyte of memory and 1 Mbit/s of extra I/O capacity needed Amdahls rule (MIPS = Million Instructions Per Second) How data is structured in memory is also important for fast access and this structure is usually application dependent. There are however two general properties we can use to enhance memory performance. The first is the principle of time locality. An item just referenced will tend to be referenced soon again. Hakan Gulliksson 100 CPU CPU cache System RAM Hard disc Backup storage Figure IV.9.1 Tradeoff between price and performance for memory. Access time The second principle is the principle of spatial locality. Items with addresses close to the item just addressed will also probably be referenced soon. With this in mind, and to get the most out of system memory per Euro, a hierarchical structure is used, where data just used, and with high probability to be referenced soon, i.e. data close to the data just used, is kept close to the CPU. Very fast memory, called the CPU cache, is placed closer to the CPU and is updated from slower, cheaper, memory. The idea of a cache is reused also for web pages on the Internet. Remembering something should be seen as an association of events with memories. Along with the thinking of connectionism, see Chapter IV.7.1, and considering that the brain is a neural network, we should not think of memories as something tucked away in files and ordered according to some logical pre-defined scheme. Rather, memories are retrieved after an activation of the neural connections by external and internal inputs (mental state and situation). As a result structures in the network of the brains are highlighted. This will be done differently for every person, at every time, in every situation, and it is a dynamic process. We all build very personal networks as life passes by. Our common heritage, and the fact that we share many experiences means that our neural networks are similar, but they are never the same. If by a strange coincident we would have the same structure of connections we still would assign different weights to them. On a clear disc you can seek forever. Peter J Denning 1999 2000 ♠ ♣ IV.10 We attend to it Attention is of vital importance whenever a human is involved in an interaction. This means that how to attain attention is something that must be studied, which of course has been done in depth for public relations, and political propaganda. Any interactor faced with reality, and not prepared for the shock, will be overwhelmed by information. Reality continuously presents us with parallel events, audible, visual, tactile all around us. One way to manage is to consider only parts of the information, i.e. to focus, and make use of attention, as humans also do. Hakan Gulliksson 101 On a high level human attention is determined by self relevance (needs, goals), pleasantness of stimuli (music, humour), emotions, and ease of processing. On a lower, functional level, the following figure illustrates how we keep focused on the task at hand. Long term memory Sensory input Short term memory buffer (cache) Figure IV.10.1 Cognitive architecture for attention. Attention(t) Current objective Sensory input and memories are collected to a short-term access buffer (cache). The buffer is not very large, which means that new, more interesting sensory input, have to flush it. Results from internal mental processes that need other sensory input, or memories, also flush the buffer, as well as unwanted external distractions. The system is highly time dependent, because objectives and sensory inputs change, and longterm memory evolve. Attention is also easily disrupted by stress from noise, light, anger, or lack of sleep. Since people cannot be redesigned we have to make sure that the systems we design take our limitations into consideration. Since attention directs the resources available to the most important issue, it is necessary for successful interactions in physical reality. A computerbased solution to the same problem is to assign priorities to internal processes and make sure that the process with the highest priority gets access to the resources, such as CPU and memory. A system with a few simple tasks can manage with fixed priorities. Systems working in a more complex environment need algorithms to change the priorities. Read the following sentences while at the same time saying out loud “7, 5, 2, 3, 10, 6, 1, 4, 9, 8” “10, 9, 8, 7, 6, 5, 4, 3, 2, 1” Which was easier to say? The stroop effect Definition:Attention is the application of the mind to any object of sense, representation, or thought (just as You thought). Attention is a finite resource and it needs stimulation to set off, i.e. an alarm system. For humans we have many innate mechanisms for this. Having something too hot to drink, and movements in the periphery trigger such behaviour. Other more sophisticated alarm systems need to be trained in a social environment. Most of us for instance learn to sense a changed level of tension at a meeting. In general what sets us off is a change of a pattern or state. Experiments show that airplane pilots using a heads-up display when landing easily miss another airplane blocking the runway (landings were simulated). Another example is that using a mobile phone severely reduces the attention spent on driving a car, since hearing and vision fight for the attention. Computer systems have an interrupt mechanism whereby a lower level system, such as a mouse driver, can alert higher-level software. In the case of a mouse a function is triggered that updates the pointer on the screen. Hakan Gulliksson 102 In the discussion above we ignored some rather important questions. One is how different senses compete for attention. If we quickly skip that one and just look at the visual system, only we have some questions for that too. What information is important enough to capture attention? How does the visual system know when to attend to a specific event, and when to shift attention to another one? How does it do this efficiently? The same problems are now facing engineers and scientists when they want to build adaptive things. We will come back and discuss some potential solutions in the next part of the book. To somehow represent an inner model of the world is well-spent memory. A mental state is an internal image on a functional level, and humans and animals are currently the only interactors with such a model. If we restrict ourselves to interactors that have beliefs and fears both about their intentional states, and about the states of other interactors, we are probably left with humans only. Only by concentrating the finite resource attention can we make things happen. [MC3] If you believe, or know, about something , you have a mental state that represents to believe, or know, about something . You can for instance have a belief that there will be dinner on the table when you come home after a long days work, perhaps candles, and some wine. Compare that to a simple goal-directed action like eating when hungry, with the food already there, in front of you on the table. In the second case there is not much of a mental model, no intentionality needed for eating. What you believe, when you believe that dinner is served, is however not merely a sentence, there s more information represented inside your head. We have mental states of different kinds, such as nervousness, elation, depression, belief, desire, hope, and fear. To be in a mental state is in other words to be disposed to behave in certain ways. For interaction some mental states, called intentional states, are especially interesting. They are directed at, or about, states of affairs in the world and many of them can be externalised by means of human language, or by other interactions [JS3]. Some intentional states are [JF]:      Interactional: percept, information, decision, request, norm. Representational: belief, hypothesis. Conative (to try, undertake): tendency/goal, drive, claim, commitment and intention. Intention is used here in the meaning of an act of will, and is a special case of an intentional state (easy to confuse). Organisational: method, task. Other: fear, desire, hope, dream, affect. All of the different intentional states are not independent; desire is for instance also a drive. Intentions are important because they heavily influence our behaviour and course of action by the following properties [MW3]:    They drive reasoning and serve as goals. They persist; we typically do not abandon intentions without good reasons. They constrain, we usually do not nurture inconsistent intentions. Hakan Gulliksson 103 When an intention is selected the actor makes a commitment to it. Commitments are managed in different ways. One strategy is blind, or fanatical, commitment where the intention is maintained at all costs until it is fulfilled. This simple strategy is not the best if the environment changes frequently. A trade-off has to be found between adaptability and simplicity. So far humans have much more advanced mental states than any technology can provide. Below our consciousness, sublimal processes are at work, taking care of matters that we do not currently care about. The task of driving a car on the highway is soon delegated to a lower level of attention. Another example is an advertisement in video (now forbidden) where a frame here and there in the original video is exchanged by an advertisement. The exchange is not consciously detected, but the effect is real. We will be affected by the ad. RIP ALWAYS COCA COLA By the way, are you sure that this text does not contain hidden sublimal messages? The cocktail effect is another interesting trick played by attention. You enter a room filled with people chatting. In this noisy environment you suddenly hear your name mentioned. The voice speaking your name was not louder than the others, it is just that you have filtered out the familiar sound pattern, and directed your consciousness at the sound. There are also other effects. The precedence effect is for instance the very convenient behaviour that the first sound that arrives gets attention, and echoes are ignored. A last example is that differences are heard rather than similarities. If you suddenly start hearing your car while driving it, something is possibly wrong. The car sounds strange . Dinner shopping Bear approaching Baby crying IV.10.1 Reaction time and attention span From the previous section we learned that consciousness works like a flashlight focusing on the currently most important input. According to our consciousness we react immediately to external stimuli, but this is not so! Experiments show that it takes up to a second before we express an intention; and at this time our brain has long since prepared the reaction. This evokes the question of free will. Are we really in control? Some reactions are much faster, but they are not under our conscious control. If you put your hand on a hot plate, no second elapse before you remove it! Measurements done on the brain shows raised brain activity about 70 microseconds after a visual stimulus, but almost a tenth of a second will pass before we recognise the visual object, under the best of conditions. Every action involving muscles takes time to perform, even if it is not conscious. One example is the quarter of a second it takes to move your eyes in the direction of an action. Things and information can be implemented with much faster response times. The response time of a feedback is an important factor when performance is evaluated. Human attention sets an upper limit to this response time. A chunk of data is kept in short-term memory for no longer than 15 to 30 seconds. After that time the information once again have to be retrieved from long-term storage before we understand what is going on. The philosopher William James estimated his own attention span to approximately ten seconds. Try to measure your own! Hakan Gulliksson There is more to life than increasing it’s speed. Mahatma Gandhi The bright senses, sight & hearing, make a world patent and ordered, a world of reason, fragile but lucid. The dark senses, smell & taste & touch, create a world of felt wisdom, without a plot, unarticulated, but certain Crowley Attention span: The length a time an individual can focus attention on a particular object, task, or material to be learned. 104 For things and information the attention span can be as long as the designer necessary. There is however a cost associated with a longer time. More data will be ready at hand and this demands more memory and increase processing time. The human brain does not have an idle mode so if a task is put on hold our brain goes off planning the next move, or wanders off to think about something completely irrelevant. A slightly too long delay will always be used somehow. Some delayed responses are particularly annoying, such as when we plan ahead, and we know that if we have done something wrong, and the time before we can fix the error will be long. Using an overloaded Internet engine will trigger this reaction. Did I give the wrong search word? This aggravation can be somewhat relieved by feedback, for example an indicator showing the time remaining before the answer arrives. Moderate disturbance (e.g. a quiet radio) and the presence of other people can help sustain the level of attention.. This is one of the reasons why some students say they can study better if they are playing music at the same time, if it is too quiet performance suffers. Events What delays we accept depend on our expectations. When pressing the stop button on a video recorder it should stop within a second or so (within 2 seconds according to [BS1]). If it does not stop we become impatient and press the button again. Still no reaction and we pull the plug (at least if we work with computers and are used by their behaviour). Another example is that if we press a light button in a virtual environment we expect the light to go out immediately, or at least within a tenth of a second. time Objects The list of positive effects from fast feedback, i.e. short delays, include:    The plan for solving a problem is easily remembered. Distractions are ignored since the focus is always on the problem. Errors are rapidly handled with minimal distortion. IV.11 We reason What is the use of all this processing? Maybe it can help us to solve a problem? The intelligent agent, including the human, has needs, targets to strive for in its life, both in the short and in the long term. Solving problems is also fun, and a major underlying theme of this book. It is the process of accomplishing an objective through a series of not immediately evident actions, and involves an internal representation of the problem, a search through the space of possible actions, and a selection of a set of actions using principles specific to a domain. The domain specifics are what differs problem solving from the more general term reasoning. Both are principles to draw conclusions by manipulating information. Reasoning about problems is a fundamental human ability, and will be for any thing or information that have to manage in an even the least complex, loosely specified, or changing environment. It can be carried out in two different ways, either by deductive, or by inductive reasoning. Deductive reasoning starts from generally valid assumptions, true statements, and uses them to draw conclusions that can help us reach an objective. It works from the general to the specific, starting with an idea of a theory formulated as a set of hypothesises. If each of these is verified through observations then the theory is confirmed. We start with the hypothesis that E=mc2, and try to verify it through experiments and already verified scientific facts such as F=m·a, and the wave equation. Hakan Gulliksson I never guess – It is a shocking habit – destructive to the logical faculty. Sherlock Holmes Is there a life after death? Eternal question 105 Inductive reasoning instead starts from one or more observations about the world, observations that are verified only in special cases, i.e. they can be false. From the observations patterns are sought. Just like Sherlock Holmes we use these patterns to build a hypothesis that can be tested, verified, and packaged into a new theory. One example of how humans use inductive reasoning is the following one [NS]. Try to estimate if words starting with an r are more common than words with r in the third position. The first attempt to solve this problem starts with searching your memory for examples of words of the two kinds. Since it is much easier to retrieve words starting with r you falsely ? induce that there are more words starting with r . The problem in the game of chess is how to select the next move such that it maximises the chance of winning. The search space is too big for a human to traverse all of the possibilities, so heuristics are needed to prune it. A heuristic is a rule that can be used to simplify a solution to a problem. The rule can be found from commonsense knowledge, by trial and error, or in some other way. Eureka!! ! Figure IV.11.1 Pruning the search space. Problems in open systems, e.g. social environments, are in general too complex to be solved in an optimal way. The solution strategy is to learn enough about the problem specifics, and of the context, to adapt an old heuristics, or invent a new one, to approximately solve the problem. It is often necessary to look at the structure of the information in the environment to select a good heuristics. The solution in other words will be specific to the problem rather than general purpose. Examples are the heuristics you devise to select a mate, and how you decide on the location of your new home. One useful method for applying heuristics is means-end analysis. At each state in the search space you choose the transition that minimises the difference between the current position and the goal state. Figure IV.11.2 below illustrates the example where the road to choose is determined by the shortest remaining distance to Umeå. Here it is quite simple to determine the distance to the target. In a game of chess the choice is usually not this obvious. Umeå 10 km Umeå Show that xn + yn = zn has no solution in whole numbers, where n > 2. Fermats last problem, dotted down in a margin that could have needed enlargin T Lehrer Figure IV.11.2 Meansend analysis. Select the path with lowest cost. Umeå 12 km Hakan Gulliksson 106 When we solve algebraic problems we use another strategy. Such a problem is formulated as a set of equations, using the well known notation shown in the example below, and some prerequisites. The solution is found by substitution: Potent problems: pollution, poverty, population, and political power. Problem: What is the value of a? Prerequisites: e=1, b=3 Equations: a= b + c, c= b + e Unknowns in the equations are eliminated, one by one, in the right order, to retrieve the value of a. We ask ourselves, what are the unknowns? What facts are given? This type of algebraic problem can also be formulated in words If Tom has twice as many problems as Joe who has half as many as Mary who has 2. How many has Tom? Already while reading the problem statement the solution is planned. We ask ourselves what the unknowns are, and what data that is given [GP]. Tom has twice as many as Joe means that if we know how many Joe has, we know the answer (Tom = 2 * Joe). Keeping this in mind we go to the next statement Joe has half as many as Mary . Now we know the answer if we know how many Mary has (Joe = Mary / 2). The last statement Mary has gives us the last clue and we can now backtrack to the solution. Joe =2 / 2 =1, Tom = 2 * 1 = 2. Elementary, my dear Watson Not said by Sherlock Holmes in any of the books A reasoning of different kind is cased based reasoning (CBR). It is not directly build on logics, as the mechanisms above, but explores the fact that we (and machines) already know patterns that can be reused by analogy. If you know how to make pancakes you are well off to use the frying pan for other courses. This reasoning involves generalising a particular solution to another case. If you for instance know how an IFstatement works in Java you will have no problem using similar constructs in other programming languages. CBR means extracting information about a specific situation, and storing it. Next, the relevant aspects of the new situation have to be found that can be used to find the stored knowledge. How we represent knowledge is important. The last step of the procedure is to adapt the found knowledge to the new situation. What do you do if some necessary prerequisites are missing? In a more complicated example an expert (you of course) will find out that something is missing, but the novice might not. The expert will also better determine what missing information to start searching for, and will know how to obtain this information. Expert problem solvers in a discipline also learn how to recognise patterns in problems, patterns that they can use for selecting the next step in the solution process. A good programmer will recognise structures in a problem and use them to delegate parts of the solution to separate functions. A parent will know the significance of slightly different screams from the baby. A crime investigator such as Mr Holmes ignores the right irrelevant details. Most real world problems are characterised by incomplete information about system variables. Also, the resolution of known variables can be low, i.e. variables change value too fast, or might even be tampered with by evil forces. This leads to contingency problems where the solution of the problem needs feedback while looking for the solution. The agent Hakan Gulliksson What to have for dinner? Morning out of the sun A smell of toast is in the air When there’s a war to be won The flying toasters will be there Text from the “Flying toaster” screen saver 107 must continuously explore the problem state space either directly through experiments, or indirectly by simulation. Typically real world problem also involves a lot of states. This means that computational complexity, and memory consumption are important considerations, and achieving real time performance is a problem. There are two strategies available here. Either we use a general solution, or we create a specialised solution to the problem at hand. The general solution is flexible and can be reused on more problems, but consumes more memory and computational resources. The specialised solution is efficient, but since it is not flexible we are forced to develop a new solution for each new problem, and development itself takes time and consumes resources. IV.12 We plan and search If we are faced with a really tough problem, like preparing breakfast, we have an enormous information space of possible actions and physical constraints, which means that we must use informed search. How do you specify a heuristic function for preparing breakfast? You have several tasks to perform, sometimes in sequence and sometimes in parallel. To open the refrigerator, get the milk, put the water kettle on, and slice the bread, are all relevant subgoals. So far, the only tool described in this book and available for you is to the entire time search through all of the impossibly many actions available for each subgoal. It seems that we need at tool that better can help us structure actions and objectives. We need a plan! Action Subgoal Main goal Napoleon used planning to make sure that his armies were used wisely. How would he have managed with search only? He failed, was that because of bad planning? Incomplete knowledge of state space? Maybe he just had bad luck? Given that we have an inner model where different actions are represented, and a problem to solve, we must choose between actions. We say that we have a plan if we have a representation of the goal, and a sequence of actions that when executed achieves the goal [PG]. Note that the choice implies a search through the possible sub-goals and action sequences as indicated in Figure IV.12.1. Think about the last time you made a plan. Assuming that you remember one, list 10 circumstances under which the plan would not work. Goal (1) Planning Sub-goals Figure IV.12.1 The planning process followed by execution of actions to achieve goals. (2) Execution of actions Planning is one activity that situates humans in time, the other being storytelling about past or imagined events. Hope and regret are two emotions that reflect this situatedness, and to support it we try to format and structure information such that it makes narrative sense [CH]. Hakan Gulliksson 108 The human seems to be the only animal able to plan ahead. We can spend four years on an education for an exam (silly is it not ). Or we can buy twice the amount of pasta that we need this week, when it is sold out at a lower price, because we foresee that we will need it next week. Our ability to plan is a gift, but also a curse. We have to choose between enjoying the passing day and preparing for the next. The uncertainty and anxiety that comes from having to choose, and knowing that we choose, is fundamentally human. Still, despite all planning and choosing, and because of the complexity of our context, it is likely that we will discover the consequences of our actions only after making them. Heuristics, imitation, and post rationalization are consequently found everywhere. To plan you have to understand cause and effect. It is inherently connected to our notion of time, because without cause and effect time would cease to exist. Since planning is fundamental for survival, sorting events into cause and effect is also very important. We always try to find a reason for things that happens to us. If we cannot blame anyone else, we blame fate, or in more positive circumstances we accredit luck. Looking for a cause is usually a wise thing to do; if you hear a bang when you are out driving, you slow down. Puncture, gearbox problem, superman landing on the roof? As events always bombard us it might seem that planning is a kind of Sisyphus work, where we constantly have to revise plans, but somehow we manage to keep small continuous changes in prerequisites and context from flooding the planning process. Planning algorithms use descriptions of states and goals in some formal language, usually in first order logic. This explicit description enables programs to reason about the states and the goals. Actions are represented by logical descriptions of cause and effect enabling the planner to relate states and actions. With this arsenal of descriptions the planner can, for instance, use the idea of divide and conquer and attack independent subgoals separately. "Baldric, you wouldn't recognise a subtle plan if it painted itself purple and danced naked on a harpsicord singing `subtle plans are here again'." Edmund Blackadder, `BA IV' I choose therefore I am “Jag har en plan” Sickan What if the plan fails? There might be assumptions made during the planning that does not hold the test of reality. Perhaps context changes or is misinterpreted. This is the contingency problem applied on planning. The planner has to do a trade-off between either adding too much information about the world, or using sensory information to detect when the plan does not work. At this time we want to mention two other related fundamental problems facing artificial intelligence. The first problem is called the qualification problem and is the problem to define the circumstances under which a given action is guaranteed to work. There are many possible reasons that could stop you from going to work in the morning. The bus could be blown to pieces by a bomb, a snowstorm barricades your door, your alarm fails, or all of the above happens on the same morning. The problem is to qualify just enough conditions to see if something can be done, a task well performed by our common sense . Hakan Gulliksson 109 The second, related, problem is the framing problem, also called the ramification problem. It concerns the fact that we need an infinite amount of data to exactly describe the currently relevant aspects of reality and the implicit consequences of actions. When creeping out of bed in the morning thousands of small creatures in your bed will get cold (about 10.000 of them) with consequences you never think about. Even a seemingly simple problem such as preparing breakfast involves a surprising amount of knowledge. How come that we know that the butter stays on the knife, but milk does not? That milk stays in the glass? That we cannot hold the glass of milk and the sandwich in the same hand? If we took everything into consideration that could possibly have an effect on our early meal we would die from starvation [DD]. We would, by the way, also die if we tried to figure out everything that does not affect our breakfast. Every square inch of the human body supports on average 32 million bacteria . To make things even worse reality is dynamic and often difficult to predict. There is a story from chaos theory about a butterfly in Paris creating a storm in New York. This reflects the fact that any small physical effect, under the right circumstances can be magnified totally out of proportion and ruin several well-planned picnics in New York. When solving a problem search is one way to find a sequence of actions that leads to the goal, i.e. to find a solution to the problem. Search is an extremely useful concept not just for solving problems, but also in general for all sorts of information retrieval. It traverses the branches of the state space tree from the root node (root state). If only two branches are allowed at each branching point, the tree is called a binary tree. If you traverse the tree and always select the next deeper node at each branch you do what is called a depth first search. If, on the other hand, you complete each level of the tree before starting the next level you do a breadth first search. If the only knowledge you have is about which nodes that are direct descendants to a node, you have to use blind search. A better strategy is, if possible, to always traverse the path starting at the state with the currently lowest cost. This is called best first search. In chess, the number of possible moves is estimated to be 10 120 [NS]. Search strategies for a problem with such a (realistic) search space have to compromise on optimality and as a result the optimal solution might be missed. Other reasons imposing tradeoffs are that the time for finding a solution is limited, processing resources are limited, operations that have to be performed while searching are very complex, or there is a shortage of available memory. If it is impossible to visit every node in the tree, to the full depth of the tree, i.e. to do an exhaustive search, a greedy search algorithm can be used. The greediness is formulated as minimising the estimated cost to reach the goal state . If we know the cost to reach the current state we use a heuristic function that is a rules of thumb to approximate the remaining cost to reach the goal. Next we pursue the path with the least total cost in a best first search. Hakan Gulliksson Rheumatic pain is associated with changes in weather. (Hmmm?) Categorizations is made on the basis of similarity between instance and category members. (Huh?) Two events can have greater chance of co-occurring than either event by itself. (What?) 110 One example where heuristic search is necessary is if you want to find the research paper How to search in your room let us say that you have several papers lying around in piles). You try some of the piles where such a paper could be found. If you find a paper with the title Traversing data structures you suspect that the paper on search is nearby, i.e. you use a heuristics and you thoroughly search that pile first. IV.13 We make decisions Why are some plans and actions executed and others not, i.e. why do we take the decisions we do? The model shown in the next figure focuses on this question, dividing the mental system of a human (or any other interactor) nicely into two blocks [JF]. The first block is the motivational system where all inputs that can have effect, i.e. can motivate, are collected into tendencies. Since living systems cannot stop doing, motivation is more about selecting among alternatives than motivating doing something. Motivations help to steer behaviour and increase alertness. If you for instance are thirsty you look for water rather than bread, and the thirstier you get the more intense your search will be (up to a point). 7.00 Monday morning, November. (surely a false alarm) A priori goals (survival) Decisions Tendencies Motivational system Motivations (interpreted percepts, inter-individual and social claims commitments, drives), decision uncertainty. Decisional system Figure IV.13.1 A model with two subsystems for managing decisions. To representational and organisational systems (commitments, plans, standards, hypothesis and so on) The tendencies are input into the decision system, which evaluates tendencies and selects a decision. Outputs from the system are the decision and the representations of the decision, for instance the actual plan resulting from the decision, or the commitment made. We all have lots of commitments, to our friends, to the society, to the environment and so on. They reflect the fact that we plan ahead, that we can promise something about how we will behave tomorrow, and they help to stabilise the world [JF]. Which why? What to do? When? How? The system should be thought of as a continuous interaction, a process, where feedback is extremely important. Feedback from the environment, on the results of decisions, is needed to evaluate decisions and can, among other things, support a model of uncertainty for a decision. Uncertainty can thus be used by the interactor to motivate behaviour, see figure IV.13.1 Human processing solves problems in a veritable chaos of changing information. Problems are solved with limited resources in time, processing power, knowledge, and memory. This among other things calls for simple heuristics to    Search for alternative actions, more information, or both. Know when to stop searching. Make the decision from the current situation Hakan Gulliksson 111 One example of a heuristic is to choose an alternative that is recognized, and ignore the other. Another example is to reuse the strategy that worked the last time. Along the same line of thinking a hypothesis that is easy to represent in memory will be favoured, as well as one that we have a more detailed description of. When we are faced with several different alternatives another simple heuristic is to select one cue out of many and use the decision with the highest value for this cue, ignoring information about all the other cues. Recency is another aspect that strongly influences how we interpret events; a recent occurrence is recalled more easily and the decision made will be revised. There are also specific external events that trigger actions and decisions. One example is that if a warning signal sounds, the hypothesis is that something needs attending to. If we have decided something we tend to favour it, and in general be confident in it. We will to some extent even ignore evidence that does not match, i.e. the first impression is lasting . This inherent overconfidence in our own beliefs also affects the effort that we invest in evaluating our beliefs. We rather search for evidence that confirm a belief than evidence that do not. We are a positive thinking breed and should take this built in behaviour consciously into consideration when making choices. Humans are also somewhat irrational also when evaluating probabilities. We tend to overestimate the probability of an event with a quite low probability (less than approx 10%) as seen in the figure below. We also tend to underestimate the probability for the rest of the interval. One effect of this is that we behave differently when facing a possible loss or gain. Suppose there is a choice between a certain win of 1$ and a 50-50% chance of winning 2$ (or nothing at all). According to the curve we will underestimate the 50-50% probability and choose the certain option. If we reformulate this and say that we face a certain loss of 1$, or a 50-50% chance of loosing 2$ we will tend to underestimate the risk and go for the 50-50% chance [CW]. Subjective probability Russia Perfect behaviour High Typical behaviour Low Low High Figure IV.13.2 Humans subjective probability underestimates high probabilities. Stated probability Decision-making can be improved by learning and training. Accounting and playing chess are two examples. In these domains feedback is available and there are rules of behaviour. Stockbrokers face quite a different problem with low predictability, and with ambiguous and delayed feedback. Learning also helps to find alternatives when accepted decisions and current behaviours do not work. One example is that a child learns to walk instead of crawling. Another way to improve decisions is to work out procedures, routines to follow given certain conditions. Most decisions that we take repeatedly, such as weekly shopping, or driving to work every day, are performed (and optimised) as procedures. A third way to improve decisions is to automate them, i.e. let the computer take Hakan Gulliksson "In retrospect it becomes clear that hindsight is definitely overrated!" Alfred E. Neuman Consider that the world is trying to tell us something, if only we know how to listen. Sensing can thus be viewed as a form of communicationin which information flows from the environment to the attending agent. [RA] 112 the decisions. Computers do not bias facts, or underestimate probabilities (when programmed correctly). Whenever a selection does not give the wanted result another course of action must be chosen. One example for a strategy here is trial and error. Trial an error does however not always work well, for instance when meeting a tiger in the jungle. This is one example where emotions can support the model in figure IV.13.1. An emotion, here fear, can modulate motivation by weighting in the uncertainty of the situation. The little one knows that the jungle is a dangerous place, and when facing the tiger understands that uncertainty is high, flight rather than fight seems to be the best option. What we have learnt enables us to choose the right sequence of decisions for a given situation. We usually cannot foresee all the particularities of a situation, so we learn patterns and adapt them to the situation at hand. Figure IV.13.3 shows how knowledge about decisions can be represented in a data structure to solve a problem. In this case as a decision trees for two different situations. “The task of rational decision is to select that one of the strategies which is followed by the preferred set of consequences.” Herbert Simon Context dependency means that the same sequence might not work in another situation, and keeping track of all of the situations where a decision works, or not, is itself a problem, Situation 1 Situation 2 Figure IV.13.3 Two different situations demands two different sequences of decisions. Solution A similar figure to IV.13.3 can be drawn where two different sequences of action lead to the same goal. In this case the interactor needs to select one path, preferably the best one. Making a decision is easier said than done. Complete knowledge is rare, facts are uncertain, and only some of all the relevant strategies are possible to evaluate. Facts can also depend on each other such that conditional probabilities are important. IV.14 We learn and adapt Humans are surprisingly adaptive. While interacting we ignore all sorts of irrelevant information, errors, and inconsistent behaviour. We adapt to odd habits and foreign cultures, we overcome differences in age, knowledge levels, and language. As teachers we adjust the presentation level of knowledge such that it match the knowledge level of the students. If the first explanation does not trigger a spark of understanding then we try another angle, perhaps offering specific examples instead of describing a general principle. A fundamental prerequisite for adaptation is our ability to learn, see also Chapter III.1.4. In the definition by Maturana to the right, learning and interaction are intimately related. We for instance do not learn about hammering from some abstract mathematical model, with forces and angles, neither do we Hakan Gulliksson Learning is not a process of accumulation of representtations of the environment; it is a continuous process of transformations of the behaviour through continuous change in capacity of the nervous system to synthesise it. Maturana, Biology of cognition 1970 113 store the idea of hammering as a symbol. We learn it by engraving the actions of hammering into the nervous system. Figure IV.14.1 Learning as the process of engraving behaviour into the nervous system. (You also have these interconnected little circular shaped things don’t you?) To use knowledge that we have, we must be able to recall what we have learned, when it is needed. This is however rather a re-production of behaviour rather than only a retrieval of some previously stored data structure. Mother nature is greedy by nature and dislikes random knowledge. Whenever knowledge is stored it is because it has been used for a purpose. Therefore knowledge can also be seen as part of a problem solving process. Finding your way to work is one example. You can take hundreds of routes, but there is one short and easy. This is the one that you have stored in your memory, and can follow without difficulty, even without conscious effort, every day. Try to think about some knowledge that you have that is not problem related! A little knowledge that acts is worth infinitely more than much knowledge which is idle. Khalif Gibran Why is learning important? Most of all because adaptation by learning is much faster than adaptation by evolution that takes generations to refine randomly generated genetic variations. Learning combined with planning and reasoning are powerful tools for survival. The development of humanity can be seen as a journey along a path of knowledge [AC]. What we do when we learn in school and from life is following our path of knowledge. Figure IV.14.1 Knowledge is gained through interactions. Science adds to the path by researching new pieces of knowledge from an enormous search space. We add layer after layer of knowledge to our collective fortune. This process is dynamic and there is no guarantee that knowledge will survive, it is something that humanity will have to constantly fight for. Some ideas added can only be understood once others are already assimilated. It is for instance difficult to understand multiplication without having mastered addition. Hakan Gulliksson Nature is seen by men through a screen composed of beliefs, knowledge, and purposes, and it is in terms of their cultural images of nature, rather than in terms of the actual structure of nature, that men act. Rappaport 114 However, learning does not come for free. It takes time and other resources, and during the learning period the interactor, or even the society, is vulnerable. Are we learning the right things, and adapting the right way, to the changed circumstances? Delayed reproduction time (approx. 20 yrs)! IV.14.1 Taxonomy for learning A group of psychologists lead by Benjamin Bloom developed a taxonomy for learning behaviour. They identified three overlapping domains for learning: affective, psychomotor, and cognitive learning. Affective learning relates to how we learn emotions, attitudes and values. The fact that we learn them is shown by how we behave when we grow up. We learn how to show attention, concern, interest, and how to act responsibly. Psychomotor learning is about learning more basic motoric behaviour such as coordination, fine motor skills, dance, athletics, and how to make facial expressions. Cognitive learning in Blooms taxonomy is what we typically do in school, and it takes place at the following six knowledge levels. With each level the understanding of what is learnt is increased. 1. 2. 3. 4. 5. 6. Knowledge, define, list, recognise, repeat, facts. Comprehend, classify, describe, translate, facts and their relationships. Application, apply, interpret, operate, sketch. Analyse, calculate, compare, criticise, examine, experiment, test. Synthesise, construct, create, design, develop, organise. Evaluate, defend estimate, judge, predict, rate, evaluate, argue. This list is also interesting since with each higher level it becomes more difficult to describe how humans manage to perform the tasks. Learning about learning is by necessity a multidisciplinary effort. Pedagogy, psychology, computer science, linguistics, philosophy, and neuroscience are all interested in this phenomenon. Pedagogy studies learning from a phenomenological standpoint, and also in the complementary activity, teaching. Typical questions are Is this a good way to teach this subject? , Will this situation enhance learning? , If I formulate the knowledge in this way, will understanding be enhanced? . IV.14.2 How do we build knowledge? Knowledge is the result of interaction! There would not even be any knowledge without interaction, i.e. without confrontations with the world, and especially with other interactors. Try to find some knowledge that you have that was not obtained through interaction! It is in other words extremely important for people to interact with other people through discussions, books, or by other means to achieve knowledge and knowhow. Hakan Gulliksson “Knowledge is above all the fruit of interactions between cognitive agents who or which, acquire it through a process of confrontation, bijections, proofs and refutations.” Lakatos 115 Learning and training can be achieved by different behavioural mechanisms:      Habituation is learning where we become accustomed to the input. In the case of the sensory system, the input signal is ignored after a while! Conditioning, where good behaviour is encouraged and bad punished, e.g. say ma-ma . Copying, where successful examples are followed. Trial-and-error, just doit. If you do not know how to accomplish a task you just perform any, more or less random action, and hope that it will reduce the distance to the goal state. Playing, this is a variation of the trial-and-error method that is sometimes driven by curiosity and encourages creative thinking. Playing is a good way of elaborating knowledge, and results in well-formed behaviour. It is also an excellent method to collect information for later planning. Knowledge elaboration of various kinds, such as playing, helps us to remember, understand, and to acquire new skills. Because we tend to forget things, we need to rehearse as well as to learn. A hopeless situation in the long run for a single human who wants to learn as much as possible, even if we constantly add new supportive technology, for instance electronic calendars. Limitation in human attention also affects learning; to learn we have to pay attention. It is for instance difficult to learn how to play the trumpet and to study history at the same time. ODEUR For men by hg Ever tried? Ever failed? No matter! Try again, fail again, fail better! Samuel Beckett "Ah, Percy. The eyes are open, the mouth moves,but Mr. Brain has long since departed." Edmund Blackadder, `BA II' From this it follows that we need to be free from the everyday survival tasks in order to have time to pay attention. Another conclusion is that specialisation is advantageous since it gives adepts more time to learn (some important knowledge like this). This seems to suggest a world of specialists, but there is a major problem with this prophecy. A specialised language is necessary among the specialists from narrow disciplines, but creativity prospers through the cross-fertilisation that occurs when disciplines meet. Cross fertilisation implies overlapping knowledge and common language, which means that experts cannot become too specialised. Without creativity, adaptability will suffer, and adaptability is something that will certainly be needed even in a society of experts. Learning is creative work. A new knowledge pattern must be imprinted in the brain and intertwined with previous knowledge. This process will be unique for all learners, at least for all of the type human. Another observation is that knowledge is personal, and something that everyone has to build herself. This is because knowledge is learnt in a personal context that can make it difficult to understand knowledge if we do not have access to the context in which it was generated. Is it possible to understand how to make a snowball if you have never seen snow? What does the expression green fingers mean to an Eskimo? Think about explaining to a bushman, coming directly out of the desert, how to change language for spell checking in Word ®. If you had the ultimate wisdom it would probably be so dependent on your own knowledge that it was completely useless to anyone else (what a shame). Hakan Gulliksson Three types of knowledge: What you know you know. What you know you don’t know. What you don’t know you don’t know. Larry Marine To be an expert is to not know that you know what you know and to know what you do not know! HG (classic) 116 As you gain more and more knowledge on a topic, the knowledge will be increasingly integrated into your thinking, and you will have more and more of a problem describing your knowledge, or even knowing that you have it. Examples of this are numerous, various crafts, carpentry, knitting, and green fingers. Try to describe how you walk. Some years ago walking was, for a while, quite a problem for you. Learning by doing or, as in science, learning by induction by studying a number of examples, generates knowledge that is embedded within the system and cannot be reconstructed by the system itself. This knowledge is called tacit knowledge. Animals have the same ability, but probably do not reflect on it. Things on the other hand can only do things that can be formalised, i.e. described. With enough knowledge you are ranked expert and spend your time in a narrow knowledge area. The more of an expert you are, the narrower the area. The more expert you become, the more you know about related knowledge that you do not have. More on how to acquire ignorance is accumulated later in this book; in fact the whole book is about this accumulation. IV.14.3 Knowledge management Interaction is by itself not enough to generate knowledge and to use it efficiently. It is necessary, but not sufficient, as mathematicians say. There must be some mechanism to reduce the amount of information and order it for efficient access. If we absorbed everything, we would be drowned in information impossible to access again, because it would be so voluminous, and without structure. Information Luckily, there are at least two remedies for this problem. The first is to throw away everything that is not interesting enough. The second remedy is to match new information against what is already stored and only consider differences, enhancements, and abstractions. It is illuminating to compare our own abilities to how images are stored and accessed on the Internet. How do you find an image of a brown dog on the Internet? How do you recall the memory of a brown dog? What is the difference in the level of detail between your memory and the image from the Internet? We are still far better at recognition than the machines, but most of us have problems keeping track of the details. Figure IV.14.2 When we want to characterize infor-mation, differences can be very efficient. Abstraction gives us something more than efficient storage. If knowledge is grouped into categories we can assign properties to them and use the categories for inferences, i.e. predicting properties we have not observed. If we know that a PDA stylus is a kind of pointing device we will not be surprised if a cursor appears on the screen when we touch it with the stylus. Hakan Gulliksson 117 Categorizing and making abstractions would not be of much use if properties were sprinkled randomly in reality. How come they are not? Why are properties lumped together and assigned to local groups of objects? The laws of nature, evolutionary pressures and adaptations to the environment are some probable answers. One of the best arguments for Charles Darwin s Theory of survival of the fittest is that it explains why living things are hierarchically grouped into family trees [SP]. In 1619, Italian philosopher Lucilio Vanini was burned alive for suggesting that humans evolved from apes. IV.14.4 Machine learning Things are by no means intelligent today. In fact, most things are quite stupid and have severe problems learning anything at all. A pencil, a cup of coffee, or even a TV-set are not that smart. One way to improve things is to learn by studying humans and try to reuse the findings, but scientists also uses results from biology and ethology, i.e. the study of animal behaviour in natural conditions. Animals have a lot to teach us, from simple motoric behaviours, to complex social ones. Think about how you would represent a screwdriver and a tin of paint in the computer. Ready? Now, test if your representations is good enough to find out that the screwdriver is perfect for opening the tin. Machine learning (ML) is a branch of Artificial intelligence and Cognitive science. It explores a broad range of topics, such as modelling mechanisms that underlie human learning and using machine learning on real world problems. If we could teach things how to learn then we could build them with much less effort, allowing them to explore the complexity of reality by themselves. This section gives an overview of some of the terminology used in ML, but without mathematical rigor. Please note that there is much more to learn than we could be massaged into these few pages [TM]. Nobody really knows how humans learn. Can we then teach computers how to learn? In a way we can say that we learn the things by programming them, what is the difference in principle between sending children to school and to add another program to the thing? One theory is that learning demands interaction with the context to be learnt from. If this is true then things cannot learn faster than humans since the speed of the interaction is constrained by the speed of the changes in the environment. Machine learning can be defined as the improvement of performance, in some environment, through the acquisition of knowledge resulting from experience in that environment, see figure below [PL2]. Learning is in other words something that very much depends on context, such as prior knowledge. Performance Knowledge Environment Figure IV.14.4 Machine learning in context, i.e using the environment and knowledge. Learning Hakan Gulliksson 118 Learning is useful for several types of tasks such as classification, regression (learning how to fit data to a real-valued function), and problem solving. Classification is applicable when we want to categorise some input (instances) as one or more concepts in a concept space (feature space). In the following figure instance x with some known attributes is classified as concept y. Often we do not know the exact mapping from instance to concept, so an inference engine will have to use a hypothesis. The inference engine shown in the figure could learn the hypothesis from a set of training instances. Learning from examples this way is called inductive learning. Concept space Instance Inference engine Figure IV.14.5 The principle of machine learning. Attribute Instance x Concept y Learning can be difficult for many reasons. Obviously the complexity of the concept space and the number of instances and attributes adds to complexity. Other typical problems are the number of irrelevant features, the amount of noise in the feedback for supervised learning, and noise in instance data. IV.14.4.1 Representation of instances The choice of the representations is very important. We need appropriate representations of input (instances, the training set), output (concept, feature space), and of the knowledge learned. Some common representations for instances are Boolean values, categorical values, i.e. mutually exclusive values for attributes, and numerical values, see the figure below. The figure also shows a vector space representation of the numerical data (bottom right). It is always possible to describe the instances using a vector space representation. Instance Hakan Instance Someone Big = True Blue = False Mean = False Howl = False Cluck = False Character = Mean Attributes with Boolean values Instance Hakan Cannot fly Can fly H T, I Instance Hakan Character = Kind Attributes with categoric values Figure IV.14.6 Four different representations for features and instances. Height Hakan Height = 1.87m Weight = 90 kg Attributes with numeric values Hakan Gulliksson Weight Vector space representation (feature space) 119 For some domains a propositional representation is more appropriate, i.e. representing an instance as a data tuple (Length Hakan 1.87). We have previously discussed how such knowledge, in the form of propositional representations could create complex graphs. IV.15 Humans are creative We think we know about what we know, and about what we can do, but sometimes we often surprise ourselves, e.g. when falling in love at the first meeting. We also think that we have good mental models of other humans, but they never cease to surprise us. Our acquired knowledge and mental models are always limited. It is for instance very difficult to understand parenthood if you are not a parent yourself. Some aspects on human knowledge representation and processing have already been described. Now we will discuss maybe the most magical of all human features, creativity. It is usually considered in conjunction with inventions and arts, rather than with adaptive behaviour. But, the fact is that we all need creative thinking every day, in the small, to survive. We are constantly put into situations where we need new solutions. One example is when you try out a new recipe for a small dinner with candles and spouse. My favourite creativity exercise: Try ing to find as many ideas as possible on a randomly chosen topic. A definition of creativity is found to the right [MC2]. The domain mentioned in the definition alludes to a set of nested symbolic rules and procedures embedded in culture; one example is physics, another athletics. Note that if creation is goal oriented and purposeful it is equivalent to design. Definition:Creativity is any act, idea or product that changes an existing domain, or that transforms an existing domain into a new one. The creative process as described in the following steps is accepted by most researchers [MC2]:      Preparation Incubation Insight Evaluation Elaboration The preparatory phase is spent learning about an interesting problem, studying and assimilating facts, and maybe gathering sensory inputs. This phase can take years, or ten minutes, depending on the domain. The second phase, the incubation, means doing something else. Subconscious processes manipulate the information collected and suddenly, as a bolt of lightning, the effort delivers some insight, an aha experience . The aha experience might not be the right solution, it has to be evaluated and possibly thrown away, a most unpleasant task. Throwing away is even more important in the coming idea-rich information society. If the new insight is admitted some hard work is still needed to adapt and refine it. The creative process is said to be 1% inspiration and 99% transpiration. Hakan Gulliksson It’s not possible! It is not done that way here! It is too much work! It is not our idea! My idea is already perfect! Problem: Creative solution? Chi dorme non piglia pesce. (He who sleeps does not catch any fish) How to put to other uses: Adapt?Modify?Magnify? Minfy?Substitute?Rearrange? Reverse?Combine? Osborn, Applied imagination 120 The result of the creative process must be possible to express. We cannot say that we have created something if we cannot formulate, or execute it. Actors perform it, and designers visualize it. Engineers build it, but also think about how to build tools to support creativity and how to measure it. “Now it is impossible to remain human and throw away technology!” Dr Michael Heim, Art Center College Comments? "Arx" by Lars Vilks. Its creator describes Arx as a "three hundred page book whose pages cannot be turned. The reader must move himself." IV.16 Humans feel presence, and have social abilities Presence is the experience of being there, in a situation or an environment, involved in a cause-effect chain of actions. The internal representation of the situation and the actions involved is called a frame, or a schema, and in this frame you place, see, and can study yourself. It works beautifully for instance on placebo pills. When a frame breaks, presence is shattered, which is exploited by visual illusions and many kinds of humour. Frames are dynamic, socially shared, and can be culture specific. If you are present you could be more or less aware of the directions of feelings and cognitive attention [RR3]. Presence increases as this awareness decreases. “nother definition of presence is as the perceptual illusion of nonmediation . Here instead of feelings and cognitive attention it is the awareness of the mediation that decreases as presence increases. Presence can also relate to social presence, i.e. to our awareness of a social environment. Three types of presence can be identified, environmental, social, and personal presence. The difference between environmental and personal presence is that for environmental presence the environment takes you into account and reacts to you. The sense of presence is affected by several factors. A cultural framework, the possibility of negotiation, and the possibility of action are important, and also the affordances of the context. Presence is improved by ease of interaction, user-initiated control, realism and length of exposure [RR5]. Hakan Gulliksson You take the blue pill and you wake up in your bed and believe whatever you want to believe. You take the red pill, you stay in wonderland and see how deep the rabbit hole goes. Matrix 121 The above discussion on presence focused on what it is, and how to keep it up. Alternatively we can look at how it works and has evolved. The reference [GR1] suggests three levels of presence, proto presence, core presence, and extended presence. Proto presence is about the unconscious embodied presence related to the level of perception-action coupling . The next higher level is core presence where changes in core affect and perceptions are consciously followed and attention is directed according to evolutionary dispositions and learned knowledge, i.e. something arouse me, here and now, I see and hear it, and I consciously react to it . Note that this kind of presence still does not assume memory. Extended presence is slower, and here perceptions and emotions are integrated into a single experience, i.e. this is what is happening in the current situation, I understand how it could affect me and my goals, I change my plans accordingly . Extended presence builds the frame we discussed previously and can be seen as a narrative structure involving us. The three levels of presence correspond to three suggested levels of self built by evolution, and also to the reeeeeeaaaaallllly thorny issue of consciousness. Since our mental structure has been developing over a long time there is bound to be interactions between the different levels of self and presence. Emotion is one example of a feature cross coupling them. Think about the following events; holding a conversation while drinking a cup of coffee, and holding a conversation while trying to decide from taste and other clues whether the coffee was Columbian. When all three of the layers focus on the same event we have a maximal presence and the prerequisite for flow, which will be discussed next in the next chapter. One important task for cognition is to support the social abilities that groups of interactors need. For humans they have evolved over a long period of time, and some evolutionary biologists even argue that development at group level is why humans are successful. Groups of interactors are formed for different reasons. One important reason is that members of the group benefits. A group can accomplish feats impossible for the individual alone, and an interaction among groups, rather than among individuals, might be even more efficient. Before we continue discussing social abilities we perhaps should first start by introducing what we mean with the social environment. We will mostly discuss the primary group, which is a small group of people that stays together for a long time, such as a family. This kind of group was important already on the Savannah millions of years ago and within it we want as many and as tight connections as possible to increase belongingness. Secondary groups are larger and less personal temporary relations. For both kinds of groups the identity can be grounded in interacting patterns of [BS3]:    My body is wherever there is something to be done Merleau-Ponty Consciousness: an organisms awareness of its own self and surroundings Antonio Damasio “One blue LED flashes when the robot is both recognizing behavior in another robot and imitating it. In another experiment, the researchers placed the self robot in front of a mirror. Although the blue lights fired, they did so less frequently than in other experiments.” Junichi Takeno, dec 2005 I suspect consciousness prevaildein evolution because knowing the feelings caused by emotions was so indiscpensabe to life. Antonio Damasio Definition: A society is a system where citizens reach their goals through interaction and where there is more interaction within the society than between societies. The citizens live in a common space and time, are aware of having a distinct identity from other societies and believe that they can obtain their objectives better within the society. One objective would be to keep the society together. Shared norms, laws, values, beliefs and attitudes, Artefacts used and created, Blood ties, shared experiences, and physical closeness, i.e. characteristics of the individuals in the culture. Hakan Gulliksson 122 Artefacts are instantiated technology, and are constantly changing in shape, which provides for powerful societal changes. Within the framework of a group it is possible to achieve intimacy, establish trust, confidence, and other social effects. A first pre-requisite for a social environment is social presence, and three dimensions have been distilled; co-presence, psychological involvement, and behavioural engagement [FB]. Co-presence is the degree that a person feels that he or she is alone, i.e. that she knows there is someone else at the same location (co-location), or senses others while showing some aspect of herself or her activities (mutual awareness). Psychological involvement is to what extent the person attends, senses, or responds emotionally to another person. Behavioural engagement is about the interactions that constitute social relations, e.g. when someone is dependent on an action by someone else. Connectedness complements psychological involvement and describes a situation where we know there is a person thinking about us, even though we cannot sense it [RR2]. The example given in the reference is that we send someone a message just to tell that we are connected to the Internet. A number of key dimensions of social intelligence are [VK]:      Situational radar: understanding the social context and how to adapt behaviour to it. Maybe acknowledging that someone really needs privacy? Presence, confidence, self respect and self worth, as seen by others Authenticity, telling the truth to oneself and to others, Clarity, to use language effectively and efficiently, Empathy, ease of creating and maintaining sense of connectedness Social systems favour the socially competent, and abilities such as guessing thoughts and intentions of another individual are extremely valuable. Imitation of behaviours is another important talent, and even very small children follow the gazes of their parents. Social competence is also about finding patterns, e.g. rituals, and using them. One example is that if you see someone who twice gets really, really angry over dishes not done, you suspect a pattern and perhaps make an extra effort the next day. Imitating the angry father is a popular, and advanced, social activity. Research on a social behaviour questionnaire selected 20 out of 172 social parameters for social competence. The items were altruism, amicability, assertiveness, compassion, competence, compliance, dutifulness, eagerness of effort, empathy, good impression, gregariousness, helpfulness, likeability, modesty, responsibility, sociability, socialization, straight forwardness, trust, warmth [BR]. Hakan Gulliksson A social system is a predictable pattern of interaction among persons made possible by shared structures of attention. [MC3] 123 Social connectedness means to highlight personal unique or exceptional attributes, and to recognize, cultivate, and acknowledge such attributes in others. It gives possibilities to acquire specialized social and other skills, and with them the next move is to find groups missing such competencies, while still adhering to similar basic values [DB]. The same type of behaviour is favourable also when selecting long-term companions. Other examples of competences useful in a social environment are the abilities to detect presence and departure of others, identify relations, recognise individuals for instance using faces or gaits, and to refer to other individuals, using language when chatting and gossiping. Means for communication and interaction are necessary, otherwise a community will certainly not work, and this communication will follow patterns. We for instance agree on how to represent correlations between causes and effects, and there are protocols and rules for turn taking in discussions. Norman Rockwell Promotion of cooperation is very much an evolved behaviour [DB]. We need social commitments in cooperation for explicit coordination. If interactors publicly state intentions, then other interactors can use the statements for coordination. We should allow for future possibilities to affect current decisions about relationships by increasing the number of interactions and commitments to interesting individuals. If the norm is reciprocity, helping friends and relatives then deviant patterns of behaviour will reveal exploiters who does not give, just take. This behaviour can be supported by insisting on no more that equity thereby avoiding greed. A reputation as a greedy exploiter will not help make friends. We will come back to co-operation and co-ordination later in the next part of the book. Whatever the reasons, groups are formed, roles are assigned, and when they are, the complexity of the system rapidly increases, faster the more heterogeneous and complex the individuals are. One way (the only way?) to limit this complexity is to impose social rules on the system, using close feedback via social interaction. The rules impose order and also simplify social navigation and manipulation. Humans are unique in that we can create, and even purposively design, new conventions for coordinating our social contexts. We use rules to, among other things, build trust, make friends, and identify cheaters. A marriage is for instance a long term social bound telling who belongs and who do not. It puts and end to messing around and signals focusing on the next generation. Hopefully it also is a result of emotional attraction. We will in this book mostly discuss interaction in smaller systems consisting of only a few interactors. So little time, so much to do. There are of course also large hierarchical systems where the interaction is between members of large organisations, or between whole societies. If you carefully read the definition of an organisation to the right, and think in terms of interaction, you will see that interaction is the crucial element in every organisation, simultaneously the source and the product of its existence [JF]. Hakan Gulliksson Three types of social action:  Rational  Traditional  Emotional Max Weber Definition:An organisation is an arrange-ment of relationships between components or individuals that produces a unit, or a system endowed with qualities not apprehended at the level of the components or indi-viduals. E Morin 124 Human social behaviours are very similar to other social species, such as other primates and wolves. However, currently interactors of type information do not organise themselves. Why? First, the interactors cannot yet make sense of their environment sensed. Second, they are not autonomous and flexible enough to explore the possibilities in the virtual world. Looking for spontaneous grouping is currently unreasonable. A major limitation is that societies created by technology so far have been very limited in how goals are obtained and managed. The complexity of the objectives and their interactions for any H is on the other hand extremely high, emotion; mood, love, and grief all fill social functions. Empathy, for instance, is an interaction that tends to keep a group together. You know, or at least suspect, that I feel what you feel, and share that feeling. Can a wolf feel empathy? An ant? The solution so far has been to let a user , or a designer programmer , make the decisions. Sharing limited resources and the use of communication implies mobility, and a social context that changes over time. Social context, i.e. norms, roles, and social pressure triggers mobility. We for instance leave a meeting when we receive an important phone call. If we however imagine a society of designed networked interactors then communication does not demand mobility and virtual resources can be shared over the network. # # ! # !! Try to put yourself in the place of a car, how would you perceive reality? IV.17 Humans experience it In the following section we will focus on experience and emotion. Not only are they important for well being, for how we feel , what excites us, and for what we potentially will buy, but they also difficult to clearly describe. We will draw a rough sketch of a number of concepts that have been discussed since Plato, and which are still not resolved. Emotion and a sense of experience are based on interactions internal to us, but they are tightly bound to evolution, and to the physical and social environment. They are also processes rather than states. This section could have been situated in the next part of the book, on interaction, but we choose introduce the content as a characteristic of an interactor. We prefer to see experience as a dynamic action cycle where a human perceives internal and external events, and has intentions (goals) and concerns. The percepts are appraised, and emotions result. Action tendencies (arousal) are established, and actions executed that will change internal and external variables, possibly triggering new events. Concerns are or have similar effects as needs and urges. Together with impulses, drives, and attitudes they sum up to a set of motivational factors that can complement emotions and conscious reasoning to help us decide what to do. Along with emotions we also have moods and traits. Hakan Gulliksson Let us have an emotional experience together. Impulses are behaviours that trigger spontaneously grounded in desires. We buy that nice looking unreasonably expensive mobile phone. 125 An event in this context is an external or internal representation that behaves such that it attracts attention, i.e. it is the object of the emotion and it changes. To perceive an event presence is important and attention must be directed. A mood is a person's sustained and predominant internal emotional experience; examples include aggression, fatigue, depression and euphoria. It is generally of a long duration, unintentional, undirected. Action tendency Behaviour Appraisal Event Emotion Action Figure IV.17.1 Experience as a network of interacting components. Concern Cognition Attention Even though the concept network in figure IV.17.1 seems complicated as with many other areas where the human is studied, the devil is very much in the details. Scratching the surface discussed reveals numerous different definitions and views, overlapping in scope, different in abstraction level, describing highly entangled behaviours. However, for the purpose of this book the above model is sufficient. There are of course also other theories to choose from. For many of the concepts considered there is a discussion whether they are iinnate or learned, and if learned, how this is accomplished. Is for instance the social environment the main factor for learning? We will not discuss this here. To clarify the concepts we will now shortly define some of them, still at a rather high level of abstraction. Man consists of four principles, which are called the Physical, the Nervo, the Soul or Psychic and the Mind or Mental. Max Theon's teaching The septenary division may be given as follows: 1. Physical Body, or Sthula-Sarira, 2. Astral Body, or Linga-Sarira, 3. Vitality, or Prana. 4. Animal Soul, or Kama-rupa, 5. Human Soul, or Manas. 6. Spiritual Soul, or Buddhi, 7. Spirit, or Atma. We start with emotion. IV.17.1 Emotion Emotions serve many functions as our control centre. They produce shifts in concentration and attention, motivate, and help us to sustain explorations, manipulations and investigations. Furthermore they free up cognitive resources when needed, and support social life. Emotions help us to make decisions from insufficient facts (almost always the case for us). To be surprised you have to know what to expeeeeeect, which means that you have prompt access to a lot of information about the world. Emotions involve neural, physiological, cognitive and social aspects of behaviour, and they serve as modifiers, or amplifiers, of motivations that drives human behaviour to fulfil needs. We for instance need food, and for this hunger is a motivation. If you are hungry and see someone eating, someone who does not want to share, then anger could amplify hunger and trigger an attack. Without anger you could discuss pros and cons of attacking until all of the food is already eaten. Emotions also additionally serve as signalling mechanisms. They tell other individuals about your emotional state and help them to guess what you will do next. They could also guess your situation, and perhaps infer from Hakan Gulliksson “Emotion is too broad a class of events to be a single scientific category. As psychologists use the term it includes the euphoria of winning the Olympic gold medal, a brief startle at an unexpected noise, unrelenting profound grief, the fleeting pleasant sensation of a warm breeze, cardiovascular changes in response to viewing a film, the stalking and murder of an innocent victim, lifelong love of an offspring, feeling chipper for no reason, and interest in a news bulletin” [JR] 126 your fear that running is a good thing to do. Furthermore emotions represent social and moral values. Fear of going to prison is supposed to keep you from committing crime [KS]. Emotions are experienced and attached to events, people, products, and services. Generally they are short-lived, intentional, directed. They interact with cognition as well as affecting physiology, e.g. anger raises blood pressure. A common denominator for mood, emotion, and feelings is affect, and the simplest descriptions of emotion is as a core affect. In many models there are two dimensions of core affects, Activation/Deactivation and Pleasant/Unpleasant, and they can be seen as spanning a circle in two dimensions, see figure IV.17.2 below [JR3]. Alternatively we can use valence and activation to describe core affect where valence is the degree of attraction or aversion that an individual feels toward a specific object or event, and activation is to what extent an individual is wake/tense or calm/tired. Pleasures are agreeable reactions to experiences in general. Pleasure is similar to enjoyment, a word that has been used for the more limited scope of positive responses to media. On the periphery of the circle spanned by activation and pleasantness we can find different combinations of core affects, again see figure IV.17.2. As a human we are at every instant somewhere in this space, and, importantly, forever on the move. A grown up smiles only 17 times a day. Happy Sad When humanity is gone I hope it will be remembered by a joke. /HG (the essence of a joke is surprise and curiosity) Activation tense alert excited nervous stressed elated happy upset Pleasant Unpleasant sad contented serene depressed lethargic fatigueed Figure IV.17.2 Core affects and their combinations. relaxed calm Deactivation One basic set of emotions often used is Fear (terror, shock, phobia), Anger (rage), Sorrow (sadness, grief, depression), Joy (happiness, glee, gladness), Disgust, sometimes complemented by Surprise [PE]. The main reason to use this set is that they can be seen as face expressions. The emotions listed above are the most often used, but there are between 500 and 2000 different categories of emotion suggested in the English language, and different research views generates different categorisations [JR3]. Hakan Gulliksson Feeling, similar to emotion, but still a different kind of experience. The diffference is that a feeling does not call for an action or activation change [NF]. 127 IV.17.2 Appraisal Stimulus of the system has to be appraised. This is the cognitive interpretation of the event that also can be used to categorise emotions, see also figure IV.17.1 [NF]. A set of appraisal variables distilled from research is listed in table IV.17.1 [JG]. As can be seen from the table when deciding the significance there is an agent introduced that mediates events, or is itself causing the event. Appraisal variable Relevance Desirability Causal attribution Agency Blame and Credit Likelihood Unexpectedness Urgency Ego Involvement Coping potential Controllability Changeability Power Adaptability Explanation Does the event require attention or adaptive reaction? Does the event facilitate or thwart what the person wants. What causal agent was responsible for an event? Does the causal agent deserve blame or credit. How likely was the event; how likely is an outcome. Was the event predicted from past knowledge? Will delaying a response make matters worse? To what extent does the event impact a person’s sense of self (self-esteem, moral values, beliefs, etc.)? The extent to which an event can be influ-enced. Table IV.17.1 Variables for appraisal [JG]. To what extent an event will change of its own accord. The power of a particular causal agent to directly or indirectly control an event. Can the person live with the consequences of the event? Another set of appraisal variables, with their associated emotions within parentheses is [Scherer as cited by JU]:  Novelty surprise, amazement … 1st 6, 6, 28  Motive compliance (instrumental emotions such as disappointment, satisfaction … Intrinsic pleasantness (aesthetic emotions: disgust, attraction to … Legitimacy social emotions indignation, admiration  4th amendment (US), 10-4 2nd law of thermodynamics 0, ∞, 90-60-90 π, 1.618 033 988 749 894 848… 50th birthday, 10 Downing street Challenge/promise interest emotion boredom, fascination …   … 42195 m, 135th (British) Open championship Mount Everest Hakan Gulliksson 128 From the appraisals suggested above a layered representation can be constructed of how they are applied, see figure IV.17.3 [JU]. As the authors also note this however is a very simplified model of your emotional life. Guilt, disgust, anger High low Shame low Pride low Incompatible with norms Incompatible with self-respect Ability to cope low Sadness Very low Goal hinderance Unpleasantness Novelty Very high low Figure IV.17.3 Highly simplified process of appraisal filters [JU]. Fear Happiness Indifference low Some specifically social appraisals have also been suggested; status signals (trigger pride), violation of mutual fairness (anger), and events that happen to a group that we identify with will trigger emotions in us [JU]. IV.17.3 Concern (need, urge, drive, goal, utility, desire, motive) Concerns and other motivators is another set of ill-defined overlapping concepts. If the system detects problems related to concerns emotions develop [NF]. A concern can be universal, such as the concern for physical well-being, but others are personal, related to previous events. Concerns could be abstract, such as concern for democracy, but yet other can be very practical, such as worrying about a traffic stocking. That concerns can be personal and situated is a problem for us here since we want to state general facts about quality of life (QoL). Since concerns are personal, so will appraisal be, along with emotions, and eventually the actions taken. A concern is the disposition of a system to prefer certain states of the environment, and of the own organism over the absence of such conditions. Needs are effectuated through drives, and are low-level and nonconscious, directed at achieving essential resources. Biological drives include hunger thirst and reproduction [DD1]. We need to eat, need to mate. Needs can be seen as particular qualities of experience that humans need for QoL. By asking people to rate different needs a list of needs was identified, and on top of the list were the four needs autonomy, relatedness, competence, and self-esteem [KS2]. Autonomy means that the activities chosen are self-endorsed, and relatedness is the need to feel a sence of closeness to others. Self-esteem is about achievement, status, responsibility and reputation, whereas self-actualisation is described by personal growth and self-fulfilment, the process of becoming everything one is capable of. Less important were security, self-actualization and physical thriving. Popularity/influence and money/luxury where not seen as important. The list of needs was chosen from a survey of different proposals for needs. Another famous suggestion is the fundamental set of Hakan Gulliksson 129 needs on different levels by Maslow; physical health, security, selfesteem, love-belongingness, and self-actualization. Other researchers have complemented Maslow's hierarchy by aesthetic (beauty, balance, form ..), and cognitive needs (knowledge, meaning, self-awareness). A need for aesthetics is motivated by that humans should have evolved systems to find rewarding the preferences that would have been adaptive in the past. This relates to sexual attractiveness,  Longest kiss 30 hours  The most sit-ups using an abdominal frame completed in an hour is 8,555  The longest time a coin has been spun until coming to a complete rest is 19.37 seconds. Yet another interesting suggestion for needs, or urges, are curiosity, challenge and teaching [MT]. Curiosity is for instance clearly seen in young children, and there is no end to the number of world records. Curiosity without challenging anything will not get you far, which indicates that challenge is a basic need. The teaching urge means that it is difficult to keep a secret, and that it feels good to share your knowledge. SDT (Self-Determination theory) gives us our next list. It postulates three innate psychological nutriments for growth and well-being, competence, relatedness and autonomy [ED]. Note that challenge and curiosity are important, if not necessary for achieving competence. Goals additionally can also be cognitive conscious creations. We have social needs (belonging, esteem), and self-actualizing needs (mastery, control, variety, meaning … [RV]. Yet another variation of the theme is motives that arouse and direct behaviour toward specific objects and goals. The big three motives are achievement, power, and intimacy, all related to the social world [RL2]. This short list includes the most important motives from the following longer list by Henry Murray; Achievement, Exhibition (to make an impression), Order (arrange neatly, precision), Dominance, Abasement (admit inferiority), Aggression, Autonomy, Blame-avoidance, Affiliation,/ Intimacy, Nurturance (to give, assisst, help, feed), and Succor (to receive). Next step in the analysis would be to find even more fundamental reasons behind all of the different concerns, and try to identify the most important concerns and their reasons. Maybe there are reasons that affect many concerns? One attempt to such an analysis is found in [BS2]. Starting from the three general components situation, environment and object the three major areas of concerns found were power (hierarchy, competition, and submission), death (violence, health, and self preservation), and love (friendship, hatred, and lust). Dracula’s kiss IV.17.4 Action tendency (coping strategies) If concerns appear endangered then action tendencies develop. The following list is some possible tendencies with associated emotions given within parenthesis [NF];           Approach (Desire), Avoidance (Fear), Being-with (Enjoyment, Confidence), Attending (Interest), Rejecting (Disgust), Nonattending (Indifference), Agonistic (Attack/Threat) (Anger), Interrupting (Shock, Surprise), Dominating (Arrogance), Submitting (Humility, Resignation). Hakan Gulliksson 130 Another view formulates the behaviour of appraisal as something that triggers coping strategies, see table IV.17.2 [JG]. Problemfocused Coping Emotionfocused Coping Active coping: taking active steps to try to remove or circumvent the stressor Planning: thinking about how to cope. Coming up w/ action strategies Seeking social support for instrumental reasons: seeking advice, assistance, or information Suppression of competing activities: put other projects aside or let them slide. Restraint coping: waiting till the appropriate opportunity. Holding back Seeking social support for emotional reasons: getting moral support, sympathy, or understanding. Positive reinterpretation & growth: look for silver lining; try to grow as a person as a result. Acceptance: accept stressor as real. Learn to live with it Turning to religion: pray, put trust in god (assume God has a plan) Focus on and vent: can be function to accommodate loss and move forward Denial: denying the reality of event Behavioral disengagement: Admit I cannot deal. Reduce effort Mental disengagement: Use other activities to take mind off problem: daydreaming, sleeping Alcohol/drug disengagement Table IV.17.2: Some common coping strategies. The table distinguishes between problem based (cognitive) and emotion focused coping, but the two types of strategies still interact. IV.17.5 Experience Experience is a useful concept when discussing quality if life. It certainly is a complex and multifaceted concept since it is an aggregation of whatever people encounter in their lives. After a deep breath we will now attempt to dissect experience. An experience is the sensation of interaction with a product, service, or event, through all of our senses, over time, and on both physical and cognitive levels [NS]. We are constantly interacting with our environment, attracted to favourable experiences, and repelled by others. Our moods can even be affected by words shown too fast to be consciously noted [RL3]. By evolution we are selected to be healthy and to feel good. Evolution has most efficiently eliminated those of our forefathers who did not enjoy food, shelter and company [RV]. Playing tennis with a friend is typically more fun than sitting in jail alone, and people who win Oscars on average live 4 years longer than people who are nominated, but fail to win [RL3]. Hakan Gulliksson 131 We humans prefer a medium level of uncertainty. Total predictability will be dull and grey (at best), and even worse is the ultimate chaos without any patterns of stable references. We, in other words prefer a semi chaotic environment. If this is not possible we will try to change context or situation. When we find ourselves in a comfortable situation this is however sadly only a temporary match and relief. Either we adapt, or the situation develops into something we cannot handle. The problem is not as acute in a social environment where the participants can co-develop, but even there the context can endanger a long term relationship. Hedonic experiences are experiences related to like or dislike. If positive we can name them pleasure, enjoyment, excitement, fun or happiness, and once again we face a set of semantically overlapping concepts. In the following we will not make a difference between the terms. There are however research that try to define and explore the differences, and appealingness of technology for instance relates to pleasure, but novelty involves pleasure, excitement and fun [HS1]. If we consider media enjoyment, such as reading a book, watching a video, or playing a game, we seek a suitable tension between the cognitive abilities of the interpreter, and the complexity and other characteristics of the media message. We can view experience as a story, i.e. a frame, possibly socially constructed, where we can be one of the participants. Enjoyment results from moving in between overstimulation and understimulation providing different level of arousal. The extent that we are able to extract the message from a media, and experience arousal depends on disposition, but also to a large extent on training. Both cognitive and affective structures are active and interdependent when information is processed. Even physiological aspects are involved. A human infant tasting sugar will relax the muscles of the middle face. So far however, no one has found any centre of pleasure in the brain, even though there are several regions identified that contributes to feeling good [KB]. Luckily, a larger area in the brain is allocated for positive experiences. Many researchers have tried to define the dimensions describing experience. One is Donald Norman who describes the visceral, behavioural (learned, habitual), and reflective experiences [DN1]. The visceral level is the "biologically prewired", about how things look, feel and sound. Another framework is provided by Jordan who classifies pleasure as; physio-pleasure, psycho-pleasure, ideo-pleasure and socipleasure [PJ]. Hakan Gulliksson 132 A third approach is to look at the aesthetic experience that is ultimately about a satisfaction resulting from an experience and could originate from perception, cognition, or action, e.g. dancing. Aesthetic sensibility can be trained and increases with age. It is personal, changing, and not necessarily rational. An aesthetics can be defined also for the social environment relating to trends, culture and religion [BS3]. Aesthetics is as fundamental to human life as creativity and design, but it is concerned with how things are done rather than finding them up. Everything from making a cup of coffee, to writing a software program can be done considering aesthetics, and it is blessing as well as a curse. A blessing since if we care about aesthetics beauty is something that matters, but a curse because it adds extra constraints that hamper productivity as measured in time. Taken to the extreme a designer will not be able to follow orders in conflict with his sense of aesthetics. Why do we care for the aesthetics? It has no functional value and yet it is highly valued. A crude explanation from evolution is that anyone who can afford to mess around with arts and care about aesthetics must be wealthy and worthy of respect. It is a matter of status. Another reason is that aesthetics is about human concerns, emotional needs rather then efficiency. Not only solving the problem, but doing it with elegance, creativity, and with little resources. Aesthetics: beautiful, good, true, satisfying, efficient, and useful all at the same time behaviour, form, or thing. White square on a white ground Why then is an expression considered aesthetic? What is it that pleases the individual, and satisfies by concentrating pleasure stimuli? For vision, and if we once again use evolution as a master copy, we want to look at safe, food-rich, explorable, learnable habitats and have friends, that are fertile, healthy mates, and babies [SP]. Any ordinary family photo album can be used to verify these facts. Reproductions of basic graphic elements, that we employ daily to make sense of our environment, for instance parallel lines and symmetrical shapes, are also candidates for aesthetics. These elements have been incorporated into us by evolution, to help us orient ourselves in the environment, and now we can exploit this inherited capability in meaningful, clear, and aesthetic graphics. Is it possible to enhance human’s visual literacy (as a new language)? The search for elegance and the careful selection of design elements forces the designer to reflect on the result, and to spend time with it. Sometimes a new, complete and yet the most economical solution is found, and the result surprises even the designer herself. Some of the keywords are proportion, scale, contrast, and emphasis that evoke activity and interest. An extensive framework for experience is provided by [NS2]. The intent is to help designers and management to think about products. Some of the characteristics affecting an experience discussed in the framework are:       Intensity, reflex, habit, engagement. Breadth, price, promotion, channel/environment, name, brand, service, service, product. Significance, function, price, emotion/lifestyle, status/identity, meaning. Triggers, by sense or cognitive (concept, symbols). Duration, initiation, immersion, conclusion, continuation. Meaning, beauty, accomplishment, creation, sense of community or oneness, duty, enlightment, freedom, harmony, justice, Hakan Gulliksson 18 133 redemption from undesirable validation by others, wonder. conditions, security, truth, In the framework the meanings of meaning are the most interesting. It is defined as a distinct level of cognitive significance that represents how people understand the world around them , and integrates emotional and cognitive as well as cultural factors. Meaning is very important to all of us, and many of the suggested meanings listed above can be seen as a person living out a culturally specific frame. There are also other human experiences that have not been covered in the discussion above. Some of them are:     Commitments Pride Schadenfreude, enjoying the mistakes or bad fortunes of a rival. Fiero, personal triumph over hardship or impossible superiority. Human lives are extremely rich! IV.18 Human’s subjective well-being, emotion, and flow The ultimate reason for design is to maximise QoL (Quality of Life), or as we refer to it in the header, Subjective well-being. Quality of life is a number of suitable dimensions (quality) for assessing the emergent process of one or more humans being (life). Synonyms for QoL from other areas of research are Life satisfaction (economics), Happiness (psychology and economy), Well-being (psychology and health), and Wellfare (economics). The term Quality of life itself originates and is used in sociology [BA]. We will assume that the total well-being of a system is the sum of the QoL for the human interactors involved and a list of the most used QoL dimensions, sometimes referred to as life-chances are shown in table IV.18.1 [RS]. When the tide of life turns against you And the current upsets your boat Don’t waste tears on what might have been Just lie on your back and float.  Hakan Gulliksson 134 QoL dimension Indicators and descriptors 1. Emotional a. Contentment (satisfaction, moods, enjoyment) well-being b. Self-concept (identity, self-worth, self-esteem) c. Lack of stress (predictability, control) 2. Interpersonal a. Interactions (social networks, social contacts) relations b. Relationships (family, friend, peers) c. Supports (emotional, physical, financial, feedback) 3. Material a. Financial status (income, benefits) well-being b. Employment (work status, work environment) c. Housing (type of recidense, ownership, neighbourhood) d. Infrastructures (personal and goods transportation) 4. Personal a. Education (achievement, status) development b. Personal competence (cognitive, social, practical) c. Performance (success, achievement, productivity) 5. Physical well- a. Health (functioning, symptoms, fitness, nutrition) being b. Activities of daily life (self-care skills, mobility) c. Leisure (recreation, hobbies) 6. Selfa. Autonomy/personal control (self-endorsed, determination independence) b. Goals and personal values (desires, expectations) c. Choices (opportunities, options, preferences) 7. Social a. Community integration and participation inclusion b. Community roles (contributor, volunteer) c. Social supports (support network, services, events) 8. Rights a. Human(respect, dignity, equality) b. Legal (citizenship, access, due process) The indicators listed in the table are not independent; interpersonal relations are for instance extremely important for emotional well-being and personal development. Claims have for instance been made both that the TV disrupts and increases family interaction. Other surveys have found that heavy TV-users are unhappy, but is this really an effect of wathing TV, or do they watch a lot of TV because they are unhappy for other reasons? Social identity how you want to present yoursel to others. To this we can add the narrative self, your own history as you see it, past, present, and future. A frame you adapt to your life, and try to adapt your life to. Continuity is important for trust, on the other hand you need to show that you are unique, contrasting against others. Self-esteem is a persons feelings towards herself, pride in oneself. Selfesteem depends on social factors such as family traditions, language, cultural customs and values, about for instance economic background. We all want to feel privileged and chosen relative to the rest of the world. Table IV.18.1. Quality indicators based on reading 9749 abstracts, and 2455 articles, selected from the 20900 articles with the term quality of life in the titel and published since 1985. "Um, I think your problem is low self-esteem. It is very common among losers." We can study the indicators in table IV.18.1 in two ways. Either we ask individuals and obtain an internal estimate, or we try to measure the same parameter from the outside. When we ask individuals we face the problem that we cannot be sure that the question is understood the same way by everyone, and also that adaptability modulates the expressed level of QoS. Other problems are that personality, disposition, temperament, and recent experiences affects opinions. The cultural and organisational context of an individual also change the priorities of the indicators in table IV.18.1. More money makes a bigger difference when living among poor. It is on the other hand difficult to establish the mood of a person from the outside, although for instance analysis of facial expression can extract emotional information. The table above can be used to indicate in what areas positive changes will affect the individual. Even if a few individuals do not appreciate improved housing, on a population level this indicator is a true measure of increased QoL. With respect to technology we can compare the indicartors above before and after the introduction of a new technology. Hakan Gulliksson 135 Measures of QoL provided by a product or a service could be:      What people are willing to pay for it. Their reaction to loosing it. How much they use it. Their attitude towards the product. How they feel after using the product. Happiness is an individual s appraisal of life synonymous to subjective well-being. It is a personal sampling of life as a whole, while in the middle of living. Happiness is an easy indicator of QoL to measure, just ask. The problem is to interpret and relate the answers given. If our measures for instance indicate that unmarried persons are less happy this could be interpreted as that the word unmarried is negatively loaded in the specific culture. It could on the other hand be an indication of loneliness, or correlated to the fact that unhappy people are less attractive [RV]. From table IV.18.1 we have the ideas of QoL dimensions. They can be seen as a context for what we do and what happens to us, i.e. the course of events that we experience. Table IV.18.2 below shows an example where happiness has been measured for a selection of the daily events we encounter and the actions we perform. Action/event 1. Sex 2. Socialising after work 3. Dinner (socialising too?) Relaxing (socialising too?) Lunch (socialising too?) Exercising Praying Socialising at work Watching TV Phone at home Napping Cooking Shopping Computer at home Housework Childcare Evening commute Working (! low index) Morning commute Happiness Average hours per day 4.7 0.2 4.1 1.1 4.0 0.8 3.9 2.2 3.9 0.6 3.8 0.2 3.8 0.5 3.8 1.1 3.6 2.2 3.5 0.9 3.3 0.9 3.2 1.1 3.2 0.4 3.1 0.5 3.0 1.1 3.0 1.1 2.8 0.6 2.7 6.9 2.0 0.4 “We are made happy when reason can discover no occasion for it”, Henry David Thoreau Happy: embullient, joyful, exhilirated, elated, carefree, contented, at peace, at ease, and being in high spirit. Table IV.18.2 A list of courses of events and their happiness index. Note the popularity of social interaction [RL]. ”Interaction” Hakan Gulliksson 136 There is presumably a motivation for taking an action and reasons to expose oneself to experiences. The most important one is to directly increase happiness by having sex, or enjoying lunch, but we also plan ahead as when we work to fund shopping or a good wine. The logic is either; action -> feel happy -> more action, or experience something that increase happiness -> take action to experience the same thing again. Whether interaction technology makes you happier is a good question, and one you need to ask to assess the result of the introduction of new technology. Asking is important since if we cannot evaluate the result, then what and why do we design? Measures are however not very precise even if they are consistently positive or negative, for instance when interviewed by someone in a wheelchair healthy persons will rate their happiness higher [RV]. Happpines s China (Red) Egypt, US (Yellow) Cherokee (White) Japan, Middle east (Orange/Gold) Brazil (Blue) An overall assumption in this book is that happiness can be changed by adding technology. Whether raised happiness over a longer term is possible is however not clear. A widely accepted figure of the heritability of well-being is 50% [KS], i.e. to a large extent our genes indicate our destiny. Quality dimensions as listed in table IV.18.1 provide another salient set of cues for how we rate our happiness, and even by themselves can affect happiness. If we compare our social prestige with that of our neighbour the result can affect how we evaluate our life. Life chances could also directly affect how we evaluate what we experience. More people without family bonds feel lonely. Each individual will however interpret and sum their life-chances differently depending on for instance age, disposition, and education. After taking heritability and life-chances into account we still have a significant share of happiness that depends on the events and the experiences that we encounter, but also, and perhaps more importantly, on the actions we take and how we experience the result of these actions. We in other words have a chance to create our own happiness within the context of our life-chances, but one problem with this is that we are restrained by our traits. They are the affective, cognitive and behaviours that are consistent across situations an over time, and quite stable for an individual. Another objection against taking action for happiness is that we anyway adapt to the new resulting circumstances, and consequently striving for happiness is not worth the trouble in the long run. If we however not look at the result, but on the process of gaining happiness, the fight might be worth the effort. We are constantly interacting with our physical and social environment, and it seems silly not to do this in a way that brings us as much happiness as possible. If we can build tools to help us in this, it is good use of technology. Identifying the actions to take, and the technology we need, are important research issues. Furthermore, what we do also changes the conditions allowing for new actions and ideas of supportive technology. Hakan Gulliksson 137 Examples of actions that change happiness are purposely being nice to someone, and preparing a list of positive things in life [KS1]. In general however the activities to choose depends on the personality of the individual (strengths, interests, values) and the life-chances. More people in developed countries, where housing and food is provided, will look positively on life. They also think that they are more happy than average, we are a positive breed of critically observers. On the other hand we rate losses as more salient than wins, which tend to make us conservative. One important finding from the perspective of this book is the positive correlation between happiness and urbanization, industrialization and individualization. It seems that we are happier in a modern society despite the obvious problems with anonymity and alienation [RV]. Why this is so might be clearer when we later give examples of the technology that surrounds us, and what it helps us to do. Typical social traits associated with subjective well-being are; extroversion, conscientiousness, and agreeableness, but what other characteristics do positive social behaviour show? The follow-up questions are, whether and how technology can induce and support such behaviours. Ebinezer Scrooge Work hard Increase production Prevent accidents and be happy Voice IV.18.1 Flow Flow is the optimal presence, and it is the optimal experience, i.e. the ultimate mindfulness [MC3]. It might also be more common that you think. Where you lost in the previous sentence half a second ago? Prerequisites for flow are:     A task with clear goals to complete. Immediate feedback. Ability to concentrate on the task. Sense of control over actions. Masters of science! In the state of flow the duration of time seems to change, concern for self, awareness of worries disappears, and after the experience a stronger sense of self emerges. To achieve all this it is however important that skills and challenges matches the person, see figure below. Level of challenge Anxiety Flow Figure IV.18.1 Model of flow. Apathy Boredom Level of skill Hakan Gulliksson 138 People will accept or be faced with a level of challenge, and as skills develop and new challenges emerge they will be forced towards the upper right corner in figure IV.18.1. Also, note how easy it is to match the prerequisite for flow to how a successful game affects a player. Challenge can be generalised to the level of complexity that a person faces. Boredom then means that we face a situation with low complexity compared to what we can manage. This difference between situational complexity and ability is called incongruity in [RN] and the figure below shows a person in two different situations and how learning and adaptation affects incongruity. Complexity “What, me worry?" Alfred E. Neuman Situation 2 Learning/Adaptation Incongruity Situation 1 Figure IV.18.2 Incongruity (adapted from RN]) Context Person The figure is interesting since it suggests a definitely dynamic framework for experience and also that all experiences are individual since incongruity is individual. It is also context and situation specific. Now, if we return to figure IV.18.1 above, and following the reasoning from [RN] flow means that contextual complexity must be larger than the individual s. “lso, a medium arousal level is sought which means that we will search for such an incongruity. If the challenge is too low we will get bored and look for novelty, i.e. for a more complex (challenging) situation or context. If, on the other hand it is too high we try to lower the level of arousal, increase confirmation, and reduce uncertainty. High Situation 2 Optimal new situation Pleasantness Figure IV.18.3 Optimum level of arousal (uncertainty) Situation 1 Level of arousal High We can assume an equivalence to flow also in social activities. Too much control reduces complexity, with not enough challenges we get bored. Too much variation, i.e. a high level of social complexities that we cannot handle, is also not good. A simple solution to tweak the current situation is to use drugs, other disputable strategies are all kinds of over consumption whether it is of games, gambling, girls (3G), or of food, fat, and sugar. Hakan Gulliksson 139 IV.19 Unique features for each of us IV.19.1 Unique human abilities Let us start with the human development. The main theory is evolution, survival of the fittest. A kind of biological learning where favoured mutations or new behaviours will be established as dominant after some generations. The development cycle resulting from this is rather long, and does not necessarily produce optimal solutions, survival is enough. Also, it is not clear what it means to be the fittest in the new, fast changing, information society. Humanity seems to have bypassed evolution for individuals, but perhaps it is still at work for societies and ideas? The list of intellectual feats mastered by humans only is rather long [DAN]. Maybe the most important of them, apart from the ones discussed previously in this chapter, is how we design and create artefacts to develop old abilities and give us new ones. Some other animals also use simple tools, but humans have created complex artefacts such as the hammer, the automobile, and the computer. The hammer is by the way not that simple to use, ask any chimpanzee. People are deeply rooted in space and reality. Even a four-month-old infant is surprised if objects pass through a gap that is narrower than the object itself, or if an object disappears from one place, and materializes in another [SP]. The spatial information is mainly provided by vision, and there are innate behaviours in place to manipulate it, see figure below. We all have a little weakness, which is very natural but rather misleading, for supposing that this epoch must be the end of the world because it will be the end of us. How future generations will get on without us is indeed, when we come to think of it, quite a puzzle. But I suppose they will get on somehow, and may possibly venture to revise our judgments as we have revised earlier judgments. G K Chesterton Humans think they are smarter than dolphins because we build cars and buildings and start wars etc... and all that dolphins do is swim in the water, eat fish and play around. Dolphins believe that they are smarter for exactly the same reasons. Douglas Adams Figure IV.19.1 Spatial effects in visual perception. Processed image information is stored in long-term memory for convenient retrieval, with better access for frequently used information. You will quickly recall the colour of your house and the colour of the house next door. But, recalling the colour of the next house but one is slower as you need to do a mental scan over the street. Right? Imagine the letter D. Rotate it 90 degrees clockwise. Put the number 4 above it. Now remove the small horizontal segment to the right of the vertical line. What familiar object do you “see”? Our species special spatial dependency is also evident in language. We [SP] assume that distant objects and relations affect us less, and this way of thinking can also be reused with reference to time. We say that The Space and force pervade language. meeting lasted from 13.00 to . , and The meeting is at 15.00 , I will Many cognitive scientists have conpay this bill, it is due tomorrow, that one is due next week . The same cluded from their research that a metaphors are also reused for programming, These input parameters handful of concepts about places, paths, motions, agency and causation have almost the same values, and should produce almost the same output underlie the literal or figurative from the procedure . meanings of tens of thousands of words and constructions…These The human is also the only interactor we know of who troubles herself concepts and relations appear to be with philosophical questions, such as what is consciousness, self, free the vocabulary and syntax of will, meaning, e.g. I know what a natural number is, but how can my mentalese, the language of thought [SP] brain have a relation to an infinite number of items? . Other mindboggling problems are concerned with morality, e.g. Why not steal to “Consciouness made easy” avoid starving? and knowledge, Why is the speed of light constant, and “Meaning of life for dummies” why should that fact be true tomorrow ? Maybe we can use technology to New titles we would like to see. Hakan Gulliksson 140 shed some light on some of these questions, or is our own nature such that we are not capable of understanding the answers [SP]? Some other feats to ponder on are:    Common sense, most of us have it, but no one can describe how to teach it. Humour, jokes are based on the human ability of association (and more?). Art, creates a strange world of its own, exploring socially accepted truths that are sometimes agreed on by one person only. Art is a multilevel experience. We perceive a work of art, we reason and think about it, and it might arouse emotional responses. We humans decide ourselves what should be considered as beautiful, but there are some accepted universal truths. Below, in figure 30.2, is an attempt from Dr Marquardt to design the most beautiful, ultimate, face. It is created using the golden ratio, also called golden section, golden proportion, or the divine proportion. This number is the result when a straight line is divided into a longer part a, and a shorter part b, such that a/b = (a + b)/a, i.e. (5 –1) /2=1:1.6180339887. For some reason the golden ratio is considered beautiful by most people, and also appears frequently in nature. Curious? Why? Common sense knowledge is not the sort of knowledge found in encyclopedias, but, rather is the sort of knowledge taken for granted by those writing articles in encyclopedias. Hubert Dreyfus Tangram (spookey) Figure IV.19.2 Face designed using the golden ratio (left). The width of your nose and the width of your mouth should follow the golden ratio, but, if yours do not, do not worry, beauty is not everything (right).     Games and sport, mental and physical exercise raised to art. Schooling, organised cultivation, not always as efficient, or as much fun, as it should be. Rituals, used for social confirmation. These are tight social interactions where the sum adds up to more than the individuals. Time perception, what is time, another of the eternal questions. Every one of us perceives context differently. Some of us quickly identify situations and events, but others are quite sluggish. Does this mean that people perceive time itself differently? Now it is time to go to bed. Time perception …. Hurrying Waiting All mobile things face the problem with sustaining power. The sun in mostly the source, directly through a solar panel (0.3m2 10W), or indirectly through a Lithium battery (200g  6W). A human has the wonderful power to convert food into energy. A not overly exercised adult consumes 2000 kcal/day. This is equivalent to 2325 Wh. We would need quite a battery! Some figures on consumption are that sleeping needs 80W and walking adds 60W. Researchers from MIT (Massachusetts Institute of Technology) have managed to retrieve 8.4 mW of this energy by inserting a piezoelectric element into a shoe. Arm curl adds another Hakan Gulliksson 141 t 35W of consumption, and finger motion adds around 10mW. It is not much use trying to burn fat by exercising your fingers [TS]. "Carriages without horses shall go, And accidents fill the world with woe." Mother Shipton (circa. 1530) As our environment becomes more complex, more often the limitation in the system will be the so-called human factor. Human memory and processing limitations will inevitably lead to errors when a human is given the wrong type of tasks. Humans are also limited in many other respects; our running speed is for instance not very good compared to an aeroplane. We have to carefully design systems such that they do not overtax human levels for mental and physical effort, reaction times, performance, or frustration. But, even though the human factor is the cause of many problems the truth is that it is the only unique feature of H, and should be nursed and refined. IV.19.2 Features and limitations not found in man Many times things are more robust than humans, other times they break without any visible reason, but so far they have never complained! A rugged design can survive in the desert, as well as on the top of Mount Everest, for a long time. Robustness stems from redundancy and other qualities, which can be built into a thing, but a human being has to settle for nature s readymade design. The thing is currently evolving under quite a different environmental pressure than man. Compare survival on the savannah with survival in the market place. Evolution is supported by the fact that it is possible to manufacture an exact replica of a thing. This is not possible to do with a human, and is a major difference! We might copy the basic genetic code and use it as a base for an individual, but to copy a human complete with her psyche is impossible. Nature s choice of an analogue implementation enhances adaptability at the cost of loosing some determinism, including the possibility to do exact replicas. If such a trade-off is necessary for intelligent, context dependent, adaptive behaviour is not clear. Every second someone hooks a new computer to the Internet. This person must be stopped! Internet I want to pull my arms into me when they aren't in use. John Dobbin What are the limitations for a thing? None whatsoever? The answer is that we really don t know yet, but the very idea of an intelligent thing is something that by itself continuously enhances human capabilities! By exploring the possibilities of things we learn about ourselves, and about the reality we share. What we can say is that the thing is less limited than the human in many respects, but that currently humans are controlling the reproduction. We are no longer able to give birth to a computer without the help of a computer, but we are in control of the on/off button to the womb. In practice we have lost also this possibility since our society now is too dependent of the computer. Hakan Gulliksson 142 A grown up human cannot decide her extent of autonomy, it is fixed by nature. How much autonomy to implement in a thing is a design decision, and depends on the purpose. The complexity of the internal workings of a thing will be proportional to the extent of autonomy. Perhaps emotions and beliefs are overkill for a vacuum cleaner? If your child s toy becomes too autonomous it might walk sulking away from the playpen. Another limitation is that the thing currently has no consciousness; it is not self-aware the way humans are. One prediction is that technology by 2029 will provide the necessary means for consciousness (memory, processing power). If these are sufficient is another matter [GB]. The prediction is based on the fact that there are 1012 neurons in the brain, each with 1000 synapses, summing up to a total of 1015 synapses. An artificial neural network needs 4Bytes of memory per synapse, so in other words, we need 4 million Gbyte of memory. Estimations from the typical random-access memory configurations in personal computers the previous 20 years gives the following formula: noByte  10 ( year 1966 ) 4 , solving this for 4 million Gbyte gives the year 2029. Even if the number of memory cells, and the number of connections are equal in an integrated circuit and a brain, and even if the circuit is a hundred times faster than a neuron there is still another trick that the brain could use to achieve its outstanding complexity, and that is timing. Timing in an integrated circuit is used only to start a computation, for instance by enabling a gate in an inverter. In the brain timing can be used also to encode information. A short difference in time between two neurons firing in parallel could mean that a trailing neuron is triggered. If, on the other hand, the difference is large the trailing neuron will not fire. The complexity possible for such a scheme is in principle infinite. In practice robustness will set a limit to the complexity, using too small timing differences is not practical. Power is quite necessary for an intelligent thing. Currently computers drink electricity and depending on the application and the situation this can be a severe problem. It is expensive (as it is for humans) to quench the thirst. For stationary things power is usually not much of a problem, but mobile things need batteries, solar cells, or some other means for energy source. There are also other demands on the energy source; size, weight, service life time, rechargability, time to recharge, replacement cost, and environmental and ecological concerns. It is tough to be a hardware designer. A typical Lithium battery for 6W weights 200gram, and a NiCd battery for 6Watt is quite heavy, about 1.5kg. We can use the Mips/Watt/ $ to estimate the power performance of a system. A typical value for this is 20 (1999). A similar measure for memory is MegaByte/Second/Watt/$. 31 Dec 2028 Isaac Asimov formulated the following three robot laws in one of his science fiction novels: I A robot may not injure a human being or, through inaction, allow a human being to come to harm. II A robot must obey orders given by a human being except where such orders would conflict with the first law. III A robot must protect its own existence as long as protection does not conflict with the first or second law. So far no robot has needed any laws because they still cannot even reliably detect humans. But, if they needed laws, would they really need the second law? When, if ever, should an “intelligent” thing lie to another? To a human being? A number of different approaches are being tried. Engineering, Internet (We are still guessing at this point). A personal computer dissipates around 100 Watt, and as a household radiator a thing can dissipate 1 or 2 KWatt. Hakan Gulliksson 143 Computers do currently not learn very well, and this fact forces them to develop through evolution. They do not reproduce, which means that they are at the mercy of their creators. Survival of the fittest gets a different meaning then, namely survival by manual selection. On the other hand, things can be copied and each generation can have a short lifespan, e.g. compare mobile phones. Which is best in the long run? To be able to learn, or to develop through fast evolution? Maybe these two approaches are equivalent in our case since we are all interdependent? Currently another limitation is that we do not know much about the mechanisms for creativity in humans which makes it extremely difficult to build things that are creative. What we can strive for is not technology that is creative by itself, but that helps a human to be more creative, i.e. technology that supports the steps of creativity described in the Chapter IV.15 on human creativity. IV.19.3 Summary Human vs Thing The table below is a summary of the characteristics of man and machine [BS1]. As we do in the table we tend to confront the thing and the human, but, maybe we should rather see the thing as an extension of a human, a complement extending senses (binocular, hearing aid), motorics (car, bicycle), and cognition (computer based calendar, calculator). Machine generally better Sense stimuli outside human’s sensory range. Count or measure physical quantities. Store coded information accurately. Monitor prespecified events, especially infrequent ones. Retrieve pertinent details without a Make rapid and consistent priori connection. responses to input signals. Draw on experience and adapt Recall quantities of detailed decisions to situation. information accurately. Select alternatives if original Process quantitative data in approach fails. prespecified ways. Generalise from observations. Infer from general principle. Act in unanticipated emergencies Perform repetitive predefined and novel situations. actions. Develop new solutions Exert great highly-controlled physical force. Concentrate on important tasks Perform several activities when overload occurs. simultaneously. Ethical reasoning Maintain operations under heavy load. Emotional (computers do not care?). Maintain performance over extended periods of time. Thinking about the things we used to do Nancy Sinatra, Dean Martin Imagine one million things, each with sensors and capacity for a mental model. Connect them into a high-speed network and wait. What will happen? What are the main challenges to make this scenario true? Humans generally better Sense low-level stimuli, e.g. using the finger tip. Recognise constant patterns in varying situations. Sense unexpected events and subjectively evaluate them. Remember principles and strategies. Hakan Gulliksson Table IV.19.1 Summary of humans versus things. 144 Part V: Interaction, we do it together This part of the book will introduce feedback under the pseudonym of interaction and it is one of the most important concepts of this book. We start off by pinpointing the concept of interaction. Next, Chapter V.1 to V.5 give short introductions to interactions between the three participants H, I and T. We discuss interaction as a way of improving quality of life, and also why we need technology for this. Since the main objective is to improve quality of life, much of the discussion, and almost all of the examples, are taken from a humanistic perspective. We continue by giving context a special treatment. It is important since context is the background environment that fuels interaction. Chapter V.7 defines context and describes how it can be used. Equipped with knowledge on context we go on to Chapter V.8 and V.9 where we focus on interaction modelling and interaction characteristics. Next, mediation gets its own chapter V.10. The last five chapters discuss interaction terminology and technology for interaction control and cooperation, specifically the last chapter of this part of the book, Chapter V.15, will give an introduction to command based interaction, i.e. the traditional core of human-computer interaction. The word interaction hints at its own meaning. It is built by the words inter and action, both derived from Latin, inter meaning between or among, and action, from Latin a’ctio, actually meaning action. The following story adds to the concept: An author (whose name we have forgotten, perhaps it was Victor Hugo?) becomes really nervous when his new book is released, and escapes to a resort. He soon becomes feverishly curious and sends a telegram to his publisher, which only contains a ? . His publisher returns a ! and the relieved author continues his vacation. I T Interaction is where two or more actors share a common time-space-state universe. This is obviously an interaction between two communicating entities. It is also something more than a mere exchange of messages. Because the publisher knows the author he can predict how the author will interpret the ! . One description of this interaction is that the two participants are brought into a dynamic relationship through a set of reciprocal actions, that is through a series of events, during which they are in contact with each other in some way . “ better and shorter definition is Mutual interdependence . This more accurately describes the coupling between the participants, but the fact that the interaction above was goal directed is still missing. Let s use the definition to the right. Hakan Gulliksson Definition: Interaction is a method for goal directed mutual interdependence. 145 A goal of an interaction can be engineered into the system, or emerge from previous interactions. If it emerges it could do so through a democratic process, or be imposed in a more dictatoric manner by one of the interactors. Emergent goals have the interesting property that they can dynamically change behaviours, and make old objectives obsolete. To follow norms and fulfil obligations are examples of emergent, and also changing, goals. In this book we will however not always insist on a goal for the interaction. Two charged particles affecting each other will also be considered an interaction. We can see that traditional sciences explore two types of phenomena at different levels of detail. They study objects, and also operations acting on and performed by objects. Typical objects are molecules, cells, computers, humans, networks, societies, and stars, which we in this book refer to as interactors. The operations, in this book called interactions, can be exemplified by human conversations, chemical reactions, software processing of sensor data, data communication, and gravitation. Interactors and interaction are intertwined and dependent, which means that even if it is possible to spend a lifetime studying almost any detail of either an interactor, or an interaction, we believe that such a strategy will not give the whole picture. We cannot fully understand an interactor without knowledge about the interactions it is engaged in, and we also cannot understand the interaction ignoring the interactors. When we study the world we can try to figure out why an interactor does something, or we can study the actions themselves. The latter is obviously much easier, at least if we restrict ourselves to observable actions. If we consider that an action is a product of the local context and the situation, and if we see an interaction as a set of actions extended over time that can have emergent properties, then actions do not seem so tangible and comprehensible any more. What keeps the following systems from falling apart? A society? A rock? Internet? Sun Earth % At each level of detail there is much to learn, but as humans we should start our investigations at our own level. Here we have a first hand experience that makes it easier to understand principles, without too much formalism. It is easier to apply knowledge from our own lives. Even though quarks are interesting, and are described by well thought out formal models, they are hard to relate to in daily life. The price to pay for studying ourselves is that the information hidden, because of the high level of abstraction, makes the interactors and interactions under study slightly magic. With two actors, a shared space, and some means for communication we have the basic pre-requisites for coordination. Managing the coordination is in itself an interaction, i.e. a meta-interaction. This could for instance involve planning for the interaction, or specifying rules or infrastructure for communication. A door bell is a perfect example of a tool for coordination. Hakan Gulliksson ”England expects that every man will do his duty”, 12 sets of flags, in total 31 flags. 146 Two antagonists will coordinate by competing, but more amiable actors will co-operate, or form a symbiotic mutually beneficial relationship. Interaction adds properties to the system that goes beyond the individual; it is both a source of power, and a well of problems. Anything that joins could be the cause, or the medium, of a conflict; marriage, and a common language are two examples. More about the co-x words in Chapter V.11 to V.15. The main reason for interaction is a shortage in resources, necessitating coordination, and perhaps leading to co-operation or antagonism. One shortage could be the lack of skills, but there are an unlimited number of possible resources to fight over. Definition Coordination is the process of sharing access to tools, objects, space and time. It includes transfer of tools and objects. Adapted from [DP1]. Resources Resource space needed by A Resources needed by B V.1 H-H Interaction, the reference The problem for human-to-human interaction is sharing or exchanging information between two persons, or within a group of people, while sometimes excluding other people. Many times we as senders are forced to adapt to the receivers, and the receiver to us. How do you manage this in a highly dynamic, stochastic, context? Human to human interaction is the highest, most complex, form of interaction! It has been around for quite a while, so the fundamentally social human animal has established many rules, and even hard coded some behaviour. It is important to know about these constraints whenever you are interacting with other people, and even more so if you are developing systems to support H-H, H-I, or H-T interaction. Wave your hands at sea, and you summon the coast guard. You place yourself last in the queue for coffee. You say Good morning , Good night , See you tomorrow , “fter you , Please ,….. Are we having an interaction together? There are many levels and types of interactions, some of which we are conscious of, and some that we have to train ourselves to recognize and register. If one person in a group of people sitting around a table yawns, inevitably several others will do the same. As another example try dilating your pupils at will! This could be quite useful since dilated pupils of a woman have been found to cause the pupils of men s eyes to dilate by as much as 30 percent, indicating that there is an affect. It is not clear if the response holds true also for the other sex, but it is worth a try. Most H-H interaction in this book will be considered as symmetric interaction between equals. A third example of H-H interaction is a mother talking to her child. There are four sound patterns that reappear in cultures all over the world [PG]: Encouraging, Come to mommy , raising the tone. Rewarding, Good girl , ”ravisima, bravisima , lowering the tone. Warning, No, no, stop that, no,", short staccato. Comforting, Hush hush , soft. Hakan Gulliksson 147 So far, the best tools for H-H interaction are the spoken and the written language. An interesting theory is that the spoken language was invented more for social reasons than for information transfer. The language gave a group of humanoids a common reference that could keep the group together even if the number of individuals in the group increased. In a small community each individual can get to know all of the other individuals, but as the size of the group increases, socialising without language takes too much time, and the group will starve, or split up. In fact, most human knowledge is stored in a common culture, which is kept, and developed, by interaction and language. Because of the complexity of H-H interaction and the adaptability of H emergent behaviour, such as the creation of a language, is the norm rather than the exception. Language is also an example of an applied coordination technique. Examples that have emerged are history books, timetables for buses, and university lectures. We can use technology to enhance person-to-person communication and consequently to further develop our culture. The telephone makes conversation independent of distance, and e-mail removes the time constraint from the interaction. Unfortunately technology also reduces the communication bandwidth. It is quite difficult to see gestures through a phone, and it is also difficult to hear the tone of voice in an e-mail. Is it possible to hear a smile over the telephone? But, note that the limitations in technology are not fundamental. Certainly technology can provide a much higher bandwidth than any human can manage. According to one calculation a human accepting at full speed crunches 1 Gbit/s of sensory data, and such a bit rate is available already with today s technology. I marmaladed a slice of toast with something of a flourish and I don’t suppose I have ever come much closer to saying ’Tra-la-la’ as I did this morning. It is no secret that Bertram Wooster, though as glamorous as one could wish when night has fallen and the revels get under way, is seldom a ball of fire at the breakfast table. Confronted with the eggs and b. He tends to pick cautiously at them, not much bounce to the ounce. The reason for the improved outlook on the proteins and the carbohydrates was not far to seek. Jeeves was back. P. G. Woodhouse This ’telephone’ has too many shortcomings to be seriously considered as a means of communications. The device is inherently of no value to us. Western Union internal memo, 1876 There are several different reasons for communicating. From the perspective of interaction we can list them as in the figure below [UB]: Entertain ! Inform Coordinate Collaborate Co-operate We entertain for fun, inform to let someone know, coordinate to synchronize or to level, collaborate to achieve a common objective and finally we co-operate using shared resources, also with a common goal. Evaluating any of the activities in the figure above is notoriously difficult in H-H interaction because we cannot read human minds. This fact also makes any H-H interaction potentially non-linear, e.g. slightly changing the context of the interaction does not necessarily mean a small change of the result. Since we cannot be certain about the goal of an interaction it is also difficult to know if the goal is met. We structure interactions in many ways, depending on the circumstances. Planning, estimating time, repeating actions, and aligning constrained tasks come easy to us. However, with H-H interaction results from repeated interactions are unpredictable. This is many times a result of the opaque human mind, but sometimes also because of the shear complexity of context or the task, designed or natural. H-H interaction is at least predictable enough to improve efficiency when repeated. Creativity is the other side of the coin; it emerges along with complexity and unpredictability. Hakan Gulliksson Figure V.1.1 Reasons for communicating. Below are some of the 16 essential interactions defined by Thom: Ending Beginning Being Rejecting Stirring 148 Humans are social beings. This means that we immensely enjoy, and engage ourselves deeply, into chatting, gossiping, and discussing. We can participate in several simultaneous interactions, and for each interaction we have multiple information channels. Some with high bandwidth, such as speech, and some with lower bandwidth, such as asking a 10 year old to leave his teacher a message. H H’ H V.2 I-I Interaction, so far for efficient data transfer For I-I interaction the problem is to exchange information supporting the intended functionality as efficient as possible; aesthetics and satisfaction are not relevant. Information-information interaction is the most structured and formalised type of interaction, with rules that have to be rigorously specified down to the last information bit. The following section will discuss I-I interaction in general. Many more examples and details of interactions will be given in the later chapters. There is no known limit neither to the complexity, nor to the bit rate of I-I interaction, but currently the data transfer is constrained by the state of the art in data communication technology. Fundamental physical limits such as lack of bandwidth, and background noise levels have not yet been reached. The complexity is hampered by a plethora of data representations that currently are not compatible, not general enough, and with limited information content. It seems that the problems are practical in nature, but whether I-I interaction can reach the complexity of H-H interaction is still not known. Should it be possible it will however take many years of hard work. The behaviour of many systems might seem intelligent, even creative, but so far this is more a result of the enormous amount of information available, and the strange, foreign, character of the processing. Information resources are increasingly hooked up to the global network, and the information can be combined to create new information with emergent properties. All books ever written in the western world, such as the one you are reading, are only combinations of less than a hundred characters, and in any new book you will find fragments and ideas from other books. This certainly goes for this book too. Combining information from unemployment registers and salary registers creates information that suddenly sends people to prison. Statistics of demographic indicators distributes resources; polls and stock market quotes are active interactors in politics and economy. Statistics very much rule the day in this era, whether valid or invalid. Note that interactors of type information never forget, which means that the full history is always available to the interaction. “The communication tail is wagging the processing dog” Paul Saffo A B  42 6 0 For I-I interaction data access is the goal, and access rights are consequently important. Computer viruses roam the network trying to take over precious processing and memory resources. Hundreds of gigabytes on hard discs all over the world have a content that the owners of the computers would be shocked to know about. This accessibility of information reflects human society. Not everyone is allowed to have a peak at the data files of NSA, but, at least in Sweden, the personal income is available to everyone. Without explicit access restrictions all data in an I-I interaction is in theory available to all interactors, even data internal to the interactors. Hakan Gulliksson 149 Having the access rights to data is not the only problem though; even worse is that the information available to an information agent is fragmented in many ways. First, it is distributed, a problem that can be fixed by search and indexing, even though the enormous amount of information still presents a problem. Context to I-I adds even more information, and there are lots of contexts. The amount of data means that some views of the data will be favoured while others need to be hidden. The next problem is that information is stored in many different formats, and that different abstraction levels are used. Finally, to top the above, not enough meta-information and context is available to support interpretation of data. Altogether this means that information and information interactors must be purposely designed to be useful. n o t r i i a m n f o So far everything that happens emanates from human intentions. H is still designing the data structures, aligning processes, and the one pushing the buttons, for instance enforcing the law by deciding which databases to match. The increased connectivity means that pushing will effectuate more, and do it faster. If demand on productivity continues to increase more and more applications will eventually have to remove the human out of the loop, and statistics will be in charge. This will favour is information that is measurable, and that is actually measured. All other aspects are likely to be ignored. Things and behaviours that will disappear or change when mobile technology matures -Newspaper -Telephone book in paper -Calender in paper -CD-player -Dedicated TV-set -Wallet Wrist watch Interactions of the type I-I are manifested not only by exchanging messages. Ideas that interact is another example, put forward by the biologist Richard Dawkins. Analogous to the gene, the information carrier for the idea is called a meme. Memes combine into new memes that represents new ideas, such as the one by R. Dawkins. Memes do not mutate at random, as genes do in the biological evolution, but by creative or purposeful adaptation fostered by humans. Some memes are potent, they survive, and spread like epidemic diseases. Other dies, either because they were bad ideas, or because they showed up in the wrong place, or at the wrong time. Memes are interesting but will not be further discussed. Nature is also a “meme bank”, an idea factory. Vital postindustrial paradigms are hidden in every jungly ant hill. Kevin Kelly Internet has autonomous subunits, high connectivity, and no centralised control. These features match characteristics in nature and society, which can provide many ideas, metaphors, and structures for models and real implementations as Internet develops. Current examples are the firewall, a stockade raised against malicious messages, and the Internet backbone where the bulk of the long distance messages travel. A third example is the use of protocols. I-I interaction really can use good ideas since it has some catching up to do. Speech appeared about 100.000 generations ago and Internet only 1! Rose is a rose is a rose is a rose Gertude Stein 1995 Birth of the WWW V.3 H-I, H-T Interaction, joining forces Human to thing, and human to information interaction are the most difficult interactions to implement with high bandwidth and efficiency. One reason for this is the incompatibility of the cognitive systems of the interactors. Both processing and perception are radically different. As a consequence adaptation must be exploited as much as possible. The current approach is to primarily adapt I/T to H by developing technology and using it intelligently. To what extent H and society, can adapt to I/T over a longer time perspective is not known. Cars, traffic jams, TV and soap operas however prove that we are capable. Another problem for a designer of H-I or H-T interaction is that any interaction involving H will Hakan Gulliksson 150 be complex as aspects of human behaviour and society will affect the designer. We have grouped the interactions H-T and H-I together since one without the other is not interesting, or simply impossible. Interactions involving a human always have a physical representation, telepathy is not yet common. So, although the intention is to interact with information, the means for doing so is a physical thing. One example is the mouse. Are we interacting with the cursor on the screen (H-I), or with the mouse (H-T)? One way to understand the development of computer-based technology is as a result of computers reaching out into the human environment, or in the words of Jonathan Grudin , the computer is colonizing its environment [JG ]. Starting out as a working tool for expert programmers the computer currently is an indispensable, and many times invisible, support. The computer interface to the world is getting more advanced at an accelerating pace. But, as Grudin pointed out in the reference the metaphor of the computer reaching out has a major problem. A human child learns about its own context. A computer on the other hand tries to understand the human world rather than its own. It is supposed to support humans and therefore needs to understand them rather than itself, and its own environment. An illuminating example comes from the reference [SS], where the authors discuss how a user might formulate a command to change the light (computers reaction within parenthesis):     H-T interaction Gersdorff 1517 UNIX UNIX It is dark So what? I need more light by the stove What is a stove? Set up the lighting like it was yesterday ”ut, your wife is still asleep upstairs?) Light this room for ”ob Who the hell is ”ob? As you can see from the examples there is much knowledge of the human world hidden in even a simple command. Perceptually technology is still not mature. It is however improving fast, and there are many possibilities for new and improved channels, which will support closer relationships between interactors. This will also mean that negative aspects of close relationships will surface, for instance gossip, possibilities for aggression, and personalised advertisements. As the interfaces between interactors improve also other contexts of the interaction will affect the interaction. Later chapters will give details and provide more examples of the above. We are now facing the design of systems where networking and processing capacity are built into our environment and everyday appliances. Such systems will mediate information from context and other interactors. They will give us access to the digital world and give the system information about us which means that a designer will have to understand social issues as well as technological limitations. Together with the human user a ubiquitous system will be intelligent and adaptive, and with a large potential for emergent behaviour. In short many, if not most, aspects of H-H interaction will be combined with the possibilities and limitations of I-I interaction. This means unpredictability, difficulty to observe internal states, and limited access to goals, but it also implies Hakan Gulliksson H H H H H H H O O O N C C 151 constant availability of interaction history, and increased interaction speed and frequency, at least as an option. The topic of this chapter is also the focus of the research disciplines human-computer interaction (HCI) and man-machine interface (MMI). “ definition from the Curricula for Human-Computer Interaction written by ACM Special Interest Group on Computer-Human Interaction SIGCHI Curriculum Development Group puh! is Human-computer interaction is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them . V.3.1 Ubiquitous computing Technology soon allows for ubiquitous computing. The idea is that computers are networked and numerous, executing everywhere in the physical environment, and that this can be exploited to make computing invisible to the user. The computerised device in other words is integrated into, and spread out over, the background environment. Combined with sensors the resulting pervasive computing in principle continuously can monitor what a user does, record it, and react to commands given anytime, anywhere. User interfaces in intelligent ubiquitous computing environments have to accept the lack of a single focal point [SS]. Related events happen in parallel, on different locations. This further aggravates the problem of how to interact with the user. Another issue is when several co-located users simultaneously interact with an interface, perhaps with conflicting goals. Furthermore, the context of use will change as the user moves around and this could mean that:   ”The most profound technologies are those that dissappear.” Mark Weiser OFF ON ON OFF ON OFF Users change interaction devices, e.g. to a smaller screen with new usability constraints. Users are distracted by social and other changing aspects of the environment. There is no more a singe most important task, and maybe not even a single most important user. The goal of a next generation supporting ubiquitous computing could well be a system flexible enough to allow for the context, including humans, to modify the system itself. Either a designer provides predefined services, or he suggests a platform where the users can augment the environment, and perhaps even develop services by themselves. Top down design is then replaced by an interactive, iterative, design development, compare to how television is now desperately adapting to the Internet. In principle a reflective system can adapt to anything, by modifying itself, but in practice there are numerous constraints. No technology is indefinitely malleable. The designer, material, original intention with the technology, and much more constrains the possibilities to modify a system and its technology in a given situation. It is hard work writing an essay on a pocket calculator. People have intentions, emotions, dislikes, phobias, perceptions, interpretations (and misinterpretations) and many other motivators that drive their behaviour in unpredictable ways that are impossible to even model accurately, let alone instrument or infer. Victoria Bellotti A designer still needs to put herself in the position of the participants of a designed system that is distributed and adapted to local circumstances. Not a simple task. We as humans have problems with conscious parallel activities; we are highly sequential thinkers. We also have problems following dynamic processes, we prefer to freeze them and study them at one point in time at the time. Hakan Gulliksson 152 Because of the above, ubiquitous computing challenges the prevailing interaction styles [SS2]. As the physical environment becomes computerized anything in the environment and any combination of things are potential interaction devices. Either indirect, as the environment tracks the thing, or direct using functionality built into the thing, including networking. Even the number and type of interaction devices might change throughout a task. We start an interaction in the car, continue while walking into the office building, and end it in the coffee room. Phone home! The mouse, keyboard, and the display are special cases and not always available for interaction. Input through other means is maybe not that difficult to imagine and realise, but what about output? If there is no output device at hand, how can the user be contacted? Speech is one alternative, but it is not always the best choice in a crowded room. Room lighting is another possibility; a purposeful blinking of the light is a good signal. It seems that there is a need for more options for output, preferably continuous in time. Privacy will be an issue as information looking for users roams screens, and loudspeakers everywhere potentially spells out secrets. If a phone also transmits context, is it obvious that bystanders want to participate in the context of a videoconference? Law prohibits camera surveillance in public places, at least in Sweden. If the functionality of a device becomes less clear, because of the possibilities to build functionality into everything, its appearance will be even more important, at least from a usability point of view. Ubiquitous systems are, on the other hand, very complex systems and this implies that they cannot be built overnight. In other words we will all have a lot of time to get used to the new systems, and some will loose a lot of money from mistakes done during this adaptation. We as designers will certainly learn a lot about how to make technology publicly available for general use such that it does not disrupt existing social models and norms. 180 I/O Device? V.4 I-T Interaction, access to reality at the speed of light A thing moving around in a physical environment faces many of the same interaction tasks as humans; identification, navigation, choice, reading, writing, and manipulation. The sensors are different though, as well as the characteristics and capabilities for communication and processing. Although humans can absorb and process huge amounts of sensory inputs, the rate for digital data is low. A computer on the other hand has problems interpreting even the simplest of scenic views, but easily accepts, manipulates, and stores, Mbit of data per second. WC Furthermore, for digital data the thing needs fewer intermediary transformations and interpretations. This means that most of the technology that people need for information management is not needed for T and I! A thing does for instance not need a display to inspect a digital image. Potentially I-T is a powerful interaction. It has contact with both the physical and virtual environment; and there is a built in evolution where intelligent things could provide feedback to information agents in the virtual environment that could further improve the things. The evolution is fuelled by the fact that I-T interaction can keep a detailed track of what Hakan Gulliksson CD-cooler 153 has happened, something that the designer later can use for the next improved generation. With no H directly involved the complexity of the interaction is lower, but even without H the real physical world is unpredictable, and certainly not a static information source. Currently for I-T interaction, we cannot use curiosity, hunger, or sexual instincts as motivations to get something done. On the other hand we do not have to since I and T will not refuse, loose confidence, or be afraid. Context awareness is obviously immensely important, but the channels to the physical reality are so far narrow and isolated. The mobile phone for instance estimates its position using a radio signal only. One problem is that a single channel will be sensible to noise. Failure of the channel will terminate the service. Compare this to how we use vision, hearing, and tactile information to constantly orient ourselves. Finding the current position is a well-researched problem area, but despite this fact only few commercial applications use the technology. No problem! The I-T system can be seen as an organism where things explore the physical reality and information agents the virtual world. Such a system communicates with close to the speed of light and is inherently distributed, which makes controlling it a problem. Its structure will probably start out as a hub based system where control is centralised, but as the capability and capacity of the nodes increase control will be more and more distributed. Coordination will be split up over the nodes, and command based behaviour will in the long run give in to co-operation and negotiation. Design according to standards, and for upgradeability will be immensely important. Goals are designed into the thing and are currently specified at a low abstraction level. Goals such as survive or do good are well out of reach. Even a modest goal such as dust the floor is very difficult to obtain. Goals on a low level are on the other hand easier for a designer to evaluate and verify. If the designer so chooses the goals and other internal information can be made available to the virtual environment. The situation will then be equivalent to the I-I interaction, with the same problems and possibilities as we previously discussed. Since things populate the human environment aesthetics is important, as well as ample functionality and physical properties such as size and weight. Currently a typical interaction sequence involves a human in some part of the sequence (X-H-Y), but next generation interactive applications will increasingly exchange H with I or T (X-I-Y, X-T-Y). The question is where, and to what extent this can be done, and it is a billion dollar question. Already we have a pen that translates English text to Swedish speech, and a pen that reads the name of a TV-show from the paper and automatically programs the video. In fact, anything that happens to a thing and that is registered manipulates information. One example is that the towing company moves your incorrectly parked car, and at the same time cash will disappear from your bank count. T Hakan Gulliksson 154 V.5 T-T Interaction, forces matter The following chapter is a short one. This is not because the topic is small; on the contrary it is huge. The main reason is instead that many of the interesting problems involved in T-T interaction have already been discussed in the previous chapters. Another reason is that most interactors of this type involves participants that are not very interesting from the rather high-level perspective on interaction in this book. In other words, atoms, molecules, neurons, cogwheels, and gearboxes are considered components rather than autonomous participants in interactions. Exploiting interactions between things is nothing new. The hammer and nail, axe and tree, pen and paper, head and pillow are accepted and important interactions. What is new is that things become more and more intelligent, and can provide us with more services where humans need not be directly involved in the interaction. One example is that the blinds of your house could close and open in synchrony with the air conditioning system. Another example is that the calendar on your mobile phone coordinates with your alarm clock at home. Most of the interactions of type T-T are simple interactions and considered as being part of the context, i.e. of the physical environment. Position, speed, temperature, and all kinds of forces are interactors of this kind. The possible interactions are always constrained by the laws of the universe and are studied in Physics, a science devoted to interaction. Physics sets the fundamental laws even though the sum of the parts often emerge to something more than the parts themselves. It is of course impossible to dream up a car, or a parking lot, using only basic physics, even though their atomic relationships can be described in principle. The models we use are sometimes discrete and sometimes continuous. The same is true also for the models we typically use for our day-to-day reasoning. We say that someone s length is . meter, but this is only an approximation of the real length that is a rather large number of piled atoms. One way to group physical interactions is as gravitation, nuclear, and electromagnetic interaction. Nuclear interaction involves forces in the very short, nuclear range. Gravitation on the other hand is a very longrange weak force that is difficult to use other than indirectly as in friction, e.g. where the rubber meets the road, or in pendulums. What is left is the electromagnetic interaction, which is responsible for most types of physical interactions. Interaction in physics is interaction between particles mediated by forces, exemplified by the electromagnetic force between two electrons. A force is in itself an exchange of energy through energy quanta, modelled either as particles, or as energy fields. For an electro-magnetic force the energy quantum is called a photon. If we zoom in, and increase the level of detail, we can see many examples of electromagnetic interaction. All of the effects of magnetic and electrical fields, optics of course, but also inter-atomic and molecular interactions are examples of electromagnetic interactions. The electromagnetic force is for instance active active for both sound and light, although at different levels. Light is itself photons, i.e. packetised pure interaction energy! A sound is transported by atoms and molecules interacting by exchanging photons. Similarly, all other mechanical interaction, hammering a nail, banging your head into the wall, and spinning the tires of your Porsche, are higher order incarnations of the electromagnetic force. Hakan Gulliksson Hard "The ships hung in the air, the exact same way that bricks don't" Douglas Adams "In science, there is only physics; all the rest is stamp collecting." Ernst Rutherford, physicist and Nobel Laureate (ironically, he was awarded the Nobel price in Chemistry) H H May the force be with you. Star Wars H H Gravity is a habit that is hard to shake off. Terry Pratchett 155 The effect of the loudest tolerable sound of 1 W/m2 can be compared to the 200 W consumed by a PC. The softest sound possible to perceive is on the other hand 10-12 W/m2. Quite a difference in magnitude. The energy 1 Joule is the same energy as 1 Ws, and this is about 19 magnitudes larger (1019) than the energy needed to excite an electron in a photo detector in a digital camera. Recall from Chapter IV.19 that an adult consumes roughly 2000 Kcal/day, which is equivalent to 2325 Wh. Tool with and without embedded engine Plant or Ice? Bo Tannfors, TFE V.6 Why do we interact? Interaction has to be useful, otherwise why bother? A fundamental and important result of the highly dynamic concept of interaction is stability! This might seem as a paradox, but interaction allows for the emergence of a stable configuration of a system or organism. Stability and configuration of subcomponents is in other word both a result of adaptation, and a pre-requisite for next level stability and adaptation. We can see it happening for society, and for the cell. Some prerequisites are communication among the components, processing, and some means of representing the adaptations found, i.e. memory. Another example is the brain complemented by nerve pathways. The given prerequisites are not enough however, sensors, and actuators are also necessary to connect to what is happening, understand it, and make necessary external adaptations. Finally, to support all of this energy is needed. A tree, a car, or a computer are easily seen as interacting systems, and the above is also a necessary pre-requisite for our self, where the body and our brain build a relatively stable reference, and a partially consciously controlled action-reaction interface. We can restate the above and say that interactions are useful because humans, and in the future perhaps also for things, cannot develop properly without interaction [JF]. We need interaction to stay in touch with the world in order to gain knowledge. Interaction is how we as humans realize our existence in our world, and how we interpret and experience reality. Hakan Gulliksson "On top of the list [of characteristics of the self ] I placed stability. In all kinds of self we can consider one notion always commands center stage: The notion of a bounded individual that changes ever so gently across time but, somehow, seems to stay the same." Antonio Damasio “Studies of flow have demonstrated repeatedly that more than anything else, the quality of life depends on two factors: how we experience work and our relation to people. Mihaly Csikszentmihalyi Could it be that interaction is, or should be considered, a good in itself- even disregarding what the interaction is about, or the character of interaction? [LEJ] 156 Through interaction we can verify that communication has occurred and perceive the effects of the communication. This is of utmost importance in almost any situation and means that interactions increase survival capacity. Not only is this true for the individual, but also for collectives, organisations, and families. Two typical examples are that if predators hunt in groups they can succeed in finding and killing larger prey. The prey on the other hand can hide in the group, or build a better defence by working together with the group. Figure V.6.1 Interaction. This brings us to our next motivation for interaction, and that is to improve performance. It can be improved both quantitatively, e.g. it takes two men half the time of one man to clean a bathroom, and qualitatively, e.g. two men can carry a piano but one man cannot. Maybe he could manage half a piano, but that is not very useful. Performance is measured in many ways, quarterly earnings and examination points, just to name two. Interactions are also necessary for setting up social organisations. It is through interactions that organisations are bound together and influence each other, and as a result, social entities and new functionalities emerge. Social groups are thus both the results of interaction, and where the interaction takes place. Last, but not least, interaction is mandatory for conflict resolution. It is a fact of life that there is a shortage of resources. Autonomy in things and humans will then, inevitably, lead to conflicts that have to be resolved one way or the other, for instance by regulation, arbitration, negotiation, force, destruction, conflict avoidance, or prioritisation. The success or failure of conflict resolution can be measured in terms of the number of dead and injured, the number of individuals involved in a conflict, or the amount of the fine. Situations of conflict are both the effect and cause of interactions. They have their origins in a lack of resources, and they call for supplementary interactions so that a way out of conflict can be found. Jacques Ferber [JF] V.6.1 Why use others for interaction? One assumption made in this text is that interaction, and more specifically well designed interaction technology, will enhance the Quality of Life (QOL). QOL relates to either human individuals, or to human societies. You install equipment to help find the car if it is stolen, but this is not because the car would be hurt if you did not care. You do it because you want your car back, and your car is not supposed to care. Neither is any other thing currently in use. A dog cares, but is not discussed in this book. So, at least for now, the ultimate goal of interaction is concerned with a person or a society. A problem with QOL from an engineering point of view is that it is difficult to define and measure [RV1]. To start with life and quality are two tricky concepts. Do we for instance mean life of an individual or of a group? Quality could refer to objective or subjective measures. Two persons under the same objective, i.e. same observable external circumstances, could subjectively report different levels of quality of life. Hakan Gulliksson ”Products or services that reduce the time and amount of tasks needed to be performed … increase our enjoyment, entertain or reduce tension, gives us information or challenges to improve our knowledge or our well-being” Quality of life, Philips 157 What we do know is that providing QOL inevitably means hard work, an effort that must be rewarded. In other words, someone has to get rich in the process, or else nothing will happen. One view of human behaviour is that it is either reactive or reflective. Most of the time we just react on events, but sometimes we stop and think “.…[activities that give QOL are those that]... have built in over the how, what, why and when in life [DAN]. Reactive behaviour is goals, feedback, rules and chalthe form of experiential behaviour highly appreciated by Hollywood lenges, all of which encourage soap opera producers, and owners of ice hockey teams. Currently much of one to become involved in ones the development of technology is geared toward this side of human work, to concentrate and lose behaviour. One example is broadcast technology for mass distribution. It oneself in it” supports content that has to be mainstream to survive. Better interaction Mihaly Csikszentmihalyi technology means that entertainment could be made adaptive and individualised, and in this way interaction technology could increase enjoyment and entertain us more efficiently. Good interaction is a basis We are now shifting from the of competitive advantage. information society to the interaction society Interaction technology also has the power to encourage and reinforce M. Wiberg, Umeå University reflective behaviour. Adaptive applications that adjust their reflective level to the user could give suitable intellectual challenges for everyone, not only for the main stream. Some of the more demanding computer We are now shifting from the games are interesting examples [DAN]. Without any previous training, interaction society to the and with only a few clues given at the right moments, an eager player will realtion society solve level 1 after only a couple of tentative attempts. If it can be done for H. Gulliksson, Umeå University games, then it can be done for other applications, so interaction technology can make thinking interesting and worthwhile, and maybe assist in the production of useful results from the thought process. This is a real challenge! To help us think deeper and faster. Machines currently enhance our muscles, and interactive tools for thinking could do the same for our brain and memory. We do not believe that human thinking will be replaced for another couple of years :), but why not complement what Meaning humans are good at, i.e. creative thinking, with tools for improved logical of life? reasoning, and for exploring alternatives? We could ask questions such as What if? , or give commands like Explore that path . One first example of such a tool is the calculator. There is of course a risk with this scenario, as with all possibilities. Technology used for replacing intelligence could reduce its users to reactive machines. It might, on the other hand, allow humans to use intuition, and associative abilities, i.e. human specialities, without bothering about details. Continuing this line of thinking interaction technology can enable and support previously impossible tasks, such as functional alarm systems for the elderly, or educational games for kids. We already depend on technology to solve many problems. Imagine calculating tomorrows weather by hand! To the category of previously impossible tasks we can add technology that make invisible processes audible, visible, or shown as tactile information. Oven temperature, driving mileage meter, and a share index are some examples. T Another objective for new interactive tools is to simplify or eliminate tasks that we are not very interested in. We could transform them into reactive tasks, or eliminate them altogether, allowing us to focus on the really interesting problems. Consider, as an example, the many possible sources for messages, each with its own user interface. Unified messaging is a technology that simplifies by providing one input basket that is used Hakan Gulliksson 158 for all mails, phone calls, and voice messages. This way we need only learn one interface, instead of one interface per message type. Another example of focusing on the real problem is e-learning, where interactive tools that strip off boring mathematical manipulations, and let us explore the basic principle by, for instance, graphically simulating gravity. Simplification is however not always a good thing, we risk eliminating also meaningful tasks. The alarm system for elderly should not replace also the human contact. 18 Interaction technology is also a new basis for social interaction. Videoconference and e-mail are examples of interactions where technology helps us surpass limitations of time and space. Meetings among family members scattered all over the world are no longer considered science fiction. Presence in a virtual gym is another example. It could motivate you by verifying that others also work out. You could compare improvements, conform to praxis when everyone exercises, and learn how to do it by studying the more experienced [BF2]. As a bonus our new interactive systems can be used to study the mechanisms behind interactions, for instance how people are influenced and motivated. The knowledge gained can drive mathematical and computational models of interactions, that in turn can be used to synthesise even better interactive systems, and at the same time help us to better understand human interaction. Most interaction in nature is severely limited in time and space. Even if you shout as load as you can you will not be heard 10 kilometres away, and direct communication face-to-face is real-time and impossible to delay. Talking while strolling along the river is possible only if your partner walks in the same direction and with the same speed. Technology can eliminate such limitations in time and space. They are no longer fundamental problems, only engineering problems. Telephone and email are our first attempts at solutions. Furthermore, technology opens up new communication channels. One example is MRI (Magnetic Resonance Imaging) that lets us inspect the workings of the brain. Another example is using GPS to keep track of the wolves in the Swedish wilderness. To the visually impaired, and people with hearing loss, technology is already indispensable. The examples above are found at a physical level of interaction but technology can support also at a higher, semantic level. It is for instance possible to synthesize a video of a smiling face, and there are experimental systems that can distinguish between a smiling and an angry face. At the aesthetic level technology can currently only help by providing access to numerous examples from which we can learn new ways to express ourselves. African ”talking drum” Risks with information technology: Anxiety, Alienation, Information poor minority, Complexity and speed, Technology dependency, Invasion of privacy [BS1] Another motivation for tools is that they are the only way that we can access the worlds of things and information. Try to speak kindly to an ordinary PC and persuade it to print a document! When information and things need to access each other excluding technology is not even an option. Behind the development of technology there is a deeper trend, or rather two complementary trends [UCIT]. The first trend is that we humans are moving into virtual reality. More and more of our interactions are taking place in virtual reality where space, and sometimes even time, are not Hakan Gulliksson 159 important. When you write an e-mail you can write it at home, or at work, the mail system does not really care where you do it. The letter is written into a virtual message space where the reader accesses it, from anywhere in the world, either two minutes after it is sent, or two days later. It will not disappear if it is not read. The trend of moving into a virtual reality is something we already know a lot about. Internet has prepared us for years. We shape our buildings and they shape us. Winston Churchill The second trend is that virtual reality is entering real reality. The things that surround us are slowly evolving. They are transforming, like the transformers kids used to play with. Dead things are programmed into things with identity and knowledge. How about an answering machine that expands physically, like a balloon, whenever a new messages arrive, or an intelligent call service at the tax office where you can request paper forms, with no human service required. The two trends complement each other, and in the long run we will not only live as real persons in a real world with a virtual image in the virtual world, but also as virtuals in a real environment (a bit speculative), and as real persons in a virtual environment (really speculative). VR RR. A word of caution is needed here. Living in two worlds gives new possibilities, but it does not come for free. We have to get used to new ways of doing things, e.g. navigate in virtual 3D space, and have to learn new concepts, such as avatars and hypertext. Also, society does not change in a day. New habits and social patterns take generations to establish, and does not always mean improvements for all. To conclude, all interactions in one way or another end up fulfilling needs of people, or society. The main reason for this is that ordinary people are the participants with wallets. V.7 Context, it is everything else Whenever interaction is taking place there is an extremely important shared context, i.e. a shared system environment, to consider. It provides the receiver with a reference for the information, and helps to interpret any messages. If you hear a tiger roaring in the living room you will not be too frightened. You assume that the sound comes from a television program. Let us start with the definition to the right. The definition is very broad listing any information relevant could keep an ambitious designer busy for quite a while. It is on the other hand narrow since context is limited to interactions between a user and an application. A second definition is given below the first one in the margin. Admittedly this definition is even broader than the previous one, but we think that it better reflects that focus should be on the interaction, and that context is something that affects, this interaction. The context has many aspects: situation, time, physical, virtual, technological (battery, screen size), computational (cpu, memory and network capacity), social environment, activity, self, user, the application used, and a lot of others. The aspects are partly overlapping, and for each interaction, i.e. each context s context, some of them are more important than other. Hakan Gulliksson Definition: Context is any information that can be used to characterize the situation of entities (i.e., a person, place, or object) that are considered relevant to the interaction between a user and an application, including the user and the application themselves. [AD2] Second definition: Context is any information relevant to an interaction between two interactors (i.e. a human, thing or information), including the interactors. Synonyms to context: Circumstance, situation, phase, position, posture, attitude, place, point, terms, regime, footing, standing, status, occasion, surroundings, environment, location. [AS] 160 Self Figure V.7.1 Example of context classification [KL]. Environments Activity Self, the cognitive state of a human, or the internal state of the device, is one important context, see figure V.7.1. If you are angry while driving your car you might speed up, and take a chance rather than wait. The situation, or circumstance, describes the activity in which the interactor is embedded. An angry bear attacking is an interesting situation; a more common one is driving a car. An application involves a task, i.e. something to do, as does driving a car, but the application is embedded in a computerised tool created for a specific purpose. Using Microsoft Word® is an application based context, but writing a book is situation based. Activity as a context overlaps application and situation, but focuses on the task that the user is performing, rather than the circumstances in which they are performed in, or the application used to perform the activity. One example is killing the angry bear. Context can be modelled at different abstraction levels [JL]. Here we suggest three levels, physical, perceptual and cognitive. At the lowest, physical, or sensory level numeric values are collected. Thermometer readings, time, positions, or pixel information in an image, are extracted using sensors. Often, the interesting event is something that differs from the normal, and to find the unusual we have to know about the usual. This means that we have to collect and maintain background information, for instance the image of a wall seen by a surveillance camera in a house. Things moving against the background are easily filtered out. At the next level sensory information is processed into symbolic observables. We call this level the perceptual level. Interactors and objects as well as behaviours and relations are identified and characterised. The information at this level is independent of how the sensory level collected the information. Perceptual information can itself be represented in different ways and at different level of detail. A picture of a pink house (a sensation) can at the perceptual level be represented as the text string "house", or as a collection of features of the house such as the number of windows and its colour, or in a countless number of other ways. The cognitive abstraction level is the third level and here we interpret the symbolic information. The number of possible perspectives and intentions are even larger than the number of perceptions. Conclusions are drawn from a chosen point of view, and with some intention for the use of the result. By combining information from a thermometer and a photograph of a view from a window we can determine the weather situation. If our intention is to go out outside we can use the context to decide which clothes to wear. A Zebra is easily found on a checkerboard. Context Observables For all of the three levels we can use previously stored information to certify our findings. We can also use information from other systems, such as suggestions from human users. Aesthetics is one such collective opinion Hakan Gulliksson 161 closely related to culture, which means that knowledge about context is important also for a satisfying experience. The process of understanding context is simplified by clues provided by new and more advanced sensors. This is the good news. The bad news is first that the complexity quickly increases with the number of sensors, i.e. the number of possible interpretations escalates. Building a context in real life bottom-up from sensory information will give us (too) many possible interpretations of the data. If we instead try to describe the world topdown we get an explosion of the number of possible relationships among our chosen aspects of the worlds. We do not have access to enough details to sort out the meaning of all of the aspects. It is like trying to find the Eiffel tower in Paris without knowing how to make sense of street signs. We have to use context to understand our sensory information, and need sensory information to understand context. The second bad news is that technology and its use makes the problem even worse by transforming the context in unpredictable ways. Your small office will for instance serve as a meeting room when you have a computer supported conference, even if you are physically alone in the room [LEJ]. Hen (contextual overview) Egg (sensory info) Implicit interactions are those where the systems in the environment get input from an action performed by an interactor, even though that was not the interactor s intention. One example is leaving the car keys on the table where the environment can keep track of them. Another example from [AS3] is a garbage bin that scans the bar codes of products and registers the information. The stored information can be useful to the system, but the action to deposit the garbage was done without any intent of providing the system with information. Explicit interactions are all other actions, aimed at exercising the system. Implicit interaction gives a new meaning to the concept of a user. It is up to the system to decide who, or even what, a user is. The context is something that is shared among interactors, but this does not mean that it is interpreted the same way by all of them. Cold weather is not the same for all people. Also, different interactors might choose different representations for the same feature. What one interactor addresses as Lummerstigen 12, another refers to as close to the university. The problems of interpreting representations especially haunts distributed applications where an interactor has to reason about the relation between its own and the remote, perhaps mobile, interactors representations of context. Perhaps love is like a resting place A shelter from the storm It exists to give you comfort It is there to keep you warm John Denver V.7.1 Use of context Advances in technology make it possible to quickly change contexts, and also to combine contexts in new ways. One example is a mobile phone that displays information about the lecture you are currently missing, both its content, and when it ends. The next moment it plays Tetris. Possible uses of context will be classified in the following. Using context as additional input is the first possibility, e.g. it is -17º C outside, too cold for you to go outside. The context here serves as a resource to the interaction. “n interactor s goal is for instance a resource from the context of the type self. Knowledge about plans, and of effects of actions are other important inputs to the interaction. Hakan Gulliksson 162 Context can also be used to modify input. If you are in Sweden you expect to hear Swedish words and will try to interpret any muttering as Swedish. System Input Output Figure V.7.2 Context modifies input. Context Another way to use context is for feedback. This is how web browsing uses context to help you select the next link. You find the link when you scan the current page. Context can also be used as a receiver of output. By painting your house and buying new clothes you express yourself to the context. Our last example is to use context as action trigger. Information can be bound to a site and accessed at the right position using a computerised tourist guide [JG]. A less virtual example is a stop sign. Context is important and will be discussed many times in this book. But, if it is so important, how come it is not used more by technology? Some of the difficulties for H-T interaction comes from the following properties [AD]. WC New sensors are needed that have to be integrated into the current infrastructure, i.e. the keyboard and the mouse are not enough. Further, information from sensors needs to be combined and abstracted to be of any use. One example is that coupling a position with a temperature reading at that position enhances the context information value. Physical context is local in space, and for mobile interactors this can dramatically change the situation. Heavy rain when putting for a birdie on the ninth green is one example. Physical context is also dynamic, i.e. local in time, which is a problem for both stationary and mobile interactors. Two contextual situations may look similar but could differ dramatically due to internal states of the interactors, changing objectives, or interaction history. A human could, at any instance, change the current goal, on almost any grounds. This makes it difficult, sometimes even impossible, to set up predefined rules for how a system should react. A thing has to learn a lot to cope with these situations! If an application, or more general any interaction, makes use of context it is said to be context-aware. One example is the Fasten Seatbelt warning in a car. As long as the seat belt is not fastened, and the car detects that someone sits at the drivers seat, it gives a warning forcing the driver to adapt. Other interactions with context are less persuasive, such as using a computerised address book. Here the owner has to manually specify and request the information about the addressee. An interactor, or an application, in a context-aware interaction poses the following questions to determine why a situation is occurring [AD]. Who is involved where, when, doing what how? The depth and width of the answers determine the quality of the internal model of the world that can be built. In current systems location, identity, time, and activity are important answers, and we will elaborate on how to get and use them later in this chapter. Hakan Gulliksson An interaction is contextaware if it provides relevant information and/or services to the interactors where relevancy depends on the integrator’s tasks. Adapted from [AD] I have never had any faith in the future, but I think I will have Blandaren 163 V.7.2 Real, Virtual and Augmented reality Virtual reality (VR) is the simulation of a real or imagined environment, and it will provide the ultimate environment where everything is possible! When we get a taste for the immense amount of virtual information, and the physical/virtual interactions available we will not let go of them. Throw out your television (no cheating with computers), and you will experience a small foretaste of how you will not want to feel. Virtual reality can enhance the interactor experience by immersion, agency, and transformation [JHM]. Immersion is the level of presence we feel when experiencing a story, or some other course of events. Deep immersion intensifies the experience and reduces the mental effort to enter it. A well-designed user interface based on gestures and natural language should give deeper immersion compared to a cryptic command language. This is certainly true for the first time user, but maybe not for an expert user? More details also give deeper immersion, but at the same time implies more advanced technology. Other factors that affect immersion depth are [GR]:     Number of senses involved. Multiple actions are possible in parallel. The dynamics of the environment and to what extent it interacts with other environments. Uncertainty, which forces us to actively strive to make sense of the environment. A completely random environment does not affect us. Familiarity with the environment reduces uncertainty, but copying reality is difficult. High tech is however not a necessary prerequisite for immersion. Reading an interesting book can give the same effect. We feel the yearning of the characters; we can hear what they say in our heads. The first moving picture event in Paris 1895 showed an approaching train and is reported to have sent people screaming out of the cinema. Today a five-year-old child would not have bothered to look twice. The limits of human adaptability are not clear. In its present form, equipment like television or film does not serve communication but prevents it. It allows no reciprocal action between transmitter and receiver; technically speaking, it reduces feedback to the lowest point compatible with the system Enzensberger What does it mean to write “Hello World” in an ubiquous computing environment? Tom Kindberg, Hewlett Packard Why has cyberspace replaced outer space as the most important context of the future? Interaction? Figure V.7.3 Image from the first moving picture shown in Paris 1895 Agency is the satisfying power to take meaningful action, and see the results of our decisions and choices [JHM]. In computerised storytelling there are many opportunities for agency. Both the hero and the villain can be influenced or controlled, and we can navigate, explore, and solve problems. Transformation is the intriguing possibility in virtual reality to change our appearance, and our behaviour. Red, grey, or even bright blue Hakan Gulliksson 164 colours of the hairs are possible options. Man or woman, child or dog is up to you. Advanced environments for immersed virtual reality experiences are VRcaves and VR-glasses for video, and headphones for audio. A VR-cave is a cubicle where video is projected onto the walls creating an impressive, but rather expensive, virtual environment. The equipment can provide experiences that can be quite overwhelming, even leading to attacks of nausea. A less immersive version of virtual reality is augmented reality where technology, usually semi-transparent glasses, shows additional information complementing ordinary reality. One example is using glasses to display the service manual when both hands are already occupied repairing an airplane motor. Another application is shown in figure V.7.4a below, where the view of reality through the VR-glasses is augmented by the path to follow [TH]. A flag in the view indicates that interesting information can be found there. Through VR-glasses even a user s hand can be transformed into a display, see illustration in figure V.7.4b. If a hand appears in sight graphics could be mapped over it [LC]. The body is something that every user brings along. Figure V.7.4 a) Augmented view of reality b) Hand serving as display. b) a) H-T will eventually be H-I because I (VR) provides much better service than T. How can you change colour of your telephone in the real world? How do you accomplish it in the virtual world? Which is simpler and cheaper to implement? Church We can classify spatial technologies in many different ways. One alternative described in [SB] identifies two dimensions, transportation, and artificiality. Transportation describes the level of precense of the physical body in the application of a technology. Artificiality concerns to what extent the world created is virtual. Dimension of artificiality Augmented Synthetic reality Physical Physical reality (meeting face to face) Local Virtual reality Figure V.7.5 One example of a classification of different worlds. Telepresence Remote Dimension of transportation The classification allows for many different worlds at different levels of Local/Remote and Synthetic/Physical. Imagine a 3D display that shows your home as it looked 20 years ago. Hakan Gulliksson Home 165 V.7.2.1 Bridging virtual and physical contexts VR-glasses are not necessary for augmented reality. We can use many other devices and interfaces to represent the virtual world in the real environment. Mobile phones, or PDA:s, are obvious alternatives to audio and video tunnels allowing activity in the real world to be heard in the virtual world and vice versa. Virtual environments that we are so accustomed to that we do no longer recognise them, are music and other synthetic audio environments. Maybe not as fancy as full-blown immersive VR-caves, but still very efficient as mood stimulators and information transmitters. Interplay between real and virtual space is found, and is useful, for both actions and states. Crossing out a phone number in your paper-based calendar has an obvious virtual counterpart in your computer-based calendar. The mapping from reality to virtual reality can be arbitrarily chosen, but usually a designer decides on a natural mapping. This simplifies for the user because most mappings make little sense. One example of a useful mapping is to assign a web page to a physical location. From virtual reality to physical reality the mappings cannot always be chosen at will. We for instance cannot let the virtual world affect things that have already happened. We have to accept now, and the flow of time. What we can do is to read the current time, and set the alarm clock. By the way, does time constitute a virtual or a physical environment? Another interesting fact is that showing relationships among items is easier in the virtual reality. How do you find out who is related to whom at a large family party of an unknown family? You have to ask someone, or get hold of the family photo album. In the virtual world, relationships can be shown directly! Histories are not directly visible in the real world, neither are histories of actions, nor other traces over time. Who broke that window? What started that war? Simulation of cause and effect could explore different paths of actions and provide insights. In virtual reality it is even possible, at least in principle, to backtrack and exactly evaluate all factors. As we turn off, and throw out, our PC:s in the coming pervasive computing environment we will have to admit that they were very good for some things. One of their masteries is to keep track of a local virtual context of the user. Our calendar, task list, address list and the files we are currently working with are readily at hand. This is obvious to anyone who has had a major disk crash. We will also in the future need a local virtual context, and the place to put it is out in the cyberspace. By doing this we gain several things. First, and most importantly from a mobile applications point of view, we can access the virtual context, from anywhere. Second, we can more efficient manage our data by centralising the management. No more problems with disk crashes. Third, if everyone uses this kind of web-based context, and technology continuously improves, there will be better standardised tools for accessing it. Tools that simplify navigation in the data space that we generate. As more and more networked devices provide communication channels we can expect them to give more coherent displays of the chosen aspect of virtual reality. When you go from the living room to the kitchen the sound channel from from the living room TV could be mapped to the kitchen radio. Hakan Gulliksson We have taken our biological clocks, moved them outside our ourselves , and then treated the extension as though they represented the only reality E.T. Hall, Dance of life The human species, however paid a price when it choose the extension route. Extensions are a particular kind of tool that not only speed up work and make it easier but also separate people from their work. E.T. Hall Dance of life T H I "Think of it as reality and it becomes reality.” John Dobbin We will demand to use VR in the real world Dr Michael Heim, Art Center College 166 Several experiments have been done where a video projector is used to project aspects of a virtual world into a public space. The shadows of the virtual world could be used as an artwork, or for information display. The figure to the right, from the artwork Video place by Myron Kreuger, is one example. By image processing the shadow of the user, and at the same time projecting a computer animation of a ball, a mixed physical-virtual ball game is created. In another example users wore VR-glasses and manipulated 15 by 15 cm paper cards. A video camera captured images of the paper cards and information about their positions were sent to a computer. The computer rendered the video frames with additional information such that a card could change its appearance at any time. V.7.3 Context of H-H interaction Human-to-human interaction is always situated in time. An e-mail is written such that it will make sense at the time when the receiver reads it. We place the reader in a later time slot, and adjust our message accordingly. The same thing happens when we read e-mail ourselves. Automatically we adjust our interpretation to the time frame of the writer. This is one reason why information on context reduces the amount of communication. According to research in psychology we only accept a delay of a tenth of a second to feel that a response is immediate. If the response arrives within one second we will at least follow our line of thought, but if the response is delayed longer than ten seconds we have already forgotten what the dialogue was all about and have to restart it. Memory is precisely stable enough to keep the conversations going. People are situated in space as well as in time. All sorts of languages are used to represent space, and to relate different spaces to each other. Many coffee cups all over the world have been used to represent roundabouts. We want a table for two by the window, the voices in a restaurant indicate distances, we lean closer to indicate intimacy, and use a napkin to draw a quick sketch of where we live. One reason to the unprecedented complexity of H-H interaction is that it is embedded in a social context, which among other things is culturally specific. The people that surround us, and their behaviour, are obviously extremely important to us. Anyone who has lost a close relative knows the depth of this statement. Solidarity, love, comradeship, friendship, and military honor are examples of complex social contexts, see also Chapter IV.16. Human society is built by interdependent hierarchical structures. The family, the organization, and the state are all basically social and hierarchical. One alternative is to organize people in networks built either by specialized individuals with complementary skills, or by peers that join a network for efficiency or for fun. Relations between individuals hold hierarchies and networks together. They are built over time by authority, by friendship, love, by birth, and in many other ways. In fact, the stability of all societies depends on feedback from their inhabitants. Hakan Gulliksson Tatemae –sensitivity towards others, public self Honne – sensitivity towards one’s own private self Suji – situational significance of an event E. T. Hall Dance of life (important terms in the highly contextual Japaneese culture) 167 Why do we organize ourselves in hierarchical structures? One reason is efficiency of communication and control. In large communities hierarchies make control visible, and enable swift distribution of control messages. A related reason, for smaller groups, comes from evolution. A group makes it possible for one strong individual to dominate and guarantee reproduction of his or her genes. Since social structures and behaviour are well established in human thinking and behaviour they obviously can be exploited in different computerized applications. There is an enormous amount of results from research on these issues from psychology, and we are ourselves aware of, and affected by, many social influences. Most of us for instance have a tendency for social comparison, i.e. we behave as our neighbors do. This is one way for a social animal to survive, or at least to take easy decisions, just follow the group. People enjoy imitating the behaviour of other people. We humans form groups into lines and queues, look in the same direction as the crowd, and wear clothes to help others to understand who we are (or who we want to be). We adjust our behaviour to groups in many ways, automatically, and all of the time, for instance when we follow the group leaving the airplane, supposing that everyone is going to the luggage claim. Another example is that we prefer a crowded restaurant to an empty one. Such behaviours are currently not exploited on the Internet, or by any other technology. Additional examples of social dynamics are group polarization and social facilitation. Group polarization means that a group after a discussion tends to assume a more extreme point of view. People who do not like to make the dishes like it even less after discussing it with each other. Social facilitation is the interesting effect that a social environment increases the performance. You will run faster when competing against a person compared to racing only against the clock. One way to formalize the space of social settings is suggested by Rowson in [JR2]. Two dimensions are specified in a scenario space, see Table V.7.1. The first is relationship, which relates to the group size, and the second is the role, describing the physical/social location where the interaction takes place. Relationship Individual / Role School Homework Recreation Movies Family Casual team Formal team Community Passing notes Group project Chat Soccer team Shopping Work Spiritual Social links in Canberra Australia, [AK] Table V.7.1 Social relationship versus role. Where would you place falling in love? Prayer Some obvious social activites have been suggested in the table, but there are many blanks for you to fill in. Can you think of any more relationships or roles that could be added to the table? Hakan Gulliksson It is the human and social aspects of context that seem to raise the most vexing questions. And, though these are the very aspects of context that are difficult or impossible to codify or represent in a structured way, they are, in fact, crucial to making tha context-aware system a benefit rather than a hindrance or – even worse- an annoyance. Victoria Bellotti Human salient details of context: Identity, Arrival, Presence, Departure, Status, Availability. 168 V.7.4 Context of I-I interaction Context is important also for I-I interaction where both the participants and the context belong to the virtual world, and where it is sometimes difficult to distinguish the interactor from the context. The objective for I-I interaction is usually to locate, retrieve, exchange, or generate new information, and this is best done in the most important virtual context today, the World Wide Web. A problem is that since the web currently is mostly meant for human readers, much of the information is in practise worthless to a software program. In general an information agent can sense a lot of data, but this data cannot be interpreted because the internal model of the interactor is not adapted to the data and contextual information found. Humans evolved into a solution for the physical environment, but it took them several million years. When we today design new software we work around the problem in two ways. First we specialise interactors for limited information environments, and for simple tasks. One example is designing an agent that looks for terrorist information only. Within this limited context the sensitivity for typical data can be enhanced as well as the interpretation skills using the particular knowledge in the domain. When looking for terrorists, the words bomb , kill , or liberate in a message header provide highly relevant contextual information. Second, we adapt the environment to the capabilities of the agents, for instance by adding meta-information to guide search. Referring back to the terrorist example we could simplify the task by making sure that all e-mails pass through a few selected intermediary nodes. A problem with context, in the real world as well as in the virtual world, is that both its topology and content will change. In H-H interaction we have specialised functionality to manage changes. Attention, curiosity, and vigilance help us adapt and adjust focus. How can the corresponding functionality be implemented in I-I interaction? Scanning, another human speciality is difficult for I because of the vast information spaces involved. Humans solved the problem by specialisation, and using local context only. Most perceptual cues are wasted on a visiting software program. Information such as font size, depth queues, and colour will not be used. Even worse is the problem to extract structural information hidden in diagrams, tables, maps, and other figures. It is for instance difficult for a software program to understand that the text string Umeå in relation to the positions of other text strings on a map conveys a lot of information. Any human reader will easily approximate the time to travel to Oslo if the time to Kista is known So far, the Internet has been built by humans, to be read by humans. The illustration to the right is immediately recognised as a family and you probably guess that the family name is Gulliksson and that the phone number is +46-90-142613. Because of the lack of structure, and the implicit information hidden on the web, an agent faced with the problem of understanding the web has two options. Either to look for metainformation describing what to find where, or to do an extensive search and try to correlate the information found. Hakan Gulliksson ..100, here I come kill terror Bush kiss Liberate save LOVE Osama bomb Umeå Oslo Kista Gulliksson +46-90-142613 169 Suppose that you are an interactor of the type information, i.e. a piece of active information roaming the Internet. What would be your view of the context, and what would your perfect sensor look like? This is not an easy question. There are for instance very few sensors available to perceive other interactors of type information, or to identify I-I interactions. Autonomous, software based, interactors usually do not have accessible internal data structures, and it is difficult to find out which agents that are operating in the virtual neighbourhood. There are some indicative information that could be used; perhaps a new process starts up, processing capacity suddenly decreases, or if the memory usage goes up. From the perspective of interaction, context could be described as the capacity for processing and the amount of memory available, i.e. a technological context. An alternative is to consider the context as a system by itself and describe its interface using the vocabulary of communication, i.e. the messages are sent to, and received from, the context. This view can be useful in some situations, for instance when modelling interaction between only two autonomous agents. If the coupling or the immersion level is high using communication as a descriptive framework will be cumbersome. The most straightforward solution is however to view the context as a data structure preferably accompanied with a model describing the structure. This model is itself a data structure, maybe in the form of RDFstatements. Without such a model an agent is forced to scan and search every time it needs information from the context. The interactors in this case will also be represented as data structures, possibly with their own model of the data. V.7.5 Context of H-T and H-I interaction The number of contexts are unlimited and some of them are; hospitals, airports, museums, theatres, health care at home, playing games, interaction devices worn inside or close to the body, and your memory of how to save a file from Word ®. Your house will awake from its sleep and turn into a context where all sorts of radio-based equipment co-operate under the command of you and your family. It will be more of an autonomous context sensitive thing with senses and effectuators. Context dependency is already evident for instance in the modern camera [WB]. With a single push of a button, lighting conditions are estimated, auto focus calculates the distance to the object, and the time of exposure is set. After the photo is taken, the camera even stores all sorts of metainformation about when and how the photo was taken. What additional features could be delivered to you by an application, given that a computational context is available? Perhaps the following [AD]: 1. 2. 3. ”It’s a world of cameras aimed at everything everywhere, watched over by machines, and occasionally examined by people” Paul Saffo HELLO WORLD HELLO WORLD Presentation of information and service options. Automatic execution of services. Tagging of context with information for later retrieval. A printing application is one example that allows you to select from nearby printers (exemplifies category 1 above), illustrated in figure V.7.6a below. If nothing else is specified the application automatically redirects the print to the nearest printer (2), and remembers which printer that was used (3). If you do not know where to find a document just ask the application. This last possibility, using the system as a memory extension, Hakan Gulliksson 170 has further implications. If we trust the system to keep track of our belongings, files, keys, and children we will possibly change our behaviour to focus on other, more high level tasks, that in turn can raise our quality of life. a) T H b) c) I T I T I T H I H I Figure V.7.6 a) User selects printer b)automatic execution of service c) Location based presentation of information. I A related class of applications that combines items (1), (2) and (3) above allows us to leave a remainder that automatically will be presented when needed, see figure V. . c. When you drove this street the last time you turned left at the next crossing . “pplications from class can also be used to optimise behaviour, an automatic navigation service could tip you off that the highway is free. The features listed apply to a group of users as well, and could be used to share user experiences, e.g. to show a replay of the last goal of an ice hockey game, which is an example of a combination of (1) and (2). Error: Can’t find the printer! Now let us discuss an extremely simple context sensitive application, the thermostat that only adjusts the room temperature. ”ut, is this really such a simple task? Some people like their home very warm, others prefer a lower temperature. If you catch a cold you probably would like to raise the temperature, but if you have been out jogging you prefer a cool apartment. If the husband, who likes it really hot, goes abroad, his wife would like to enjoy a nights sleep in a frosty bedroom. It is in fact impossible for the thermostat to infer all of these, and many other reasons, for adjusting the temperature. From this we can at least conclude that any system that acts on behalf of a user will be complex. Furthermore, the user should always be able to override the system, here the thermostat, and this in turn means that the user must be able to deduce the workings of the system. The problem would be less difficult if we could trace the users state of mind, but nature ruled that option out. Another possibility is continuous update of their representations by the users themselves, but research and empirical evidence shows that people are notoriously bad at this type of assignments. Whether we will eventually be able to build systems that act on our behalf in any non-trivial situation is another of these raging debates that will not be resolved until we build such a system . The next couple of generations of systems will keep on the safe side by providing the user with rich representations of context as seen by the system, but leaving the interpretation of this context, and the decisions, to the user. V.7.5.1 Context identification How do you recognize a situation, daily behaviour, activity, cultural (social) environment, or yourself? Why does not the telephone ring signal tell you why someone is calling? Why does not the calling party already know that you are in the bath, and never even think about answering the phone? Hakan Gulliksson People are difficult to deal with as contextual entities. They make unpredictable judgements about context. Inother words, they improvise. Lucy Suchman You can’t reboot the world, let alone rewrite it to introduce new technology Tim Kindberg 171 “ small example shows the complexity of the task Magnus and his wife leave their newly built house. It is eight o clock in the morning so maybe they are heading for work, but no, it is Saturday. Oh, I see, they have their jogging outfit on, and I also recognize their dog. Probably they are taking their dog for a walk . ”The only and truly useful context-aware application is the automatic door, and it was invented decades ago.” Pessimistic view Think about the many conclusions that have to be drawn and the refinement of sensory impressions needed for this statement. Not at all trivial. One estimate of the activity can be found by tracing all of the user s positions and motions over time, and by analysing uses of objects and services. If this is properly done the trace can later be asked questions like Where did I leave my keys . The time dimension can give additional hints on behaviour. If you are looking for your keys, and had them a minute ago, they are probably somewhere nearby. Automatically detecting presence, this way, or any other way, gives many new possibilities. Since detectors can be made much more sensitive than human senses we can build amplifiers that feel the presence of almost anything. A technology based sixth sense. Location information reduces uncertainty. Knowing that someone is in the kitchen, the bathroom, or in the bedroom, for how long, and with whom says a lot. Location as a context is consequently important in mobile applications. Adjustment of time depending on the current time zone is for instance possible to do automatically. For many applications we have the opposite problem, the user wants to keep the computational environment independent of the intention for moving, and of the movement itself. The address book should for instance not be bound to a specific position. Another example is that the car radio should automatically retune to the selected radio station if the transmission frequency changes as the car moves. Modifying output depending on whether the user is moving or stationary is another nice feature. The font size on a mobile phone should be increased when user is moving, and some interactions such as entering a phone number should be simplified. It is for instance easier to select from a list of recurrent numbers. Context parameters for a user can be seen as a context space [OR]. As time goes by every user makes a personal journey in this context space. If we store the traces we can build many interesting applications. The computer can recognize déjà vu situations, and our trace can answer questions like [OR]:     When was I here last? What did I do then? Where did I go next? Did I see Mona Lisa when I visited the Louvre? Definition:Motion- movement Synonyms: act, action, advance, agitation, ambulation, change, drift, dynamics, flow, fluctuation, flux, kinesics, locomotion, mobility, motility, move, oscillation, passage, passing, progress, stir, stream, sway, sweep, swing, tendency, travel. Mood Time Company Location If we are allowed to also use other people s context traces, even more fascinating questions can be asked:       Who is going in this bus with me? What do people usually do here? What happened here an hour ago? Has someone here seen Mona Lisa? Where is my child just now, when did she go to school this morning? When do people usually go to lunch here? Hakan Gulliksson 172 Some of the problems when trying to identify human actions are that they are interrupted, can continue for a long time, and are executed in parallel (especially by women they say) [GA]. Cultural differences, and situational constraints add to the list. One example of a cultural constraint is gesture recognition, where for instance south Europeans are more lively than north Europeans. The problems discussed above become even more difficult if we want to predict rather than to identify. At the very least we need data from similar situations. If we want to predict personal behaviour we need personalized data, for instance recordings of our habits. Some help can be found in assumptions on reality such as that a person can only be at one place at a time, follows his schedule, moves around with a maximum speed, and has a habit of making habits. Is it really 2006 this year? In some other counting systems, it is 5766, or 1426. V.7.5.2 Situations Travelling is an example of a situation. While travelling there is spare time to kill that can be used for interactive adventures. Another interesting thing about travelling is that we seem to do it more and more, even though in theory we should use the Internet for communication rather than use gasoline. Could it be that we get to know even more people through new information technology, new friends in distant locations that we want to meet face to face? Or is the number of possible and mandatory activities to perform increasing, making more meetings necessary? A taxonomy for different situations from the reference [MM] shows that the essential situations in which we spend our time are remarkably small in number. The table below shows a typology of everyday situations (try to find one more essential situation, and some additional situation for each column): At work Deliberating (places for thinking) Presenting (places for speaking to groups) Collaborating Negotiating Documenting Officiating Crafting Learning Cultivating Monitoring Hakan Gulliksson At home Sheltering (places with comfortable climate) Recharging (places for maintaining the body) On the town Eating, drinking (places for socializing) Gathering (places to meet) On the Road Gazing/Touring (places to visit) Watching Remembering Confining Servicing (places with local support) Cruising Shopping Sporting Belonging (places for insiders) Attending Commemorating Adventuring Driving Walking Waiting Hoteling Table V.7.3 Different situations. 173 The mass media and many other active structures in society are trying to adapt to the changes in our behaviour, and thereby help to speed up the changes. Situations such as reading a newspaper, or watching television, will change. News on paper cost at least $1 and that is quite expensive compared to a computer-based alternative at the price of 10c. Television is also made obsolete as a group activity when we learn new behaviour from the web. Who wants to watch while someone else is zapping? Other examples from this trend are that we do banking and taxation over the Internet. The new information society has however not succeeded in eliminating queues. They reappear in digitised shapes We will soon accept you call …. , Network busy , ”uffer full , and other similar messages are all too familiar. In their previous physical incarnation they had some charm. We could study the behaviour of other people in the line, or estimate the waiting time. How can we re-create at least some of the positive aspects of standing in line? In 1989 researchers estimated that in the United States more than 100 million person-hours were spent per day queuing. How much of this was unnecessary, and could be replaced by an interaction involving an intelligent PDA? Time trap Concept Time Technology Use E. Stolterman, Umea University V.7.6 Context of T-I interaction Context of T-I interaction has much in common with the context for H-T and H-I interaction. One dilemma for T-I is that the distinction between the environment and the interactors in this environment is not always clear. What is for instance interactors and context when a camera automatically takes a picture and sends it to a server? H-T/I is simpler in that H is a natural discrete object and usually the focus of interactions. T/I T/I H Questions typically asked by an interactor in any context-aware interaction are: why is a situation occurring, who is involved where, when, doing what how [AD]? Let us elaborate on these questions somewhat. The who question can be answered by either an explicit identification, or by noting that something, or someone is involved, i.e. detecting presence. If all objects nearby register their positions this is a simple task. If not, an alternative approach is for every object to have a contact zone inside which any other object is registered. This is the way humans usually do it, by vision, and is also how many networks establish network connectivity. Most of the questions above can be answered either directly by the interactor s own senses/sensors, or indirectly by asking another interactor or the context [AD1]. Where is a question about location or position, and finding it is an easy to understand problem that is quite difficult to solve. Outdoors we can use GPS down to a resolution of a couple of meters, but indoors we are still looking for a solution. When is not too difficult to figure out, but the last question in the list poses a major problem. To find out doing what how is quite difficult and certainly needs contextual information. Hakan Gulliksson 174 An example might help at this point. Imagine Hakan driving his Porsche, rather tired after a long night at the keyboard. The car detects the presence of a driver by a sensor under the drivers seat, and supports this fact by noticing that a door of the car has opened and closed. The identity of the driver is detected through personal car keys, and once the identity is known the drivers seat is adjusted. The where question can easily be answered by a GPS receiver, and the when question by the cars internal clock. A feature of this particular Porsche is to detect if Hakan is sleepy or not by analysing of a video from a camera in the car. Even if this feature is advanced it is still easier to implement than finding out whether H. is taking a ride just for fun, or is picking up some flowers to surprise his wife. Physical contexts could however be used to a much higher degree than today. Imagine what applications you could build if the car radio knew about speed limits and the current speed of the car. A wearable device can be used to supply context to the application as well as to the user. Who (or what) is carrying, from where to where, and with what intention? What if every mobile phone reported the temperature when used outdoors? Could such a massive data stream be used to improve weather forecasts? Another example is that an intelligent mobile phone could use the physical context to do the following mappings automatically [GC]:      Vibrate – In hand. Ring – Not in hand. Adjust ring volume – In suitcase. Keep silent – When owner is eating, or at a lecture. Any way – Outside (where is the user?). The physical context is not the only relevant context. Access of internal data structures within interactors, and within objects in the environment, provides contexts that could further enhance functionality. One example is a car that accesses the data structure of the car key, another example is a mobile device that explores the menu of a restaurant, when outside in the street. From an application developer s point of view this poses new challenges. Now, it will no longer do to separate the internal data structure of the application from the application interface. V.8 Interaction modelling, back to the basics Interactions have necessary pre-conditions. At least two interactors must be present, capable of acting and communicating, and there must be system state variables to modify, otherwise the interaction will be kind of boring. Each participant follows all, or some, of the basic steps in the action cycle; start with a goal, form intention, specify action, execute action, perceive system state, interpret resulting system state, and evaluate outcome. Interaction occurs when these steps intertwine for two or more participants. Hakan Gulliksson Non-empty ordered set of events engaging more than one agent. Alternative definition of interaction 175 As with many other topics that we have discussed, and will discuss, interaction needs both analysis and a synthesis. Analysis of interaction means that you study an interaction and try to figure out what is happening. What you learn can be used for the complementary activity, synthesis. You cannot really synthesise actions for other humans, at least if you are not God, but many other interactions are man-made and possible to generate, analyse, and tune. What you can do for human interaction is to synthesise rules and an environment that constrain the interaction. For software based systems we can do better, UML and state based modelling, provide diagrams and concepts that can be used to automatically generate software. The automation of synthesis is crucial for building the complex systems of the future. Interactions and interactors are researched using many different models, at different levels of detail. An atom can be described as a solid sphere using a mechanistic model, or as a wave packet using an energy-field model. But, if we want to model interactions between two atoms separated in space the energy-field model is much more suitable. Figure V.8.1 Models of atom at different levels. The example illustrates that models of interactors and interactions should work well together. Field-based models are one general class of models matching this criterion. Another well-behaved model is the state-based model that we will use extensively in this book. For a really complex system such as a human, and for human-human interaction, there is no obvious candidate for a formal model, even though there are many informal psychological and sociological models. Can every physical representation have a virtual counterpart? Are there any virtual representations not created by man? V.8.1 Modelling view At a given level of abstraction any system can be studied from one or more of three different perspectives, intentional, conceptual, or physical [DB1]. The intentional view describes the system from the perspective of how, and where, it is going to be used, and what goals and expectations it can fulfil. One example of an intentional view of a message transmission is that a sender intends to tell you what time it is. While doing this we should ask ourselves if our model is useful. Modelling the mental states of a light switch does not add much value to us. From a conceptual view, we can learn how the system works, its properties, and about the mental model we should use to understand the system. To explain a concept similarity, pattern matching, metaphors, and cultural associations are important. Understanding the principle behind a cut and paste function in an editor is one example, another one is a message transmission viewed as sending packetised information. B A The physical view is concerned with how the system and the real world influence each other. At this level a packetised message can be modelled as a sequence of bits, sent over an electric cable. “nother example of a physical level model is a neuro-physiological model of the workings of the eye. The 3D-model of a coffee cup with a handle that is a perfect match for your finger is a third example, and a red button indicating a stop function a fourth. Hakan Gulliksson 176 Why do we need three levels? Why not two or four? This question is the subject of an ongoing discussion among philosophers, but it seems that at least three levels are needed. One physical interaction can be used to implement many concepts, and many concepts can use the same physical interaction. The cut and paste editing function can for instance be implemented through the use of either a keyboard or a mouse. In a similar way many intentions can make use of the same conceptual function, and many alternative concepts can be used to fulfil an intention. You can achieve an objective in many different ways. If you want to tell your mother some good news to cheer her up; you can either visit her and tell her in person, or send her an email. The intentional view is needed because the concept of sending your mother a mail does not include the additional information that you want to cheer her up. Intentional Conceptual Physical Can you think of an example where a single concept is used in many intentional views? Reflecting on systems from these three perspectives is something that humans do all of the time, and it is a very useful practise in the design process. We can use an alarm clock as an example to illustrate the three perspectives above. This specific clock is added as an extra function to a digital camera. If you are told that a digital camera is equipped with such a function you will understand, because of the intentional view, when it can be used. You will start looking for a user interface, i.e. you use a physical view of the device, and you expect, from your conceptual view of an alarm clock, that this interface will give you the opportunity to set the alarm, set the current time and turn the alarm off. This leaping between levels is typical for how humans think and use mappings between the different views. An alternative set of modelling levels is the conceptual, semantic, syntactic and lexical levels [FDFH][BS1]. This is a description chosen in accordance with how language is modelled. The conceptual level is in essence the same as the conceptual level discussed above. It is important for a designer since it establishes a common ground for everyone involved interactors. Using a metaphor is one approach, but stretching analogies too far could hamper usability. The semantic level, also called the functional level, defines the operators, their meaning and the information needed to execute them. In the alarm clock example above, one operation is to set the time, and for this the current time of the day is needed. At the next lower level, syntactic, or equivalently sequencing, design is performed. Units of meaning such as input of character, and button-clicks are grouped. Setting the time on our alarm clock involves the following basic units: enter set time mode, input four digits, and exit set time mode. The lowest level, lexical design level or binding design, assigns physical properties to the units of meaning. Colour, line widths, and text fonts are specified for objects. When binding for our alarm clock we could decide that throwing a pillow across the room should stop that annoying alarm sound. Hakan Gulliksson At which levels of description is a car is controllable and observable through measurements? Ever washed a window? Crashed it? Minimized it? Why, Who? What? When? How? 177 V.9 Interaction characteristics, some suggestions All humans have names, as well as many information agents and most types of physical interactors. Interactions on the other hand, in general have no names, and there are no systematic naming conventions. One reason for this might be that there are too many interactions. Another reason is that behaviours of physical objects have not been interesting enough, which means that a description of the interaction degenerates to a one-way command or action. We hit the golf ball, but do not care about the behaviour of the ball when it is hit (ouch!), only that the ball landed 200 meters away in the middle of the fairway. Consider the way we build our sentences. The subject-verb-object organisation such as in "HakanG set an alarm clock." is obviously working well for us, but perhaps it overemphasizes the interactor s role, and favours behaviours where an action by someone is directed at something. Some particularly interesting interactions have been honoured by names. Examples are high-level human behaviours such as flirting, courting, and conversation. There are however no short forms for shaking-hands-whenone-person-is-happy-one-is-grumpy, or adjusting-temperature-for-ashower-when-water-in-pipes-are-cold. Maybe we will someday have a language based on interactions? To make such a language useful we need to extend our vocabulary of interaction, but even changing the order of the clause elements would make a difference: "Fighting and arguing are Adam, John, and Peter over original sin". Noun / Verb (Adjective,Adverb), Actor / Action (Characteristics),Object / Method (Attribute), Program / Execution, Data / Operation, Representation / Transformation, Signal / Filter, Knowledge / Learning, Message / Modulation blast, blow, bump off, burst, crash, damage, demolish, destroy, devastate, fulminate, lay waste, sabotage, break, bust, collapse, crush, damage, decompose, disassemble, discontinue, disrupt, end. What is the Universe? A collection of objects exiting in space, or the processes that shapoe it, dynamic forces at work? [CC] The true intention behind the behaviour of a human is difficult, if not impossible, to identify. When we sing in the morning it could be because the sun is shining, but it could also be a torturous way of waking the children up. Most interactions also exist at several levels. If we, for instance, want to talk to the house owner we alert him by ringing the bell, by moving a finger [RV1]. To accomplish complex goals higher abstraction levels use internal models of planned combinations of lower level behaviors. Lower levels on the other hand take care of the detailed howto-do-it aspects of the interaction. We change levels from lower to higher by learning, and we change from higher to lower levels when our plan, or the behaviour we learned, fails. A complex interaction that we have not performed before will force us to consider elementary actions. Try to remember the first time you prepared a gourmet dinner. Each step in the recipe was probably a major obstacle. Envisioning the prepared plate was probably easier. There are many different views we can use to study interaction, for instance the intentional-conceptual-physical views introduced in Chapter V.8.1, or a state-based approach. A third alternative is to view interaction at the following three levels: aesthetic, functional, and transmission level, another variation of the layerings introduced in V.8.1. ’1’ ’0’ Aesthetics is concerned with the experience of an interaction. Did you enjoy it? Was it visually appealing? Do you want more of the same? However, interaction is of no use if the participants do not gain from it. The functional, or semantic, aspect highlights this constraint. Failure is also inevitable if messages never arrives at the receiver, or if there is no Hakan Gulliksson 178 input channels available. The parts, and the behaviour, of the system that ensures low-level feedback are referred to as the physical, or transmission, level. Let us discuss a smile as an example. The intention is to signal and induce a positive emotion, but a smile for less than 1ms will not be detected. If it is 100ms long it might be detected, but you will not be sure that you saw it. To be smiled at for a couple of seconds is a nice experience, but if the smile continues for a minute, or more, you will start to feel uncomfortable. “t the semantic level you analyse the message and ask yourself, What does this person mean by smiling at me all of the time? The original physical representation of a smile is a facial expression accomplished by moving muscles. But, in the right mood you can also see the man in the moon smiling. What about the aesthetics of a smile? Is perhaps Mona Lisa s the perfect smile, or maybe the hint of a smile of your first-born? Try to think about an ugly smile. What makes it ugly? Its representation or context? At each of these three levels the communication channel is susceptible to noise, or its equivalence. Noise at transmission level is easy to understand. One example is that the transmitter sends a , but the receiver sees a due to some electromagnetic disturbance. At the functional level noise can be exemplified by an e-mail in Swedish sent to an English receiver, or a mathematician trying to explain the solutions of Fermat s theorem to anyone else. For modelling we can borrow a battery of terms from systematics. An interaction can be seen as a system with input and output from context. A closed interaction is the special case without information flow. Interactions can be adaptive and have memory. They are usually non-linear, and neither stability, nor time invariance, can be assumed. An interaction can be deterministic or stochastic depending on the complexity and the characteristics of the participants. Interaction with angry people, or with individual raindrops must be considered as stochastic. Another aspect is the observability of an interaction. In other words, if we can deduce all internal states of the interaction from observation. When the participants are human this will not be possible. We cannot know everything about the inner workings of the players in a game of football. Software based interactors in principle can read each others minds, but in practise any complex interactor s expressions, whether external or internal, are very difficult to decipher. We need to measure interactions in different ways, for instance to evaluate if they fulfil our expectations, and return our investments. A game of football has a score, which is one measure. If the objective of the game is player exhaustion only, we should measure how tired the players get. Measuring is not too difficult if we are content with physiological measures, but measuring the entertainment value of a game of football is not that easy. Will subjective measurements do? Perhaps we could analyse face expressions rather than asking interactors? At least we all agree that an exiting game is more entertaining than watching the presentation of Office® while installing it. Another example is that some users complain about the quality of speech when using an Internet phone, even though they clearly can understand what is said. This is an example of where Min aesthetics and subjectivity matters. At transmission level the Information theory by Claude Shannon provides useful measures and estimations of quality. The functional and the aesthetic levels need other, indirect, Hakan Gulliksson Context Max Min 179 measures of evaluation however, such as a questionnaire, or a viewer poll. But, how do we estimate the results at these levels already before we start an interaction? There are many other aspects of an interaction that we can estimate. Here are some suggestions:    Complexity. The level of complexity describes the combined complexity of all state machines involved in the interaction. An interaction can have a high level of complexity even if the rules for the interaction are simple. Complexity increases with the number of interactors involved, and with the perceptual and processing resources available to the interactors. If a rule implies choosing between only two alternatives the choice can still be enormously complex, depending on the states and contexts of the interactors. The extent to which an interaction actually exercises the potential complexity of the interactors is the complexity depth of the interaction. Interactivity depends on the choices available [CC]. A K All in? Asymmetry The complexity, e.g. skills, cognitive capacities and number of internal states involved, could be evenly distributed between participants, or associated to only one of them. One example of a highly asymmetric interaction is the user pushing the elevator button where most complexity is on the user s side of the interaction Interaction bandwidth, or resolution. This is the interaction channel capacity. A game of chess has simple rules, and with each move only 5 bits of information is transferred. If you instead consider all the moves in a game of chess as one transaction of an interaction, or if you take the psychological interplay between the interactors into account, then the bandwidth is much higher. Many parallel channels, and also communication symbols rich with information, increase bandwidth. A A A K A K Other aspects of the channel is its dynamics, i.e. the relation between the maximum and minimum bandwidth, and how difficult channel access is.  Effectiveness or congruence. The effectiveness of an interaction is to what extent the objective of the interactors is fulfilled by the interaction, i.e. to what extent an interaction satisfies system level goals [HP]. Some of the goals are perhaps not fully compatible, and it might even be impossible to obtain all of them. You and your friend will get some exercise by raising your glasses of beer, but a round of golf is more congruent with an aim of efficient exercise. Still, you might rather take another beer. Driving any car for fun, or a red Porsche, makes quite a difference. Hakan Gulliksson 180 K       Efficiency. An efficient interaction needs few resources to achieve its objectives. Use of network capacity should for instance be minimized at the transmission level, and using an appropriate compression algorithm for coding the message helps to accomplish this. At the functional level attention is a precious resource, and distractions are bad guys. Entropy level. A user action will not always give the intended result. The uncertainty of the outcome is the entropy level of the interaction. When a golfer makes a put from 1 meter he will miss one every now and then. A high entropy level makes planning difficult, and if the level is too high the interaction will appear chaotic and uninteresting. If, on the other hand, the entropy level is very low, the interaction itself will bore humans even if it produces useful results (9999 puts from 1dm). Creativity level. This measure estimates to what extent the next useful state will be a surprise. A high level of creativity corresponds to low predictability, but in this case good comes from bad. As the potential for the unexpected rises, so do the chances for discovering something new, and also the number of possibilities. To not understand something completely could increase curiosity, liveness, and attract attention. The level of creativity is usually raised with increased interaction bandwidth and complexity. Generating the next step on the dance floor is one example of a highly creative interaction. The creativity depth complements the creativity level and indicates the length of the sequence of states or transactions that can be used in a prediction of the next useful state.  History and memory. Relates to the number of previous actions, and other aspects of the interaction that can affect the current state. A high level of entropy indicates that the history is not very useful. Immersion level. This measure describes to what extent a participant experiences being there while interacting. A welldesigned interaction can give a high immersion level, even if the interaction bandwidth is low. Immersion level also measures whether irrelevant context is screened from the interaction. Typically immersion is discussed for virtual reality, but advanced technology is not necessary, an interesting discussion in a chat could also screen the rest of reality out. A related concept sometimes used with the same meaning, as immersion is presence. Affection level. If you run into a wall, the wall will not think much of it. Affection level measures to what degree an action from one interactor changes the states of the other interactors. Human interaction has a potential for high affection levels. We tease, hurt, and excite each other using words and actions. To automatically characterise and estimate emotions is currently an active area of research. Hakan Gulliksson 181    Satisfaction. This is another of the concepts without consensus on the definition, and yet everyone roughly knows what it means. It is an emotional and subjective response to a stimulus where the response is relative some expectation, or fulfils some need, of the receiver (puh!). Satisfaction is a sense of contentment when expectations are met, or even exceeded, and it has a significant social dimension. The receiver wants to feel privileged and chosen relative to the rest of the world. Things and Information do not care much for satisfaction and it is difficult to measure objectively. "Excited," "euphoria," "thrilled," "very satisfied," "pleasantly surprised," "relieved," "helpless," "frustrated," "cheated," "indifferent," "relieved," "apathy," and "neutral" Range of intensities, Joan L. Giese and Joseph A. Cote Cohesion and correlation. Cohesion describes the level of correlation among interactors over time. Strong cohesion implies high bandwidth, and short communication delays. A bicycle race has a high level of cohesion. All participants start at the same time, follow the same route, and finish at the same place. Correlation is a mathematical measure of the co-behaviour of the involved interactors. Coupling. Measures the number of the channels between the interactors. A crowd of people all taking to each other at a party is one example with quite a high degree of coupling. Figure V.9.1 below shows three different structures for interaction. The leftmost figure illustrates similar highly coupled computing components that work in parallel. With more heterogeneous interactors we can have loosely coupled groups, and the third figure, to the far right, shows autonomous interactors, loosely coupled, that cannot be assumed to share common objectives. Multiple coupled interactors acting in parallel quickly increase the complexity of the system. Figure V.9.1 Coordination structures with different levels of coupling.  Structure. As the complexity of an interaction increases structure emerges, or is imposed, see figure V.9.1. This happens because interactors need structure to study, or be able to manipulate the interactions. Typical measures of structure are number of layers, number of groups, and depth of the resulting hierarchy. Let us say that the board of Ericsson decides to refocus the company, from technology oriented to service oriented. The detailed implementation of this is clearly a monumental interaction. Each department will have its own agenda that will affect the workings of every group in the department. Probably the transformation needs to be performed in steps over time. Each step, and each affect, in itself is an interaction. The new service orientation will itself mean a modified structure of interactions between Ericsson and its customers. Hakan Gulliksson 182     Context involvement. As previously discussed there is usually a context that constrains the interaction, i.e. the interaction is typically an open system. An active context (environment) can even change the rules or the goals of the interaction. The representation of a more passive context can be accessed by the interactors, and thereby indirectly affect the interaction. A game of football, where the loser will be thrown out of the league if a third team looses, will be much less interesting if the third team is in the lead by 3-0. One example of an active context is heavy rain forcing the referee to cancel a game. Rain is one of the more active contexts that we can find in the soccer world; FIFA does not usually change rules during matches. Interactions are embedded in a hierarchy of contexts where each higher level sees an active context at a lower level as an interactor. In the football example the rain is part of an atmospheric interaction. The rules at that level are quite different from the rules in the game of football. Seen from the outside an open interaction could learn and adapt to a changed context. In this case we have an autonomous interaction. Duration and Extension. Every interaction has an extension in physical space or over a structure. As a special case it has a duration in the time domain. The granularity of this duration can be very small, such as in the single interaction of asking a question, or it can be an uninterrupted period of participation such as in playing a game of football. Start and stop, or perhaps more interestingly, birth and death are events important to the interaction. LAW Frequency and delay. The time between actions in an interaction is sometimes very important. Exchanging one love letter per year, or one per minute, makes a difference. If the frequency is too low, an interaction tend to be a one-way communication where all relevant context from the last action is forgotten, or has changed. The time to download a web page certainly affects the experience. Prompt delay is also the reason why spreadsheets are so popular. Alignment. Related to the issues of timing and structure above is to what extent interactions are taking place in parallel, sequenced, or are otherwise constraining each other. Two interactions can be aligned either in time or by some other structure. One example is taking turns in a conversation, and another that cars in most countries are driven on the right side of the street. The qualities selected above are not independent, and there are other qualities that can be described as combinations of them. One example is that adaptivity implies high level of complexity and creativity. Hakan Gulliksson 183 V.10 Mediation, with the help of it The result of an interaction is usually more important than the interaction itself, the interactors involved, or the means for the interaction. For most people a car is only a convenient tool for transporting people from A to B, even though some might oppose to reducing a red Porsche to a mere vehicle. Reading a book is another example where the result, in this case the content of the book, is more important than the format of the book, the font used, or the colour of the sofa where the book is read. H H If we study interaction from the point of view of interactor X in the figure V.10.1 below, what happens comes from a change in X (a modified internal state), which in turn through actions affects Y, the context, and the interaction. The interactor Y mediates, i.e. pre-interprets, prepares, and transports actions, information, humans, or things to and from context. The context can be any type of environment, i.e., other interactors, physical environment, task, situation, internal state of Y, or even aspects of X. A mediator is neither limited to a channel that only tunnels information, nor to a filter that only reduces or shapes data. But, these operations are good metaphors for possible mediations. Context X Y X and Y establish the interaction and soon become experts in it. As long as the mediation behaves orderly X might not even be aware of Y, just of the results that Y provides. But, if for some reason X and the mediation suddenly lose synchrony, i.e. the expectations of X and the results of Y somehow become incompatible, a breakdown occurs, and X needs to focus on Y, and on how Y works, rather than on the result of the interaction. One example is when you run Internet Explorer® and expect your favourite web page to show up, and instead the message The page cannot be displayed appears on the screen. This will abruptly force you into debug mode and to think about what you did wrong (or someone with a good self confidence to ask himself what Microsoft did wrong). If you tear out a page from a book, a reader will experience a breakdown as he reads past the missing page. Another example is that you ask your son to make the dishes, and he actually makes it without fuss! Really strange, an investigation is necessary. If you remove the rails for just a few meters the passengers of the train will certainly experience a breakdown. Figure V.10.1 Interactor X uses the information from the context mediated by Y. The page cannot be displayed The page you are looking for might have been removed, had its name changed, or is temporarily unavailable. Or maybe this is a joke... We are now quickly approaching the situation when we will not survive without the Internet. It is our most important tool for interaction. What is causing this trend towards distributed systems? Hakan Gulliksson 184 To start with the problems are distributed, they surface locally, in vehicles, at home, on the factory floor, and in hospitals. Problems are also heterogeneous, meaning that distributed control is best and that adaptation to the environment is needed, further localising the solution. Next, the problems are increasingly complex indicating that different expertise are needed for solving them and this expertise are also distributed. Furthermore distribution better supports robust, error tolerant systems. If we combine the locality and the complexity of problems distribution seems inevitable. We are currently building the infrastructure for distribution (the Internet) and inventing tools for implementation. The main problem facing designers of distributed systems is that context is not as easily shared as in a local interaction. Distribution gives things an advantage over humans since they can easily connect to the network. We tend to think of a thing as a single autonomous unit, because this is how we usually see ourselves. We get another perspective if we instead consider ourselves as a co-operating collection of cells. A thing could be as small as a cell, designed by man such that many of these tiny things co-operate. Hundreds or thousands of relatively limited units could organize themselves into appropriate structures. If they move around they could for instance reorganize themselves into a shape that best fits the terrain, perhaps to climb a stair. An even more visionary application are units at the molecular level that together physically model 3D data. Assume that the storage capacity was improved by a factor 1000 over network capacity. Would that change how we build and execute applications? Assume that the capacity of computer networks suddenly improved by a factor 1000 compared to computer capacity. Would that change how we build and execute applications? V.10.1 Mediation as a model After the introduction to mediation in the previous section it is time to introduce our next model. Below, in figure V.10.2, the roles played by the mediator Y are ordered according to their level of meditative power. A tool has a relatively low mediative power and is typically fully controlled by simple commands. It could be as simple as a hammer, or as complex as Word®. Its main objective is to simplify for the interactor using it (chair, positioning device, data base), or help it achieve otherwise impossible tasks (screw driver, Apollo shuttle, camera, Word®). Complexity of mediation Noisy channel Tool Medium Models are the mediating artefacts of design David Benyon. Figure V.10.2 Mediator Roles ordered according to complexity. Social actor A meditative tool is quite a broad concept so to make it useful we need to refine it into categories. There are many ways to do this, but we will follow the suggestion from [BF2] in which a tool can have the following meditative functions (X and Y from the figure V.10.1 above):   Reduction, Y reduces, focuses the actions possible for X. Tunneling, Y guides, transports X through a predefined sequence of actions. Hakan Gulliksson 185       Tailoring, Y adapts possible actions to the characteristics of X. Suggestion, Y suggests actions to X at the right time and place. Conditioning, Y teaches X by encouraging an action Self-monitoring, Y monitors X. Surveillance, Y monitors context. Rehearsability and reprocessability, Y allows X to reexamine or edit an action before or during the interaction. We will give more examples of tools in the following sections. An interactor with the meditative power of a medium provides experiences and could allow an interactor to explore cause-effect relationships. A book, or a calculator showing a range of curves, are two examples on the low end of the scale and on the other end a full blown VR-environment could provide symbolic as well as sensory data and be of any complexity. Mediation is related to the concept of immersion, i.e. the experience of being there in VR, but it is not identical. Immersion is the effect of an interaction as perceived by the receiver. Mediation is accomplished by one interactor directing output towards another. The US and worldwide number of original book titles that have been published, both in and out of print. 65 million book titles University of california A social being is capable of even more powerful mediations, which could be used to establish and maintain social relationships. Being social means that the mediator has a fairly extensive model of the other interactor and uses this model to modulate powerful representations. A social actor also could assume a role that matches the objective of the mediation. As this type of mediator is capable of creating and manipulating a fictive world it can also be classified as having a narrative intelligence, and be trusted or not. Viewing interaction as mediation by a tool, medium, or a social actor can be used to grasp the complexity of persuasive computing that will be discussed in Chapter V.15.2 [BF2]. The model above provides a framework for possible meditative roles. It is not useful to describe details of the mediation, or how it comes about. V.10.2 The medium A medium is the realisation of a symbol system (images, sounds, texts, sign languages) including its implementation and how it is used [LEJ2]. Many theories have been developed to model information and information exchange. One is the model from information theory in figure V.10.4, pioneered by Claude Shannon, and often used to describe communication between computers. Information source Transmitter Channel Receiver Destination A medium is the realisation of a symbol system (images, sounds, texts, sign languages) including its implementatation and how it is used [LEJ2]. Figure V.10.4 Shannons model of communication. NoIes Hakan Gulliksson 186 In this model, communication is seen as a method to transfer information from information source to destination through a channel. One example of an information source is an image. It starts out as light (physical representation) that is digitised into ones and zeroes. This digital representation is transmitted over a wireless network using oscillating electromagnetic fields (an analogue physical representation) to the receiver. The transmitter formats and modulates data such that the channel is used as effectively as possible. One of the problems facing the communication engineer is that both the transmitter and the receiver have characteristics constraining the possible transmission bandwidth. The eye and the fingertip allow quite different possibilities for communication. T T T T T I The channel is the media with properties that enables communication, but also with a maximum bandwidth limiting the transmission capacity. It is for instance impossible to transmit video with the same quality as television through changes in air pressure, i.e. as sound. Some channels are considered digital since data is represented digitally, but even digital data need representations in the analogue physical reality to stay alive. The physical channel has its own set of problems. One is damping and another one is that noise can disturb the channel, possibly distorting the message. The fact that the signal intensity decreases as the distance to the sender increases has major implications on human and animal social life. It is for instance one reason why a family is such an efficient organisation. The channels people use are certainly not ideal for communication. A high level of background noise, such as an ambulance screaming through the room, can be quite disturbing. Discrete digital technology such as computer networks are built to circumvent some of the problems. You rarely get a slightly faded e-mail message. Most signals in nature have a behaviour in the frequency domain such as the one shown in figure V.10.5. Relatively high amplitudes at low frequencies and lower amplitudes for higher frequencies. One example is the human speech. Amplitude Figure V.10.5 Typical frequency behaviour of signal in nature. Frequency The channels used by nature have a different shape. They typically exploit some physical resonance phenomenon at some frequency, which gives them a characteristic bell shape, see figure V. . . “t the peak frequency the transmission capacity is maximal. One example is the human ear that drops both the lowest frequencies below 20 Hz, and the frequencies above 20 KHz. Human eyesight has a similar shape. Transmission capacity Figure V.10.6 Typical frequency behavior of channel in nature with a bandwidth of B Hz. B Resonance Hakan Gulliksson Frequency 187 The bell shape results in a channel bandwidth that relates to the amount of information that can be sent. According to Shannon the channel must provide us with a capacity that surpass the demands of the signal. The resulting curve cuts off signals with low frequencies, but since low frequency indicates low information rate, not too much has been lost. High frequencies are cut off as well, and this is another trade-off. Very high frequencies imply high signal energies with high transmission costs. It also indicates problems in generating and receiving the signal. The channel could filter, or transform, a message. One example of a filter is a television set that does not allow certain programs before 5 pm. An example of a transformation from the same context is the subtitling of foreign movies. To compensate for noisy channels we add redundant information in the message. The natural language for instance contains lots of redundancy, up to 50 percent. People with different dialects saying the same word will sound quite different, but the redundancy will help us guess the right word. Interestingly this task is quite difficult for computers since redundancy in human-to-human communication is not easy to represent. Data communication between computers uses its own breed of redundancy, carefully calculated and added on purpose to counter noise. Let us list some more examples of transmitters and media:     Image: painter, water colours, light waves, museum. Music: loudspeaker, differences in air pressure, rock concert. Data item: computer application, light wave, web page. Anger: eye brows, gun, bullet, Western movie. T It can be difficult to characterise something as a transmitter, channel or medium, or as a message without going into deep philosophical discussions. Sometimes the medium is itself the message, an idea put forward by the Canadian media educator Marshall McLuhan. Think about MTV for instance, selling itself as a music video. The abstraction level at which we choose to view the message, transmitter, and the medium can differ enormously. From the lowest level where atoms collide to the highest (?) social level. At a social level you are for instance supposed to understand that an angry glance means that you are standing too close to someone else. V.10.3 Pragmatics Pragmatics is the study of language use in relation to language structure and context of utterance [ADFH]. It is a multidisciplinary study (like most other topics in this book), involving at least linguistics, philosophy, psychology and sociology. We extend our area of interest from a single sentence to the study of discourse, i.e. the study of more than one sentence connected to a system of related topics [ADFH]. It might seem that language, in a straightforward, automatic manner, can be split up into sentences and terminals that correlates to objects, structures, and meanings in the world. But, this is only true if the context is strictly limited [TWD2]! Hakan Gulliksson Pragmatics: Interpretation I of representation R of world W. (Representation, Meaning, Interpretation) <= >(Syntax, Semantics, Pragmatics) Peter Wegner, Brown university 188 The intended effect on the listener is usually one of the following [JS]:       Conative: deals with an action by the addressee as in an order or request, Do this , “nswer my question . Expressive: used to communicate states or beliefs, I feel fine , I believe in you . Referential relates to the state of the world or of third party, It is raining , He seems to be in a good mode Phatic tries to open or keep the channel open, Hello there! , Please continue , Copy that, Over , “cknowledge . Metalinguistic: information that concerns messages, or the communication itself, This is important , Say that again . Poetic: aesthetics, describe something such that it arouses the receiver. Another taxonomy of possible effects starts with two basic messages: assertions and queries. An assertion informs, it states a fact, and a query is issued to retrieve some information. From these basic types we can identify other such as; reply, explanation, command, permission, refusal, offer, promise, acceptance, denial, and so on. Conversations let us externalise questions and thoughts that we have, and as we formulate ourselves we often see things from new perspectives. As listeners we have to interpret messages, and this will tune our mental models. The process is fundamentally social and feedback is essential. Speakers constantly adapt the message to the state of the audience, and we always modulate the message, giving it a personal touch, consciously or subconsciously trying to impress the listener with our personality. Figure V.10.15 below illustrates the structure of an electronic knowledge community where conversations are taking place [JC]. Circles represent conversation topics; a large circle implies higher activity. Black filled circles indicate new topics, and grey circles shows participants contributing to a conversation. The figure in a fascinating way shows how knowledge is created and who is contributing where. “Clouds appear and bring to men a chance to rest from looking at the moon.” Basho Conversation is essential. We use conversation as a medium for decision making. It is through conversation that we create, develop, validate, and share knowledge. T. Erickson, W. Kellogg [JC] Conversational distance is maintained with incredible accuracy (tolerance of an inch) E. T. Hall Dance of life Figure V.10.15 Topics and participants in conversations. “nother illustration, the social proxy for social awareness, from the same source [JC], complements by showing activity within a topic of conversation, see figure V.10.16. A conversation is once again shown as a circle and each labelled circle denotes a participant. 4 3 1 Figure V.10.16 Activity in conversation. 2 Participant 1 is not contributing at the moment and participant 4 is the most active, indicated by the distance to the centre. From the figure we can see that someone is active, but we cannot hear what is said, much in the same way as we can see two people talking on the other side of the street. Hakan Gulliksson 189 Also, we cannot see whether the other participants are paying attention, but this is difficult also in H-H face-to-face interaction. The social information in the figures above is presented in new abstract ways. It is not a direct copy from the physical world to virtual reality. By reviewing traces in diagrams, such as the one in the figure above, we can follow the history of conversations in a knowledge community and update ourselves on what has happened. This illustrates the fact that we enter the digital world our conversations will no longer disappear. They will persist and possibly be reused. V.10.4 Social dynamics and timing We do not exchange messages randomly with other people. There are patterns and rules supported that in effect define our culture. Here we will shortly describe discourse, turn-talking and conversation. Discourse in this context, also referred to as talk exchange, is the general behaviour when sentences in a discussion of some topic are adaptively combined into a symmetric interaction controlled by rules and principles. Give me a place to stand on, and I will move the earth. Archimedes Turn-taking, also called floor passing sets the rules for who talks when. A talk exchange is started by an opening utterance to attract attention. After the opening there are different principles governing the next speaker s entry into the exchange. It is done either by appointment from the current speaker or, if there is a sufficient pause, by breaking into the conversation. If the pause is too short the interrupting speaker is considered rude, and if a pause is too long the conversation will be inefficient. One alternative procedure is to pass a token around and anyone who has the token is allowed to speak. We also use our eyes for turn-taking and this is one reason why face-toface interaction is more efficient than interaction over a voice only media. The timing in an ordinary speech conversation could be extremely tight, down to 1ms, a delay that is quite difficult to match with current technology. About one forth of betweenspeaker intervals are shorter than 100ms. This is such a short time that the next speaker has already decided what to say before the current speaker is finished. V.10.5 Meaning and inferential model We are still left with the problem of how a receiver extracts the intended meaning from an utterance or an image. The message model from Shannon, described in Chapter IV.3.1, will not do, and speech acts, see Chapter V.15.6.1 only give us a fresh view and a taxonomy for sorting utterances. This section will discuss the problem of finding meaning. Surprisingly, too much information is often a worse problem to tackle than lack of information. The behaviour when faced with the second problem is simply to get some more information. The problem with too much information is to find the correct and relevant information hidden in the false or irrelevant. A small group can handle the problem of abundant information by at least five basic strategies [AD4]. A first one is for a member to take some action. Perhaps by asking questions to, or trying to force opinions on other members of the group and awaiting the response. A second strategy is to combine information in a variety of formats, text, graphics and others, and from several sources, e.g. colleagues, family members, or Internet. Combining the different aspects or forms of information hopefully clarifies matters. The next strategy is to use contextual information, for instance by drawing conclusions on similar Hakan Gulliksson A man with a watch knows what time it is; a man with two watches isn’t so sure. Anonymous 190 events, in another time or place. A fourth approach is to carefully reason about, and contemplate everything learned from the three previous strategies. Lean backwards, arms behind neck, feet on the table, and start humming. This (creative) process can take a long time. A final strategy is to put yourself in the position of others and try to understand their reasoning and understanding. This is a good way of establishing a common ground. Language is a way of representing knowledge that has been used over generations. Such a complex system that has evolved over many years is not logically well defined. Our language interacts with the lives we live and can therefore only be understood in relation to man and society. It is continually revised and refined to support its users, and at the same time changes the behaviour and structure of the users and their community. We can exemplify the effect of evolution by the many words for different kinds of snow that Eskimos have been reported to have, and that there are African tribes with no word for green. Green is not a very useful word in the jungle where everything is green. The claim for the many Eskimo words for snow has been denied [RN], but it is an illustrative example. Furthermore, expressions are ambiguous. The listener must make use of a context for a proper interpretation and one example is that the phrase You poor thing will have a new meaning after reading this book. Even before reading the utterance is underdetermined, i.e. what is comforted is not uniquely determined. Yet another problem is that the utterance by itself gives no clue as to the intention of the speaker. The expression You must do that! might be a positive suggestion, or a direct order, according to circumstances. To really understand someone, that is to have a deep, successful, interaction, involves a common understanding of a shared context. Context of a real world conversation is extensive and makes almost any conversation stochastic and difficult to control, Even a simple word such as end has listed meanings in Webster s electronic dictionary, and close has more than meanings. Luckily, humans are good at keeping a goal-based conversation stable. The utterance This is the end , said by the sink, or while splinting a rope helps, or rather forces, the listener to make different inferences. Making sense of a sentence like This is not fair! an interpreter has to recognize the situation of the utterance, and what fair means to the speaker. The conclusion is that language cannot be understood as transmission of information only. If participants in the conversation have the same inferential model, and the same contextual understanding, then the meaning can be re-generated in the listeners. The same inferential model means that both the speaker and the listener come to the same conclusion from the same facts. If given the statement I'm sure that the cat likes you pulling its tail the listener (hopefully) recognizes that the animal loving speaker could not be speaking literally, and infers that the speaker means the opposite of what is said. Two humans never have exactly the same inferential model and the same contextual understanding, even though they are often similar enough for communication. Consequently two conversations are never the same. As the individual fragments of a conversation are fused into a coherent message, ambiguity is reduced, and a reasonable guess can be inferred about its meaning. Furthermore, people remember different things, so also the memory of the last interaction will be different for the participants. Hakan Gulliksson qanuk - 'snowflake qanir - 'to snow' qanunge - 'to snow' qanugglir - 'to snow' Snowtalk in Central Alaskan Yupik Despair or gratitude? One of the few words with more meanings than the word ”thing” is the word ”anything” Comment on the following: “If a lion could talk, we could not understand him." Wittgenstein (1953) 191 If the participants use separate sets of rules, or have significant differences in their knowledge, the speaker has to adapt to the listener, and prepare the message carefully. One example of this is an adult trying to explain something to a very young, unknown, child. The child s parent, who better knows the child, sometimes has to intervene and translate. We constantly adapt our speech depending on who is speaking and listening, what the dialogue is about, where it is taking place, and how it proceeds. To the list of problems when interpretating utterances we can add others such that we do not always mean what the words say. When using irony we even mean the direct opposite. We sometimes speak indirectly, as when we say to the service man: My car has a flat tire , and what we actually mean is that he should help us change the tire, It s getting late , which in fact is a request to hurry, and The door is over there , asking someone to leave! To top all of the above there is humour. Other important presumptions on the speaker are sincerity, and truthfulness. How is a computer supposed to make sense of this? The solution is that it has to understand, and use more of, the context, specifically human intentions, and that it has to study the human language. Thinking about interpretations has resulted in the theory of Hermeneutics, where one dispute is whether a text has a meaning independent of interpretation, or if the meaning is created while reading [TWD2]. This distinction is important because if each individual always recreates meaning, all individuals in a society have to learn to interpret language in the same way. Otherwise society will not work. The focus will be on dynamic interaction rather than static representation. The importance of interpretation has been taken even further by the philosopher Heidegger who claims that existence is interpretation, and interpretation is existence [TWD2]. We do not even exist if we do not interpret! Egyptian hieroglyph for movement, walking. The overall trend is that more and more time is spent on reading and updating information, and that update is continuous rather than at any specific time. Examples are numerous, ICQ and similar on line chat programs complement e-mail, television sets in every room show the news all day long. Infrastructures support the trend, telephone modems are exchanged for “DSL, always on line , modems and mobile phones now have stand by times of several days. Why this enormous appetite for information? V.10.6 Infrastructure of H-T, H-I interaction The current infrastructure for HCI will soon change. The PDA (Personal Digital Assistant), also called the PIM (Personal Information manager), or the PID (Personal Information Device), and the mobile phone are the predecessors for more advanced wearable, networked, and computerised tools. The main technical challenges are limited computational resources, low communication bandwidth, memory limitation, power conservation, and raised radiation levels. Hakan Gulliksson T/I H T/I 192 It is interesting that many infrastructures seem to develop towards distribution. The fixed power outlet is more and more distributed using advanced battery technology, wireless networks are soon more prevalent than their wired ancestors, and computing is no longer performed by one central resource. A long-term vision is the distributed wearable computer, built by wearware, communicating over a high bit rate body area network. It is so small that it can be integrated into clothes and jewellery, and it uses sensors and effectuators all over the body. The intimacy with the user opens up some interesting possibilities. First, the computer could more easily learn behaviour from its owner, a single individual. Second, the close synergy between the computer and the user makes it easier to protect the system from attacks. Water anytime, anywhere. Camel association to ∞*∞ Cheese! The criteria s for identifying a wearable computer are    It may be used while the wearer is in motion, or is doing something else. It exists within the corporeal envelope of the user, i.e. it should be not merely attached to the body, but become an integral part of the person's clothing, see illustration to the right. It must exhibit constancy, in the sense that it should be constantly available. Wearable computers introduce new ergonomic problems. Wearability, i.e. the physical shape of the wearable, and how it interacts with the human body is important for equipment worn throughout your life. We have an aura around our bodies that the brain will perceive as part of the body and where permanent equipment could be placed. We all have different physical forms and dimensions that the equipment must adapt to, and the equipment must not be too heavy. It should not dissipate disturbing heat, but look good, and somehow get access to energy [FG]. New technology also has to be socially acceptable. If the telephone earphone-microphone unit is so small that it is invisible, then the socially accepted behaviour could well be to insert a finger in the ear to indicate a telephone conversation. What time is it? Do you want to use a graphical user interface based on Windows® for your wearable computer? Alternatives? The answer might be right before your eyes. At the same time as we dress up in computers, the computers will invade objects in our environment. If you want to enter text you use any keyboard nearby. The feedback from the system will appear on your PDA, or on any other appropriate display. You could move the dials of any clock to set your alarm clock at home. The possible interactions are unlimited in a world filled with things that continuously look for other things and combine with them into more powerful interactors. Hakan Gulliksson “If every work place had a gravity invertor, productivity would be greatly increased and repetitive strain injuries would decline instantly." Only 99.95$ 193 Embedding the user interface into the environment rather than carrying it around has many implications for the user interface. The availability will be better since wireless networks and small battery based power sources can be avoided. On the other hand, users might not trust an embedded ubiquitous system, and privacy is certainly an issue with public displays. The interaction between a computer and a human user could be described in different ways, see Figure V.10.22 [SM]. Interaction shown in part a) is the basic form that we have discussed. Part b) of the figure shows the typical situation where a human accesses context and uses the thing as a computational device, or as external memory. In interaction c) the thing works as a filter, and in the last interaction, d), we could say that the thing uses the human as a resource. Interactions b) and d) are complementary views, either H uses T, or T uses H. a) H b) T c) What behaviour from your wearable computer do you prefer? Should it just passively filter data, i.e. without personality, or should it be your personal attendant to which you direct explicit questions and discuss interesting issues, i.e. a light schizophrenic touch. H T d) T People live in their environment, they don’t use it E. Stolterman Figure V.10.22 Different interactions possible between H and T [SM]. H H T All interactions with things need not be computer based, compare the paper and the pen. Adaptive materials for specialised tasks will be available and there are all sorts of hyped technologies, micro mechanics, nanotechnology, and others that eventually will deliver. Money makes the world go around and it is also needed to build, and maintain new infrastructures and services. One question is how users (I and H) should be charged to support further development (and generate the necessary profits). Charging by network service is acceptable if the user understands the relation between the value of the application and the money charged. One example is to let the user pay per byte received and sent, but this is a relation not always easy to understand. Charging by time is the traditional telecommunication solution to charging. As a principle it is simple to understand, but no one wants to pay when doing nothing, which means that most of the computers connected to the Internet will be turned off most of the time, maybe not the best scenario for new services. The third option is to charge users depending on the applications used. Now the problem is how to compensate the network operators; they need money to improve the network. Maybe a combination is the best solution, or maybe some other solution, such as a governmental support? Hakan Gulliksson Shape memory alloy is malleable at low temperatures, but above a "transformation temperature" it becomes hard to deform and pushes back to its predetermined shape. The Chinese University of Hong Kong 111 194 V.10.7 Screen or paper as the medium It is somewhat unfair to compare reading from paper to reading from a display. Paper print technology has evolved for 500 years, from hand printed characters to standardised Times New Roman in millions of documents. Line separations, margin widths, sentence length, and many other aspects have been optimised for paper over the years. But, perceptual and cognitive experiments show that if the conditions are the same (good lighting and contrast, equal resolution) there is no difference in reading speed or task completion. Today, paper is still superior in resolution, but this is only a technical problem. Some potential problems with reading from a display are that fonts can be poor on low resolution displays, one page of paper needs several screens, and that you have to sit in front of your bulky display, which is both an ergonomic problem as well as a practical problem. Do you really want to read your good night story from a desktop computer? Virtual version Paper version On the other hand, an electronic reading board, e.g. a Tablet PC, has all the important features of the book, for instance page flipping possibility, and a comfortable format. It also has some extra features from being a virtual rather than a real book. You can add, and edit, personal notes, ask the board where you stopped reading yesterday, read in the dark (!), and even change to another book by a short command. Of course there are some less wanted side effects such as that the board switches itself off when battery level is too low, and that it has a smell of plastics rather than of paper. Technology development also explores the complementary possibility, i.e. to make the paper more flexible. New types of paper have recently been invented where the paper can be electronically updated. You can use the same newspaper every morning, but with a different content electronically updated every night. Imagine the applications you could build if technology allowed smooth transitions from paper to digital media. You use a paper based book when a digital environment is not available, e.g. on a sandy beach, and switch seamlessly to a digital representation when the environment allows you to, and it is favourable. There are several interesting research projects aiming at such a smooth integration of paper and digital information. In one project a real desktop is used as a computer screen and a video projector generates the output images on the desktop, for example an electronic version of the paper book you read when at the beach [HK]. A video camera collects information about real and virtual objects on the desktop and enables direct manipulation by analysing hands and fingers found in the video. A problem with the idea is that you have to clean your desk. V.11 Interaction control, a joint venture or one of us in control Many systems only have a single controller, an algorithm, a finite state machine, boss, or a policeman. In general however, an interaction involves two or more participants, each potentially capable of controlling the interaction. If there is a clear objective for the interaction along with some rules to follow, and if the participants agree on the rules and the objective, these constraints might work as a control. A game of soccer is one example where two teams with 22 players agree on rules and a goal. Actually three goals, but we leave the two with nets out of the discussion. The goal here Hakan Gulliksson No control 195 is usually to have fun, get some exercise, and to entertain the spectators. It is rare that one member of a team takes the ball and tries to score with his hands. It is also rare (at least for grown-ups) for someone to take up the ball in the middle of a game and leave the field. The interaction, in this example a game of football, does not follow any predefined agenda. It is impossible to predict that the result will be 1-0, and that the goal will come from a free kick. No one is in complete control of the match; all 22 players collectively control it. This interaction in other words is an example of distributed control, where peers communicate and decisions are taken locally. Distributed control is nicely exemplified by any H-H interactions between equals, where each of the interactors is autonomous, and intelligent. Such an interaction can be labelled as a cooperation, competition, or as a mixed-initiative system, and will be further discussed later in this chapter. As a special case, one or more of the participants act as subordinates. They can for instance have very simple behaviour, such as elevator push buttons. This leaves another participant in control, i.e. in centralised control. If you, an intelligent driver, see a car accident, you will command your car to steer away from it. Many interactions with centralised control can be described as command oriented, a view taken in the fields of human-computer-interaction and control theory. Examples of commands are pushing a button, and military commands such as “ttention . Commands are convenient, but they have some limitations. It is for instance difficult to use a single command to describe concurrent actions, e.g. command two mice, or to manage coordinated action, i.e. two different actions synchronised in time. Command based interaction will be discussed in Chapter V.15. Communication enables social conventions and other rules. The nature of the rules is a trade-off between inflicting too hard restrictions on interactors, and chaos. Imagine driving in the absence of traffic rules. Without rules attaining anything at all in a social environment would be very complicated. Information could be passed directly between interactors, or be mediated through contextual representations, as when we leave a note on the refrigerator door. In the following matrix, adopted from [HP], the ways of transferring information (indirect, direct) is combined with the possible control structures (centralised, distributed): AP reports that IBM'er David Bradley, who came up with the Ctrl-Alt-Delete key combination, is retiring. Mute Go away Buy it Pizza Parental remote Control tech Centralised control Distributed Direct information transfer Command oriented Indirect Constrained Conversation Stigmergy Table V.11.1 Different control structures and information transfer. Centralised control and direct information transfer together corresponds to a command oriented interaction. If information transfer is indirect rather than direct, the distinguished interactor, who is in control, constrains other interactors by changing exogenous context representations. A parent does this as he dad tired turns the electricity off in the whole building when the kids refuse to leave the PC and go to bed. Direct distributed interaction, for instance a conversation, gives us humans much pleasure. Hakan Gulliksson 196 Stigmergy, or indirect distributed interaction, relies on actors changing the context and perceiving the changes. When there is no more milk in the refrigerator you know that your children have been at home for a quick afternoon snack, and that your next mission is to restore the level of milk for dinner. If you yourself also wanted a glass of milk the stigmergy specialises to competition [HP]. V.11.1 Coordination All interaction involves coordination where an actor communicates and synchronises its actions with other interactors, or more generally with its context. Heavy traffic in a roundabout is a good example; a driver entering either has to wait for a long safe access slot, or somehow has to communicate with the cars in the roundabout telling them I am coming in, move! . “nother example of coordination is playing a game of chess, where coordination is based on very exact predefined rules. Any coordination is initiated (drivers enter the roundabout), one of the interactors makes a proposal for adaptation (waves a hand), a decision is taken whether to go ahead (break or speed up), execution (iiiiiiih, screeech), and evaluation (puh!). Wikipedia – The free encyclopedia Definition: Coordination is the process of managing interdependencies between activities. Malone There are three types of coordination, unplanned, implicit and explicit. Explicit coordination is what we usually mean with the term coordination. Interactors have goals, apply reason to their actions, and communicate. The coordination is realised by planning, by executing some predefined algorithm or protocol, or by joint intentions. Voting, a prompt for more information on the computer screen, and floor passing during discussions are examples of the control mechanisms used. Explicit coordination can benefit from mutual modelling where all participants have a model of the other participants internal states, and of their views on the current situation. Unplanned coordination occurs when actions are triggered by the unexpected behaviour of interactors, or by the environment. Examples are that it is effective to shovel snow after it snows, the web browser indicates a missing link, and that we should go to bed if we catch a severe cold. In the third variation, implicit coordination, action motivators are predictable and built into the behaviour of the interactors, or into the environment. Furthermore, direct communication for coordination is not possible, or not used for some reason. Typical implicit examples are social laws, i.e. you should call your mother on her birthday. Another example is a progress indicator that shows how long time you have left for doing something. While coordinating explicitly an interactor can communicate with the other interactors, but in implicit coordinating it has to guess their intentions. This means that a model both of the interaction and of the other participants, is necessary. Bluffing in poker is a good example. Hakan Gulliksson 197 It might seem as if humans have an advantage over other types of interactors since their communication channels have many more dimensions (speech, newspapers, gestures), and since they are part of a society with developed social conventions. This will soon change. As things hook up to the network they will gain access to information via other things and all this information will give them opportunities for interaction that are not available to people. If people want to keep up with the machines they too will have to hook up to the network. Coordination implies a sense of presence, and communication, but also reuse of the taxonomy and models from Chapter V.10 on mediation. We for instance discussed different complexity of the mediation, as a noisy channel, a tool, or even as a social actor. Also, the characteristics of interaction discussed in Chapter V.9 are applicable to coordination. Related to coordination is correlation. It is a mathematical measure that indicates the co-variation of two processes, for instance the mathematical models of two interactors. Without correlation there can be no interaction, but with correlation we still can only presume interaction. Correlation does not guarantee causality, but hints at it, a fact exploited in the presentations of many fishy statistical investigations. The difference between correlation and coordination is that correlation is only a statistical measure of non-independence while coordination involves communication between an actor and its context. I reached out a hand from under the blankets, and rang the bell for Jeeves. “Good evening, Jeeves” “Good morning, sir” This surprised me. ”Is it morning?” “Yes, sir” “Are you sure, it seems very dark outside” “There is a fog, sir. If you will recollect, we are now in the autumn – season of mists and mellow fruitfulness” “Season of what?” “Mists, sir, and mellow fruitfulness” “Oh? Yes, yes, I see” P. G. Wodehouse V.11.1.1 Mechanisms The mechanics of coordination can broadly be characterised as sharing and transfer [DP1]. In sharing there are different actions to obtain a resource, reserve it, or protect a resource already obtained. Transfer is accomplished by handoff or deposit. Handoff is a synchronised transfer of a tool or an object. A birthday gift, or using a protocol to update of a virus definition list are two examples. A deposit on the other hand is an asynchronous action where something is left in space or time for someone or something else to pick up. Sharing always has a built in potential for conflict, which transfer does not have. It is for instance not always easy to share an alarm clock. When machines communicate over networks they use protocols, i.e. predefined control and data units, which are exchanged according to predefined rules. Humans also use protocols but with the difference that for machines all the rules have to be rigorously and explicitly defined, and followed. A little slip from one of the participating machines will completely destroy the communication. Technology, as we know it today, obey the rules, but this is not true for humans who constantly, many times joyfully, tweak rules. Protocols, in turn, needs mechanisms on lower levels, for instance allocation of time slots, or conventions for when to speak. Two mechanisms for the necessary communication are selective messaging and information sharing. By selective messaging we mean explicitly sending a message to a specific receiver or group of receivers. We can use speech, writing, gestures, or a number of different media supported by technology. A prerequisite is that we need to address the receivers and this can also be done in many ways. One is by just looking at the receiver, another is by using a phone number. Information sharing on the other hand is indirect, the message is left in, or transferred to, a common medium without specifying the intended receiver. Message Hakan Gulliksson 198 boards, overhearing a conversation, wall graffiti, and dogs marking their territory are some examples. Selective messaging allows for efficient feedback, because it is easy for the receiver to acknowledge the message. This is exploited on the Internet, and elsewhere, by the client-server protocol. One example is when a client requests a service, e.g. a web page, from a server. Both selective messaging and information sharing can be used in a producer-consumer protocol. Here, feedback is not necessary; a one-way communication channel is sufficient since the producer only starts the consumption, see the figure below. Client Server Request Producer No feedback  Consumer Invocation Response Figure V.11.1 Two patterns for communication, Client-server and producer-consumer. Another important prerequisite for communication is an infrastructure, at least if we want to distribute the message over a large population or area. Today the developed world has plenty of efficient infrastructures; newspaper distribution systems, television, and the Internet are some of the most prominent. V.12 Co-operation, we are all in control Because of co-operation more can be accomplished, but with the resulting coordination, we face the problem of conflicts. Accepting this trade-off is necessary, because without interaction, individuals are detained of adaptive capabilities and cannot reach their full potential. Co-operation might even aggravate conflicts. This paradox can for instance be seen when a war is expanded by different pacts and agreements. Other variations of related complex interactions are obstruction, exciting, agitation, threatening, and finally parasitism where the benefits that one participant achieves come at the expense of all others. Necessary conditions for co-operation are joint intentions. These are social commitments used for explicit coordination. The interactors publicly state intentions, and other interactors can use their statements for coordination, If you promise to stop working before midnight there will be some tea prepared for you when you finish . Publicly stated intentions provide stability in the world since interactors do not have to re-evaluate everything all of the time. But, this stability has a price; it is essential that enough interactors are sincere. Interactors commit to the overall objectives, as for instance all players in the Swedish national football team commit to win the next match. When circumstances change commitments sometimes have to be dropped. Conventions identify when to do this and which commitment to ignore. No player should give up a match just because England starts out with an early goal. On the other hand, all players should accept the result as the final whistle blows, even if Sweden should loose the game. (Which they won t . In our human society conventions are in the form of norms and Hakan Gulliksson Definition: Convention - A practice or procedure widely observed in a group, especially to facilitate social interaction; a custom Language itself is nothing more than a convention which we choose in order to coordinate our activities with others [MW3] 199 social laws. They serve as patterns that simplify decisions on behaviour in everyday situations. Note that a convention is a coordination mechanism. A difference between co-operation and coordination is that we are forced to inspect the interior of an interactor to determine if co-operation is taking place. If your children ate all of the candy it could be because they know you are on a diet, i.e. they co-operate with you, but perhaps it is more likely that they eliminated the risk of loosing some, by eating it all. For co-operation, once again the three levels of modelling: intentional, conceptual and physical can be used [DB2]. An intentional view of the cooperation is about its purpose. Why is it taking place? Do all interactors share the same objective? The conceptual view considers information flow, information structure and possibly mental models of the interactors. What kind of interactors are involved? What are their relationships? How can their interaction be characterised? Could we use an auction to sell this merchandise? Similarities to other cooperation? At last the rubber meets the road at the physical level, where the appropriate view may be physiological, biological, or focus on a computerised implementation. We can take the teacher-student interaction in a learning situation as an example. In this example the topic taught is the IF statement in Java. “n intentional view will focus on the interactor s objectives. Hopefully they both share the same objective that the student wants to learn what the teacher teaches. At the conceptual level the student knows that the teacher has valuable knowledge about programming languages, which in this case will be transferred in a classroom situation. At the physical level the teacher will use a blackboard to write Java code as the means of expressing his conceptual view. Definition: Norm - a standard, model, or pattern regarded as typical: Nilikuonyesha nyota (mwezi) na uliangalia kidole tu. (Swahili) I pointed out to you the stars (the moon) and all you saw was the tip of my finger. (English) Co-operation is necessary because no single node has sufficient expertise, resources and information to solve a problem. E.H. Durfee Co-operation = interactors acting in parallel + coordination of actions + resolution of conflicts Jacques Ferber Interactors in a system have objectives, need resources, and possess skills. Table V.12.1 below shows different situations involving the interactors. We will not go over all the possible permutations, only provide some examples. If the interactors have compatible goals, the resources are sufficient, and the interactors by themselves have insufficient skills we have a good opportunity for distributing the work. The condition that all interactors share the same objective is called the benevolence assumption and greatly simplifies the task of the designer. Complicating the problem is that decisions often must be made at run time, i.e. we have a dynamic system with real time constraints. For moderately complex systems such behaviour is impossible to hardwire at design time, which means that the interaction has to adapt. Hakan Gulliksson 200 Goals Compatible Resources Sufficient Skills Sufficient Compatible Compatible Sufficient Insufficient Insufficient Sufficient Compatible Insufficient Insufficient Incompatible Sufficient Sufficient Incompatible Sufficient Insufficient Incompatible Insufficient Sufficient Incompatible Insufficient Insufficient Types of situation Independence (Indifference) Work in parallel Obstruction (Co-operative involvement) Co-ordinated collaboration Pure individual competition Pure collective competition (Antagonism) Individual conflict over resources Collective conflict over resources Table V.12.1 Aspects of goals, resources and skills give different types of interactions. If, as in the fourth row in the table above, resources are also scarce, as well as skills, one way is to somehow multiplex the actors. This is exemplified by the assembly line where it would be too expensive to have a group of employees working in parallel at each stage of the production line. The solution is to train experts to perform one stage each in sequence as the product passes by. Try to think of examples yourself for the other combinations. Which row matches a high jump final in the Olympics? A boxing game? A war for more than 50% share of the market between Coca Cola and Pepsi? Queuing because the road is not wide enough? Goals do not necessarily have to be incompatible even if they are not shared. We have many other cases where goals interact. If one actor achieves its objective this might partially fulfil the goal of another actor. One example is the birds that clean the teeth s of crocodiles why crocodiles don t occasionally swallow a bird is not clear . In a house building competition in Los Angeles the winning team completed the house in four hours! The team was composed of 200 builders. [DHA] V.12.1 Measures of co-operation How do we measure the amount of co-operation, and why? The first of these questions is typically what an engineer would ask, and we can use the previous section on co-operation to formulate some qualitative statements [JF].   In any co-operation the addition of a new interactor should increase the performance levels of the group. In any co-operation an action by an interactor should reduce actual, or potential, conflicts. At the same time as a measure gives information on the level of cooperation, it estimates problems in a group of interactors. No co-operation at all is usually an indication that something is wrong. The following quantitative measures of cooperation can be used [JF]:    The number of adjustments to actions. The degree of parallelism, which depends on the distribution of tasks and on their concurrency. The amount of resource sharing. Hakan Gulliksson If you have an apple and I have an apple and we exchange apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas. George Bernard Shaw 201   The level of non-redundancy of actions, co-operation is characterised by a low rate of redundant activities. The number of blocking situations. What is measured is either how well a system of co-operating interactors works as a unit, i.e. how well it uses resources, or to what extent the system avoids the implicit problems of coordination. V.12.2 Mechanisms for co-operation We will now discuss some mechanisms that give rise to, or are results of, co-operation: grouping, specialisation, and resource sharing. We start by noting that a necessary prerequisite for all of the mechanisms is communication. We can increase the possibility of co-operation by grouping the interactors. This is easier if we decrease the physical distance between the interactors in space and time. Grouping improves reliability by increasing redundancy, and also enables parallel work on local problems. But, the result is not always altogether good, suburbs for instance leads to congested traffic that no one likes. With grouping comes a possibility for specialisation. It improves the solution of similar recurring problems, because expertise is reused to increase efficiency. But, with more autonomous specialised individuals, vulnerability increases. Everyone, or no one, wants to follow a particular specialisation, and some cannot, even if they want to. It is difficult to be both a specialist and a generalist. A specialist therefore needs to be either highly adaptive, or depend on others and less capable of surviving alone. As specialisation increases, compelled by efficiency and productivity demands, specialists will be more and more confined to their speciality. The surrounding society will become more and more dependent on the specialities of the experts and this has been, and will be, exploited by the experts. Hospital Umeå University It is hard to argue when you are alone. Figure V.12.1 Generalisation and specialisation. Generalisation Specialisation Sharing of tasks and resources is both a cause for grouping, and a result of it. Supply and demand, centralised or distributed control, and problems with coordination are some of the issues involved. Sharing resources is also a typical computer science problem, complex software accesses all sorts of resources, hard disk drives, and shared data and code segments, allowing only one program at the time to access the resource. Task sharing also implies problem decomposition and allocation of subproblems to actors. One-way to do task allocation is to publicly announce the available subtasks and have an auction where interactors commit for the tasks they select. Hakan Gulliksson 202 V.13 We compete, and compromise Competition is a type of interaction. Perhaps not as well behaved as cooperation, but found everywhere, and as we did with coordination we can describe it as implicit, explicit, or unplanned. Implicit competition is based on behaviour built into the interactors, or the environment. One example is competing for a place in a soccer team. There can be only 11 players on the field. Unplanned competition is when you get a call from the television station inviting you to compete in University mathematics with a 4-year-old for a nice red Porsche. Another interesting example is when less well-behaved people try to jump a queue. A queue is typically an explicit coordination, but the annoying ones turn it into an unplanned competition. Explicit competition is exemplified by all sorts of games where the interaction rules are clear and the players known. Apart from an occasional runaway train, a few overheated nuclear power plants, and a rocket exploding because of numerical conversion error humans do not seem to be heads up against I or T. Conscious competition has so far only been seen in films such as Terminator. But very little gossip is malicious – only about 5%, if that. And gossip does I think a lot of good, it’s an important way for keeping us all honest. Professor Nick Emler, Oxford university Machines and software are however designed and used for someone s purposes, not always compatible with yours. The computer virus is an interesting example that forces you to spend money for a close to everyday update protection. Because I and T change the human condition and behaviour they still can cause conflicts. They are limited in many respects that certainly will affect and constrain humans. Why do you for instance call a number and not a person? The unique characteristics described in Chapter IV.19 can be exploited typically with goals often related to increased efficiency. Speeding up travelling is one example, which is done at the cost of missing sceneries. You will only see a boring (safe) highway. Another example is software agents buying stocks faster than humans can manage. Automatic machines are (so far) typically designed for deterministic tasks. When people work together with such systems the demand for predictable, repetitive action will afflict also the human tasks. Decision speed increases, tolerances shrink, and quality spells increased productivity. Furthermore, coordination and communication must be done on conditions optimised for the computer. We need oil. Or is it the machines that need oil? To continue the bad news technology threatens jobs. Many simple tasks, such as making a cup of coffee can be done by a machine. There are also new jobs created with the introduction of advanced technology, but they are different, and often amounts to controlling the behaviour of the system at a higher abstraction level. Compare digging a hole with a spade to using an excavator. There is a substantial difference in skills needed, in economical investment, and demands for efficient use of the respective tool. Conflict resolution is a good reason for interaction; if conflicts arise they must be resolved or ignored. Some useful methods for resolution are arbitration, justice, negotiation, bilateral agreement, laws, regulations (which can be compiled or hardwired into things, but broken by humans). For software, prioritisation is often used. If two programs do not agree, the programmer or the operating system assigns them a priority, i.e. a Hakan Gulliksson 203 number, each. The code with a higher number gets the favour and is granted access to the resource. Alternatively each program could get access to the resource in turn for a finite amount of time. This idea is referred to as time-sharing. As discussed above time-sharing will be a problem when humans and machines cooperate because of their different characteristics. To strike a more positive note improved technology should make it possible to adapt machines to us, rather than the other way around. Even without such adaptation we have used machines for hundreds of years and managed without too much problems. For a small example of how new technology changes our lives consider TV-news. You do no longer have to watch it as 19.30 sharp. You can record it and watch it any time you like. Soon you will watch it over the Internet, and select the news items that suit you. Living in general and competition, in particular, force interactors to trade off alternative paths of actions, i.e. to make decisions. There are several interaction strategies for decision making [SS]. The most obvious strategy is to follow a plan, and if we do not have a plan we make one. A more direct way is to match a goal using a set of currently available affordances and knowledge about effects of local actions, e.g. knock him out to win the fight. A third strategy is to use a history based selection of decisions, such as This worked the last time , This usually works , This works every time . Decision making becomes difficult in interactive situations when we have to consider also what other interactors will decide. It is especially complicated when noone is willing to give information away. If we have a competitive situation we still can reach an agreement by using negotiation, or argumentation, which in turn need protocols. Representatives for different nations sit down at the negotiation table and try to find agreements, but first they have to agree on the rules. Who is allowed to talk when, and for how long, i.e. they agree on the protocol. Argumentation is a way to gain an edge in a competition. One actor tries to convince another using logical, emotional, visceral, or kisceral arguments. Visceral arguments are physical, for instance an applause to support an idea. A kisceral argument appeals to the intuitive, mystical, or religious as in “ccording to the ”ible … . Eeenie meenie mini mo … The Sanskrit word for "war" means "desire for more cows." For very simple negotiations auctions are suitable, but they can only handle allocation of goods. The bidder with the highest bid wins the merchandise. If the negotiation is more complex, perhaps there are several issues that have to be balanced, an auction is not flexible enough. We can still use a simple protocol, such as a series of rounds, with every actor making a proposal in each round, but we have to find some strategy for what proposals to make. The proposal chosen by us of course depends on those made by the other negotiators. When you buy a car you have to consider price, milage, colour, conditioning, and also a salesman that keeps adding new arguments. It is sometimes a problem to find a rule for when the deal is closed, and for a highly complex negotiation it is even difficult to agree on what the negotiation is really about. ”orders, settlements, weapons, terrorist definitions … Hakan Gulliksson 204 We can use many criterias to evaluate a negotiation. Pareto efficiency measures the global result of a negotiation. A negotiation x is pareto optimal if there is no solution x such that at least one actor is better off in x than in x, and no agent is worse off in x . “nother criteria is that all actors should gain something by participating in the negotiation, i.e. rational behaviour is assumed. In strategy, it is important to see distant things as they were close and to take a distant view of close things. Shinmen Musashi No Kami No Genshin, also known as Miyamoto Musashi from the book ”Go Rin No Sho”, 1645 V.14 Computer-supported co-operative work We can study computer-supported co-operative work (CSCW) at different levels of detail, e.g. as H-I-H, or H-T-I-T-H. The technology for CSCW, also called groupware, steps in between humans, borrowing interaction metaphors from formal meetings (video conference), and from the telephone (video telephone). As the supporting infrastructure evolves CSCW will be more and more important. It is an accepted truth that human communication mediated by technology has a lower communication bandwidth. But is this necessarily true? Let us say that you want to communicate with someone in the dark. In this situation you could use an infrared camera to enhance the bandwidth. In fact, by using technology any human-to-human communication channel could be enhanced! Information display of previous experiences and automatic matching of interests are some possibilities. “If, as it is said to be not unlikely in the near future, the principle of sight is applied to the telephone as well as that of sound, earth will be in truth a paradise, and distance will loose its enchantment by being abolished altogether” Arthur Mee, 1898 [JC] New technology will necessarily change behaviours, but the effect of introducing a groupware system could take days, weeks, or even longer before it is seen. It for instance took mobile phones 10 years to become a commonality in Sweden. Considering how long it takes for a technology to change the behaviour of grown ups it is strange that very few research papers study how interactive systems affect the behaviour of children. One example of changed behaviour is reported in [WM]. The commons of EuroPARC office could be seen over a local network. The effect of this was that people showed up in waves at meetings. When three or four were seen, the rest of the participants appeared more or less at the same time. They used the video to optimise their time. Another example is that answering machines at one time were considered rude. A couple of years later it is considered rude not to have one! The four different player categories of role playing games are: Achievers, explorers, socialisers and killers. R Bartle For any collaborative application to be a success it has to fit into the context of the users. It must do so in many aspects, but first of all it has to be accepted by all of the users, in all of their roles. They should all gain something from the introduction of the new system. A groupware system is of no use if only one person in the department uses it. Other important aspects are:    Communication media should provide adequate support. The transition back and forth between individual work and groupwork should be seamless. The groupwork should be integrated into the overall work process. Hakan Gulliksson 205 There are some fundamental physical and social constraints that cannot be overcome by technology. We have different time zones, unavoidable delays from speed of light, cultural differences (when in Rome, do as the Romans do), differences between generations, language issues, and a healthy scepticism towards technology in general. We also have to accept inherent limitations in technology. The sender of an e-mail, for instance sitting at a stationary desktop, expects the e-mail to be read in a similar environment. But, it could be impossible for the receiver, using a mobile computer, to follow an attached web link. Chairman (far away) Bad lighting Figure V.14.1 Typical CSCW setting [JG1] Bored participant (close to camera) Remote participants (small images) The videoconference setting where two teams line up at each site encourages antagonism rather than collaboration, see figure V.14.1. This, together with the complexity of H-H interactions that we have discussed above, makes negotiation difficult [WM]. Better results with CSCW are reported when people together concentrated on the solution of a problem rather than on each other. One typical example is remote troubleshooting of a machine where video was used as an aid to pinpoint causes and follow up on repairs. ”Watson, please come here” First words by A. G. Bell on hs new telephone V.14.1 Taxonomy How does a group accomplish its task? The first thing to acknowledge is that groups are complex social systems with both internal and external relationships. There are many views possible, economical, sociological, managing, and educational. We will here introduce the TIP (Time Iteraction Performance) model by McGrath which emphasizes that a group is a social system with a purpose [JM]. The immediate future is a concentration of series of open possibilities, and the mobile phone is increasingly indispensable for arranging these possibilities and establishing priorities. Timo Kopomaa. In the TIP model groups are seen as simultaneously and continuously engaged in three activities. The first is production, i.e. getting the task done, including problem solving and task-performance. Next, member support encourages its members and increases participation, loyalty, and commitment. The third function is to keep the group together as a social unit, for instance by management. Small groups carry out the three functions in four possible modes: 1. 2. Inception (choice and acceptance of goal); a group working well quickly starts up work, and easily generate new ideas and plans. Problem solving, a good group efficiently finds the preferred means and methods. Also involves staffing, and role issues. Hakan Gulliksson Compare the four modes to the left to the action cycle. 206 3. 4. Conflict resolution, conflicting views or interests need to be resolved, for instance in work assignments and preference resolution. Execution (implementing the solution to reach the goal), possibly done in competition or against common knowledge. The four modes are concurrently active, but focus shifts depending on knowledge level, type of task, group preferences, available technology, and other changes in contexts. The group may be in different modes in the three different activities mentioned above. It could be problem-solving in the production function, and engaged in conflict resolution in group wellbeing and member support. The activities above are by no means static. There are continuous processes of coordination and synchronization within the group, and between the group and its social environment. Also, the group evolves over time as members get to know each other and establish routines and norms. To further complicate matters people might belong to many groups. There is no one single way from start to finish. Production Member support Group well-being t A social interaction can be characterized in at least two dimensions, space and time, as shown in table V.14.1. Asynchronous and synchronous interaction addresses whether the interaction is happening in real time, or if it is delayed. Not many people visit the site of an already finished competition where the winners have already drunk the champagne. Same place Different place Synchronous (Same time) Face to face (quite common) Information about a closing airport gate. Asynchronous (Different time) Messages on the refrigerator Electronic mail, Book Table V.14.1 CSCW, spatial and temporal view. The most difficult cell in the table to support from a technical point of view is the synchronous/different place combination. A network is needed with enough capacity to transport the information, and the delay must be kept within the hundred milliseconds range. This delay includes all processing of data by the computer, which can be substantial, for instance in a videoconference. Given more time, i.e. the asynchronous applications in the table, the demand on the technology is not as severe, but transporting a video mail is still a problem since the amount of data is quite large. A slight variation of interactions of type different place is indirect interactions where there is no possibility for the receiver to affect the sender of the message, i.e. no return channel. We take the most popular brand of some product, i.e. asynchronous interaction, and if we are in a hurry we try to find a clear path through a crowd, which is an example of synchronous indirect interaction with the crowd. Hakan Gulliksson 207 We get a different perspective on CSCW if we replace place (Same, Different) in table V.14.1 with the objects involved in the collaboration (Artefact, Participant) [RR]. Artefact Participant Synchronous (Same time) What is happening to the artefact? Who are around and what are they doing? Asynchronous (Different time) What has happened, or will happen, why, when and how did it happen? Who has done what and who are going to do what? Table V.14.2 illustrates some possibilities for a CSCW application. Let us as an illustration briefly discuss notes, i.e. artefacts supporting asynchronous and synchronous interaction. Technology is constantly inventing new ways of sticking information to a place for others to find at some other time. Less sophisticated is spray-painted graffiti and almost magical are electronic notes, where a physical reference helps the user s browser to find the right web page with local information. The information supplied by the notes could assist in way-finding, or display historical events. It might describe some interesting aspect of the local context, or be left as a message to a specific person passing by, i.e. to fulfil some social function. Passing notes in the classroom is one way to send private messages and technology could provide students with a wireless electronic equivalence. Table V.14.2 CSCW, participants and temporal aspects. Potential reference link (bar code) A permutation of the previous tables, see table V.14.3, gives yet another perspective, and now we have to consider virtuality. The table helps us not only to take the history of an interaction into account as in table V.14.1, but also to elaborate on the fact that technology overcomes physical distances. Artefact Participant Same place What physical interactions are involved? Face-to-face collaboration possible without technology. Different place Networked artefacts Table V.14.3 CSCW, participants and spatial aspects. Technology mandatory for H-H interaction V.14.2 Effective interaction Humans are embedded in social information and many decisions are taken using it. Some examples are, buying a house in the right neighbourhood, having the right, tight jeans, not choosing a restaurant that is empty, shopping at certain times just to enjoy watching other people, or the opposite, shop when no one else does. Compared to this wealth of information available in real life we are socially blind in the digital world [JC]. This section will discuss the degree of goal-attainment that can be reached when a group interacts, i.e. the effectiveness of the interaction. Hakan Gulliksson Das pferd frisst keinen gurkensalat (The horse eats no cucumber sallad) The first telephone message, Germany 1860 208 A prerequisite for effective interaction is common knowledge, i.e. a shared common ground, and a communication channel with appropriate bandwidth. Table V.14.4 below, adapted from the reference [JC], describes some characteristics of synchronous face-to-face interaction and exemplifies them in different ways. Throughout the examples the importance of a common ground can be seen. It is easier to maintain if the interactors are situated at the same place, i.e. collocated. Collocation means more possible communication channels and that you know, and know about, people around you (could be years of discoveries). How do you appreciate your co-workers pheromones over the Internet? What does an e-mail say about the stress level of your fellow worker? Common ground is also enhanced by co-reference, and implicit cues. A task that demands a high level of coupling additionally requires short feedback loops, and possibly a large number of complex messages over more than one modality. Characteristic Description and Implications Individual control Each participant can freely choose what to attend to. Familiar Identities and characteristics known for participants, collaborators their roles and their relations. Helpful for interpretation of messages and behaviour and for identifying expertise and knowledge. Rapid feedback Many communication channels with short communication delays. Quick corrections possible. Multimodal Voice, facial expression, gesture, body posture, and more. Enables efficient complex messages, and redundancy for error resilience. Fine-grained Analogue or continuous information flows. Subtle information message differences possible and information modulation possible. Shared local Participants have similar physical environments and context experience the same local situations, objects and actions. Allows for easy socializing as well as mutual understanding and learning by copying. It also provides means for a shared history. A key such as the chaotic desktop of a colleague is useful in deciding whom to discuss a problem with. How can similar cues be provided in the virtual world? Co-reference Easy joint reference to objects. People and objects have known spatial locations. Gaze and gesture can easily identify objects by pointing. Implicit cues A variety of cues as to what is going on (events and effects of events) that are available in the periphery to an individual. Natural operations of human attention, e.g. eavesdropping, provide access to important contextual information including facial expressions and body postures. Provides information to a history. Hakan Gulliksson Definition:Common ground - an agreed basis, accepted by both or all parties, Groups are inherently complex: A group has a past and a future, variable membership of more, exist in an environment (communities, organizations, neighbourhoods, kin networks, departments), tasks are related, never repetitive without variation, ad hoc,modulated by time, place and situation, and not (always) rational Table V.14.4 Characteristics of synchronous face-to-face interaction. 209 Social information is necessary for social manoeuvring, and the implicit cues support social awareness, see table V.14.4. Social awareness is important since it supports cultural rules, we know that we will be directly accountable for our actions. You can see that someone else is currently working on the same project as you are, and you know that this works both ways. This relationship is not necessarily obvious in the digital world. Through implicit cues social rules and social control is exerted and it also enables humour, discussions, and implicit learning by copying, i.e. as in the Swedish saying, knowledge is in the walls . In general the downside is that there is a trade-off between visibility and privacy. Another problem is that it will take time to introduce a new member to intricate established social cues. Most Chinese are never out of earshot from another Chinese. True? Physical contact and familiar collaborators means more and better opportunities for you to extend your social networks. It is much easier to ask someone you know about somebody else. In general knowledge formation is a social phenomenon. Figure V.14.1 Hole in space, Galloway, Rabinowitz 1980. Real-time video connection with full size images between New York and Los Angeles. New technology does not seem to decrease the number of meetings faceto-face. A typical use of a mobile phone is to discuss when and where to meet, and to keep options for meetings open. Neither does use of new electronic medias seem to replace other forms of interaction, they only complement them. Paper is for instance still used (a lot) even if most information is digital. Reviewability and revisability are characteristics of the digitised message. Writing an e-mail gives you time to formulate, and re-formulate, your anger in a more sophisticated wording instead of immediately meet person to person to make even. The fact that any message transmitted using technology is potentially stored and later retrieved is of course not only a blessing. Users concern for loss of control over unique data, and for dissemination of personal confidences must be considered if technology is to be used in the communication loop. Trust in technology? What if the videoconference where you participated (and did something you shouldn t have is on the loose, out there on the Internet [JG ]? New options on how to record and access data make it possible for us to acknowledge knowledge work in new ways by keeping track of who is making major contributions [JC]. At the same time we will be able to see and find those who are doing nothing, or are working with other things, something that affects privacy and will tend to prioritise certain kinds of work that is measurable. Hakan Gulliksson Top ten list of the places that people are most likely to gossip… At number 10 – unisex loos; at number 9 – supermarket queues; at number 8 – to their personal trainer; at number 7 – with cab drivers; at number 6 – in crowded bars; at number 5 – at meetings; at number 4 – on mobile phones; at number 3 – friends telling other friends; at number 2 – on train and tube journeys…and at number one – restaurants! Relations Analyst Stephen Forster Neither transmitters nor receivers seemed particularly sensitive to the public nature of transmissions, although they did indicate some embarrassment about the fact that speech emanated from unexpected areas of their body, depending on where they had clipped their cellular radio. This sense of body parts such as thighs or hips “talking” did not result in a change in practice, i.e., they continued to use speaker audio. A. Woodruff on Push-to-talk 210 Mobile applications add new twists to the discussion since they allow constant availability. Interrupts anytime, anywhere, by anyone will be the result. If you turn off your mobile device you might miss something, or be considered asocial. On the other hand, constant access to others gives a sense of safety and security. This is a nice feature when you travel by car in the deep woods of Sweden with a temperature outside below –30 C. Voices are distracting, too personal for communication with unknown. Dr Michael Heim, Art Center College V.14.3 Interaction bandwidth The figure below describes H-H interaction using technology from a social point of view. It grades the coupling and emotional involvement necessary (and possible) in different interaction forms, i.e. the social presence possible. Text chat is the least engaging form. You can very well participate in three different private chat sessions, all at the same time. This is not easy using face-to-face videoconference. Increased bandwidth Text chat Animated Telephone cartoon figure synthetic voice Animated virtual face, true voice Face to face video conference Close coupling is easier to maintain in collocated face-to-face collaboration. Network delays do not exist, and there is no need to learn complicated tools. A large vocabulary is available, and with a number of analogue communication channels available it is easy to exactly specify your message (well, at least relatively easy). Individual personal control lets a collaborator focus on the most important message and on the most important channel, e.g. that person over there yawned, did I hear a laugh? Maintaining a close coupling is simplified by a well established common ground since formulating messages is easier if you know how the receiver will interpret them. Close coupling in other words increases efficiency and effectiveness. This is however not always desirable because it also makes participants vulnerable, open for criticism. Furthermore, face to face communication clearly favours good-looking individuals, and as we restrict the communication channel to only text everyone who can type has the same possibility, humour will be important. In a text based interaction we give interactors the possibility to hide physical weaknesses, and allow the interpreter to build an illusion of the sender. A reduced channel can even increase the experience of using it! Visibility and audibility are lost in a two-way text chat. It is impossible to know if someone is paying attention to the chat, and also difficult to force someone else to give a prompt response. The real-time coupling is lost in an e-mail where message content and sequences have to be recalled. An answering machine is another example that also accepts asynchronous messages and leaves them hanging in the cyberspace waiting to be heard. It does maintain audibility though, and you can easily hear if someone you know well is disturbed or angry. The human voice is a sensitive instrument that displays emotions, whether the speaker wants it or not. On the other hand, a text chat makes it possible to hide emotions, but also makes it difficult to show them. Hakan Gulliksson Figure V.14.2 Interaction bandwidth for different applications. People who have established a lot of common ground can communicate well even over impoverished media… G. Olson & J. Olson Feedback, immediacy of response, multiple communication channels, multiple participants, symbol variety, rehearsability, reprocessability, Features of the optimal medium Video is a very powerful medium, perhaps too powerful Wendy Mackay 211 There are many other interactions possible on the scale. You can enter messages using voice that can be presented as text, enhanced by emotion amplifiers **. You can enter text that is presented as voice, or you can distort voice output in some convenient way. A video might be manipulated such that the individuals, or what they do, cannot be cleraly identified, but the video would still give a sense of presence. This could provide a feeling of presence akin to the one you feel from a group of people passing, chatting and laughing, outside your closed door [JC]. Computer games such as CounterStrike use voice for communication among players. This means that thousands of families will be listened to in real time, all of the time. Is this good or bad for society as a whole? Maybe this is not something more than the ordinary phone? When the phone was introduced one objection was Will anyone be able to call me up for a shilling? How could an application that facilitates communication harm teamwork? Are there new sources of friction among participants when using computer-based cooperation? V.14.4 Social quality of service As we build tools for H-H interaction we should also evaluate the social quality of service of the tool. How does a particular tool present social services? How are the users affected and what means do a user have to adapt? Some of the questions related to social services are:         Who is allowed to join? Who has joined and who has joined and left? Who is allowed to do what, with what, and together with whom? Who is doing what, and has done what, at which activity level? Who is allowed to follow the work as it progresses and to what degree? Who is following and has followed the work? Who is allowed to see the results? Who is viewing and has viewed the results? If we build systems where answers to such questions can be found and presented in useful ways, we have a chance to enhance social life, at the risk of being controlled and restricted. The figure below shows a part of a web page where the number of people currently visiting the page are shown as gauges. A fuller gauge indicates more visitors, and by using a logarithmic scale the gauge can be made sensitive to few visitors and still be able to scale to a lot of visitors [DC]. Much focus is on loss of privacy using new technology, but what can be gained? What social quality of service is possible with new advanced technology? Discuss this for an elderly user, with relatives living in another part of the country. Figure V.14.3 Visitor activity shown by gauges. Buddysync is another example where a designer has identified the needs of the users in a cultural context. It is mobile communication tool for kids that presents communication mechanisms adapted to different social environments [HS]. Close friends are always connected and their real-time status is displayed. Larger communities of friends are accessible via another, less intimate interface, and voice is used to communicate with parents. The last functionality probably added to motivate parents to pay Hakan Gulliksson In fact, the more we try to get a system to act on our behalf, espercially in relation to other people, the more we have to watch every move it makes. Victoria Bellotti 212 for the device. If we could automatically detect social context even more intriguing applications are possible. But, as noted many times before this is a most difficult task. Finding out types of relationships, subject matter and detecting subtle mood changes that affects social relationships is a real challenge. As the social aspects of life drips into digital life we will also see more of the less flattering aspects of human social life, suspicion to strangers, protectionism, and jealousy, just to name a few. These behaviours will necessitate explicit mechanisms that enforce identity and accountability for actions. The system need to detect arrival, presence, departure, activity and identity [VB]. All of them good old human capabilities. What do you do if you stumble over a personal secret on the Internet? Figure V.14.4 Friendship choices among fourth graders (adapted from Moreno, J. L. (1934). Who Shall Survive? Washington, DC: Nervous and Mental Disease Publishing Company). Videophones, or stationary desktop video at workstations, raise ethical issues, as well as practical. We want to know who is watching us, and why. Normally there should be an indication that someone is remotely using a video camera, and this will be yet another distraction, adding to the distraction from the sound of arriving e-mails plopping into the mailbox. You should at least have the option to turn the video off, and the question is whether you will ever turn it on again. The following four issues, at least, have to be considered with respect to privacy [WM]:     Control: Users want to control who can see or hear them at any time. Knowledge: Users want to know when somebody is in fact seeing or hearing them. They also need feedback on what is seen. If a recording is done and reused in another context the user should know about it. Intention: Users want to know the intention of the connection, i.e. whether the video is stored, or otherwise processed, and for what reasons. Intrusion: Users want to avoid connections that disturb their work. Warning! You are currently being watched! A picture of you, naked, gets distributed over the Internet. What can you do to remove it? On the other hand, we somehow manage to take the subway and go to the restaurant without too much problems with privacy issues. Whether you choose an audio-visual recording of your activities, audio only, or none at all depends on where you are (bathroom, library), what you are doing (giving a lecture, singing in the bathroom), the relationship to the person who will have access to the recording (everyone, your wife), what you are wearing (a smart suit, robe), if you have already recorded a similar activity, if you are hungry, you want to share your message with someone you care for, … How can an application possibly infer all of these factors? Victoria Bellotti. Hakan Gulliksson 213 V.14.5 The social-technical gap It is not possible to fully represent, and manage, the same amount of social information in a CSCW system as we do when socialising, without effort, every moment, every day. We will illustrate this by following the discussion on privacy preferences and P3P in [JC]. The idea with the Internet P3P protocol (Privacy Preference Protocol) is to create a privacy standard for the web, allowing the information owner to have detailed control over access. For P3P we are faced with a difficult user interface problem. There are several millions of users and if we assume a fine information granularity, almost any user provides thousands, probably millions of data items. How can we set the access rights for each item and keep them updated to changed conditions? We need to group data and users to keep the complexity down, but as we do this we loose control over details and will have to introduce numerous exceptions. Also, the systems that we currently use are discrete and precise which means that we will not even have the possibility to postpone the decision and stay ambiguous, which would be a typical human social solution to a similar problem. We are faced with the socialtechnical gap [JC]. Please call, but don’t expect me to answer. Figure V.14.5 Lay ”ountiful , popularity map where a large circle indicates high popularity, (Adapted from Lundberg, G. A., and Steele, M. (1938)) The problem is, it seems, impossible for a computerized system to solve even if it is assisted by a human, but it is easily solved many times a day, by any human, in a social context. The gap is a fundamental problem originating from the complexity of life itself, as indicated in figure V.14.6 below. Human society Flexibility Hakan Gulliksson Human brain Neural networks Fuzzy logic Logic Figure V.14.6 Complexity means flexibility and loss of control. Control 214 One partial solution is to build really flexible systems that learn from, and co-develops with, humans. Next generation, at least in Sweden, will have spent more time with the computer at the age of 15 than the previous generations will over their whole lifetime. The extent to which ICQ and interactive networked games are used says something about the importance of the next generation of computer support. Maybe we should aim for the S in CSCW and try to find ways in which the computer could augment rather than replace our social abilities. Technology could provide strange augmented reality support such as seeing through walls, or maybe technology for automatic blood pressure analysis of people walking by. V.15 Command based interaction, someone in control In many cases of interactions one of the interactors is in control, and the other interactors are controlled. Interaction in other words is asymmetric, or command based, and control is centralised. This type of interaction is the rule rather than the exception, whenever technology and a human user are involved, at the current state of technology. Human-computer interaction in Windows® is a perfect example. The good thing with this asymmetry, seen from the interaction designer s point of view, is that centralized control simplifies interaction. We only allow objectives from one participant to influence the interaction, and usability goals are easy to formulate, such as that interaction should be efficient for the controlling user. Centralised control does not necessarily mean that the same interactor is in control all of the time. There are techniques for rule-based distribution of the control among participants over time, such that each participant gets its control slot. One way to do this is to pass a token around, and whoever has the token is in command. Another way is to assign predefined cyclic time slots. A board of directors is assigned a time slot of a year. V.15.1 Mechanisms One possible choice of taxonomy for commands is identification, navigation, choice, manipulation, read, write, and system control. We start with identification, since it is a prerequisite for any command and used to identify representations of objects, behaviour, relationships, states, transformations, familiar spatial or temporal settings, faces, sounds, smells, or anything else that describes participants, or the current circumstances. Matching representations through recognition and identification is a cornerstone for generating knowledge. We will use identification in a broad sense here including discrimination, segmentation, and classification. Properly speaking, identification, i.e. I can see that it is you , is not the same as discrimination, i.e. I can see it's not you , segmentation, i.e. I separate hair from face , or classification, i.e. That is a typical nose . Hakan Gulliksson I will set sail chose my new course restart with ctrl /HG 215 Navigation, allows the interactor to explore a local environment and to follow a path through system representations. By exploration the user gets to know the environment, and how things are related. She then positions herself in the proper context, e.g. moves to an advantageous viewpoint or position. In a CAD program for architects, navigation involves selecting the blueprint of the right floor. Navigation also implies an intention and a target. The target can be indicated by pointing devices, voice detection, gestures, head tracking, or gaze tracking techniques. Furthermore, navigation needs some means to modify the current position and speed, by specifying direction, speed, or acceleration, and not to forget, some way to indicate stopping. Choice, is a selection among alternatives using sound, gesture, button, menu, dialog box, or any other effective method at hand. Manipulation, modifies the system. It involves modification of object representations, changing system state, or self. One way to accomplish this is through direct manipulation. Read extracts system state representations and write is a command that changes system representations. Write provides system input that will change the system state. Sequences of read and write corresponds to manipulation. The last interaction type is system control, manipulating the prerequisites for system behaviour, e.g. change environment from Earth to Mars, or load a new operating system. Starting and stopping the system, loading new modules, and saving system states are some other examples of system control tasks, which are in turn implemented by other interactions, such as choice or manipulation. System control is not available for everyone in all situations. Remember that terribly embarrassing situation? Was undo an option? Reload from a previously saved state? As you can see from above the interaction types are partly overlapping, identification for instance implies a read operation. They are also somewhat arbitrarily chosen. Navigation could be replaced by search (combined with identification and manipulation), and nothing is wrong with a command such as track . There are many other taxonomies possible for interactions, at different abstraction levels, and for different purposes. One of them, more related to information use is; creation, gathering, processing, retrieval and communication. These commands could all be expressed by different combinations of read, write, navigation and choice, but they are closer to of the user s intent and internal processing. Create can for instance be expressed as a write, but the concepts of create and write are certainly not equivalent. There are in fact an unlimited number of taxonomies possible; every verb in Webster s dictionary could under the right circumstances be used as a command. Effectuate Select direction Evaluate new position Feedback Controlled Interactor Read() Display representation STOP I T Selection – choose object from alternatives Position – Specifying a position within a range, e.g. pick a screen coordinate. Orient – Angle or 3D orientation Path – Series of positions or orientations Quantify – Specify numeric value Text – Entry of symbolic data Foley, Wallace and Chan, “The Human Factors of Computer Graphics Interaction Techniques”, 1984 Hello, and thanks for calling. Your call is very important to us and, we're sure, to all of humankind. If you would like to challenge my sincerity, press 1. To report a discrepancy between the way you planned your life and the way it's turning out, press 86. If you wish to end this call or return to the main menu, do not press your luck. You are not going back to any main menu, my friend. You have come too far. There is no turning back. You can only press on. Internet Hakan Gulliksson 216 V.15.2 Intelligent support The question of whether the user interface should have a life of it s own is raging. The main argument against is that users want to stay in control, and that intelligence at the other end distracts from the task at hand. The proponents argue that as the amount of information grows humans need support to cope. Both sides are of course right (as usual). The trick is to figure out under what circumstances support is needed, and when it will not degrade usability too much. Forcing the driver to discuss the next turn with the car, or even to beg it to turn left, is maybe not a good design alternative. Historically the ideal has been transparency. You, the user, should spend your time on the task, not on administrating the tool. At the extreme we will not attend to the user interface at all. A professional Formula 1 driver does not have the time to worry about finding the break. The driver s hands should never leave the steering wheel. This transparency is challenged by context dependency. A common example is a wireless network that at one time gives very good performance, but a moment later completely fails to deliver enough capacity. If information about the network capacity is hidden, i.e. full transparency, the user will only see the resulting strange application behaviour and will soon shut the application down. On the other extreme we can give the user all information and tools to adjust the application to the current network capacity. Once again we risk loosing the user, this time the user closes the application out of shear exhaustion after having tried to keep up with the changes in the network. The solution seems to be to give the system enough self and context awareness to manage minor network fluctuations. Mixed-initiative systems are compromises that interweave direct control and automation. The dialogue in such a system behaves similar to a human conversation. By turn-taking the interactor with the most urgent task takes control, but this is more complicated than it sounds. In a H-T setting the Thing has to evaluate the relative importance of its own task against the task it guesses what the human is solving. It also has to make social considerations. How many times has the human been disturbed? Does he seem annoyed? Is he interested in the result of the task that the thing is about to perform? Does this particular human understand the result? One important aspect of intelligent tools is that we will be reluctant to use them if we do not trust them. Will you trust a web agent with your secrets if you do not know how it works? Maybe it packetises your secrets and sends them to Microsoft? Intelligent support on the other hand can be used to enhance experiences and surprise you. You come home Friday evening and your house announces an Italian weekend. The rest of the weekend the intelligent thermostat matches the temperature in the house to the temperature sensed at a specific piazza in Naples. Adaptability was listed as one of the characteristics of intelligence in Chapter III.1. It can be provided for at many levels. At a physical level this could mean adjusting lighting for the task at hand, or automatically tilting the driver seat of a car. At a conceptual level adaptation could involve changing the user interface to match age or expertise. Adapting at an intentional level was discussed above, and is even more complicated. Most adaptations could benefit from learning about the user and the environment without explicitly requesting information, and the more the system learns by itself the more magical it will seem. Another criteria Hakan Gulliksson Systems that assume that the user has infinite attention, complete knowledge of the system, and infinite patience quickly become annoying. E. Horowitz "Anything that happens, happens" "Anything, that in happening, causes something else to happen, causes something else to happen." "It doesn't necessarily do it in chronological order, though." From a story by Douglas Adams Had to call the SmartHouse people yesterday about bandwith problems. The tv drops to about 2 frames/second when I’m talking on the phone. Today, the kitchen CRASHED. Freak event. As I opened the refrigerator door, the light bulb blew. Immediately, everything else electrical shut down -- lights, microwave, coffee maker -everything. Carefully unplugged and replugged all the appliances. Nothing. The police are not happy. Our house keeps calling them for help. From the NET 217 for intelligence was the ability to solve problems, especially problems in the real world. To achieve this we have to tackle the issues discussed in Part IV, i.e. reasoning, planning, and decision making, and also the even more fundamental problem of how to select and represent enough of the relevant aspects of our environment. There are many interactions where we really need some limited level of intelligence in all of the interactors. Imagine that you want to move a heavy patient from one bed to another in a hospital ward, and you are assigned two robots to assist you. For you to manoeuvre every movement of the robots would be very complicated. An alternative is to build limited intelligence into the robots such that they understand what they are supposed to do, and how they should co-operate to do it under your command. The problem here comes with an extra twist because we do not want the patient to get hurt (but without support you are the one who gets hurt). If we give up the goal of transparency we can instead view the application as a collaborator. We give this collaborator advices for how to behave and it learns both from the advices we give, and from how we give them. We sometimes will have to show it how to do things, correct it when it is doing too much, or done the wrong thing. If this is too annoying the system will not be used. Stupid: Slow to learn or understand; obtuse. Tending to make poor decisions or careless mistakes. Pssst.. He’s awake 06.58 V.15.2.1 Social support Should you present a computer as a person? In reference [BS1] the author states that you should not give the computer human characteristics. Rather than the computer saying, I am waiting for your input the phrasing should be Waiting for input . Other studies show that as human beings we tend to treat anything that we communicate with as human beings. Stupid machine , refers to the interface that we do not like, rather than to the designer of the interface. This behaviour is an example of how humans unconsciously map patterns from their own everyday experience onto the environment. One reason for this is that it lowers complexity since patterns can be reused. When it comes to computer applications this behaviour could be exploited to build more efficient services, or it could be ignored with costly consequences. At a perceptual level we easily recognise, or think that we recognise, a human face, such as the man in the moon. Our attention is triggered by things moving as autonomous entities, such as autumn leaves whirling in the wind, behaving as if they were alive. At a cognitive, social level, we reason about the mental state of an entity, its drives, emotions, and also attribute pain to it. We even assign personality clusters to a system, i.e. groups of traits (lazy, optimistic, perfectionist), or social roles (father, teacher, lover) [PP]. Happiness is not to be rich (even if it helps), but having wife or husband, friends, and challenging tasks to perform. How can we use this for more intelligent user interfaces? Can we make the interface a friend of the user? Do we then risk being exposed as cheaters, and that the user will end up hating the interface instead? We are now getting more and more used to, apparently intelligent, humanlike interactors and interactions, will we accept less than humanlike behaviour from the next generation of tools? Hakan Gulliksson 218 “ good strategy to use designing someone s friend is to make the interface you have designed (and yourself) irreplaceable. This is possible if you have expertise, or abilities, that the user is in desperate need of. A virus scanner should provide a great opportunity for this type of friendship. Another strategy would be to associate the user interface with something, or someone, that has the same objectives as the user. A third strategy is to include functionality or characteristics matching the users abilities or preferences. A user with a preference for symbolic manipulation can be given a symbolic, text based interface, whereas a more spatially oriented user rather should be presented with a graphical user interface. Cultural differences can be used in a similar way. A yellow and blue computer for a Swedish user? One last variation of how to acquire friendship is a symbiosis where both the user and the agent benefit from the friendship. Napster was one example, users added music, Napster prospered, and new users registered because of the accumulated amount of music. From H-H interaction we can list some criteria that indicates politeness in an interface from a users point of view [ACR]. The interface should be:             Possible to identify as an individual. Interested in you. Respectful to you, e.g. moderates its pace to yours. Responsive to you. A car that talks but does not listen? Anticipate your needs. Taciturn about its personal problems! Well informed and perceptive. Self-confident. Stay focused. Give instant gratification. Trustworthy, e.g. not share confidences without your permission. Have common sense. A computer that plays a fanfare every time you re-boot it? “The enemy of my enemy is my friend” Mao Tse-tung Let us say that you are the owner of an enormous record company. Your objective is to make sure that noone can live without music. MP3 and the Internet is the perfect guarantee for this to happen. The challenge is to avoid loosing too much money until the new behavioural patterns have evolved.. You've got a friend in me. You've got a friend in me. When the road looks rough ahead, And you're miles and miles from your nice warm bed. You just remember what your old pal said. Boy, you've got a friend in me. Yeah, you've got a friend in me. Randy Newman To have an interface that checks all of the items in the list is impossible (what a friend!). Some of the criteria are not difficult for a computer to achieve, such as keeping focused, but many other are currently out of scope, such as having common sense. The list is good to keep in mind when discussing the social bandwidth possible for computerised tools within the next 10 years. Implementing items in the list also depend on very important lower level characteristics such as timing. A basic drive for what we do comes from our emotions. They are, according to some researchers, in turn the means for our genes to make sure that we enjoy life, and reproduce. We want to feel at ease, content, pleased with ourselves, and happy. Achieving this involves pursuing subgoals, problem solving, and planning. All actions are selected and executed under emotional control, even if we ourselves claim only logical reasoning and rational behaviours. From this it seems that a user interface ultimately should take emotions into account. Signs of emotions: facial expression, intonation, gestures, gait, posture, pupilary dilation, respiration, heart rate, body temperature, electrodermal response, perspiration, muscle action potential, blood pressure. Picard Some of the feelings to consider in the user interface are, anger, gratitude, sympathy, liking, shame and guilt. They can all be discussed in a framework of favours. If you grant someone a favour then you are more likely to be liked. If you on the other hand exploit favours given, but never even intend to return them, the person giving the favours will be angry with you if the swindle is discovered. Guilt is if you accept favours, never Hakan Gulliksson 219 intend to return them, and suspect that you will be exposed. It is not hard to imagine how the user interface can be used to exchange favours and that the resulting feelings can be exploited by the user interface with the intention to make the user happy (and the developer rich). But, can we make the user trust the interface? We can take the discussion one step further; imagine falling in love with a computer! If we accept an evolutionary view of human social behaviour an average woman will look for stability, sincerity, wealth, status, or ambition that in turn will result in wealth. These are qualities that for ages have kept the family protected and well fed. The man on the other hand will look for faithfulness, to ensure fathership, and signs of fertility, indicated by age and beauty. Never trust a pretty interface. 75% of the users swears at their computers. Picard V.15.2.2 Persuasive support To start with we evolved out of the African savannah with semi-open views (seeing without being seen), green surroundings, flowers, visible horizon, landmarks such as big stones, trees for frame of reference, and multiple escape paths. Moods depend on the surroundings and a familiar, pleasant, environment will consequently help the user to relax and to do a better work. Things or applications that take the user s attitude into account and attempts to change the behaviour, the worldview, or the attitudes of the user, are called persuasive, or seductive [BF]. These features will be ever more important in an aggressive market where individualised services are increasingly important. Persuasion implies intent from the persuader or the seducer and, if the persuader is a thing, a designer, must implant this behaviour. Computerised technology can function either as a tool, a medium, or as a social actor. As a tool the thing, or information, could persuade by providing the user with new capabilities that could enhance selfconfidence, or change behaviour. One example given in [BF] is a device that gives information about the heart rate when exercising. This information can be used to set the right training pace. The same reference also suggests the following taxonomy for persuasive tools:        “Orthodontists have found that a good-looking face has teeth and jaws in the optimal placement for chewing” [SP] Reduction, persuasion by simplification. ”uy by simply pressing this button once Tunnelling, give the user a clear, easy to use, path. To continue press that button Tailoring, adapt to the current user. Hey good looking, press this button, especially made for you Suggestion, provide information and request information at the right time and place. You will win if you press that button within one minute Self-monitoring, feedback persuades to adapt behaviour. My wearable tells me that my heart rate increases when I reach for the button Surveillance, change behaviour to match perceived state. Ha, he is pushing the button. Run! Conditioning, positive reinforcement. You won, and will continue to win if you keep pressing that button Hakan Gulliksson 220 Applications using the thing or information as a medium for persuasion are easy to find. By providing experiences any commercial on a television show tries to change our behaviour. Persuasive computerised social actors that create relationships are not yet commonplace, but toy pets are getting more and more intelligent. Some of them even expect the child to take care of it. A seductive experience starts by attracting the attention of the user. To hold the attention the experience next makes a promise. This promise is what keeps the interaction alive which means that it has to be matched against user aspirations, emotional or other. The experience ends by fulfilling the promise, but could be kept alive for a long time by partially fulfilling promises. A flirt between a boy and a girl usually involves such partial fulfilments and a soap opera on television uses it to perfection. Some clues to a seductive experience are [JK]:        It diverts your attention. It surprises you. It creates an instinctive emotional response. It gives promises that matter. It fulfils some of these promises. It unexpectedly gives deeper understanding. It unexpectedly provides more than expected, i.e. it goes beyond expectations, indirectly exposing a devoted designer. There is a considerable risk to add social elements to your design. If this is done the wrong way, or with bad timing, the interface will be perceived as annoying and irritating. We are very experienced social beings and one example is that we do not like to chat when we need to be efficient, i.e. getting the job done is not a social event [BF]. V.15.2.3 Narrative support As humans we expect our fellow interactors to have a narrative intelligence. This means that they should remember the interaction, and have a local model of what has happened in terms of human interaction. If computers could tell stories, have a personal history, and recognize the narrative structure of other interactors we would be much more at ease with using them (Chrystopher Nehaniv as cited in [CH]). A computer is designed and programmed to change behaviour under pre-defined circumstances. This means that it, from the point of view of a human user, suddenly might change personality, which is quite annoying. Narration is a good example of interaction because the narrator is trying to create a state of mind in the listener. Originally, story telling was H-H. It was extended to H-T-I-T-H when printing was invented and now we enter the time where I-T-H is possible, i.e. the story can be told (and generated) by a program. We do not yet know enough about story telling, and do not have the right tools, but there is nothing in principle to keep us, or rather information, from doing it. Do you agree? Further down the road are listeners from the species of thing. Hakan Gulliksson In thunder, lighting or rain. First line of Othello by Shakespeare (my emphasis). 221 If feedback is available the response from the listener can be used to change the course of events in the story, or the way the story is presented. This will make it possible to create totally new types of experiences. Any object in a story could be activated and reveal new trajectories in the story space [DABB]. ® V.15.2.4 Calm technology Always having to think about how to use the computer, or any other service, is tiresome, recall driving a car for the first time (or reading your first page of text). Every interaction initially has to be consciously thought through and learned. We will reduce this cognitive workload if the car finds out, and performs, as many actions as it can, by itself. If the road is blocked, the car should stop. The term calm technology tries to accomplish this by using ubiquitous computing where computation most of the time is performed in the periphery of the user s attention and sometimes directly attended to by the user [MW2]. This approach is also called foreground/ background computing and the basic idea is that technology should inform the user without demanding full attention all of the time. When the user decides to pay attention, maybe triggered by some unexpected event, supportive technology is brought into focus. An illustrative example, where we still have not found a calm solution is email. Each arriving e-mail interrupts, but is this the right way to do email? It seems contradictory to say, in the face of frequent complaints about information overload, that more information could be encalming. It seems almost nonsensical to say that the way to become attuned to more information is to attend to it less. [MW] The following table adopted from [WB] divides the possible services into foreground and background applications. A foreground application is one where your attention is directed toward the application. Currently most applications are of this type, but as improved infrastructures gradually emerge the number of background applications will increase. Human-Human Human-Thing Human-Information Foreground/Focused Telephone call. The timer on your oven. Selecting a link on a web page. Background/Peripheral Context enabled mobile phone, e.g. one which could say “User is asleep” Smart house that turns down the heat at night. Advertisements on a web page. Table V.15.1 Foreground or background applications. Applications will not restrict themselves to one cell only in the table above, and they will adapt both to the situation and to the user [WB]. An advertisement that is selected is suddenly in the foreground. Your mobile phone displays the timer of your oven set by your wife, and you reset the timer via a web page after discussing the dinner menu with your wife over the phone. Hakan Gulliksson 222 With increased information density and demands on productivity it is natural for a user to have access to multiple sources of information, and to perform parallel tasks. This raises the question of how to direct or schedule attention. News are broadcasted at the same time every night, and screen estate is allocated to tickering news updates on the CNN news channel. The problem is to find a balance between distracting the user and providing the wanted service. This is a design challenge and the (useful) clock on your computer screen shows that solutions can be found. Hurry up dad, we seem to be late! Another indirect, peripheral, background, application was explored in the following experiment [JM1]. A large screen was placed in a workplace hallway. The hallway sensed the identity of people passing by and adjusted the content shown to them on the display. This gave interesting emergent phenomena, as people knew that they affected the display. One of the problems for the researchers was how to collect information about people without raising security and integrity issues. V.15.2.5 Slow technology Slow technology [LH] is another way to use technology in context. Here the focus is how to provide technology also for reflection and mental rest. One example is an electronic doorbell, which not only signals that someone is at the door, but at the same time sends other messages. Each signal from the doorbell could give an additional clue to a full, secret, doorbell message. This is slow technology that makes us stop and reflect. Technology that deliberately consumes time rather than saving it. So, instead of placing the activity at the periphery of the user s perception as in calm technology, slow technology steals, and highlights, a moment or two. This can be done in many ways [LH]:      It takes time to identify what is happening. It takes time to learn and get accustomed to it. It takes time to understand why it works the way it works. It takes time to apply it. It takes time to find out the consequences using it. Usually we aim for fast technology, meant to increase productivity, inverting all the statements above, but in slow technology we try to make good out of bad by using the extra time spent by the application for reflection, creating new thoughts in the moments gained. The time spent in interactions could be days, weeks, or even years. Another example, also from [LH] is soniture. This is furniture and physical environments designed with add-on sounds, creating an additional environmental dimension, e.g. a floor with its own audible interpretation of the steps it feels. Just a button Golden button Time perspective changes from just encompassing the moment of explicit use to the longer periods of time associated with dwelling. [LH] XII IX III VI Alarm clock with an acoustic E-tone chime that sounds only once. Hakan Gulliksson 223 V.15.3 Identification How do we recognize a fellow man? You, or a thing, have numerous ways to do it; voice, of course, face and ear features, thermograms that show body heat radiation, visual texture of iris, gait, keystroke dynamics, DNA, body odour, signature, retinal vasculature, fingerprints, and hand geometry are perhaps the first few examples that come to mind. Things also have other options. They can sense physical properties of a thing (weight, shape, colour, size), or affix a property to a thing, i.e. a tag (bar code, radio frequency identifier tag, doctors coat, registration plate for a car . “ car equipped with a tag could serve as a link to the driver s web page. Detecting presence is easier than identifying because of the lower level of resolution needed. We can use many of the techniques discussed above, but also temperature, and shadows. A burglar alarm is the typical example. Technology for identification and detection of presence should fulfil some constraints such as robustness against noise, and power failure. Also, the solution should not disturb the wearer. The technology discussed above is already in use. Car keys are for instance used to find a car on a big parking lot. Researchers are investigating smart floors. How do you think they will manage to separate two individuals walking over a floor? What characteristics will be used? Every person has a unique tongue print. Figure V.15.4 Human input and output devices. Is it a bird, a plane ? NO, it is Superman ! Some of a human being s input channels also work as output channels. Eyes tell stories, and hands that are used for gestures are also the tools for sensing reality. The mouth is mostly output, but in intimate situations, and for survival it also works as input. Successful interaction demands identification not only of humans, but also of the messages sent by humans. One type of a message is the sentence, which is composed of words where each word is itself a sequence of phonemes, most of which has to be recognized to identify the message. Personality can be classified into five different scales (the big five): Openness (to the unfamiliar), Conscientiouness (goal directed), Extraversion (capacity and need for stimulation), Agreeableness (ranging from compassion to antagoninsm), and Neuroticism (mental stability) McCray and Costa We can put ourselves in another person s place. This is quite a feat when you think about it, and very useful when we co-operate. In our minds we build a model of her and execute this model to find out what the other person is thinking, feeling, focusing on, or believing. Using this model we can try to figure out what she is up to. As we analyse our fellow humans we find ourselves in a nice recursive information web, or even in a mess of Hakan Gulliksson 224 deception. People know that we scan for information about them, information that they might not want to give away, or would like to control. A poker game is an extreme example. If you look worried your opponent will use it. You can use this fact by simulating worries, but your opponent might guess that you are simulating, and if you know that he knows that you are simulating …. Not only do we study other participants. We also study ourselves, and our own behaviour. As we do this we sometimes even manage to fool ourselves. Consciously, as for example when you convince yourself that you are not tired at all, in the morning, when the alarm clock rings, or subconsciously when a happy tune on the clock radio, starts you up whistling which cheers you up. For many interactions, pinning down the physical positioning of an interactor is a problem. This is not so for direct interaction where two people are at the same place, face-to-face. But, when technology steps in the physical position is not as easily shared. To socially position someone is more difficult, regardless of if interactors are at the same location. By social positioning we mean establishing all that characterises a human being, in a social context. Is he, or she, happy, sleepy, at home, at ease, interested? One important social process is that we place ourselves into social hierarchies that affect our behaviour. Most of us do not yell at our boss, but sometimes to a child that does not want to do the dishes. Almost as important as establishing identity or social position, of yourself or someone else, is to identify relationships between people. Relationships can change over time, and sometimes quite fast, for instance when you tell your wife that you forgot to pick up your daughter after school. For efficient interaction we also have to identify social conventions. One example is that in some countries a man should not address a woman in public. The identifications discussed above are, as almost any other human interaction described, built on pattern matching. In general, to identify something a characterization is needed which is not always easy to specify. The figure below shows three trajectories indicating intentionality. Can you identify which of them that best illustrates, fighting? Playing? Courting? Figure V.15.5 Three trajectories showing movements of two participants in an interaction [PT]. Patterns are found, designed, used, established, and generalised. Generalised too much patterns found loose their meaning. Maybe this paragraph did just that? Interaction implies communication. Direct communication peer-to-peer means that we have to identify the receiver, and the receiver is usually also interested in who sent the message. Identification means either direct physical contact, or naming, and for I-I interaction physical contact is not even an option. Hakan Gulliksson 560322-8593 225 A name can be provided in many different ways. Some are born with it, as a network card is. It has a unique label imprinted in hardware. Some are given a name by someone else, for humans parents usually accept this responsibility. Some have to ask someone else, which is done on the Internet when a networked station asks a server node for a temporary address. Some systems select an address. You do this when you buy a new mobile telephone and choose a number that is easy to remember. And, as a last resort, a name can be randomly generated. When everything is connected to everything else in the next generation of networked systems, unique identifiers will be very important, and luckily it is surprisingly simple to create a unique identifier. All you have to do is to combine a place and a time. Microsoft ® for instance, does this when a unique id for a new software component is created that can be used worldwide. The place is provided by the number of the network card on the local machine. Combining two independent name-spaces into one unique space is also quite easy. We create a hierarchical naming scheme by adding a unique number to each of the original numbers. This is for instance done with the telephone numbers, but could also be used to create a combined number space for Internet addresses and ISBN numbers. We simply concatenate with the Internet address and with the ISBN number and the two types of numbers can never be confused …… , ….. . The naming system should be consistent with the topology of the represented world to support easy delivery of messages to the receiver. This, for instance, helps the mailman to find the right street address, and the hierarchical phone numbering scheme to route your telephone call to the right country, and to the right part of that country. One example of an H-I interaction is a coffee machine that recognises you, has already filled up your cup, and greets you grinning with a cheerful Good morning – maybe even smart enough to skip the Good morning some mornings? To implement this, the coffee machine has to be enhanced with machine vision for face identification, and also for finding out your mood. Two quite difficult, some would say impossible, feats. In H-H interaction we do face and mood detection effortlessly, even though there is evidence that face identification is treated separately from other object recognition indicating that the problem is both difficult and important [SP]. The problem for a thing is not how to access digital information, which can be done through networking. The problem is rather how to make sense of the physical environment using sensors. For this, identification is a fundamental problem. Biomedia is any information that a computer extracts from a human being and this information can be used for biometrics, i.e. measuring human characteristics for identification. Performance is one obvious problem, but not the only one to consider. What if someone does not want to be recognised. Should we do it anyway? Also, what if we make mistakes? Will users accept a system that fails to deliver now and then? When we combine identification with location awareness and networking, theft will become a high tech business, and security a major issue. Hakan Gulliksson [email protected] GMP 452 2004-05-19 IV.3.07 Me Tarzan you Jane Any sufficiently advanced technology is indistinguishable from magic. Arthur C Clarke The chief mechanisms used in identification are: To point at it ... primary, physical ID To label it with a "word" ... secondary, representational method Draw a picture of it (another representational method - graphical) We may also separate out from Set A (all things) a Subset B (a category of things). By itself identification is a rather useless action 226 If we follow the relation H-T the other way, what can a human read from the exterior of a thing? It is designed, so in principle anything can be expressed! People are used to interpret shapes; an average adult can name about 10.000 things. So, by exploring this ability in H-T interaction we can simplify interaction, and enhance experience. To identify is to separate a thing or action from the set of all other things and actions. Without a human in the loop we still have many new, and potentially important application. Some T-I interaction some applications where identification is necessary are:   Customize device behaviour based on identification of context. An intelligent camera should send the picture taken to an available hard disc. Customize physical environment based on recognized context. If a user starts reading a book the physical context adapts lighting and the telephone leaves a do-not-disturb-if-it-is-not-veryimportant message. Recognition by bottom up visual search and data-driven matching has been shown to be NP-complete, i.e. computationally very expensive. If, on the other hand, knowledge about the context task, situation … can be used, the complexity decreases. Consequently for efficiency reasons we should avoid building complete 3D-models and other complex representations, and instead let the interactors make better use of the contextual information. We should in other words prune the search space. This line of thinking contrasts somewhat with the ambition to build an (extensive) representation of the context for ubiquitous computing. What to look for, where to look when, and how to look, are questions we need a priory knowledge of the world to answer. One example of where such a contextual model is useful is when a thing goes shopping for food [RA]. Foods in a grocery store are grouped, candy in one area, big items low on the shelves, milk and butter far from the exit, and away from the entrance. This information can be used to guide an interactor in the store to find the right items. ”8” Top down Bottom up t There are numerous techniques that a thing needs to learn more about its environment. One is image processing that has many applications, such as improving or manipulating images, to extract 3D information from 2D images, or to find patterns in an image. The problem to generate object information from 2D or 3D images, or doing motion estimations from a set of images is referred to as computer vision, a special case being face recognition as discussed above. Critical issues in vision research revolve around the nature of the representations used, and the nature of the processes that recover characteristics [DM]. The representations range from local properties of the image, such as pixel intensities and edges, via depth and information on orientations of surfaces, to objects defined as hierarchies of volumes, which can be matched against real world objects. Hakan Gulliksson 227 For real world objects we have many practical problems; different lighting conditions, noise from hard rain, children that grow up, repainted houses, and snow covering a landmark. One specific problem is background clutter, especially if the recognition uses edge detection. Other problems are; background colour conflicts if the algorithm uses colour lookup, and partial occlusion. Some algorithms also need elaborate training sequences to recognize an object and might require the user to select feature points. The figure below illustrates how the human eye and brain represents an image [SP]. Each cell can store information about one area in the image. Cells in the middle of the eye are smaller, resulting in higher resolution for this part of the image. Surface:: Depth: Slant: Tilt: Colour: Figure V.15.6 Human image representation. Recovering the characteristics of a real world object from an image is typically an ill posed problem where the solution (the characteristics) is not uniquely determined by the image analysed. The processes manipulating the representations either have to use contextual knowledge from the real world, or guess, to extract properties. Examples of rules used from the real world are; surfaces are consistently coloured and textured, motion from tension and gravity makes straight lines, objects are rigid, contours are continuous, and light sources are constant over short periods of time. We can then apply statistics to constrain the image processing by our knowledge of the world. Given the figure to the right, what is the possibility that the uppermost image is a coin in the scene and you are viewing it edge–on? Coin? Alternative interpretations: Yes! No! Another representation of an image, quite different from the one of human vision, or the pixel based representation, is shown in the following figure. Position Cylinder Coin Is a <x, y> 7 mm Thickness Diameter 3 cm Which representation of the above is the best one for identification? This has to be decided individually for each application and it is important to consider both representation and processing at the same time because this can simplify either, or both of them. Compare the human way of storing an image, where two objects close to each other in the image end up close together in the brain, with an array in Java, where the correlation between the position in the array and the position in image is arbitrary, chosen by the programmer. Hakan Gulliksson Figure V.15.7 Knowledge schema for image Who are you going to believe, me, or your own eyes? Groucho Marx 228 V.15.3.1 Pattern matching Many human wonders are based on pattern matching, i.e. recognition, a basic but intricate feature. You quickly realise this if you try to implement it using a computer. The following figure shows some variations that have to be taken care of in the simple task of recognising the letter F. f f As discussed previously, pattern matching uses two complementary processes, bottom up and top down. The bottom up approach combines lower level primitives and tries to form meaningful constructs. The top down process uses contextual information and internal models and matches them against the bottom up constructs. In a crowd you recognise your son, at least his jacket. Odd, since the person wearing it is too tall and your son should be at home doing his homework. As you get closer you realise that it really is your son. Surrounded by his short mates he looks taller. Homework? He says. Another example is that late at night your kitchen is lit only by the moon. A cylindrical object on the table will rather be perceived as a cup than as a spare to your car s engine. Some aspects of matching are already built in by evolution. Why, for instance, do we see a diamond as a diamond, and not as a rotated square? The only difference is the viewer s frame of reference, see figure V.15.9. Figure V.15.8 Different patterns that matching has to cope with to identify F. abcdeīghij We humans assume that we are directly perceiving the “objective“ properties of the bodies which surround us ... Snow really is white, roses really are red, tables are flat, and so on. This purely passive idea of perception is so entrenched in us that it is difficult for us to grasp that we only really see what, in a way, we have already seen.…We have already learned to distinguish through our culture, that is through prior interactions with other members of our species, certain ranges of colours, differentiating red from pink and violet for example. Jacques Ferber Figure V.15.9 A square and a diamond shape? Recognition by Component is a theory of matching developed by Irving Biederman. It is also an example of a bottom up model of behaviour. He postulates that matching is done by comparing a visible object to combinations of geometric icons, geons. We will identify a cup by matching it to geons representing a cup in our brain. Can you think of any reason why evolution has implemented the brain such that it first breaks the visual pattern down into edges and primitive patterns before putting the visual pattern back together again? Figure V.15.10 Geons, small building blocks that com-bine into wellknown objects. Hakan Gulliksson 229 Do not be fooled by the trivial examples chosen, they are only meant as illustrations of the general principle. Matching can be very complex, and even involve inventing, or imposing a match. Also, matching is not restricted to identifying physical objects. Social relations and cultural patterns are also found this way. Runner? Crocodile? V.15.4 Navigation Navigation is an activity where an interactor uses context to find its position and to follow a planned course. Our introductory example of a context is the social environment, i.e. other humans, and our navigation task is to ask them for their opinion when we are curious about someone s personality. Navigation in this context is a cyclic process where people, or traces of people, are browsed, the findings modelled, and the result interpreted and used. Understanding moods and emotions are important, and there are many gestures and behaviours that we subconsciously continuously monitor. Correct interpretation makes it possible to manoeuvre in a social context, to avoid conflicts, and to make a good impression. The result of the interpretation is used to decide on the strategy to use for the next iteration of the cycle, starting once again with browsing the environment [MR]. Social navigation is the process of making navigational decisions in real or virtual environments based on social and communicative interactions with others. Mark O. Riedl Figure V.15.32 Social navigation. Some aspects of navigation are shown in figure V.15.33 below where Mr A is shown, lost in Tokyo [DW], [MR]. He has knowledge about what Miss B knows, which he has learnt from studying her behaviour, or by knowing something else about her (maybe she looks Japanese). In addition, Mr. A probably knows something about his current position, I am east of my hotel . He also knows how to extract knowledge from the physical environment since he has a map (in English) over Tokyo. “Man is by nature a social animal, and an individual who is unsocial naturally and not accidentally is either beneath our notice or more than human. Society is something in nature that precedes the individual. Anyone who either cannot lead the common life or is so self-sufficient as not to need to, and therefore does not partake of society, is either a beast or a god” Aristotle, 384-322 B.C Mr A Context Representation of self Representation of person B Representation of context Representation of … Miss B There are two strategies for navigation possible in this situation. One is to ask the girl, and the other one is to consult the map. What strategy he will use depends on the disposition of Mr. A, his mood, his knowledge in Japanese, his trust in Japanese girls, or maybe his civil status. No social Hakan Gulliksson Figure V.15.33 A memory system, adapted from [DW]. There is a dynamic relationship between people, the activities in space, and the space itself. All three are subject to change. [AM] 230 navigation is necessary if the internal representation of the context is good enough, e.g. he suddenly recognises his hotel across the street. Navigation is further complicated by the fact that social environments are unstable. They consist of individuals that suddenly can change their minds about something, and modify their actions accordingly. We are all both navigators and parts of the map. The possibility for I/T to navigate in a human social environment is still well out of reach for technology. Now let us, i.e. H, navigate in information. Spatial thinking and spatial navigation come natural to us, and are therefore interesting as tools for exploration, orientation, and navigation, in all sorts of environments. Navigation tries to answer the following basic questions. Where am I? Where have I been? How can I get to where I want? Finding the answers once again depends on positioning, and setting a course. They are two tasks performed within a context, either in real or in virtual reality, using a model of the places to visit (or miss) and the possible paths to travel. A good cognitive map over the world is necessary and can for instance be a mental map over roads, crossings and cities. Familiarity with the space can be represented and generated in different ways. We might have travelled the same path before or seen a similar web page. Maybe we know the path from a map we have seen, or we have some high level topological knowledge. Passing street numbers for instance indicates speed and direction. Other times we for instance know that the destination is beyond, or in between, familiar landmarks. If we have travelled the path before we will recognise clues in many modalities, a smell here, a sight there, i.e. the context is very important for navigation. A practical aspect is that fallback solutions, i.e. backtracking, or short cuts to safe places, should be provided in case the wrong course was taken. Errors are inevitable. Chair Work Home Refridgerator Bed What does it mean for information to navigate? To start with, and as noted above, a prerequisite for navigation is a position since you need it to set a course. Information is virtual which means that the physical world is not the default space. In fact, for information anything that can be identified could serve as a position! Furthermore, there is an inevitable time delay associated with physical distances, but delays also could emanate from many other distances. One example is the distance between a software agent that understands English, and one that understands Swedish. Passing a message between these agents will involve a translation step that takes time. The conclusion is that interaction, and specifically navigation, of the type I-I is not necessarily situated in either time or physical space! One simple virtual space is a colour space. A specific yellow colour has a defined value, for instance (255,255,0) in the RGB colour space. The distance from yellow to red (255,0,0) can be measured and if an application wants to transform a yellow dot into a red one it could follow the path , , , , , … , , , , , . If we on the other hand want to navigate from black (0,0,0) to white (255,255,255) we have many different possible paths of equal length available. In the HSB (Hue, Saturation, Brightness) colour space navigation from black to white starts from (0,0,0) and ends at (0,0,100). In this case there is an obvious path to use. Hakan Gulliksson 231 The way-finding problem is a part of the navigation, and it is also an optimisation problem. We want to arrive as soon, or as cheep, as possible. The problem gets extremely complicated if multiple cost estimates (time, money, condition of the path) are associated with different sub paths. Navigation in virtual environments comes with additional costs [SS]. One example is that input devices for 3D require training. Metaphors used, such as virtual flying, are easy to understand, but using them in practise is a problem, especially for a novice user. She is also often placed in an unknown environment forcing her to do local explorations. On top of this, the world itself, landmarks, paths, and even laws of nature may behave strange. One way to alleviate the problem is to externalise goals and plans. Examples of how to do this is to draw a red line in the virtual world that the user can follow to the target, or to introduce force fields that hinders her from going the wrong way (uphill is prohibited). In virtual reality there are many more possible ways to do things, but still we want to reuse familiar human interaction techniques. The control metaphor chosen when designing navigating in a virtual 3D space is important, and should be made clear to the user. There are several options available depending on why the user wants to navigate. One alternative is that control is exercised using a virtual camera held in the hand. The view shown is the one seen by the camera. We can start from anywhere within the world, fly around and look at objects, expand ourselves at will, and maybe extend an arm to reach and manipulate an interesting object, all in third, or first person, see figure below.   Navigation indicators Regulatory Signs Warning Signs Guide Signs Three most important signs as defined by MUTCD (Federal Highway Administartion standard) Figure V.15.34 Different ways to look at a virtual world. Both positioning and course setting make use of lower level tasks, notably search and scan (browse). Scanning is a combination of the operations overview, zoom, and filter, and as it should be, this is also a good way to describe how you yourself navigate in a jungle [BS2]. First you gain an overview of the scenario, and then you focus on one of the more promising openings, filtering out uninteresting bush wood and single trees, that are simple to pass. Scan Search Recognize Describe The new Google scanmachine Figure V.15.35 Scan and search are complementary operations. Search and scan complement each other. Scanning is the superior strategy when it is easier to recognise than to describe information and search excels if it is advantageous to specify the wanted information. To scan is also favourable if the user is not familiar with the content, and if there is not too much data. Hakan Gulliksson 232 Automatic navigation is one of the problem areas where progress of technology has met hard resistance. The thing faces exactly the same problem as us humans, but we already have a well-developed toolbox, provided by evolution. There are many problems facing the thing [NS]. To start with the real world is dynamic and non-deterministic, which means that planning is difficult and must be done in real time. One example is the poor robot bumping its head in the bedroom door that is normally open. Repeating the same action does not always have the same effect. Some other problems are that the world is continuous rather than discreet, and that the thing can never fully perceive the environment because of sensor limitations (neither can we). The illustration to the right shows another problem where a typical thing with only a local model of reality is caught by a curved wall. The railway train is an example of a successfully navigating interactor, and the autopilot in an airplane is another one. Both must be considered as passive and reactive interactors as they only follow a tasty path impossible to leave. Examples of active, reflective, navigating things are harder to find and usually, as in the example of an automatic factory, demands a highly constrained environment. One reason for this is that things cost money, and will break if handled carelessly. For information, automatic navigation is used in many important applications, most notably for packet delivery on the Internet. A thing can use a camera as a sensor. In this case the first part of the navigation problem is the problem of assigning each pixel in an image to a position and a velocity in 3D space. By using this information, pixels can be grouped to objects, which can be identified as real world objects and avoided, or aimed at, depending on application. Using such technology you can listen to music, and the music follows you as you move from room to room. When you leave the living room the music fades away and meets you as you enter the kitchen. T ? T T Physical forces give us another way to model path finding and navigation. We start by representing goals and obstacles as potential fields. Goals are modelled as attractors and obstacles repel, see the figure below. Figure V.15.39 Using potential fields as an abstract model. Obstacle Goal In the real world the strength of the field is typically inversely proportional to the square of the distance to the source, i.e. F α / distance2, a fundamentally continuous function. With the concepts introduced above a world can be modelled as the superposition of the Hakan Gulliksson 233 potential fields of a number of obstacles and goals, and a smart interactor moves along a path towards a goal using as little energy as possible. Figure V.15.40 Interaction as minimisation of energy spent on the path to the goal. V.15.5 Choice Choice, or selection, is a very basic concept, not easy to reduce into other lower level concepts. Navigation, for instance, depends on choices (and a selection involves navigation among choices). Intention and attention are prerequisites, at least for conscious selection, and attention is caught by and trained in a cultural and physical environment. Attention is in fact a choice, conscious or not that for instance a boy tries to manipulate by making an impression on a group of girls. This section is about how to detect and make choices. We are continuously choosing from the menu of the world, and the trend is that increasingly choices are made and effectuated using technology. This means that technology must be adapted to how we choose, and to the context where we choose. The problem gets worse because computers display information many times faster than human language can express it, at least without learning and adaptation. This means that relative to HH interaction choices among alternatives will be much more prevalent in H-I or H-T interaction. Of utmost importance when we design for choice is to provide for feedback. If you run your favourite text editor, and look for feedback you will be surprised by the number of examples you will find. Confirmations (highlight of selected text), in progress feedback (hourglass icon) and feed forward feedback (the cursor) are some examples [VB]. We are so used to working with windows, menus, icons, and pointers that we take the feedback it provides for granted. When we consider a multi-user, multi-input, multi-screen environment things get even more complicated and we will have to reconsider how we accomplish feedback. B A How many ways are there to control the lighting in a room [SS]? Here are some examples to start with:  Lighting switch.  Voice message.  Voice message and entering the room. List another 10 ways assuming the environment is arbitrarily intelligent Choice is to thinking as battle to war. You can philosophise and deliberate all day, but the end result of all your mental gymnastics has to be a choice of some sort [CC] B A Remote controls Hakan Gulliksson 234 How can we tell that a choice has been done? Figure V.15.57 shows some alternative selections. Select 1 No selection at all Figure V.15.57 Different ways to select something. Select by elimination Select by interval or position H I T T I Select by symbol or property All of the selections in the figure could as well have been slips, i.e. mistakes. If we do not know the intention of the user then all we can do is guess whether the choice was valid. One way to increase the probability is to ask for a confirmation. In some dangerous machines a choice is confirmed by using both hands in the selection. Why not use both hands for spatial input? Better performance is possible! What tasks should be suitable for twohanded input? Could human factors be improved? Usability? Figure V.15.58 Safe operation. V.15.6 Manipulation The more active participants of the human race will certainly try to change moods and stir emotions. Advertisements are mild and acceptable versions of manipulation, but others, such as hypnosis, and group pressure, are potentially dangerous. Many forms of blackmail are certainly not allowed and will send you to jail. Still, our society would not work, not even for one single day without social manipulation. We call it education, and want a child to be well raised. We will now complement section V.15.2 on intelligent interfaces with a discussion on manipulation. We start with H-H, and by noting that most peculiarities of H-H can be reused for H-I or H-T interaction. How could you avoid being manipulated? This is knowledge that works both ways. If you know how to escape it, you know how to exert it. Without knowledge about manipulation you will not even know that you are being manipulated. Are you? There are three factors that are important in developing resistance. First, knowledge about social psychology and attitude change are important, second and maybe even more important is general knowledge about philosophy and science. If someone presents scientific facts showing that energy is created in their refrigerator without any external energy source then you, as a knowledgeable person, will mutter something about UFO and alchemy. A third factor important for detecting manipulation is selfknowledge. It will help you to inspect yourself and observe your own reactions. Of course, a general attitude of scepticism, which is a born gift, is always healthy. Hakan Gulliksson Have you seen a dictator with irony? A dictator that laughs at himself? 235 Some examples of findings from social psychology are that manipulators often start with making minor requests. They often seem concerned, sincere, and friendly. They use group pressure and do not make things too easy. They present you with an appearingly meaningful task that is said to be tough, but you are of course are capable of performing it. Immediate intimacy and friendship, and feelings of disorientation, confusion, and embarrassment are some possible indications that you are being manipulated [SD]. In general people are easy to fool, as can be seen in any magic show! Not all manipulation is bad of course. Some flattery could even help convince someone to do the dishes and praising the result will lower the resistance to repeat the feat. Flattery is an important social cue, and similarity is another. People similar to us more easily persuade us. If you play golf and meet another golfer the probability for liking increases. In fact, the greater the similarity, in background, trait, or attitudes the greater the potential for persuasion [BF]. If this is scientifically proved to apply also to fifty year old men making the dishes we do not know. In a social setting we exploit similarity by adopting culturally predefined roles. In the role of a teacher we are trusted to know the subject and be able to teach it. Another example is that any referee is automatically accepted as an authority on the football field. A social role can easily be used for manipulation, we for instance readily trust a doctor and accept the decision of the head of the family. Listening, not imitation, may be the sincerest form of flattery. Dr. Joyce Brothers quotes Yeah! Attractiveness is another important aspect since attractive people or products socially influence us easier. Physically attractive people are by default assumed to be intelligent and honest. Luckily people cannot yet be designed, only styled. A problem for product design is that different audiences have different culturally established preferences, which vary over time. This means that the designer has a lot of footwork to do, making surveys and looking for clues, for instance in typical magazines and TV-shows. The last possibility for social manipulation that we will mention here is the rule of reciprocity, which seems to be followed in every human society [BF]. The principle is that if you are given a favour you will feel obliged to return it, and this can be used for social manipulation in many ways. One example are companies that gives you a watch, almost for free, if you sign up for buying one book each month. Manipulation for I-I means changing a data structure, for instance deleting, copying and moving data in a database. Here we will add a short discussion on security. Security is how to protect a computer, network, or another resource, from being manipulated. It is necessary to guarantee privacy, the condition of keeping something personal. Computers and networks have created new challenges, but the basic problems are as old as the social network. Some members of a group are authorized to access resources, and authentication is needed to verify their identities. For face-to-face communication authentication is simple, in other situations we need passwords, biometrics, or access cards with pin codes. Hakan Gulliksson 236 Unauthorized access could be gained by misusing prior authorization, masquerading as someone else, or by exploiting some vulnerability in the security system. All well-known tricks from films featuring J. Bond. Once inside the system, or with access to the protected resource, the intruder could steal, destroy, or browse secret information. Denial of service is another type of attack that could be especially damaging for a computer system. It diminishes server capacity by keeping the server as busy as possible and thereby temporarily hinders access to the system. It is interesting to note that biology inspires thinking about network security. Viruses and worms are different existing attacks. The defending side also use colourful names, firewalls, sandboxes, and honey pots (used to trick hackers) are some examples. Security is always a balance between the cost of loosing control and the cost of equipment and administrative expenses to keep it. Another important lesson is that insiders are responsible for most of the security attacks. Security please! The combination of a user name and a password works well to authenticate fixed access, but as users start moving around and take their computer computers along, new solutions are needed. Logging in every 100 meters, i.e. to every local area network passed, is both a nuisance and a threat to integrity. Context authentication is an alternative approach, where the network can use any implicit information about the user such as usage history, the current user environment, or typical user actions to inform authentication. One example is that the information displayed on a public information kiosk at a train station should be erased if the user walks away. Other examples are mobile computers that work only at a specific sports arena, and a network service that is enabled only if the user accepts the service, e.g. by filling in a form on a web page. A user scenario is that you attend a meeting at some location where you have never been before. You have a document that you would like to print, but the problem is to persuade the new network environment to grant you access to a printer. Currently, the solution is to find a system administrator (usually very difficult), or to disconnect the printer from the network. One possible alternative could be a trust based security system where users on a network are given rights to delegate access rights to third parties. This way, one of the attendees to the meeting, with the proper access rights, could give you a time limited access to the printer, the projector, and the coffee machine. Rights could also be associated with devices and software agents. Anything with a unique identity could be given, and could delegate, access rights. What we need is a language, preferably XML-based, to describe who has what access rights to what. Each manipulation changes attributes, or states, of the manipulated object. Move , for instance, means that the position of an object is changed. Alternatively we could, if it suits us better, assign the attributes to the manipulation; a move could be done for a thing at a specific angle, to a predefined position. Hakan Gulliksson T (move, x,y) Move ( T , x,y) 237 The English language has thousands of verbs, and many of them can be classified as manipulations. The commands selected and discussed in the following rather manipulate screen based virtual objects than persuade interactor-objects into doing something. Let us try to use physical reality to list some of the possible manipulations, starting with two objects A and B. A We can merge, join, group, compose, close, shut A and B, A and then split, divide, fork, break, open, extract the composition. B B C The new (or old) objects can be pulled, moved, pushed, lifted, drawn apart. C We might cut, delete, remove D and save it in memory, C D D C and then paste, retrieve, D back from memory. D At last we can stretch, mould, shape, form, C. D H Some useful manipulations do not have a direct physical equivalence, and create is one example. In the virtual world we can easily create something out of nothing whereas in the real world this is impossible, even if there are good approximations. Using information we can model and manipulate a surface in 3D, but information can also be used to directly manipulate things. According to Encyclopaedia Britannica, sculpturing is a form of aesthetic expression in which hard materials are worked into three-dimensional objects of art. A lot of different media may be used, including clay, wax and stone. Materials can be carved, modelled, moulded, or otherwise shaped and combined. H + H H H ca-Cola New technology gives new possibilities. Scanning a rotating object using a laser is one way to input a 3D structure and given a 3D description there are now machines that can automatically mould blocks of suitable materials to the exact shape designed by the user. The lathed surface is an example of a less complicated graphics based sculpturing. The surface is generated by rotating a curve around a coordinate axis. Just as raster operations can perform logic operations on bitmapped computer graphics, Boolean operations between volumes can be defined in 3D. Using these operations new forms can be constructed starting with simple 3D objects. Hakan Gulliksson AND 238 V.15.6.1 Speech acts Language is studied from many points of view, including the validity of a statement, syntax, and culture dependency. One view particularly interesting for interaction is that of speech acts. Here, focus is not on whether a statement is true or not, but on the act intended by the utterance, to say is to do. You are not only describing situations, you are creating, and manipulating them. Making a normal utterance, i.e. saying something, involves a hierarchy of acts on different levels. At the lowest level is the act of utterance. An utterance is perceived, even if it is in another language, e.g. a gesture, and even if we do not understand its meaning. If we take an utterance in some language, add meaning at a particular time and place, and also add an intention on the part of the speaker, we arrive at another level of the act, the illocutionary speech act. In this act the utterance, or another form of message, affects the addressee. The speaker wants the listener to recognise the meaning of the utterance, i.e. to do or think something. One example is the statement You ll do that . “n illocutionary act can be expressed as a performative, i.e. a verb, operating on some content, i.e. do and that in the previous example. The result of the act is the perlucotion and it relates to the effects that an act has on the state of the addressee. If a teenager is asked to wash up the dishes he can attend to the task singing, or he might immediately leave the house. Two different perlocutions for the same illocutionary act. Searle [JS2] proposes the following taxonomy for illocutionary acts:      Assertive acts, used to commit the speaker to the truth of the expression There s too little salt in the soup Directive acts, try to persuade the listener to perform something, Pass the salt, please Promissive acts, are attempts to commit the speaker to do something, I will pass the salt, some day Expressive acts, expresses the speaker s feelings about a state of affairs, I am sorry, the salt has been stolen Declarative acts, perform an act by the utterance, I curse you (the effect is uncertain, but maybe some salt might help?) Duck! What will you do? On the golf course versus out hunting ducks [WB]? When we look into a mirror we think the image that confronts us is accurate. But move a millimetre and the image changes. We are actually looking at a never-ending range of reflections. But sometimes a writer has to smash the mirror – for it is on the other side of that mirror that the truth stares at us. Harold Pinter, Nobel lecture Buy, Sell Top 2 words of the century I say to the House as I said to ministers who have joined this government, I have nothing to offer but blood, toil, tears, and sweat…. Winston Churchill We choose to go to the moon. We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills,… John F. Kennedy An interesting feature of all speech acts is that they are independent of the cultural setting and the linguistic form chosen for it. Pass the salt, please and Would you mind passing the salt are equivalent from the speech act point of view. The theory of speech acts is quite general and currently difficult to use directly in applications, but it is useful in the analysis of conversations, and hence it follows that it can be used to analyse interactions, for instance a graphical user interface. It also provides insights into the use of the human language. Act 3, in the kitchen, father and son sitting at the table, son doing nothing, father humming. F (hinting at the issue): “Oh, the dishes are not done!” ->No visible or audible response whatsoever F (testing a more direct approach) “It is your turn to do the dishes!!!” ->S: ”I will do it, I will, soon, just have a little thing to attend to” Act 4, later in the kitchen, father enters kithchen,son doing nothing, father steaming. F (adding a touch of affect): What the hell, the dishes still not done!? -> the world keeps spinning, the Universe is not disturbed … F (bringing the point home, stabbing at the heart) “Your monthly allowance will be reduced” ->S: ( looks at father, surprise in his eyes) “Dishes, me? Why didn’t you say so?” Hakan Gulliksson 239 The following figure shows a state based model of speech acts (speech dance?) between two interactors A and B [TWD2]. A: declare A: request B: promise B: break promise B: counter A: counter A: declare B: assert Figure V.15.72 Speech dance, clearly extremely simplified. A: accept A: withdraw A: withdraw Interactor A starts, by issuing a simple request for some salt. B could promise to deliver the goods, reject the request, or choose to counter by requesting sugar. Figure V.15.66 models quite a simple interaction, omitting many details, yet it results in a rather complicated graph, which indicates the depth and variability of human speech interaction. If I'd known I was going to win I'd have written a speech, so here it is Many of the idealists can only conceive of an idle humanity as an ideal humanity. They talk as if no man could ever rest until he reached Utopia; or as if a really long holiday were something like heaven, utterly distant and divine. Their social philosophy is that of the hearty and humorous epitaph of the charwoman, who had gone on to do nothing for ever and ever. But even now it is by no means certain that those who are not charwomen really become any more hearty and humorous by doing nothing for ever and ever. A vast amount of stuffy and sentimental humbug has been uttered in favour of the Gospel of Work. As it was said that Carlyle talked a great deal in praise of silence, it may also be respectfully affirmed that he idled away a great deal of his time meditating on the virtues of labour. Work is not necessarily good for people; overwork is very bad for people; and both often begin with a bad motive and come to a bad end. But there is another strong objection which I, one of the laziest of all the children of Adam, have against the Leisure State. Those who think it can be done argue that a vast machinery using electricity, water-power, petrol, and so on, might reduce the work imposed on each of us to a minimum. It might, but it would also reduce our control to a minimum. We should ourselves become parts of a machine, even if the machine only used those parts once a week. The machine would be our master, for the machine would produce our food, and most of us can have no notion of how it was really being produced. G K Chesterton 1925 I think the name of leisure has come to cover three totally different things. The first is being allowed to do something. The second is being allowed to do anything. And the third (and perhaps most rare and precious) is being allowed to do nothing. Of the first we have undoubtedly a vast and a very probably a most profitable increase in recent social arrangements. Undoubtedly there is much more elaborate equipment and opportunity for golfers to play golf, for bridge-players to play bridge, for jazzers to jazz, or for motorists to motor. But those who find themselves in the world where these recreations are provided will find that the modern world is not really a universal provider. He will find it made more and more easy to get some things and impossible to get others. [] The second sort of leisure is certainly not increased, and is on the whole lessened. The sense of having a certain material in hand which a man may mould into _any_ form he chooses, this a sort of pleasure now almost confined to artists. As for the third form of leisure, the most precious, the most consoling, the most pure and holy, the noble habit of doing nothing at all--that is being neglected in a degree which seems to me to threaten the degeneration of the whole race. It is because artists do not practice, patrons do not patronise, crowds do not assemble to worship reverently the great work of Doing Nothing, that the world has lost its philosophy and even failed to create a new religion. G K Chesterton 1927 Hakan Gulliksson 240 Part VI: Design, humans change our future The previous chapters introduced interaction and interactors. Concepts that can be used to model many important aspects of the world, for instance the workings of society, companies and families, or other systems built by technology, such as a car or a computer. Still missing in our modelling toolbox is how to describe purposeful creation. Interaction is all right for describing how something, for instance a computerised information service, works, but it is not sufficient if we want to describe how this service was thought out. It does not provide a framework where we can describe why the service was needed, why it works the way it does, or why you like using it. To answer these questions we need a broader framework, where interaction is still important as the glue and the engine, we need the concept of design. Definition: Design is intentional change in an unpredictable world. Nelson, Stolterman [NS] Why do we intentionally try to change and arrange our future? Some answers are the same as to the question why we interact. Deliberate change helps us survive and prosper, and if we do it right to we can dominate others, or serve them well. There is always some little detail affecting our lives that we would like to fix, and even if the world by some strange coincident was perfect we would still want to change it just to make an impression. Change is inevitable. Using a design perspective we can avoid being trapped in last minute adjustments to fix problems and instead formulate, and strive for, well thought out visions of longer-term solutions. Design formulated this way can be used as a framework for human development in general [ES]. If the world was completely deterministic, changing it would be an easy task. But, many things that happen to us, in effect happen by chance. We are also ourselves part of the system, entangled in uncountable feedback loops, which makes things a tiny bit complicated, especially at the social level. So, how do we accomplish change and even more important, how do we make sure that the changes chosen give the intended result? First, we need an accurate model of the world, which we have seen in previous chapters is quite difficult to achieve. Next, we need to understand the cause-effect relationship of actions at the appropriate level, e.g. at the physical, or social level. Not only do we need to do this for one cause and its effect, but also for the chain of causes and effects that we trigger by a change. The complexity of the constraints, such as technology, economy and social relations, and also of the problem, if it in any way concerns humans, makes it impossible to optimise the solution in a formal way. Rather than solving the problem, we need to resolve it, and working with such problems calls for judgement and balanced solutions, not only for yes or no answers. Hakan Gulliksson no yes 241 Inquiry by the scientific method is one way to generate knowledge for judgement. Another way is to use intuition, it is formed by experience and is difficult to describe. It is sometimes referred to as tacit, or silent knowledge, and is exemplified by art where the artist seemingly just does it. Since we in design have to make decisions in a hopelessly complex world, intuition is a necessary complement to the scientific reasoning [ES]. Can we trust intuition? Actually, we have no choice! But, we should as often and as much as possible use scientific reasoning to guide us. Acquiring knowledge efficiently is to some extent a matter of method and can be learnt. Some of the methods will be described later in this chapter. Systematic and continuous introspection of our own behaviour, and of the behaviours of others are also important. One interesting observation is that intuitively (!) we trust some people to have better intuition than others. When the work is done we still have more work to do. We have to evaluate the work, both the result and the process leading to it. Did we find the right balance between using intuition and explicit knowledge in a particular case? Was time spent on the right parts of the result? These and many others are important questions to answer if we want to improve as designers and if we want to get paid for more than a first lucky shot. A design process is typically realised in different phases. Starting from a vision of what we want to achieve, we formalise a requirement specification, think about the concept of the design, its appearance and many other things. The design is also for a specific context, maybe we can identify typical users and tasks. All of the time while we work with these questions we should evaluate how we work, and the results. Some of the evaluation techniques we can use are to ask an expert, ask users, study a prototype of the design in action, and test key aspects of it in a laboratory. How to apply why and how is design. Why apply how and why is human. HG Coca Cola Typical questions for evaluation are; whether the result of a design process is useful, and it gives the intended result? One indication if a product is any good is if people use it. This is a good top down measure, both for restaurants and web sites. Evaluation bottom up is more difficult. How do we measure if someone feels at home with our design, to what extent the sense of time and space is lost, or if a tool is trusted? Security, privacy, and customisation, all affect usage and should be evaluated, as well as capabilities such as the possibility for the interactors to group themselves, and to what extent they are peripherally aware of others. Note that evaluation, as well as design is something that we all do all of the time. VI.1 What is the problem? The first thing to remember is that most design tasks are wicked problems [JLES], meaning that there is no final best solution and that the problem is hard to define before a solution is found (at which time the solution is obvious). The problems are ill defined, ill structured and have resolutions rather than solutions. During the design process focus constantly shifts between details and the whole. This makes the delegation of design work difficult. If we consider designed products carefully it is difficult to find anything at all that works perfectly from every point of view [DP]! Aircrafts crash, Hakan Gulliksson 242 axes do not keep their edges, and cars smell. But, these deficiencies are the results of necessary compromises. All design involves compromises, as does engineering. As stated several times earlier in this book, the main idea with interaction technology is to add QOL (Quality of Life) by introducing new technology. This could be accomplished in the products of today by incorporating complicated systems or devices, such as the Internet or the computer, and use them to invent new services and products that solve problems. Technology by itself allows for many degrees of freedom, which means that the designer is faced with the combination of a wicked problem and an enormous toolbox. So far, many of these products are almost impossible to use by most ordinary people. The problem is termed cognitive friction in [ACR], meaning that the products are not well adapted to human behaviour and thinking. Conceptual level Physical level Designers intention with the product. Conceptual level Figure VI.1.1 One problem is the difference between the designer’s and the user’s views. Physical level Users view of the product If you study figure VII_1.1 you can see that the designer and the user might have different views of the product at the physical level. A menu with a strange name, a colour badly chosen, or a grip most suitable for a left handed, just to give some examples. This physical level mismatch is usually the easy part to fix. Cognitive friction is also shown in the figure as a discrepancy at the conceptual level between how the user and the designer view the product, i.e. their mental models of the product does not agree. The reason for this discrepancy can be that they have different background knowledge, or lack thereof. The result is an interrupt of the habitual, standard, comfortable being in the world and the user has to adapt to the product [TWD2]. Imagine a user who has always used a fully automatic digital camera and suddenly is confronted with an old Leica camera from 1975 with a separate exposure meter. How many students can today use a slide rule where multiplication is performed by addition (of logarithms)? Hakan Gulliksson 243 One way to overcome the problem of cognitive friction is to improve the usability of the product. This can for instance be done by adding a menu that is easier to read, or support the user by a smart pop up form, at the right place and time. Evaluating usability means estimating the Gulf of execution and the Gulf of evaluation [DAN2 ]. The Gulf of execution is the difference between what the user wants to do, and what actually can be done using the controls in the product. The Gulf of evaluation is the difference between the state of the system perceived by the user, and the actual state of the system. Another (better) way is to take a step back, consider the original goal of the user, and from this redesign the whole interaction, more or less ignoring implementation issues, i.e. do a goal based design. In the figure below this means that design should rather start from what is desirable than from what is technically possible, or economically viable. “Like putting an Armani suit on Attila the Hun, interface design only tells how to dress up existing behaviour” Alan Cooper [ACR] It should be simple enough to use but functional enough to be useful Figure VI.1.2 Goal based design starts from what is desirable. Product The discrepancy at the conceptual level is a problem, but what makes design really difficult is found at the intentional level. How to find out what another human being needs when she probably does not even know it herself yet, and most likely cannot articulate her wish. One way to focus on a goal-based design is to constantly ask ourselves and other involvd the question Why? as in Why is this book worth spending time with? . This is quite a different question from asking how, e.g. How do I read this book? . “nother example is that we first should ask ourselves Why does the user need this command? and only if we are satisfied with the answer do we ask How does the user access the command? , and How do we implement it? . If we cannot answer the why question we should not even start asking how! The question of why is useful in many steps of the development process. On the physical level making a cup of coffee is not difficult. If you are addicted to coffee, and in great distress, you will probably manage any technology to put together a cup of java regardless of any cognitive friction. But, without having tried coffee, or perhaps just testing it once, would you envision starting an industry by designing coffee makers or a world wide chain of coffee shops? Coffee XII Coffee IX III Coffee V1 Coffee Hakan Gulliksson 244 VI.2 Design for H-H The problem with human co-operation using technology is that all aspects of human life, i.e. all of its complexity, must be channelled over technology. Of course this is difficult! Typical questions to be answered by the designer when a team of human interactors are studied are:       Who speaks? Who is spoken to? A designer faces quite different challenges for communications of type one-to-one, one-to-many, or one-to-all. What is said and why? When and for how long? Floor control and user roles should be considered. What medium? Face to face or e-mail are two alternatives. What method is used for decision-making? Negotiation or central control, for instance. For multi-user applications, such as collaborative work environments, a designer also faces the problem of evaluation. It is many times difficult to observe the system in use since users are not collocated, and it is also difficult to create realistic test conditions in a laboratory. The user in her different roles will be a part of the design. This is both because she can tinker with system properties, and because she and her group will adapt their behaviour to the system, thereby changing the context it was designed for. The fact that more than one user interacts raises additional issues compared to a single user application. A designer has to reflect on privacy, access control, and conflicts between individuals or groups, e.g. between managers and others [JG2]. Should a manager be allowed to add some functionality, e.g. a mandatory time schedule that suits his purposes and his ways of working, but only adds to the work of others? In reference [JG2] the author formulates some additional challenges for developers when designing groupware and other equipment for H-H interaction:    Exception handling is difficult. Interaction and groupwork follow the rules most of the time. The problem is that sometimes rules are broken for good reasons. Usually because breaking the rules makes work much easier. If workarounds have to be used the groupware becomes an obstacle. Lack of experience in designing groupware systems. It is difficult to intuitively foresee all the intricate dynamics involved, even a small change, such as making the creation date of documents visible could change behaviours. Problems with introducing and managing the system. One example is that high tech gear is typically designed to show off the technology. This can make users appear socially unattractive. Work processes can usually be described in two ways: the way things are supposed to work and the way they work. [JG2] E-mail has been a success because the sender takes the initiative and has to do most of the work. You quickly read an e-mail, even though too many emails, and computer viruses, could turn even only reading them into a Hakan Gulliksson 245 nightmare. E-mail is compatible with common office practise, it is informal, and easy to adapt to new situations, even if emotions are difficult to express and easy to misinterpret. Evaluation of e-mail as a CSCW tool is ongoing everywhere by us all, and still, after 30 years the verdict is not final. E-mail is constantly compared to, and emerges with, other new technologies. It is for instance difficult to quickly browse an MMS message that is audio only, and this could be a crucial difference in favour of the text message. The question of whether a new technology is labour saving is not an easy one to answer. Some evidence suggests that the washing machine in fact increased the time spent washing! The reason was that the acceptable level of hygiene changed. Whether the same is true about the dishwasher as well is not known to us. Culture – A set of beliefs, desires, intentions, trust, morality. J. Odell Why is videoconference not used to a greater extent at Ericsson and Nokia (2004)? VI.2.1 Ethics, privacy and security As the context learns more about the interactors it can perform better, but the backside of the coin is that this threatens the interactors integrity. Will we feel comfortable in an everyday environment with computerised eyes and ears that constantly observes us and registers our behaviour? Hardware sensors shrink and are soon too small and numerous for humans to relate to them. How will you know which nodes to shut down for privacy if you cannot even see them? To top this, digitised information is inherently lightweight (instant copy), and unfaithful. It is already possible to track down both the addresses of the visitors at a web site and where they have been before. This information is currently not used to its full potential, but when it is, it will have a big effect on feedback from usage patterns. The Amazon bookshop already collects information about what books you are interested in. When designing persuasive technologies you will carefully have to consider the ethics of the services they produce. What if the user looses time, money, or does something regrettable. Who is responsible if the socially intelligent toy makes a serious mistake and perhaps hurts the child? Should we blame the designer, the company who paid the designer, the shop who sold the toy, Mother Nature, or perhaps the mother of the child who bought the toy? The toy itself usually gets off easy. These problems will become worse as persuasive technologies evolve. One example is a computerised slot machine with a high level of psychological insights. “What might it be like to live in a world where personal information becomes available as one moves from one space to another? It is hard enough to keep track of files in a desktop PC, but, with new context aware systems, how will we know when information is captured, accessed, and used, and by whom, for what purposes in context-aware settings? And how will this kind of capability make us feel? Victoria Bellotti, Keith Edwards, Xerox Palo Alto Research Center The creators of a persuasive technology should never seek to persuade a person or persons to something they themselves would not consent to be persuaded to do. The Golden Rule of Persuasion. A set of rules that should allow applications to be developed, without risking a debate, has been proposed in the privacy guidelines issued by the U.S Federal Trade Commission, also compare 9.8.4:    Notice: The individual should have clear notice of the type of information collected. Access: All information in the system should be accessible and changeable by the users themselves and it is up to them to change it, whatever way they like. This is for instance currently not the case on the web or in newspapers. Commercial organisations are mostly interested in the integrity of data, the military worries more about secrecy, and individuals are concerned about privacy. [UL] Use: How the information is used, and by whom, should be clear to the information provider. Once again this is not the fact in today s real life. You already have zero privacy. Get over it Scott McNealy Hakan Gulliksson 246   Security: Users are only allowed to access information that they anyway could have obtained, for instance by personally participating in an event. Reasonable measures should be taken to secure the data from unauthorized access. Choice: An individual should have the choice to deny data collection. If more information is shared by the interactors, the potential complexity of the interaction will increase. Information about previous interactions is obviously interesting to store. The problem is that there is way too much information that could be potentially useful (somehow Humans found a way around this problem millions of years ago). For reliability we also need ways to distribute and safeguard the information while maintaining fast access. VI.3 Design for I-I No human user in sight means that many of the aspects of design are not applicable. Focus is instead on function and efficient use of resources and the most important resource is the networkk. Compliance to standards reduces redundant work and enable reuse. Performance is important, but also reliability, and security. Networked equipment are usually quite complex devices so advanced software is needed. Since performance is important so is hardware, to speed things up. Wireless networks make many new applications possible. They are in fact at times so revolutionary that traditional affordances are no longer valid. With no cables attached to the stereo it is difficult to identify a loudspeaker. What is my device currently up to and interacting with? What consequences does this interaction have, e.g. for safety, security, stability, and configuration of other applications? Smaller local area access networks are not too difficult to design and manage, even though administrating a network with 10 users can be quite labour intense. A designer of a larger network faces real challenges. Policies, i.e. who is allowed to do what with what, and security are difficult issues. Resources needed time Performance An iteraction is in general much more dynamic than a medium. The interaction involves at least two interactors and a medium (message), and extends over time. To evaluate an interaction evaluating the medium is not enough, the characteristics of the interactors and the context must be accounted for. This is not too difficult for I-I interaction where efficiency is easy to measure, e.g. a short message is better. Whenever a human is involved in the interaction the evaluation of the medium is much more complicated. VI.4 Design for H-I/T At this stage we would like to relate technology for interaction to design. The figure below shows three different aspects of design, each with its own knowledge space. How a designer goes about designing is one dimension, another is how a product fulfils its purpose, e.g. through usability and credibility, and thirdly, perhaps most important is what the product is used for. Hakan Gulliksson time Tax Phone bill Morgage Rent 247 Ergonomic design Visual design Interaction design Concept design Requirement spec. Reuse (Standard) Sustain a vision How a designer designs a product. Credibility Usability Novelty Technology Economy Aesthetics Status … How a product fulfils its purpose. ψasic operation Χmerge, join, …Ψ Interaction (identification, navigation, choice …Ψ Serves a purpose (desiderata) Figure VI.4.1 Design and Change (to/from) technology interact. What a product does. The more abstract product deliveries, i.e. desiderata and change (to/from) in the figure VI.4.1 deserve additional comments. Panta Rei, i.e. everything flows, is one of many ways to express that change is inevitable. We constantly either manoeuvre towards a wanted state, or move away from something we do not like. The alternative is not very interesting; no change at all equals death. A desideratum (that-which-is-desired) is what we intend the world to be, it is the purpose of our product (and our lives), and not only limited to needs. A product that satisfies a need, for instance a pair of shoes, is very limited in scope compared to a social system designed to keep a group of people working in good spirit towards a common goal. Desiderata are related to the intentional states and emotions discussed in part IV of the book where desire was one of the intentional states listed. For a more thorough treatment of desiderata we refer to reference [ES]. To conclude, usability is of course important for a product to fulfil its purpose, but so are also many other aspects, such as adaptability that makes personalised services possible, and aesthetics that makes them enjoyable. Infrastructures for computing and communication are key technologies, and the limited access to valid contextual information, especially social context, will constrain the level of innovation. Remember that behaviour changes as a result of the introduction of new technology and this uncovers new opportunities for technology and services. A desideratum is something that is evoked out of a want, a desire, a hope, a wish, a passion, an aspiration, an ambition, a quest, a call to, a hunger for, or will towards. [ES] FREE ALL FREEDOM TO CHANGE FREE I FREE ME T VI.4.1 Information overload Humans always try to process input into meaningful representations with their limited memory and processing capability. This threatens to drown us because of the current explosion of machine generated, and machine transported information. This said the statement in the margin refines the problem of information overload somewhat. The problem is not so much the amount of information, but rather what we want to do with it. We humans can handle lots of information, as proved by our vision system. The information stress is rather an effect of how we consciously choose to process information, the number of goals we strive for, and their complexity. This kind of stress probably also existed several hundred years ago, without computers and databases. Information overload is something that must be kept in mind when designing user interfaces for H-T, and H-I interaction. Hakan Gulliksson “Information overload is the existence of a gap between what can be done and what one wants to do or think one should do with existing information” P Wilson [ABG] “ … there are limits to what computers can expect their human companions to put up with” Paul Saffo [TWD]. 248 VI.4.2 Incidit in Scyllam, qui vult vitare Cha-ry'bdim (Out of the cauldron into the fire) It seems that humanity has strived and longed for material things, comfort, and security for a long time. One effect of this is that we are managed by time rather than managing our own time. Our tools are running the show. Now, assuming that we will soon have, or maybe already have, an acceptable material standard, what will our next accomplishment be? Will we not try to embed and develop ourselves in a social, probably virtual environment, e.g. by posting as someone or something else doing something interesting somewhere else? The interest in networked games and virtual communities indicate this. Then, maybe we will get rid of the ringing of the alarm clock, but instead develop dependencies on many other (yet unknown) aspects of the new virtual environments. Is the sound of an arriving e-mail an indication of this? Or, perhaps the reports on stressed young people who cannot turn of their mobile phones, SMS:ing all night, afraid of missing something important? Society will disintegrate families, but the Internet and the mobile networks will re-integrate them. Keep in touch with your children at school or when they grow up, all over the world. Selling virtual homes where a family can live will be big business. What colour are your exclusive virtual curtains? /HG Information society -> interaction society -> relation society-> /HG VI.5 Design for T-T Design of T-T interaction is a well-developed area. Civil engineers have built bridges for thousands of years and in the last hundred years there has been an enormous development in different kinds of machinery. New materials and new methods of manufacturing still hold many surprises. The new information society shifts emphasis from machines and material towards information, and knowledge work, but somehow we have to package all of the information and processing capacity that will be available to us, and give it physical shapes. We do this by giving the manufacturing tools more intelligence and processing capacity, and by introducing new materials. This makes exteriors possible, which are more adapted to the use of the thing, and at the same time provides for aesthetic solutions. Our machines are more and more craftsmen that can use materials optimally, economically, and without the time penalty paid by human craftsmen. Every material has its own characteristics, hardness, stiffness, and strength, and for some materials these can be changed. Using the appropriate tools, a piece of such a material can be given a specified shape, size and weight. The techniques used are wasting, forming, and casting [DP]. Wasting means removing material from some piece until it gets the right shape. Forming is changing the shape, not by removing material, but by transforming the material in the piece by bending or pressing. The last technique is casting where a mould is covered by or filled with some liquid that hardens. With today s technology we are very good at performing the techniques above. We have machines that can do it with enormous speed and precision. Hakan Gulliksson The technological advances of the next age will be in processing. [DP] For every way there is to build something up there are many ways to break it down. Pessimistic view on the world Still the world is surviving and even developing. Positiv response. Stiff, strong Light, thick 249 What we still can improve is perhaps how we process the material itself, i.e. how we alter its properties to match the design problem at hand. Figure VI.5.1 Steel Service Corporation. As technology develops, the number of possible interactions T-T, without including H in the loop, will increase. One example is a smart tag that identifies other tags. It could soon be integrated into mobile phones, or maybe the mobile phone will be integrated into the smart tag? A smart tag can be used to simplify payment, as a security device unlocking doors, or for exchanging business cards at meetings. Robotics is another research area with numerous possibilities. Primitive automatic lawn movers and vacuum cleaners are already found on the market. But, most tasks are much more complex than they seem. Building two robots that together lifts something up, and carries it between two positions is quite difficult in general. The two robots need good internal models of each other, the physical environment, and of the task to perform. They also need means of communication, must be fail safe, and capable of real-time planning and adaptation to changes in the environment. The seemingly simple task contains many of the problems we have discussed in this book. T T T Context is of course important also for T-T design. A designer is faced with the following choices when designing a vehicle for handicapped [SC4]: 1. 2. 3. 4. Make the vehicle smart enough to manage all kinds of environments Adapt the vehicle to a specific environment Adapt the environment Make the environment smart enough to adapt to all kinds of vehicles. If we add humans to the environment, and change the design target to a system including humans, the designer will not have access to all information in the system. The design problem will be more difficult, but note that with humans in the loop the ability for adaptation increases immensely. Hakan Gulliksson 250 PART VII: Resources VII.1 References [“”G] “nders ”roberg, Cognitive tools for learning , UMINF . [“C] “ndy Clark, ”eing there , IS”N -262-53156-9 [ACR] “lan Cooper, The inmates are running the “sylum , IS”N -672-31649-8 [“D] “. Dey, Providing “rchitectural Support for ”uilding Context-“ware “pplications , PhD Thesis at Georgia Institute of technology, November 2000. [“D ] “lan Dix et al, Exploiting space and location as a design framework for interactive mobile systems , ACM Transactions on Computer-Human Interaction (TOCHI), Volume 7 , Issue 3, September 2000. [“D ] “. Dey et al , “ conceptual framework and a toolkit for supporting the rapid prototyping of contextaware applications , Human-Computer Interaction, vol 16, pp 97-166, 2001 [“DFH] “. “kmajian et al, Linguistics “n Introduction to Language and Comunication , IS”N -262-51086-3. [“D ] “lan R. Dennis, Rethinking Media Richness Towards a Theory of Media Synchronicity , Proceedings of the 32nd Hawaii International Conference on System Sciences ,1999. [“D ] “. Dey, Evaluation of Ubiquitous Computing Systems , available at www.cs.berkeley.edu/~dey/pubs/emuc2001.pdf 2004-06-16 [AD6] A. Dix et al, Exploiting Space and Location as a Design Framework for Interactive Mobile Systems , “CM transactions on Computer-Human Interaction, Vol 7, No 3, September 2000, p 285-321. [“H] “. Huang, et al, Running the Web backwards appliance data services , Computer Networks 00, 2000, p 1-13. [AK] Klovdahl, A. S. (1989). Urban Social Networks: Some methodological problems and possibilities. In M. Kochen (Ed.) The Small World. Norwood, NJ: Ablex. [AM] A. Munro., K. Höök, and D. Benyon. eds. 1999. Social Navigation of Information Space. London: Springer verlag [“M ] “aron Marcus, Eugene Chen, Designing the PD“ of the future , Interaction of the “CM, Jan-feb 2002. [“S] “lbert Schmidt et al, “dvanced Interaction in Context , Lecture notes in computer science Vol 1707, ISBN 3-540-66550-1; Springer, 1999, pp 89-101 [“S ] “lbrecht Schmidt, MediaCups Experience with Design and Use of Computer-Augmented Everyday “rtefacts , Computer Networks, Special Issue on Pervasive Computing, Vol. 35, No. 4, March 2001, p. 401-409 [“S ] “lbrecht Schmidt, Implicit Human-computer-interaction Through Context , Personal Technologies Volume 4(2&3), June 2000. pp 191-199 [“S ] “. Sloman, “rchitectural requirements for human like agents , in Human Cognition and Social “gent Technology , edited by Kerstin Dautenhahn, IS”N . [”“] ”. “nderson, Information Society Technologies and Quality of Life , Chimera working paper number 2004-09 [”F] ” Fogg, Persuasive Computers Perspectives and Research Directions , Proceedings of the Conference on Human Factors in Computing Systems (CHI-98) : Making the Impossible Possible (CHI98). Los Angeles, CA, USA, 1998, pp 225-232. [”F ] ”. Fogg, Persuasive Technology Using Computers to Change What We Think and Do , ISBN 1-55860-643-2 [”MD] ”arnard, Duke, May, Duce, Systems, Interactions, and Macrotheory , ACM Transactions on Computer-Human Interaction, Vol 7, No 2, June 2000, pp 222-262 [BS] Bruce R. Schatz, The Interspace Concept Navigation “cross Distributed Communities , IEEE Computer, January 2002, pp 54-62 [”S ] ”en Shneiderman, Designing the user interface , IS”N -321-19786-0 Hakan Gulliksson 251 [”S ] ” Schneiderman, The eyes have it “ Task by Data Type Taxonomy for Information Visualizations , Proc. IEEE Symp. Visual Languages, 1996. [”S ] ”. Salem, “esthetics as a Key Dimension for Designing Ubiquitous Entertainment Systems , retrieved 17 oct 2005 at http://www.idemployee.id.tue.nl/g.w.m.rauterberg/publications/UBIHOME2005paper.pdf [CC] Chris Crawford, On interactive storytelling , IS”N -321-27890-9 [CC ] Chris Crawford, Understanding interactivity , available at http //www.erasmatazz.com/ [CG] C. Greenhalgh, “ugmenting Reality Through the Coordinated Use of Diverse Interfaces , ???. [CH] Carrie Heeter, Interactive in the Context of Designed Experiences , Journal of Interactive “dvertising, Volume 1, Number 1 Fall 2000. [CH ] C. Hummels et al, Knowing, doing, and feeling Communicating with your digital products , “vailable at http://www.io.delft.nl/id-studiolab/djajaningrat/publications.html 2002-11-12 [CMN] Card, Moran Newell, The Psychology of Human-Computer Interaction , [CS] C. Snyder, Paper prototyping , IS”N -55860-870-2 [CW] C. Wickens, Engineering Psychology and Human performance , IS”N -321-04711-7 [D“] Diane “ckerman, “ natural history of the senses , IS”N -679-73566-6 [D“””] Davenport et al, Synergistic storyscapes and constructionist cinematic sharing , IBM system journal, vol 39, no 3&4, 2000 [D“N] Donald “ Norman, Things that make us think , IS”N -201-58129-9 [D“N ] Donald “ Norman, The psychology of everyday things [D” ] David ”enyon, Representations in human-computer systems development , http://www.dcs.napier.ac.uk/~dbenyon/publ.html. [D” ] David ”enyon, Emplying intelligence at the interface , http //www.dcs.napier.ac.uk/~dbenyon/publ.html. [DC] D. Cohen et al, Livemaps for Collection “wareness , Journal of Human-Computer Studies, vol 56, no 1, pp 7-23, Jan. 2002. [DD] D Dennett, Cognitive Wheels The Frame Problem of “I in M. ”oden ed The Philosophy of “rtificial Intelligence, 1990, pp 147-170, ISBN 0-19-824854 [DD ] D. Davis, Why do anything?, Emotion, affect and the fitness function underlying behaviour and thought. Affective Computing, AISB 2004, University of Leeds, UK [DHA] David Harel, Computers ltd. , ISBN 0-19-850555-8 [DH] Douglas Hofstadter, Gödel, Escher, ”ach an eternal golden braid , IS”N -140-05579-7 [DK] David Kirsh, Interactivity and Multimedia Interfaces , Instructional sciences [DK ] David Kirsh, Today the earwig, tomorrow man , “rtificial Intelligence vol , pp -184, 1991 [DK ] David Kieras, “ Guide to GOMS Model Usability Evaluation using GOMS and GLE“N , available at ftp.eecs.umich.edu/people/kieras, 1999 [DM] David Marr, Vision, ISBN 0-7167-1567-8 [DN] Donald “ Norman, The design of everyday things , IS”N -385-26774-6 [DN ] Donald “ Norman, Emotional design , IS”N -465-05135-9 [DP] D. Pye, The nature and aesthetics of design , IS”N -71-3652861. [DP ] D. Pinelle et al, Task “nalysis for Groupware Usability Evaluation Modeling Shared-Workspace Tasks with the Mechanics of Collaboration , “CM transactions of Computer-Human Interaction, Vol 10, No 4, December 2003, pp 281-311. [DS] D. Svanaes, Context-“ware Technology “ Phenomenological Perspective , Human-Computer Interaction, Vol 16, pp 379-400, 2001. [DW] D. M. Wegner, “ computer network model of human transactive memory , Social Cognition, 13, 1-21. [EB] Eric Bergman, Information appliances and beyond , IS”N -55860-600-9 [E” ] E. ”eck et al, Experimental evaluation of Techniques for Usability Testing of Mobile Systems in a Laboratory Setting , In Proceedings of OzCHI , ”risbane, “ustralia. [ED] E. Deci, The What and Why of goal pursutits Human Needs and he Self-Determination of ”ehaviour , Psychological inquiry, vol 11, no 4, 2000, p 227-268. [EF] E. Freeman Lifestreams Organizing your Electronic Life , In “““I Fall Symposium “I “pplications in Knowledge Navigation and Retrieval, November 1995. Cambridge, MA. http://citeseer.nj.nec.com/4353.html. [EN] Elmasri, Navathe, Fundamentals of database systems , IS”N -201-54263-3 [FDFH] Foley et al, Computer graphics, principles and practice , IS”N -201-84840-6. [ES] E. Stolterman, H. Nelson, The design way , IS”N 0-877-783055. [ET] E. Tufte, Visual Explanations , IS”N -96139212-6. Hakan Gulliksson 252 [FG] F. Gemperle at al, Design for Wearability , http://www.ices.cmu.edu/design/wearability/files/Wearability.pdf, available June 2003. [FI] FIP“ Interaction Control Library Specification , www.fipa.org [GA] G. “bowd et al, Charting past, present, and Future Research in Ubiquitous Computing, “CM Transactions on Computer-Human Interaction, Vol 7, No 1, March 2000, p29-58 . [GB] Giorgio Buttazzo, IEEE Computer, vol 34, no 7, July 2001 [GC] Guanling Chen et al, A Survey of Context-Aware Mobile Computing Research, Dartmouth Computer Science Technical Report TR2000-381. [GD] G. Doherty, Continuous Interaction and Human Control , Control Proceedings of 18th European Conference on Human Decision Making and Manual Control ISBN 1-874152-08-X p.80-96 J. Alty (Eds), Group D. Publications, Loughborough, (October 1999) [GF] George W. Fitzmaurice, Graspable User Interfaces , PhD thesis, University of Toronto, 1996. [GG] G. Graham, Philosophy of the arts , IS”N -415-23564-2 [GOF] Gamma, Helm, Johnson, Vlissides, Design patterns Elements of reusable object oriented design , Addison-Wesley, 1994, ISBN 0-201-63361-2. [GP] G. Polya, How to solve it , IS”N -691-02356-5 [GR] G. Riva et al, Presence 2010: The Emergence of Ambient Intelligence, http://www.vepsy.com/communication/volume4/4Riva.pdf, available May 2003. [GR ] G. Riva, The Layers of Presence “ ”io-Cultural Approach to Understanding Presence in Natural and mediated Environments , Cyberpsychology & ”ehaviour, vol , no , ,p -419. [HG] Hans Gellersen et al, Multi-Sensor Context-“wareness in Mobile Devices and Smart “rtefacts , Mobile Networks and Applications (MONET), Oct 2002. [HK] Hideki Koike, Integrating Paper and Digital Information on EnhancedDesk “ Method for Realtime Finger Tracking on an “ugmented Desk System , “CM Transactions on Computer-Human Interaction, Vol 8, No 4, 2001, pp 307-322 [HM] Hans Moravec, When will computer hardware match the human brain , Journal of Transhumanism, vol 1, 1998. [HP] H. Parunak et al, Co-X defining what agents do together , www.jamesodell.com/publications.html, visited 23/8 2002. [HS] Heiko Sacher, Gareth Loudon, Uncovering the New Wireless Interaction Paradigm , Interactions of the ACM, Jan-feb 2002. [HS ] H. Stelmaszewska, Conceptualising user hedonic experience , In D. J. Reed, G. Baxter & M. Blythe (Eds.), Proceedings of ECCE-12, Living and Working with Technology. York: European Association of Cognitive Ergonomics. pp 83-89. [J“] John “rmitage, From User Interface to Uber-Interface “ Design Discipline Model for Digital Products , Interactions of the ACM, May-June 2003. [JC] John Carroll, Human-Computer Interaction in the New Milennium , IS”N -201-70447-1. [JC ] J. Carroll, “ User-centred Process for Determining Requirements for Mobile Technologies: the TramMate project , Proceedings of the th Pacific Asia Conference on information systems 2003. [JD] John Deller et al, Discrete-Time Processing of Speech Signals , IS”N -7803-5386-2. [JF] John Fiske, Introduction to communication studies , IS”N -415-04672-6. [JHM] Murray, Janet H, Hamlet on the Holodeck, ISBN 0-262-63187-3. [JG] Jacek Gwizdka, What s in the Context , The CHI workshoop . [JG ] J. Grudin, From here and now to Everywhere and forever , research.microsoft.com/research/coet/grudin/ubicomp.pdf available 2002-11-15. [JG ] J. Grudin, Groupware and Social Dynamics Eight Challenges for Developers , Comm. Of the ACM 37(1), 92-105, 1994. [JG ] J. Grudin, The Computer Reaches Out The Historical Continuity of Interface Design , Proceedings of the SIGCHI conference on Human factors in computing systems. 1990, p261-268 [JLES] J Löwgren, Erik Stolterman, Design av informationsteknik , IS”N -44-00681-0. [JF] Jacques Ferber, Multi-“gent Systems , IS”N -201-36048-9. [JG] J. Gratch, “ Domain-independent Framework for Modeling Emotion Journal of Cognitive Systems Research, vol 5, no 4, 2004, p 269-306 [JH] John Hughes et al, Patterns of Home Life Informing Design for Domestic Environments , Personal Technologies' Special Issue on Domestic Personal Computing vol. 4, p.25-38, London: Spinger-Verlag. [JH ] Jeffrey Hightower, Location Systems for Ubiquitous Computing , IEEE Computer, vol. 34, no. 8, pp. 57-66, Aug 2001. Hakan Gulliksson 253 [JH3] J. Habermas, ISBN 0807015075 [JK] Julie Khaskavsky, et al, Understanding the seductive experience , Communications of the “CM, May , Vol 42, no 5, pp 45-49. [JL] Joelle Coutaz et al, Context is key, Communications of the ACM March 2005, vol 48, no 3] [JJ] Jeff Johnson, “ustin Henderson, Conceptual Models ”egin by Designing What to Design , Interactions of the ACM Jan-Feb 2002. [JM] Joseph McCarthy, The Virtual World Gets Physical, IEEE Internet Computing, Nov/Dec 2001, vol 5, number 6. [JN] Jacob Nielsen, D is better than D , http //www.useit.com/alertbox/ .html., available “ug . [JN ] Jakob Nielsen, Designing Web Usability , IS”N 1-56205-810-X [JP] J. Preece et al, Human-Computer Interaction , IS”N -201-62769-8. [JP ] J. Preece et al, Interaction design , IS”N -471-49278-7. [JR] Jun Rekimoto, TimeScape “ Time Machine for the Desktop Environment , CHI' late-breaking results, 1999. [JR2] Jim Rowson, The social media project at HP labs , http //netseminar.stanford.edu/sessions/ -11-01.html, available 2004-05-27 [JR ] J. Russell, Core “ffect, Prototypical Emotional Episodes, and Other Things Called Emotion Dissecting the Elephant , Journal of Personality and Social Psychology, vol 76, no 5, p 805-819. [JS] John Searle, Speech acts , IS”N -52-109626-X [JS ] John Searle, “ taxonomy for illocutionary acts , in The philosophy of language pp -155, Oxford University Press 1971. [JS3] John Searle, The problem of consciousness , found on the net but also in The rediscovery of mind , MIT Press 1992. [JU] J. Urda, Appraisal theory and social appraisals, retrieved 2005-11-17 at http://ged.insead.edu/fichiersti/inseadwp2005/2005-04.pdf [K“] Keith “llen, Meaning and Speech acts , http //www.arts.monash.edu.au/ling/speech_acts_allan.shtml. [K”] K. ”erridge, Pleasures of the brain , ”rain and cognition vol , , pp -128. [KD] D. Canamero et al, Emotionally grounded social interaction , in Human Cognition and Social “gent Technology , edited by Kerstin Dautenhahn, IS”N . [KH] K. Hinckley, “ Survey of Design Issues in Spatial Input , Proc. ACM UIST'94 Symposium on User Interface Software & Technology, April 1994, pp. 213-222]. [KL] Kristof Van Laerhoven, On-line Adaptive Context Awareness starting from low-level sensors , Licentiate thesis, Free University of Brussels, 1999. [KS] K. Schmidt et al, „Coordination mechanisms Towards a Conceptual Foundation of CSCW Systems Design , Computer Supported Cooperative Work: The Journal mof Collaborative Computing, vol 5, 1996, p 155-200. [KS1] K. Sheldon, Achieving sustainable new happiness:Prospects, practices, and prescriptions. In A. Linley & S. Joseph (Eds.), Positive psychology in practice (pp. 127-145), 2004 Hoboken [KS ] K. Sheldon, What is Satisfying “bout Satisfying Events? Testing Candidate Psychological Needs , Journal of Personality and Social Psychology, vol 80, no 2, p 325-339. [LC] L. Cheng Personal Contextual “wareness Through Visual Focus , IEEE Intelligent systems, May , pp 16-20. [LH] Lars Hallnäs, et al, Slow technology , Personal and Ubiquitous Computing, Vol. , No. , , pp. -212. [LEJ] Lars-Erik Janlert “ wider view of interaction , draft version June 2003. [LEJ2] Lars-Erik Janlert “ generic medium model for new media , draft version June . [LS] L. Suchman, Plans and situated action, The problem of human machine communication , , ISBN 0-52133739-9 [M”] Miroslaw ”ober, MPEG-7 Visual Shape Descriptors , IEEE Transactions on Circuits and Systems for Video technology, vol 11, no 6, June 2001, pp 716-719. [M” ] Meridith ”elbin, ”elbin website , “vailable at www.belbin.com, September . [MB2] M. A. Baker (ed.), Sex differences in human performance. Contemp. Psychology, 1987, 33, 964-965 [M” ] M. ”ickhard, Emergence , http //www.lehigh.edu/~mhb /emergence.html, last visited / [MC ] Mihaly Csikszentmihalyi, Creativity , IS”N -06-092820-4 [MC ] Mihaly Csikszentmihalyi, The meaning of things , IS”N -521-28774-x [MC ] M. Csikszentmihalyi, Flow the psychology of optimal experience [MG] “ Monk, N Gilbert, Perspectives on HCI , -12-504575-1 [MNWN] James H. McMillan, Jon F. Wergin, Understanding and Evaluating Educational Research , January 1998 Merrill Pub Co; ISBN: 0131935410. Hakan Gulliksson 254 [MR] M. Riedl, “ Computational Model and Classification Framework for Social Navigation , Masters thesis, North Carolina State University, 2001,Available at http://www4.ncsu.edu:8030/~moriedl/publications/thesis.pdf , 2002-11-30. [MR ] M. Raghunath et al, Fostering a symbiotic handheld environment , IEEE Computer, p56-65, September 2003 [MR] M. Rosson, Usability Engineering Scenario-Based Development of Human-Computer Interaction , IS”N 1-55860-712-9 [MS ] Munhindar Singh, “gent Communication Languages Rethinking the Principles , IEEE Computer, vol 31, no 12, pp 40-47, December 1998 [MS ] Munhinda Singh, “ social Semantics for “gent Communication Languages , Issues in “gent Communication. Lecture Notes in Computer Science 1916 Springer 2000, ISBN 3-540-41144-5 [MW] Martijn van Welie, Task-”ased User Interface Design , PhD dissertation. [MT] M. Toda, The Urge Theory of Emotion and Social Interaction , Chapter and , retrieved -11-18 at http://cogprints.org. [MW ] M. Weiser, et al Designing Calm technology , Powergrid Journal , http://www.ubiq.com/hypertext/weiser/calmtech/calmtech.htm, 10/1 2002. [MW ] M. Wooldridge, Mulit-agent systems , IS”N -971-49691-X, Wiley 2002. [MW4] Wiberg, M. (1999). Extending the modality of travelling: Designing travelling support for mobile IT users, Jyväskylä, : Proceedings of IRIS 22, "Enterprise Architectures for Virtual Organisations ", s 49 -58. [MZ] Michelle Zhou, Visual task Characterization for “utomated Visual Discourse “nalysis , Proceedings, ACM CHI 1998, pp 392-399 [MZ ] Martin Zimmerman, Human psychology , nd Ed Berlin: Springer Verlag 1989 [NF] N. Frijda, The emotions , Cambridge press, [NG] Neil Gershenfeld, The nature of mathematical modelling , IS”N -521-57095-6 [NG2] Nick Gibbins, lecture notes found at www.ecs.soton.ac.uk/~nmg97r/hci/task-analysis/ [NG ] N. Goodman, Ways of worlds making , IS”N -915144-51-4 [NS] Stillings et al, Cognitive science, an introduction , Second edition, ISBN 0-262-19353-1 ISBN 0-201-633618NS2] [NS ] N. Shedroff, “rticles and presentations retrieved at http://www.nathan.com/thoughts/index.html [OJ] O hare, Jennings, Foundations of distributed artificial intelligence , IS”N -471-00675-0 [OR] Odd-Wiking, Rahlff et al, Using personal traces in Context Space, Position paper for the CHI 2000 workshop WS11. [P”] P. ”arnard et al, Representing cognitive activity in complex tasks , Human-computer-interaction, 14, 93158. (1999) [PB2] P. J. Brown, G.J.F. Jones., Context-aware retrieval: exploring a new environment for information retrieval and information filtering . Personal and Ubiquitous Computing, 5, 4, pp.253-263, 2001. [PD] P. Dourish, Where the action is , IS”N -04196-0 [PE] P. Ekman. Emotion in the Human Face Cambridge University Press, Cambridge, UK, . [PG] Peter Gärdenfors, Hur homo blev sapiens , IS”N -578-0352-8. [PG ] Peter Gärdenfors Conceptual Spaces -262-57219-2 [PJ] P. Jordan, Designing pleasurable products “n introduction to the new human factors , , London, UK Taylor & Francis. [PL] Peter Lucas, Human-Computer Interaction , Vol , pp -336, 2001. [PL ] P.Langley, Elements of machine learning, IS”N -55860-301-8. [POSA1] Meunier, Sommerlad, Stal, Rohnert, ”uschmann, Pattern-Oriented Software Architecture, Volume 1: A system of patterns , IS”N -471-606952. [POS“ ] Schmidt, Stal, Rohnert, ”uschmann, Pattern-Oriented Software Architecture, Volume 2: Patterns for Concurrent and Networked Objects , ISBN 0-471-606952. [PP] P. Persson et al, Stereotyping Characters “ Way of Triggering “nthropomorphism , “““I Fall Symposium, 3-5 November 2000, available at http://www.sics.se/~jarmo/SocIntAgent.htm 2002-11-09. [PR] Pentti Routio, “rteology Semiotics of “rtifacts , http www .uiah.fi/projects/metodi/ .htm, available June 2003. [PT] Todd et al Judgement of domain-specific intentionality based solely on motion cues, available at http://www-abc.mpib-berlin.mpg.de/users/barrett/ 2003-11-21. Hakan Gulliksson 255 [RA] R. Arkin, ”ehaviour-”ased Robotics , IS”N -262-01165-4. [R”] R. ”rooks, Flesh and Machines , IS”N -375-42079-7. [RC] Craig, R. T, Communication theory as a field , Communication Theory, , , pp -161. [RD] R. Darken, et al, Wayfinding Strategies and behaviours in Large Virtual Worlds , Proceedings of the “CM CHI 96, pp 142-149. [RL ] R. Larsen Personality psychology , IS”N -07-111149-2. [RL ] R. Layard, Happiness, has social science a clue? , http://www.sustainablepss.org/intro/Layard_2003b.pdf, retrieved Oct 29 2005. [RMO] Rune Monö, Design for product understanding , IS”N -01105-x. [RN] S. Russel, P. Norvig, “rtificial intelligence , IS”N -13-360124-2. [RN ] “ new framework for Entertainment Computing From Passive to “ctive Experience , In proceedings of the ICEC 2005 LNCS 3711, p1-12, 2005 [RSW] R. Wurman, Information anxiety , IS”N 0-78-972410-3. [RR ] Rolf Rolfsen, Contextual “warteness Survey and Proposed Research “genda , SINTEF Telecom and Informatics, 1999. [RR ] Rahlff, O. W., Rolfsen, R. K., Herstad, J., Thanh, D. v., Context and Expectations in Teleconversations , Proceedings of HCI International '99 (the 8th International Conference on Human-Computer Interaction). [RR ] Rettie, R. . Using Goffman sframeworks to explain presence and reality. , Presence Seventh Annual International Workshop. 117-124. Valencia, ISPR [RR ] R. Rettie, Presence and Embodiment in Mobile Phone Communication , Psychology journal, vol3, no 1, 2005, p 16-34 [RS] R. L. Schalock, The concept of qualiity of life what we know and what we do not know , Journal of disability research, vol 48, March 2004, pp 203-216 [RV] R. Veenhoven, “dvances in understanding happiness , Happiness. Revue Québécoise de Psychologie, 1997, Vol. 19, pp. 29-74. [RV ] Ruut Veenhoven, The four qualities of life , Journal of happiness studies , Vol 1, pp 1-39. [RV ] R. Vallacher, What Pople Think They are Doing , Psychological review , vol , no , p -15. [RW] Robert Williams, Mapping Genes that Modulate Mouse ”rain Development “ Quantitative Genetic “pproach , in Mouse brain development Goffinet AF, Rakic P, eds). Springer Verlag, New York, pp 21–49. [S”] Steve ”enford, Understanding and Constructing Shared Spaces with Mixed Reality ”oundaries , ACM Transaction on Computer-Human Interaction (ToCHI), 5 (3), pp.185-223, September 1998, ACM Press. [SC] Schidt, User interface for wearable computers –Don t Stop to Point and Click, Intelligent Interactive Assistance & Mobile Multimedia Computing (IMC'2000) Rostock-Warnemünde, Germany - November 9-10, 2000, http://www.teco.edu/~albrecht/publication/imc00/uis-for-wearables-abstract.html. [SC ] S. Chan et al, Usability for mobile commerce across multiple form factors , Journal of Electronic Commerce Research, No 3, 2002. [SC ] Sanjay Chandrasekharan, Semantic web a distributed cognition view , Carleton University Cognitive Science Technical Report 2002-13, www.carleton.ca/iis/TechReports/files/2002-13.pdf, visited 2005-05-12. [SD] Steve Dubrow-Eichel, ”uilding Resistance Tactics for Counteracting Manipulation and Unethical Hypothesis in Totalistic Groups , Suggestion the journal of professional and Ethical Hypnosis, 1985, pp 34-44. Available at http://users/snip/net/~drsteve/articles/building_resistance.html 2002-12-01. [SK] Kristoffersen, S. and F. Ljungberg, Mobile Use of IT, In Proceedings of IRIS22 1998, Jyvaskyla, Finland [SM] Steve Mann, Wearable Computing Toward Humanistic Intelligence , IEEE Intelligent Systems, May 2001, pp 10-15. [SP] Steven Pinker, How the mind works , IS”N -393-31848-6. [SR] Rajeev Sharma et al, Toward multimodal human-computer interface , Proceedings of the IEEE, vol86, no 5 May 1998. [SS] S. Smith et al, Using the Resources Model in Virtual Environment Design , Workshop on User Centered Design and Implementation of Virtual Environments, S. Smith and M. Harrison (eds), pg 57-72, 30th September, 1999, University of York, York. [SS ] Steven Shafer, Interaction issues in Context-“ware Intelligent Environments , to appear, available at http://research.microsoft.com/easyliving/publications.htm. [ST] Shawn Tseng et al, Credibility and Computing Technology , Communication of the “CM, May , Vol 42, no 5. [SW] S. Wehrend and C. Lewis, A problem-oriented classification of visualization techniques. In Proceedings IEEE Visualization '90, pp. 139-143. Hakan Gulliksson 256 [TH] T. Höllerer, et al, Exploring MARS: Developing Indoor and Outdoor User Interfaces to a Mobile Augmented Reality System , Computers and Graphics, 23(6), Elsevier Publishers, Dec. 1999, pp. 779-785 [TJ] Timo Jokela, When good things happen to bad products , Interactions, Nov-Dec 2004. [TM] Toshiyuki Masui, Real-World Graphical User Interfaces , Proceedings of the First International Symposium on Handheld and Ubiquitous Computing, No. 1927, pp. 72-84, September 2000. [TM2] T. Mitchell, Machine learning, ISBN 0-07-115467]. [TM3] T. Moran, T. P. (1980). A framework for studying human-computer interaction. In Methodology of Interaction, R. A. Guedj et al., eds., North-Holland, 293-301. [TM4] T. Moran [private conversation] [TN] Tor Nörretranders, The user illusion , IS”N -140-23012-2 [TS] Thad Starner, Human powered wearable computing , I”M systems journal, vol , no & , . [TS ] T. Selker, Context-aware design and interaction in computer systems , IBM systems journal, vol 39, no 3&4, 2000. [TSi] Thomas Sikora, MPEG-7 Visual Standard for Content Description – “n Overview , IEEE Transactions on Circuits and Systems for Video technology, vol 11, no 6, June 2001, pp 696-702. [TWD] Winograd et el, ”ringing design to software , IS”N -201-85491-0 [TWD ] Winograd and Flores, Understanding Computers and Cognition , IS”N -89391-050-3. [U”] U. ”orghoff, J. Schlichter, Computer-Supported Co-operative Work , IS”N -540-66984-1. [UCIT] http://WWW.umu.se/ucit [UI] Underkoffler, URP a luminous tangible workbench for urban planning and design , Proceedings of CHI-99, pp 386-393, 1999 [UL] U. Leonrhardt, Supporting Location-“wareness in Open Distributed Systems , PhD thesis, Faculty of engineering University of London, 1998. [VB] V. Bellotti, et al, Intelligibility and “ccountability Human Considerations in Context-“ware Systems , Human-Computer Interaction, vol 16, 2001, pp 193-212. [W”] W. ”uxton, Integrating the periphery and Context “ new taxonomy of telematics , Proceedings of Fgraphics Interface ,p -246. [WC] Wayne Christensen, Self-directedness a process approach to cognition , “xiomathes, vol , p 171-189, 2004 [WE] William Eamon, Technology and Magic, Technologia , -64 [WK] W. Kintsch, The representation of knowledge in minds and machines , International journal of Psychology 1998, vol 6, no 33 pp 411-420. [WM] Wendy Mackay, Media Spaces Environments for Informal Multimedia Interaction , Computer Supported Co-operative Work, John Wiley 1999. [XFR] Xristine Faulkner, Usability engineering , IS”N -333-77321-7 [YB] Yaneer Bar-Yam, Dynamics of Complex Systems , IS”N -201-55748-7 Hakan Gulliksson 257 The Philosopher's Song (Monty Python) Immanuel Kant was a real pissant Who was very rarely stable. Heidegger, Heidegger was a boozy beggar Who could think you under the table. David Hume could out-consume Wilhelm Friedrich Hegel, And Wittgenstein was a beery swine Who was just as schloshed as Schlegel. There's nothing Nietzsche couldn't teach ya 'Bout the raising of the wrist. Socrates himself was permanently pissed John Stuart Mill, of his own free will, On half a pint of shandy was particularly ill. Plato, they say, could stick it away Half a crate of whiskey every day. Aristotle, Aristotle was a bugger for the bottle, Hobbes was fond of his dram, And Rene Descartes was a drunken fart: "I drink, therefore I am" Yes, Socrates, himself, is particularly missed; A lovely little thinker but a bugger when he's pissed! Hakan Gulliksson 258 VII.2 Index Abstraction, 13, 41, 49, 117 Accountability, 60 Action, 12, 83 Representation, 99 Action cycle, 84, 85, 125 Action loop, 95 Action tendency, 130 Activation, 127 Activity, 85, 161 Actuator, 76 Adaptation Human, 113 Interactor, 63 System, 26 Aesthetics, 133 Communication, 178 Affect, 127 Affection level of interaction, 181 Affective learning, 115 Affordance, 60 Aggregation, 13 Aggregation, UML, 41 Analogue representation, 76 Analogue signal representation, 66 Analysis Image, 68 Interaction, 82, 176 AND, 54 Anger, 127 Approximation, grouping by, 46 Argumentation, 204 Arousal, 125 Art, 141 Artefact, 73 Artificial intelligence, 97 Ascii code, 67 Associative memory, 24 Assymetry in interaction, 180 Attention, 101 Auction, 204 Augmented reality, 165 Autonomy, 28, 129 Bandwidth of interaction, 180 Battery, 143 Behaviour, 20, 50 Behavioural model, 50 Best first search, 110 Binary number, 66 Binding design, 177 Biomedia, 226 Hakan Gulliksson Bit, 66 Blind search, 110 Bloom, learning taxonomy, 115 Bottom up, 229 Breadth first search, 110 Byte, 66 Cause and effect, 20, 109 Cave, 165 Centralised control, 196, 215 Challenge, 139 Channel for communication, 187 Choice, 216 Chunk, 99 Chunk of data, 104 Classification, 119, 215 Client-server, 199 Close coupling, 211 Clustering, 47 Cocktail effect, 104 Coding, 53 Coding of text, 67 Cognition, 60, 61 Cognitive friction, 243 Cognitive learning, 115 Cohesion, 23, 41 Cohesion in interaction, 182 Collision detection, 88 Colour, 70 Command, 196 Command based interaction, 215 Commitment, 104, 111, 124, 199 Common ground, 209 Common sense, 109 Communication, 12, 30, 190, 198 Competition, 147, 199 Complexity System, 37 Complexity of interaction, 180 Computation, 21 Computer vision, 227 Computer-supported cooperative work, 205 Computing, 94 Concept space, 119 Conceptual level, 177 Conceptual view of interaction, 200 Conceptual view of processing, 94 Conceptual view of system, 49, 176 259 Conceptual view, design discrepance, 243 Concerns, 125 Concurrent processing, 21 Conditioning, 116 Conflict resolution, 199, 203 Congruence in interaction, 180 Context, 13, 55, 160 Activity, 161 Application, 161 Cultural, 55 Definition, 160 Environment, 160 For reasoning, 62 H-H, 167 I-I, 169 Interaction property, 183 Language interpretation, 188 Navigation, 231 Physical, 55 Self, 161 Situation, 160 Social, 55 Technological, 56 T-I, 174 Contingency problem, 107, 109 Continous model, 52 Continous value, 52 Control, 195 Conventions, 199 Conversation, 189 Cooperation, 147, 199 Coordination, 156, 198 Coordination for interaction, 197 Correlation, 198 Correlation in interaction, 182 Coupling, 23, 41 Coupling in interaction, 182 CPU cache, 101 Creativity, 18 Creativity level of interaction, 181 Creativity, definition, 120 Creativity, human, 120 CSCW, 205 Cultural context, 55 Cycle, 43 Damping, 187 Data flow model, 51 Data, definition, 65 Database, 45 Decision cycle, 84, 85 Decision making, 112, 204 Declarative knowledge, 90 Deductive reasoning, 105 Hakan Gulliksson Depth first search, 110 Design, 18 Deterministic system, 52 Digital number, 67 Digital representation, 66, 76 Discourse, 188, 190 Discrimitation, 215 Distributed behavior, 184 Distributed control, 196 Divide and conquer, 44, 109 Drive, 129 Duration of interaction, 183 Dynamic system, 35 Education, 17 Effectiveness in interaction, 180 Effectuator, 76 Effectuators, 62 Efficiency of interaction, 181 Emergence, 32 Emotion, 125, 126, 219 Empirism, 90 Energy source, 141 Energy, and adaption, 43 Entropy level of interaction, 181 Environment, 13, 55, 160 Cultural, 55 Physical, 55 Social, 55 Episodic knowledge, 90 Epistemology, 90 Equilibrium, 34 Error In model, 48 Ethics, 246 Evaluation, 245 Event, 125 Event, system view, 20 Evolution, 8, 26, 28 Human, 100, 140, 191 Learning, 114 Perception, 61 Thing, 57, 142, 160 Exhaustive search, 110 Experience, 131 Expert system, 98 Fear, 127 Feature space, 119 Feedback, 23 Feed-forward control, 25 Filter Message, 188 Flexibility, 26 Floating point numbers, 67 260 Flow, 138 Focusing, 42 Formal model, 51 Frame, 93, 121 Frame problem, 110 Frequency of interaction, 183 Functional level, 177 Functional model, 51 Generalisation, 13 Generalisation, UML, 41 Geon, 229 Gestalt theory, 45 GIF, 69 Goal, 62, 84, 130 Representation, 99 Goal based design, 244 Golden ratio, 141 Greedy search, 110 Grouping, 41 Grouping for cooperation, 202 Groupware, 205 Gulf of evaluation, 244 Gulf of execution, 244 Habituation, 116 Happiness, 136 HCI, 152 HCI, definition, 152 HCI, Human Computer Interaction, 57 Hearing, 78 Hedonic experience, 132 Hermeneutic, 192 Heterogenicity, 28 Heuristic function, 110 Heuristics, 106, 110 Hierarchy Contexts, 183 HITI model, 13 History of interaction, 181 HIT, 8 HITI, 8 HITI model, 11 Holism, 32 Human computer interaction, 57 Human computer interaction, definition, 152 Human Information/Idea Thing Interaction model, 11 Human language, 30 Idea, 65 Identification, 215 Ill posed problems, 81 Hakan Gulliksson Illocutionary speech act, 239 Image analysis, 68 Immersed VR, 165 Immersion, 164, 186 Immersion level of interaction, 181 Impulse, 125 Inductive learning, 119 Inductive reasoning, 106 Inference, 117 Information, 65, 66 Information representation, 68 Information theory, 66, 186 Inherit, 41 Inheritance, 41 Input, 20, 61 Intelligence, 93, 97 Intelligence, definition, 29 Intelligent agent, 58 Intelligent interface, 217 Intelligent thing, 58, 73, 97 Intention, 84, 125 Intention, Intentional state, 103 Intentional state, 103 Intentional view of interaction, 200 Intentional view of processing, 94 Intentional view of system, 49, 176 Interaction, 12 Interactor, 57 Human, 57 Information, 57 Thing, 57 Interface, 20, 41 Interpolation, 87 Inverse kinematics, 81 Is a relationship, 41 Joint intention, 199 JPEG, 69 Kisceral, 204 Knowledge, 63, 65, 90 Knowledge and learning, 116 Knowledge representation, 91, 92 Language, 30, 191 Learning, 27, 63, 118 Learning, definition, 113 Lexical design, 177 Lighting, 87 Long term memory, 100 Machine learning, 118 Manipulation, 216 Manuscript, 51 Meaning, 190 Means-end analysis, 106 261 Measure interaction, 179 Medium, 186, 188 Meme, 150 Memory, 99 Mental state, 103 Message, Human communication, 30 Metaphor, 51 Mind-body problem, 91 MIPS, 94 Mixed-initiative, 217 Mixed-initiative control, 196 MMI, 152 Mobile system, 35 Mobility, 35, 125 Model Continous, 52 Introduction, 47 Model, definition, 47 Modular, 23, 41 Mood, 125 Moore s law, 94 Motion, 35 Motive, 130 Motoric abilities, 80 Multiplicity, 21 Music, 89 Narration, 89, 221 Navigation, 216, 230 Need, 125, 129 Negotiation, 204 Neural pathway, 78 Noise, 179, 187 Norm, 199 Object, 73 Ontology, 90 Open environment, 55 Open loop control, 25 Open system, 25 Optimisation, 62 Order, 42 Output, 20, 62 Pareto efficiency, 205 Pattern, 34 Pattern matching, 229 Pattern recognition, 34, 40 Pedagogy, 115 Perceive, 60 Perception, 61, 125 Persuasive application, 220 Pervasive computing, 152 Phenomenological model, 50 Hakan Gulliksson Phoneme, 72 Physical environment, 55 Physical representation, 53 Physical view of interaction, 200 Physical view of system, 49, 176 Physical view, design discrepance, 243 Plan, 109 Representation, 99 Planning, 108 Play, 116 Pleasant, 127 Positivism, 90 Power, Human, 141 Power, Thing, 143 Pragmatics, 188 Precense, 121, 224 Presence, 164, 181, 198 Social, 121, 123, 210, 211, 213 Prioritisation for cooperation, 203 Privacy, 236, 246 Problem solving, 105 Procedural knowledge, 90 Process, 94 Processing, 21 Human, 60 Information, 94 Processing concurrently, 21 Processing sequentially, 21 Producer-consumer, 199 Proprioception, 79 Prosody, 72 Protocol, 198 Psychomotor learning, 115 QoL, 134 QOL, 157, 243 Qualification problem, 109 Quality of Life, 134 Quality of Life, definition, 157 RAM memory, 100 Ramification problem, 110 Rational behaviour, 59 Ray-tracing, 87 Real value, 52 Reasoning, 62 Reasoning, definition, 105 Recursion, 44 Reductionism, 32 Reflection, 87 Reflectivity, 87 Reinforced learning, 27 Relational database, 45 262 Rendering, 87 Repetition, 43 Representation, 12, 60 A/D conversion, 76 Analogue information, 66 of image, 68, 227 of knowledge, 91 of sound, 71 of speech, 71 Physical, 53 Virtual, 53 Reuse, 26 RGB, 70 Scaling, 42 Scan, 232 Schema, 93, 99, 121 Science, 16 Script, 99 SDT, 130 Search, 110, 232 Security, 236, 246 Segmentation, 215 Self-Determination theory, 130 Self-esteem, 130 Semantic knowledge, 90 Semantic level, 177 Semantics Communication, 178 Sensation, 61 Sense, 60 Sensing, 61, 76 Sensory registers, 99 Sequencing design, 177 Sequential processing, 21 Shadowing, 87 Shannon, 186 Sharing for cooperation, 202 Short term memory, 99 Signal, 65 Simulation, internal human, 81 Situated action, 94, 95 Situation, 160, 161 Situation, 173 Smell, 78 Social abilities, 122 Social awareness, 189, 210 Social context, 55, 167 Social presence, 121, 123, 210, 211, 213 Social quality, 212 Social rules, 124 Social support in user interface, 218 Socially competent, 123 Hakan Gulliksson Society, 55 Sound, 71 Specialisation, 116 Specialisation and cooperation, 202 Speech acts, 239 Speech synthesis, 89 Spoken language, 31 Stable system, 34 Static system, 35 Stigmergy, 197 Stimulus-Response diagram, 59 Stochastic systems, 52 Story telling, 89 Structural model, 50 Structure, 20, 50 Supervised learning, 27 Surprise, 127 Symmetry, 43 Syntactic design, 177 Synthesis Interaction, 82, 176 Representation, 54 Speech, 89 System Autonomy, 28 Complexity, 37, 38 Complexity reduction, 40 Control, 23 Deterministic, 52 Dynamic, 35 Environment, 55 Equilibrium, 34 Feedback, 23 Heterogenicity, 28 Layered, 43 Memory, 35 Modular, 41 Stable, 34 Static, 35 Stochastic, 52 Structure, 20 Time invariance, 35 System control, 216 Tacit knowledge, 117 Talk exchange, 190 Task, 161 Task, definition, 85 Taste, 78 Text, 70 Text output, 195 Texture mapping, 87 Thing, 73 Thinking, 93, 94 Time invariance, 35 Time line, 67 263 Time perception, 141 Time sharing, 204 Timing, 219 Token passing, 190 Top down, 229 Touch, 78 Transform Message, 188 Transmitter, 188 Trial and error, 116 Turing test, 98 Turn taking, 190 Ubiquitous computing, 152 UML Aggregation, 41 Generalisation, 41 Urge, 125 Valence, 127 Vector space, 46 Well-being, 134 Wicked problems, 242 Wireframe model, 87 Virtual Definition, 53 Environment, 56 Virtual reality, 164 Virtual representation, 53 Visceral, 204 Vision, 78 Word, 72 VR glasses, 165 VR, Virtual reality, 164 "My pen is at the bottom of a page, Which, being finished, here the story ends; 'Tis to be wished it had been sooner done, But stories somehow lengthen when begun.” Byron Hakan Gulliksson 264 VII.3 Think along Intentional / Conceptual /Physical Cognition / Perception / Sensation Cpu/ Filter/ Sensor Interaction – Representation of action - Interactor Feedback – Signal – Transmitter/Receiver Conversation – Utterance – Speaker/Listener Emergence by interaction examples: [ Society and family by language, Things by physical force] Co-operation / Competition Compromise / Control examples: [Aesthetics / Functionality, Complexity / Speed, Democracy / Dictatorship, Create / Use] Wisdom/Knowledge/Message / Information / Language / Data / Packet / Quanta Space / Time Static / Dynamic Attention / At ease Structure / Chaos Fixed/Mobile Noun / Verb (Adjective / Adverb) Actor / Action (Characteristics) Object / Method (Attribute) Program / Execution Data / Operation Representation / Transformation Signal / Filter Knowledge / Learning Message / Modulation Stephen Hogbin, “ppearance and reality , IS”N 1-892836-05-x Problem / Solution Analysis / Synthesis Read / Write Recognise / Describe Edge / Space Content /Interface Centralised / Distributed (modular, partitioned)) Computer / Network CPU / Bus / Memory Source / Channel Object / Relation Complexity / Flexibility / Adaptation / Feedback / Intelligence Overview / Detail Holism / Reductionism Abstraction (disregard details)/ Instance General / Particular Population / Individual Context / Self Top down / Bottom up Noise Protocol Classification / Correlation Deduction / Induction Field experiment / Survey / Formal theory Design – Evaluate Create – Evaluate Program – Test Construct – Test? Aggregation / Coupling / Cohesion /Association / Relationship Hierarchy (change resolution)/ Sequence / Concurrency / Layered Hakan Gulliksson 265