I am deeply grateful for all comments, and suggestions that
helped improve this book to the current state. Please
surprise me!
Okay kids, you can have the computer now.
Hello my dear, missed me?
HITI vs 6.500
Tufte
Tufte
HCI J. Preece
Calculus
UmU
MATLAB 7.0
Hakan Gulliksson
1
Hakan Gulliksson
2
INTRODUCTION
7
PART I: THE HITI MODEL
11
I.1 The model, presenting the interactors
I.1.1 Human
I.1.2 Thing
I.1.3 Information/idea
I.1.4 Interaction
I.1.5 Context
11
12
12
12
12
13
I.2 Applying the model, and more basic concepts
I.2.1 Hierarchy and other topologies
I.2.2 Classification
I.2.3 Aggregation
I.2.4 Sequence or Parallelism
I.2.5 Mediative roles
I.2.6 Design and creativity
13
13
13
13
14
14
14
PART II: TECHNOLOGY, SCIENCE AND EDUCATION FOR DEVELOPMENT
15
II.1 Technology and science
15
II.2 Education
17
II.3 Creativity and designing for the vision
18
II.4 Basic assumptions
18
PART III: SYSTEMS, IT AND WE ARE SYSTEMS
19
III.1 System properties, common to us all
III.1.1 Processing, Sequential or Parallel
III.1.2 Distributed or Centralised
III.1.3 Memory and Feedback
III.1.4 Adaptation and Learning
III.1.5 Heterogeneity, Autonomy and Intelligence
III.1.6 Communication and language
III.1.7 Emergence
III.1.8 Space and time, change and mobility
21
21
23
23
26
28
30
32
33
III.2 Complexity, we and it are certainly complex
III.2.1 Why are systems complex and difficult to understand?
III.2.2 Reducing complexity
37
38
40
III.3 Modelling, it and us
III.3.1 Abstraction level
III.3.2 Modelling view
III.3.3 Basic types of models
III.3.4 Representations
III.3.5 Language
47
49
49
50
53
54
III.4 System environment, context, it is all around us
55
Hakan Gulliksson
3
PART IV: INTERACTORS, WE ARE NOT ALONE
57
IV.1 We have an interface, a structure, and processing capability
IV.1.1 Representation
IV.1.2 Perception and cognition
IV.1.3 Processing summarised
59
60
60
62
IV.2 Human representations
63
IV.3 How to recognise Information?
IV.3.1 Shannon’s information theory
IV.3.2 Representations of information, see the soul of I
IV.3.3 Painting, Image and Video
IV.3.4 Text
IV.3.5 Sound and music
IV.3.6 Speech
65
66
66
68
70
71
72
IV.4 The Thing outside in
73
IV.5 Sensing it
IV.5.1 Which sense is the most fundamental?
IV.5.2 Neural pathways
IV.5.3 Internet data pathway
75
78
78
79
IV.6 Acting out
IV.6.1 Action, the concept defined
IV.6.2 Visual realism, information blending in
IV.6.3 Sound, speech synthesis and telling stories
80
83
86
89
IV.7 We need knowledge, and we represent it
IV.7.1 Knowledge representation
90
91
IV.8 We think and process
IV.8.1 Situated action
IV.8.2 Distributed cognition
IV.8.3 Trends in thinking
IV.8.4 Artificial intelligence
IV.8.5 Representations for processing
93
94
96
97
97
99
IV.9 We remember
99
IV.10 We attend to it
IV.10.1 Reaction time and attention span
101
104
IV.11 We reason
105
IV.12 We plan and search
108
IV.13 We make decisions
111
IV.14 We learn and adapt
IV.14.1 Taxonomy for learning
IV.14.2 How do we build knowledge?
IV.14.3 Knowledge management
IV.14.4 Machine learning
113
115
115
117
118
IV.15 Humans are creative
120
IV.16 Humans feel presence, and have social abilities
121
Hakan Gulliksson
4
IV.17 Humans experience it
IV.17.1 Emotion
IV.17.2 Appraisal
IV.17.3 Concern (need, urge, drive, goal, utility, desire, motive)
IV.17.4 Action tendency (coping strategies)
IV.17.5 Experience
125
126
128
129
130
131
IV.18 Human’s subjective well-being, emotion, and flow
IV.18.1 Flow
134
138
IV.19 Unique features for each of us
IV.19.1 Unique human abilities
IV.19.2 Features and limitations not found in man
IV.19.3 Summary Human vs Thing
140
140
142
144
PART V: INTERACTION, WE DO IT TOGETHER
145
V.1 H-H Interaction, the reference
147
V.2 I-I Interaction, so far for efficient data transfer
149
V.3 H-I, H-T Interaction, joining forces
V.3.1 Ubiquitous computing
150
152
V.4 I-T Interaction, access to reality at the speed of light
153
V.5 T-T Interaction, forces matter
155
V.6 Why do we interact?
V.6.1 Why use others for interaction?
156
157
V.7 Context, it is everything else
V.7.1 Use of context
V.7.2 Real, Virtual and Augmented reality
V.7.3 Context of H-H interaction
V.7.4 Context of I-I interaction
V.7.5 Context of H-T and H-I interaction
V.7.6 Context of T-I interaction
160
162
164
167
169
170
174
V.8 Interaction modelling, back to the basics
V.8.1 Modelling view
175
176
V.9 Interaction characteristics, some suggestions
178
V.10 Mediation, with the help of it
V.10.1 Mediation as a model
V.10.2 The medium
V.10.3 Pragmatics
V.10.4 Social dynamics and timing
V.10.5 Meaning and inferential model
V.10.6 Infrastructure of H-T, H-I interaction
V.10.7 Screen or paper as the medium
184
185
186
188
190
190
192
195
V.11 Interaction control, a joint venture or one of us in control
V.11.1 Coordination
195
197
V.12 Co-operation, we are all in control
V.12.1 Measures of co-operation
V.12.2 Mechanisms for co-operation
199
201
202
Hakan Gulliksson
5
V.13 We compete, and compromise
203
V.14 Computer-supported co-operative work
V.14.1 Taxonomy
V.14.2 Effective interaction
V.14.3 Interaction bandwidth
V.14.4 Social quality of service
V.14.5 The social-technical gap
205
206
208
211
212
214
V.15 Command based interaction, someone in control
V.15.1 Mechanisms
V.15.2 Intelligent support
V.15.3 Identification
V.15.4 Navigation
V.15.5 Choice
V.15.6 Manipulation
215
215
217
224
230
234
235
PART VI: DESIGN, HUMANS CHANGE OUR FUTURE
241
VI.1 What is the problem?
242
VI.2 Design for H-H
VI.2.1 Ethics, privacy and security
245
246
VI.3 Design for I-I
247
VI.4 Design for H-I/T
VI.4.1 Information overload
VI.4.2 Incidit in Scyllam, qui vult vitare Cha-ry'bdim
247
248
249
VI.5 Design for T-T
249
PART VII: RESOURCES
251
VII.1 References
251
VII.2 Index
259
VII.3 Think along
265
Hakan Gulliksson
6
Introduction
This book is about how to make it out with technology. Humans have
come a long way since leaving the trees, and through our tools we have
neutralised most natural forces and adapted the environment to ourselves.
This struggle for supremacy built, and still builds, knowledge about;
nature, the tools needed, and how our society and we work. Relating this
knowledge to the next generation of technology is the major challenge for
the following 200+ pages.
The human genome is now known, which is a major achievement. We
know the wiring of our neural network, and have cars that transport us
comfortably from A to B. However, not all problems are solved, and not
everything understood. Far from it! The last thousands of years have for
instance not shed much light on emergence, the effect of long term social
processes, dynamic human behaviour, society, our mind, consciousness,
self, and love. Aristotle is still the reference.
While we have been pounding on this kind of, seemingly impossibly,
complex problems, technology has developed, infiltrating, manipulating
and supporting more and more aspects of our lives. We have a cancer in
our midst and will soon face fundamental and scary questions such as:
"If you're not having fun, you're
doing something wrong."
Groucho Marx
Can we control global, pervasive, networked technologies with a
multitude of sensors and actuators, and with unlimited memory
and processing power?
Will such a technology support us and the society as we know it
now while improving our quality of life, or will it start living a
life of its own? A life that we cannot comprehend or control, and
that will disrupt established behaviour?
Will the behaviour of such advanced technology mirror our own
(we built it), or will fundamentally new behaviour emerge? One
that for instance does not acknowledge our preference for
behaviour in accordance with the laws of nature. This could force
us to change how we think and plan, or how we perceive, rate
and order objects and events.
Will we humans control this new technology, be the masters,
inspire it, provide it with creativity, or teach it, support it, be
enslaved by it, visit it, or maybe just sit back, eat a fruit, and
watch it develop by its own?
We are already forced to make choices by this new wave of technology.
How many cameras do you accept in a city street, a school, at your work
and at home? They will all be well motivated, preventing terrorism,
bullying, burglary, or studying who empties the dishwasher and who just
leaves dirty dishes in the sink. What will the emerging long term effects be
of our choices? As another example consider the pros and cons for society
Hakan Gulliksson
7
if a car can be positioned at any instant. Effects on road tariffs and road
planning? In the case of an accident? Effects on bank robbers, navigation,
or for a restaurant close to the highway?
A human is an enormously complex biological system that together with
other humans forms an even more complex society. Can we together with
technology better understand our society and ourselves? The society is
usually considered a result of the interaction between humans, and
between humans and their environment. It is a feedback system where
one loop (out of many) is humanity and technology evolving together.
The development of technology depends on human involvement;
technology changes human behaviour and through human involvement
changes itself.
Here is where design, defined as purposeful creation is necessary, random
evolution is very slow and resource intensive. The design process has its
own world of words, problems, and solutions. If the result is a commercial
product it should provide adequate quality of service, within schedule,
spending as little resources as possible.
Technology provides the means and methods for creating more and more
complex systems. So far, humans have mostly interacted with other
humans. Not counting some simple tools and friendly dogs to play with,
there has been little else around to interact with. This situation is now
changing dramatically. We are entering a new era where humans also
interact with systems designed and built by humans. These systems will
become increasingly interesting as interactors, and soon they will start to
compete and co-operate amongst themselves. Soon we are not alone!
Not only does technology provide the means to build complex systems. It
also supports the context of our lives to the extent that the life as we know
it in the industrial world would be impossible without it. Technology
affects every aspect of our lives. Art, music, film, and this book are formed
within the constraints of technology. A competitive world implies mastery
of technology even by artists.
This book is about interaction. To be more specific, it illustrates design of
interaction, interactive systems, and interaction technology, involving the
three participants humans, information/ideas, and things. We will use the
acronym HIT for the participants, and HITI when alluding to the whole
concept of interaction amongst the participants. Studying HITI in this
book will enhance understanding of the participants and their
interactions, improve the usability of new systems, and speed up the
development of new technologies. As a side effect in depth knowledge
will be gained of the main participant in the interaction, the human being.
H
I
T
Following technology
Hakan Gulliksson
8
Focusing on anything related to interaction technology is like trying to
take a photograph of a racing car. Technology is constantly evolving and
the speed of the change seems to be ever increasing. How long is it since
Internet was introduced? World Wide Web? Is MP3 an old technology?
There is probably a computer in your kitchen, a laser in your living room,
and a hologram in your wallet. Perhaps we can use technology to better
understand technology? Humans and society change at a much slower
rate. This means that the products of technology more and more will be
limited by humans and human society. Technology itself is not good or
evil, but merely reshaping and developing rapidly.
Part I introduces the HITI model (Human, Information Thing and
Interaction) that will be used throughout the book.
Part II clarifies what we will mean by technology and science.
Part III discusses systems and models in general. It is important because it
establishes a systematic view that can be used in many areas of work.
Basic characteristics of systems are covered in this chapter, such as that a
system can be adaptive. A large part of the chapter is devoted to models.
They are very important because without models, computers would not
be of much use, and our understanding of the Universe would be reduced
to blind search.
Part IV details the characteristics of the participants in the interaction. As
the curtain rises the spotlight finds the human (H), who of course from
our point of view is the most important interactor, the thing (T), and the
information (I). Starting from a generic interactor specific features and
characteristics are added for processing, sensing, and knowledge
representation.
Part V is about interaction. It tries to answer the questions What is
interaction? Why is it needed? , How is it composed? and When is it
performed and by Whom Where? . Once again a perspective from H is
used. The action starts and the plot is unveiled. H, I and T exercise their
abilities and try to overcome their limitations by exploiting each other.
Some highlights are: a discussion on computer supported co-operative
work, use of context in human-computer interaction, and technology
support for mobile services.
Part VI ends the book with a short discussion on design, our
vehicle for change.
The intent with this book is to bring forth recommendations, constraints,
risks, and data for the choices we will be forced to take to cope with
technology. Relevant limitations in, and differences between H, I and T,
and their interactions will be discussed, and we will do this using H, and
H-H interaction as references, and as sources for examples.
Most of the ideas are of course not mine. I want to express my gratitude to
those thinking faster, further and farther, perfectly exemplified by
professor Haibo Li, and professor Lars-Erik Janlert at Umeå University. By
the way, this book will never be finished until someone pays me to stop
working on it.
Hakan Gulliksson
In
Out
System
Do you think Hitler and
Mussolini would have
remained friends for long
after the war was won?
"Believe me, Baldric, an eternity in
the company of Beelzebub and all his
hellish minions will be as *nothing*
compared to five minutes alone with
me...and this pencil."
Edmund Blackadder to Baldrick,
`BA III'
9
This page was intended to be left blank, but…
… now I come to a new Scene of my Life. It happen'd one Day about
Noon going towards my Boat, I was exceedingly surpriz'd with the Print
of a Man's naked Foot on the Shore, which was very plain to be seen in the
Sand: I stood like one Thunder-struck, or as if I had seen an Apparition; I
listen'd, I look'd round me, I could hear nothing, nor see any Thing, I went
up to a rising Ground to look farther, I went up the Shore and down the
Shore, but it was all one, I could see no other Impression but that one, I
went to it again to see if there were any more, and to observe if it might
not be my Fancy; but there was no Room for that, for there was exactly the
very Print of a Foot, Toes, Heel, and every Part of a Foot; how it came
thither, I knew not, nor could in the least imagine. But after innumerable
fluttering Thoughts, like a Man perfectly confus'd and out of my self, I
came Home to my Fortification, not feeling, as we say, the Ground I went
on, but terrify'd to the last Degree, looking behind me at every two or
three Steps, mistaking every Bush and Tree, and fancying every Stump at
a Distance to be a Man; nor is it possible to describe how many various
Shapes affrighted Imagination represented Things to me in, how many
wild Ideas were found every Moment in my Fancy, and what strange
unaccountable Whimsies came into my Thoughts by the Way.
Robinson Crusoe, Daniel Defoe
but ….
… it cast a gloom over the boat, there being no mustard. We ate our beef
in silence. Existence seemed hollow and uninteresting. We thought of the
happy days of childhood, and sighed. We brightened up a bit, however,
over the apple-tart, and, when George drew out a tin of pine-apple from
the bottom of the hamper, and rolled it into the middle of the boat, we felt
that life was worth living after all.
We are very fond of pine-apple, all three of us. We looked at the picture
on the tin; we thought of the juice. We smiled at one another, and Harris
got a spoon ready.
Then we looked for the knife to open the tin with. We turned out
everything in the hamper. We turned out the bags. We pulled up the
boards at the bottom of the boat. We took everything out on to the bank
and shook it. There was no tin-opener to be found.
Then Harris tried to open the tin with a pocket-knife, and broke the knife
and cut himself badly; and George tried a pair of scissors, and the scissors
flew up, and nearly put his eye out. While they were dressing their
wounds, I tried to make a hole in the thing with the spiky end of the
hitcher, and the hitcher slipped and jerked me out between the boat and
the bank into two feet of muddy water, and the tin rolled over, uninjured,
and broke a teacup.
[] We beat it out flat; we beat it back square; we battered it into every form
known to geometry - but we could not make a hole in it. Then George
went at it, and knocked it into a shape, so strange, so weird, so unearthly
in its wild hideousness, that he got frightened and threw away the mast.
Then we all three sat round it on the grass and looked at it.
There was one great dent across the top that had the appearance of a
mocking grin, and it drove us furious, so that Harris rushed at the thing,
and caught it up, and flung it far into the middle of the river, and as it
sank we hurled our curses at it, and we got into the boat and rowed away
from the spot, and never paused till we reached Maidenhead.
Three men in a boat, Jerome K. Jerome
Hakan Gulliksson
10
PART I: The HITI model
This book will use the Human, Information/Idea, Thing, Interaction
model (HITI model) to structure thinking about systems [UCIT]. The
model originates from professor Lars Eric Janlert at Umea University, and
only has a few basic elements that have to be interpreted within the
specific context where they are used. The strength of the model is that it is
graphical and close to everyday thinking. This chapter introduces the HITI
model and applies it to some simple examples.
I, Human Thing
I.1 The model, presenting the interactors
The constituents of the model are the three possible participants of an
interaction, i.e. Human (H), Thing (T), and Information/Idea (I), and the
Interaction itself.
Those are my principles. If you
don’t like them I have others.
Groucho Marx
Below, see figure 2.1, the participants are shown as text boxes along with
the possible interactions between them, represented by arrows. As
technology improves, so do the possibilities for supporting interactions.
This creates new opportunities in different applications, for new categories
of users, new situations, environments, and activities.
H
T
T
H
Figure I.1.1 The HITI model.
I
I
One example of the HITI model at work is that you want to write a poem
in a love letter. You, a human (H) print the poem (I) neatly on paper (T). A
father asking his son to do the dishes could exemplify the arrow
representing a H-H interaction. If you are accessing a database using
Internet Explorer ® this is an interaction where an H (you) interacts with
an I (Internet Explorer) which in turn interacts with another I (the
database). The first interaction is through a human-computer interface
(Windows ®) and the second is implemented by data communication
(Internet).
Z Z
Z
Z
Z
Z Z
z
Z
Now, we will give a short introduction to the interactors H, I, and T. They
will act in examples showing four fundamental modelling principles, such
as hierarchy, abstraction (classification, aggregation), sequence, and
parallelism.
Hakan Gulliksson
11
I.1.1 Human
The first interactor to introduce is the most important one, the human. We
are quite intelligent – at least that is what we think ourselves. As an object
of study we have been popular for several thousand years, and we can
even do our own introvert excursions. This knowledge makes the human
a perfect role model of an interactor, and human-human interaction the
reference, the most basic, and well developed, form of interaction.
The human is also interesting because we will use human well being, and
quality of life, as a first rate constraint when discussing interaction
technology and design. We will study how to exploit technology and
design to fulfil this constraint using knowledge of human characteristics,
behaviour, features and limitations. An assumption is that we, as a side
effect, will better understand ourselves.
I.1.2 Thing
The thing is our oldest friend. For more than 2 million years it has been
with us. Quite a long time compared for instance to the dog that has
followed us for just a little bit longer than 10.000 years. Recently the thing
has acquired some new abilities, sensing, processing capability, and new
possibilities to effectuate and display its actions.
A major difference between the thing and the human is that a thing is
designed. It can be given characteristics and behaviour chosen for a
specific task and environment.
I.1.3 Information/idea
The third interactor is information, which include ideas. Information has
also been with us for quite some time, at least 20.000 years, doing a good
job as our social memory. Recently, with the advent of the global Internet,
dissipated information is increasingly a major player in social progress.
Managing and processing lower levels of information, i.e. raw data, is also
more and more important. Both because it is now feasible and the amount
of data is growing, but also because combining different kinds of data can
create new knowledge.
I.1.4 Interaction
Interaction and communication drive and support progress. They are
imperative for knowledge acquisition and maintenance, adaptation,
resource allocation, and many other things that make our society work,
and the world go around. If the interactor is a static object, interaction is
the process, the dance that interactors engage in.
Interaction implies communication. If one way communication is
intended, rather than interaction, a one directional arrow is used in the
HITI model. Interaction and communication in turn suggests adding
action and representation to the core concepts. Actions make things
happen and through representations we see the the world before, and
after the action. Actions and representations are two complementary
facets of reality. A representation is what an action changes, and without
representations there will be no cause for action, and no result.
Hakan Gulliksson
Interaction
Communication
Action
Representaton
12
I.1.5 Context
The aggregation of individual interactors of the three different kinds
constitutes an environment, or a context. We live in a world full of context
aware and context dependent systems such as humans, thermostats, web
counters, flowers, dogs, and mosquitoes.
H/T/I
The distributed nature of an interaction suggests sharing. How else can
the complexity be kept low and the efficiency high? One way to regard
sharing is as the generation of meaning through interaction. Interactors
select meaningful representations for what they do, or want to do, based
on previous experience. They perform actions and, importantly, also
choose to display representations based on previous experience. In effect a
converging feedback loop is created, representation-action-representationaction, which is actively evolved by the practitioners. Thus meaning is
created as a skilful praxis.
To support this development coordination mechanisms are needed
[KS]. The motivation behind a coordination mechanism is to offload
complexity from actions. It simplifies interactions, and can improve
efficiency by providing precompiled permanent representations of
conventions, rules, protocols, representations of plans, maps, and scripts.
Each situation could be supported by many coordination mechanisms
and they could work in concert in different ways, e.g. aligned in time.
Sharing a
resource
Coordination
mechanism
I.2 Applying the model, and more basic concepts
I.2.1 Hierarchy and other topologies
The HITI model can be used to describe hierarchical relationships. One
example is that you have a letter to write. The information content of the
letter is confined in your brain (somehow).
Hierarchy is only one example of a topology that seems to be intimately
linked with human thinking. A ring structure is another, and a network a
third that could degenerate into a point-to-point relation between two
participants. The network is a very general topology that we recognise for
instance in social life. Anyone can have a relation with anyone else. But,
this freedom is also a curse; the number of possible relations grows
quickly with the number of participants in the network. Power structures
and trade offs become difficult to optimise, especially for computers.
Humans seem to better manage the enormous complexities involved in
real life.
H
I
HI-HI-HI-HI
I.2.2 Classification
Sometimes we want to hide aspects of a model and disregard irrelevant
information, or we want to present similar aspects under a common name.
For this we can use classification, also called generalisation, which is a
special case of abstraction. Instead of referring to a long list of our family
members; aunt, brother, sister, and uncle we classify them all as relatives.
I.2.3 Aggregation
Grouping objects together into a new object is another way to reduce
complexity by abstraction. A house has roof, walls, windows, and a door.
Hakan Gulliksson
I
H
H
T
T
T
13
I.2.4 Sequence or Parallelism
Another way to use a HITI model is to model behaviour as a sequence of
actions. Sending a love letter can at one level be described as a one-way
communication from you to the reader of the letter.
H
T
H
Including the letter (the paper) adds a new interactor to the model.
We can easily add additional details. Let us model the following You
write down your thoughts on paper, see illustration to the right. The
receiver, sadly, puts your letter in the pocket without reading it .
H
H
T
I
T
We will need yet another concept nicely complementing the sequence, and
that is parallelism. With it we for instance can model how a television
station concurrently broadcasts a show and stores it on tape.
I.2.5 Mediative roles
Each of the participants H, I and T serves another participant in one out of
three different ways. First, it can serve as a tool. We use a hammer, or a hit
man as tools, and they are specialized for well defined tasks. Second, a
participant can serve as a medium and provide an experience. A clown,
Porsche, and a movie are three examples. The third alternative is a
participant serving as a social actor, e.g. as a friend to be trusted. T and I
are so far severely limited both as social actors and as receivers of media,
partly because they do not have access to reality in the same way as H. No
one has yet heard a computer laugh spontaneously as it parses a beginners
first Java program.
H2I = Chess?
X
Y
I.2.6 Design and creativity
Design is currently possible for humans only, and can be brought to bear
on almost any aspect of our lives. The important distinction is that design
is a purposeful and deliberate choice or change of something. It is how we
visualize and realise our dreams of the future, and much quicker than
random evolution. The resulting designs are evaluated and used in a
context, and as they are, ideas for new designs are found. Finding a new
idea cannot be done without creativity.
“Human needs will not be obsolete”
“Identity confirmation from family, teacher, and friends with immediate feedback
… Need for exploration, Time to think, Sense of home
Philips vision of the future
Hakan Gulliksson
14
Part II: Technology, science and
education for development
This is a book about interaction, design, and interaction technology. But,
before we launch a major attack on these issues we need to briefly discuss
what technology and science is, their relation and education, main forces
behind the development of technology.
Technology to mankind is like
giving an axe to a maniac
Why should you care about science? Well, for one thing, it seems that a
scientific approach to life is a prerequisite for development in general, at
least as we know it. We always try to verify statements and beliefs before
accepting them as we explore the world. Critical thinking is a basic,
necessary behaviour, complemented by curiosity. Curiosity provides a
driving force for exploring reality and critical thinking keeps us from
drowning in new ideas and facts. There are also other drives urging us to
explore, could you name some of them?
Could you elaborate further on that
point? Is that really true? Could you
be more specific?
Critical thinking
II.1 Technology and science
Technology emanates from, and manipulates, the human-made world. It
affects and concerns the ways people develop and use technical means –
things, tools and machines – to control, both the natural, and the humanmade world. Through technology we now communicate and interact more
efficiently, thus improving our quality of life. We can use technology to
automate boring tasks, which gives us time to spend on more interesting
activities. Technology is also used to increase our comfort, for instance
through the use of air conditioning and central heating. Using it well we
can create stimulating, challenging, learning environments where we can
develop ourselves, as exemplified by computer games. By the way, the
word technology emanates from the Latin word technos, a word meaning
skill in joining something, combining and working it. The word art had a
similar meaning originally it meant a specialized skill, rather than fine
art .
According to the definition to the right, inventing new technology is a way
of mastering resources more efficiently. Sufficient resources are rarely
available and this shortage necessitates trade-offs and drives creativity.
Take the car as an example. We are well able to walk, but that costs us
time, so we invented the car at the cost of developing technology, i.e.
money. It is possible to drive at 1000km/h, but that is too costly, both in
terms of money and lives. It would be nice to have a larger boot, but that
would take away room for the passengers legs and might obscure the rear
view. Why does a car have two headlamps, not one or three? Efficiency is
exemplified by car factories that assemble a car almost automatically. The
assembly line of human workers, as perceived by Mr Henry Ford, is no
more.
Hakan Gulliksson
Definition:Technology is the
technical means people can use to
improve their quality of life. It is
also the knowledge of how to use
and build efficient tools and
machines efficiently.
There are three types of technology, good, bad, and cool.
Patrik Eriksson, TFE
Technology is for: Efficiency,
information,compatibility, usability,
accuracy, documents, work,
technology, intimacy, communication,
novelty, enchantment, ambiguity,
postcards, fun, people
Joseph Kay
15
We can argue that efficient technology, using components in a simple,
economical way is also aesthetically pleasing. A short mathematical proof
is considered elegant, and inventive solutions such as the safety pin, or the
clothes peg, never cease to arouse curiosity and wonder. A good
programmer with a sense for taste, judgement, and aesthetics can be
enormously more productive than an average programmer. In this book
we are mostly concerned with technologies based on computers and
sensors, i.e. primary technologies needed for intelligent interaction. Please
NOTE that even though technology not often is mentioned together with
emotions and social topics it undoubtedly affects these processes.
Science is a prerequisite for technology. It studies the laws of the universe
and has resulted in an immersible, monumental, number of facts. Science
for instance tells us about fundamental limits. We know that there are
limitations to how fast we can travel, and how much data we can transfer
in a given transmission time over a channel with a given physical
transmission capacity. Physics, chemistry, and biology are examples of
scientific disciplines, and mathematics is an important special case. It is
both a tool used in scientific work and a science in itself. There is also an
aesthetic dimension to science. When many explanations are superseded
by a new simpler unifying theory we are pleased and even grateful. The
new theory will help us to better understand reality.
Science and technology are inter-dependent. Science deals with
"understanding" while technology deals with "doing". Science provides us
with knowledge that we can use to build technology. Technology, on the
other hand, helps science develop and reveal new facts that in turn
spawns new technology.
Note that all progress involves interaction! Technology, society, economy
and the individual are involved. The individual s needs provide demands
that distributed through the economy, within the constraints of society,
will drive technology. Technology will create new needs (some maybe not
altogether necessary for survival) that will once again fuel progress.
Some other formulations of distinctions between science and technology
are that science abstracts whereas technology makes concrete, science
generalises, but technology is specific. Results from science are typically
generated at a University and are available within a subject. Technology is
applied, interdisciplinary, expensive, and patented by industry.
Engineering is about how to make technology useful to people. This
should be done with available resources, on time, and within budget. The
word engineering comes from Latin ingenerare, meaning to create. An
engineer seeks optimal solutions to problems, but there is usually no
formal way to find the right compromises, and any solution is typically
modified the next time it is applied. The engineer has to use good
judgement as well as scientific knowledge to trade-off for instance speed
and accuracy, speed and cost, or speed and size.
Art resembles technology in that it deals with doing rather than with
understanding, maybe even more so than technology. Art is about
creating an expression, and for this technology is once again useful.
Technology gives new possibilities to artists that the body and the natural
environment cannot provide.
Hakan Gulliksson
Titanic, the ship, the film, the
camera, the movie projector.
A scientist usually aims at presenting a regularity (invariance) of
empiria by means of an abstract
formula or a model that the general
public can apply to its own
problems
Pentti Routio [PR]
Technology
Science
A scientist likes surprises, not so
an engineer
The Net
The word engineer is derived
from the latin word ingenium
meaning ‘ability’ or ‘genius’
An engineer is one who contrives, designs or invents; an
author, designer, an inventor,
plotter or layer of snares.
Oxford English Dictionary
An artist, on the other hand, prefers
to demon-strate general regularities
in the form of one concrete, special
case that the public can then apply
to their own situations
Pentti Routio [PR]
16
Since technology depends on science, art will too. But, the development of
science is quite different from that of art. Art is created while scientific
knowledge is researched, a work of art illustrates, but science
characterises. The difference between art and technology is less obvious.
One possible distinction is that technology aims at creating useful things,
but art creates experiences. That said, we acknowledge that an experience
can be useful, and using something is an experience.
Art
Craft / Technology
Magic is a craft activity, as is engineering and arts. A magician s goal is
control over nature through artificial means, which is the same as for an
engineer. The means are different though, and credibility for magicians is
currently low. Casting spells and reciting incantations is not engineering
even though an occasional curse can be heard from the computer lab.
Increasingly, the results of engineering could well be perceived as magic
by anyone not familiar with the technology. A door that automatically
opens is certainly magic if you do not know about sensors and electrical
motors. Interestingly, engineers of the middle ages cultivated rather than
shunned, their reputation as sorcerers [WE].
F=k·m1·m2/r2
?
Eva, I feel
strange
Figure II.2.3 Famous
experiments.
II.2 Education
Education is a mandatory prerequisite for both science and society, as we
know them. Technology would certainly be magic without it. You, a
reader of this book, have already understood that knowledge of science,
research and technology is not given for free. Hard work is necessary to
learn intellectual tools such as critical thinking, and to cultivate the
multiple views needed of reality.
The ticket is paid in time, time spent reading, discussing, thinking about,
and testing facts and relationships. How long time is needed for this? The
mean time set aside for education in Sweden is 12 years, but a political
goal is that 50% of the students should continue for 3 more years. In
practice our fast moving society will force us to learn even longer than
that. Life long learning will be the norm, and the first 15 years will be
spent learning how to learn.
The bearer of knowledge is language, either written or spoken, and each
discipline has its own. This means that to learn a discipline you need to
understand, and use, its language, and this can only be done by spending
time actively interacting with more knowledgeable practitioners and with
knowledge sources, such as books and articles. A research oriented
approach to life, spiced with a large dose of critical thinking, will make
this interaction much more fun, and also more efficient.
Hakan Gulliksson
Society
Education
One computer science professor
used to characterize the standard
length of his lectures (a little less
than an hour) as a microcentury.
Internet
"Not me. I'm depending on athletes
and actors to raise my kids.”
John Dobbin
17
Not only is education needed to maintain the knowledge level of how to
build and use technology. It is also needed to understand what the
consequences are if it is used, and to learn when not to apply the
technology at all.
Should we fear technology or
fall in love with it?
II.3 Creativity and designing for the vision
Systematically repeating the work of someone else is by itself not very
interesting. We need to fuel the process with new thoughts, inventions,
imagination, intuition, by taking chances, looking out for surprises and
examining their causes, i.e. we need creativity. If the application of
creativity is goal based we call it design. In general, curiosity drives the
scientist, and self-expression the artist. The designer on the other hand is
other-serving working on behalf of others. Only as a special case a
designer serves himself [ES].
Be creative, Draw your own
illustration
II.4 Basic assumptions
There are several necessary assumptions behind the discussion of science
above. One is that there is an objective reality and that this reality exists
independent of human discovery or observation. Another assumption is
that this reality is a basically orderly and regular environment where
nothing happens without a cause. Events are assumed not to be random
or accidental, effects have to have a cause. We also assume that we exist
ourselves!
Find out the cause of this effect,
Or rather say, the cause of this
defect. For this effect defective
comes by cause.
Hamlet (Shakespeare)
A limitation of science is that it cannot produce absolute final truths.
Throwing away a theory when a better one is found is a trademark of
science. If two theories explain the same facts the simpler is considered the
best one (Occams razor).
"Entia non sunt multiplicanda
praeter necessitatem,
("Entities should not be
multiplied more than necessary").
Occams razor, WilliamOfOccam
(1300-1349)
Morals and ethics are also out of scope for science and we should not
confuse scientific and technological development with progress, they are
only means that could be used for good or evil. This is similar to how we
regard money; it is up to the spender to decide how life is affected.
Knowledge is power
Sir Francis Bacon
Z Z
Z Z
z
Effect
YeZ
Cause
Hakan Gulliksson
18
Part III: Systems, it and we are systems
Development through technology, science and research has proved
successful. We study the bits and pieces of existence, compare
observations with previous knowledge, and arrive at various new
conclusions. A systems oriented worldview supports this process well.
Definition: A System is a set of
variables selected by an observer.
Ashby
The following chapter is an overview of systematics and modelling. The
rationale for this is manifold. First, a systems view provides us with a
framework for discussing interaction. Second, such a view gives us a
chance to introduce properties of systems that are found in many sciences
and which are useful to characterise interactions and interactors. Some
important examples are memory, feedback, adaptation, and learning.
Third, looking at the world as a system is very much an engineering
stance and equally applicable to design.
Many technical and social systems serve a purpose and this is what
motivates their existence in the first place. For other systems, such as a
human, the purpose is less obvious.
Hakan Gulliksson
Definition: A system is a unified
whole made up of one or more
subsystems or components.
19
Seen from the outside, the system can be described as a processing unit in
an environment where it produces outputs given some inputs. The
system is contained inside an interface that encapsulates the behaviour of
the system.
Environment
Interface
Input
Output
Processing
System
Figure III.1 A system.
Output is the result of
processing input. An
interface separates
thesystem from its
environment.
Input provides necessary resources for the system, specifically data about
the environment in which the system delves. This data can be perceived at
different levels of abstraction. An image can for instance be described as
photons, pixels, or as a data file. Espionage provides input, so does a glass
of milk. Output is data or actions that brings about changes to the
environment of the system, which in turn affects the system itself. Input is
sometimes named events, i.e. dynamic aspects of the environment
affecting system behaviour. An event is typically the cause that is
transformed by the system to new events, the effects.
A general system is composed of subsystems, or in other words, can be
partitioned into modules, which makes up the internal structure of the
system. Appropriately the great, great, grandfather of the word system
was the Greek word sy´stema, which means, “ whole composed of
parts . In order to be characterised as a system the subsystems have to
interact, i.e. relationships between the subsystems are necessary. The
interaction is the behaviour of the system and can be anything from
physical atomic interactions to public transportation in a big city, to some
way of expressing love. The interaction serves as the glue that keeps the
subsystems together and is usually implemented by exchange of
information units.
Information units can be real, such as photons or a birthday gift, or they
can be very abstract such as the idea of democracy. The idea of democracy
spread through many information channels and resulted in new
subsystems, new structures, new interactions, and new information units
(quite a lot of documents).
Modelling interactive systems as above does have limitations. An inputoutput view of a system restricts the way we think of processing.
Alternatively we can view processing as change of system state, which for
instance allow us to see motion, i.e. change of position, as processing. It is
also difficult to use the system view to understand self-organisation, or the
evolution of a system. Further, a system sometimes has qualities not found
in, or at least not easily derived from, the subsystems. These emergent
phenomena could come from behavioural aspects of local interactions and
are easy to miss since the system model encourages focusing on overall
input/output flows. Pile up four wheels, an engine and all of the other
parts that make up a car, show them to someone who does not know
Hakan Gulliksson
(x,y,z)
Infinity
Home
Eternity
t
20
about cars (if you can find such person) and see if they can guess the
concept of a car, motorway, parking lot, or a traffic jam.
III.1 System properties, common to us all
The following pages discuss properties that characterise a system. These
properties are important because they will surface all of the time, when all
sorts of systems are studied, in all sorts of sciences.
III.1.1 Processing, Sequential or Parallel
Central to the system is processing or computation. In its deepest sense it
is that what changes, deliberately, by design, or just by chance, by nature,
or by habit. Without processing nothing happens to input on its way to
output, and there will be no change of the system state. Processing is
performed by a processor, brain, CPU, or some other device that has a
method to use input or memories to modify output or internal states. At a
high level of abstraction it can be seen as reading symbols from memory,
processing them by applying operations, and storing the result. It also
reads the operations to be performed from memory, i.e. operations are
data.
Hmmm..…
07.55
Processor: Interprets and executes operations, to process data
Figure III.1.1 System for
computation.
Memory: Operations, Data
It is an interesting fact that with memory and a few very simple operations
anything computable can be computed. One example of this is the DNA
computer that can perform any computation using a soup of genes. The
very simple operations are merging, splitting, and copying of genes.
System overload is a problem to all processing, either the input volume or
the number of inputs exceeds the capacity, or input arrives to fast. One
solution is to somehow expand the total system capacity, e.g. buy a new,
better, faster computer. If this is not possible then the system has to be
modified. Another, good, solution is to skip less important tasks, or
perform them with less accuracy. If the processing is necessary we can
divide the original system up either sequentially, by adding concurrency
(several subsystems in parallel), or by using multiplicity (several identical
subsystems in parallel). Partitioning the load might involve restructuring
the original solution and usually adds complexity to the control. Note that
expansion by dividing a system up into a sequence, improving each step
in the chain, can be seen as expansion in time, while concurrency and
multiplicity is expansion in space.
Hakan Gulliksson
21
One example where parallel expansion is used is in the telephone system.
If one telephone exchange is over-loaded, another exchange is added in
parallel. We use the same strategies also in everyday situations. When you
serve pancakes for only one guest a single frying pan will do. With two
guests the delay between consecutive pancakes could cause some
irritation and if you have a big family, or have invited your neighbours,
you have to exploit concurrency, i.e. two or more pans.
A parallel system is inherently more efficient than a sequential and if it
consists of several subsystems we can get additional bonuses. Some of the
subsystems could specialise in important tasks, several solutions could
be simultaneously tested, and if necessary the system could be made
redundant and provide for fault tolerance.
For a prose, parallelism helps
bring about clarity, efficiency,
forcefulness, rythm and
balance.
[ET]
The human mind does not seem to handle concurrency well. Rational
thought moves sequentially, tracking an effect for each cause. Take the
example of Achilles and the tortoise. The warrior and the tortoise race and
the much slower tortoise is given a head start at the beginning of the race.
The competition starts and Achilles reaches the position where the tortoise
started. However, in the mean time the tortoise has moved yet another
increment. The race continues and Achilles steadily decreases the distance
to the tortoise, but never catches up. If you try this at home Achilles will
win the race every time. We have been tricked into doing the wrong
coupling because it is difficult to keep track of two concurrent chains of
causes and effects.
A reflection here is that there is a need for both parallel and sequential
computations to implement conjunctions, i.e. to draw conclusions on
interesting events overlapping in time. Let us say that you line up a
number of computers in parallel, each computer calculating the speed of a
specific car on the motorway. Without adding a sequence of inference
steps we cannot know more than the speed of each car. If we on the other
hand add additional steps, we can calculate the mean speed, or detect
patterns in the behaviour of the cars.
This principle is visible in all sorts of system solutions, for instance in
human vision processing. The eyes first execute a massive parallel data
processing step, computing local light intensity, and other local
information. The next step is to combine the many information channels to
higher-level visual objects.
Perception for
identification
Perception for
action
Figure III.1.2 Visual processing,
stage 1 parallel, performed by
millions of rods and cones. Stage 2
sequential in the visual pathway.
Stage 1
Stage 2
Hakan Gulliksson
22
III.1.2 Distributed or Centralised
How system control is organised is important. It is simpler to centralise
the control, but the system will be more robust if we distribute. The cars
on all of a country s motorways are one example of a system with
distributed control. Even if one of the cars loses control, all the other cars
in the country will still manage. Flight control at an airport on the other
hand is a centralised system. Possibly an airport could still work without
central control, but it would certainly be a very inefficient and risky
airport. A nuclear power station is another example of a centralised
system run from the control room. Maybe the biggest system of all, the
Internet, owes much of its success to a decentralised approach to control.
Two people talking to each other is a system with distributed control. Each
participant has control over her own behaviour, and interacts with the cotalker. An author running a word processor when centralised control is
exercised, at least as long as Word® does not start to edit by itself. It does
make rudimentary attempt to do this, for instance by automatically
change I to I, which is quite annoying when you are writing in Swedish
where i means inside .
Not only can control be distributed, but also computations and memory.
To understand how memory can be distributed, think about a shopping
list, or a diary. Distributed computation is exemplified by how monkeys
(and you?) have been found to guide the movement of their fingers. It
seems that the computation leading to this movement is not localised to a
small spot in the brain, but is spread out over a large area. Distributed
solutions are powerful and a proof of this is that the whole population of
Sweden maintains a supply of food through local decisions.
Does democracy mean distributed or centralised control?
Really?
Is the a system with centralised
or distributed control?
Memory Lane
The main rationale for having multiple components rather than one
extremely complicated component is that a single component doing
everything is the prime suspect to be the bottleneck for speed and
reliability. Partitioning the functionality gives us a modular, easily
modified, flexible system, where resource sharing and load balancing are
possible. There is a natural tension in design between distributing data
and managing resources. Distribution provides fast access to resources,
close to where they are needed. But, on the other hand, keeping resources
in a central repository gives easy access for control and manipulation.
Cohesion and coupling are two measures, mostly used to describe
software, and they say a lot about a distributed system. Cohesion
describes to what degree its subsystems together perform a single task. A
car is a good example where all subsystems in a car co-operate with the
objective of transporting the passengers from one place to another as
comfortably as possible. Coupling describes the level of interdependency
between two subsystems. Interdependency is quantified the number of
messages exchanged, and in the amount of shared resources such as
memory. If the target is to design a highly modular system the coupling
between the system and its environment should be minimised. A proof of
success is whether the communication between modules, and between
modules and the environment, is minimised.
Definition:
Cohesion, to what extent the
functionality of a system is
self-contained.
Definition:
Coupling refers to the
strength, directness and
complexity of causal relations
among parts of a system
Weick
III.1.3 Memory and Feedback
Memory is a key to the intelligent system, and will be discussed
recurrently in this book. One way to introduce it is through feedback,
another important aspect of a system. Feedback means that somehow
Hakan Gulliksson
23
previous output affects the input. To accomplish this memory, or physical
storage, is needed. In the figure below a delay serves as memory slowing
output (Out) down to modulate input (In). The memory could be either
external or internal to the system sheer transmission delay, as in the
figure, Neurons, RAM, paper, and hard disk are some examples.
Delay
(memory)
Figure III.1.3 Delay in a
feedback loop could serve as
memory.
S
+
Out
In
The concept of a memory is clear, it is where the information is stored and
retrieved. How to find it is maybe not that obvious. To use it we need to
access it, and there are two ways to do this. The first is to fetch and store at
a specific address. This is how an ordinary PC works, and also how you
use a deposit box at the bank. The alternative is to use associative memory
where information is found through an associative mechanism. Remember
the lyrics of the songs hum hum hum My way , Yesterday ti di ti da ?
Also this kind of memory can be built using technology.
Forgetting information is sometimes as important as remembering it. Ants
mark the path to the food by pheromones. When the food is gone the
pheromone path will not be refreshed, and it will slowly disappear, erased
by wind and rain. Similarly, paths followed by data packets on the
Internet will periodically be forgotten to enable new, perhaps more
efficient, paths to be established.
Java
The figure below shows a simple system built such that output is always
slightly bigger than the input. If we feed the output back into the system
the output will increase until some power limit sets in, or something
breaks.
Delay
(memory)
In
+
t
In
Out
Figure III.1.4 Positive feedback
could result in an output that
increases until something breaks.
S
Out > In
t
Misuse of feedback is a good way to build a worthless system, so beware.
An everyday example is that if you do not care for washing up the dishes,
you are likely to postpone it, and there will be even more dishes to do
when you next time get around to do it. This will of course amplify your
aversion and the problem escalates.
Hakan Gulliksson
24
Feedback can however also be used to our advantage. If we know how the
system reacts we can use feedback to moderate system behaviour. This is
another human speciality; with a vivid memory of the last huge pile of
dishes you might be more observant to your behaviour. If you are hungry
you eat until you are satisfied, not more. If the food is really tasty the
built-in feedback loop will still help you to stop.
An alternative to feedback control is open-loop control. Without feedback
the controller has to guess what control signal to apply. One example is a
fire alarm where the alarm control does not really care if the fire is put out
or not. Open loop control is fast, and robust against errors in output
sensors and feedback loops.
In
Control
Figure III.1.5 Open loop control,
i.e. without feedback to control.
S
Out
Yet another alternative is the feed-forward control, or in other words
purpose focused, predictive, behaviour control, where the control system
tries to estimate the likely disturbance of the system and compensates for
this through an extra signal path for control. This type of control requires
a very good knowledge about system behaviour. One example is that you
bring your raincoat on a cloudy day, and put it on just before it starts
raining. Another example is making a budget. This kind of control is
important in design.
In
S1
S2
Out
In
Intelligent control
for compensation
Feed-forward feedback:
What will happen if I do this?
Figure III.1.6 Feed forward control,
anticipating and compen-sating for
future states.
Try to think of situations where you use these different control principles
and you will find plenty of them. Open loop control is for instance applied
when accelerating a car from standstill, feedback is needed when you try
not to exceed the speed limit, and feed forward is in action when you see a
steep hill ahead of you and accelerate at the bottom of the hill, to take your
car over it.
A related system classification is to what extent a system is open. An open
system is responsive to its environment and exchange energy,
information, or materials with it. A closed system on the other hand, will
eventually decay into chaos because of the limited interaction with its
environment. A house under normal circumstances is heated and
maintained. Residents enter or leave through doors, windows allow light
into the house, news are shown on the television set, and the plumbing
allows for disposal of sewage. A house is in other words a fairly open
system. If we seal it up, close all inputs and outputs, it will slowly
degenerate. At the other extreme imagine what will happen if we remove
all windows and doors of a house. In some disciplines, e.g. computer
science, a system is alternatively referred to as an open system if its
interfaces to the external world are fully defined and available to the
public.
Hakan Gulliksson
Information
welcome
25
III.1.4 Adaptation and Learning
If a system is given a choice between two 2 paths, both leading to the goal,
it weights the pros and cons of the alternatives, and follows the best path.
An adaptive system manages to take new facts into account. If one of the
paths suddenly ends in a big black
hole, the system changes it
mind, turns around and takes the other path. Adaptation is very
important for survival and consequently has been re-invented many times,
and in many variations during human and machine evolution. Four
different adaptations can be identified; by predetermined reflex,
reasoning, learning, or by evolution over successive generations.
Definition: Adaptation is the
process whereby a system uses
perceptions from the
environment to optimise its
behaviour.
Flexibility is related to adaptability, but also describes systems that are
capable of being turned, or twisted, without breaking. A more flexible
system has additional degrees of freedom, i.e. additional state variables
possible to modify. If a system has too many degrees of freedom it will be
difficult to control, but with too few it cannot adapt. Think of the problems
you would have to walk with two splinted knee joints.
There is also an economical dimension to flexibility. Reuse is difficult
without it. A brick is not very flexible in itself. But, given their
characteristics they are easy, i.e. flexible, to combine. On the other hand,
flexibility means that adaptation to the situation at hand is necessary, and
this comes with an associated cost. A pre-fabricated house is the preferable
solution to a pile of bricks in many cases. A balance is sought between
rigid structures, e.g. plan economy, and flexibility, e.g. market economy.
Simple adaptation is described in the next figure, see figure III.1.7. The
parameter adjustment block has a fixed model that maps output
behaviour to new parameter values, i.e. the system adapts by a
predetermined reflex. The new values are passed to the controller that in
turn adjusts the controlled system.
In
Controlled
system
Controller
Out
Figure III.1.7 Simple adaptation.
Parameter
adjustment
One example of this type of system is a camera equipped with auto focus
and where the controlled system is the lens. The controller positions the
lens and the parameter adjustment processes the image to calculate a new
lens position that is fed back to the controller. The loop from the output
back to the controller is used to fine-tune the position of the lens in real
time.
Simple fixed reasoning could be added either to the parameter
adjustment, or to the controller, to make the adaptation more intelligent.
The auto focus controller could for instance use information about light or
battery conditions to modify new parameter adjustments.
Hakan Gulliksson
26
Usually the adaptation is associated with a real time constraint. The
system must change at a rate determined by the environment. If it snows
in summer and the temperature falls below zero, something that does not
happen often even in Sweden, an adaptive creature such as yourself
would just put on some more clothes, a plant on the other hand would die.
The time scale of adaptations varies enormously, from the milliseconds
range when we run and place our feet on uneven terrain, to cultural
adaptation acting over years, or even generations. Another example of
adaptation is a simple sponge, feeding on filtering water. It orientates
itself such that the flowing water aids feeding. This way the sponge does
not have to pump the water itself. What it must do is to somehow sense
the current and adjust its own orientation.
z
z
Z
We talk of learning if we mean adaptation as a characteristic of an
individual system and of evolution when referring to the collective
process where reproductive mechanisms come into play.
Reinforced learning is how we enhance the simple adaptation described
above. As figure III.1.8 below illustrates, parameter adjustment, the critic
block in the figure, is given more information and intelligence. It
reinforces behaviour by telling the system if the output is right or wrong.
One example of reinforced learning is a child learning to walk. It is very
difficult to instruct a one-year-old. Yet somehow, through internal rules,
courage and stubbornness the child manages. A learning algorithm goes
something like this Under these conditions I did that, fell, and it hurt, so,
I did this way instead under the same conditions and did not fall, “ha!
“n example of learning by doing .
In
Controlled
system
Controller
Out
Figure III.1.8 Reinforced learning.
Reinforce
Critic
In supervised learning an external expert is added with knowledge that
improves the behaviour, see figure III.1.9. The controller also could take a
more active part in adjusting behaviour, not only react from given
parameters.
One example of this type of learning is a child learning addition and
division, 5 + 5 / 2. The expert provides the child with the right answer and
also suggests strategies that the controller uses to solve the problem.
Hakan Gulliksson
z
z
Z
100
10
27
In
Controlled
system
Controller
Out
Figure III.1.9 Supervised
learning
Expert
Another way to learn, without reinforcement, is to integrate knowledge
from the problem domain into the system. If the system is a machine we
can directly implant knowledge, but for humans this is impossible and the
necessary rote learning is laborious and error prone.
Why is teaching a machine difficult?
Learning improves performance in several ways. New facts, experiences,
processes, and concepts, including new ways of reasoning, will be made
available, and information can be reorganized for more efficient use.
Learning is important also for generalisation and specialisation of
knowledge and concepts. How else could we know the difference between
hard rock, country, house, and techno?
To conclude, in supervised learning the environment serves as a teacher,
and in reinforced learning as a source for evaluation. Studying how to
learn is important, both for the development of the human race and of the
machines, and there are many approaches from human life that can be
reused also for things. Examples are; learning by imitation, discovery,
analogy, learning by studying books, learning from society (ethics,
morale), learning by doing (e.g. programming), by writing, or by
teaching (dialog with students). We will come back to learning and to its
companions adaptability and reasoning many times in this book.
Learning by imitation?
III.1.5 Heterogeneity, Autonomy and Intelligence
If all subsystems are the same, we say that the system is homogeneous.
For mechanical maintenance this really matters, you will buy 50 identical
aeroplanes for your company in order to simplify maintenance. Two ants
chosen at random from an ant mill are very similar, i.e. they form a
homogenous system. Heterogeneity on the other hand means that there
are differences among subsystems. Internet is one example where a large
system is put together from different networks and computers.
A system managing on its own is said to be autonomous. A more specific
definition of autonomy is given to the right. An ant is for instance not an
autonomous system according to de definition since its behaviour is not
learnt. It has evolved. An autonomous system is normally instinctively
attributed with intelligence by a human observer, but autonomy or
heterogeneity is not necessary for intelligent behaviour. An ant leaves a
chemical marker indicating the trail to a nice source of ant food. Other ants
follow the trail, and add to it when they return to the stack. As the trail
strengthens more ants will follow it. This emergent behaviour that appears
to be intelligent is actually accomplished by a very simple control
mechanism. For intelligent autonomous systems the assumption of
homogeneity is not valid, but it is a convenient approximation. We for
instance have laws that should be the same for everyone.
Hakan Gulliksson
A system is autonomous to the extent
that its behaviour is determined by
its own experience
[RN]
28
Autonomy is one aspect of intelligence, the ability to adapt, and to learn
are others. Already Aristotle considered intelligence to be the main
distinguishing feature of humans, but neither he nor anyone since, has
come up with a definition of intelligence that everyone agrees on. Some
attributes of intelligence are:
Memory capacity
o Quantity
o Information organisation and retrieval
Problem solving ability
o Speed
o Complexity
o Creativity
Ability to learn
o Facts
o Concepts
o Processes
Self-awareness
Social ability
Several of these attributes are now no longer unique to humans and
animals. They are found also in technology. In many areas and
applications a computer program is much faster, can handle more
complex relationships, and search better for information than humans.
Intelligence is notoriously difficult to measure, which is not surprising
considering that no one has managed to define it. Several attempts have
been tried, for instance measuring vocabulary, reasoning, or memory, but
they have all been criticised on different grounds.
Three demands for intelligence are: complexity of purpose, structural
plasticity, and unpredictability [TWD2].
The first claim is that a more intelligent system can manage activities with
more complex purposes; which seems reasonable. The second claim is that
intelligence improves with structural plasticity, i.e. with the ability to
modify internal structures of the system to behave differently. The
plasticity can be used to adapt to the environment, or to accomplish new
goals, i.e. to reprogram behaviour. For a computer based system this
capacity comes from its layered structure, see figure III.1.18. There is no
need to change the hardware if we want to run a new program.
Intenelligence measures:
-Distance between the eyes
-Rates of learning Nonsense -Syllables
-Standardized test of Intelligence
+
1110
+ 0101
10011
How do you define stupid?
”Doing the same thing over
and over again, in the same
exact way, and expecting
different results!”
“Causing damage to self or
others without corresponding
advantage.”
Stupid people believe that almost
everyone else is stupid including:
”People who dislike me”
”People who disagree with me”
”People who are novices in an area
in which I am an expert.”
Application
Objects/Functions
Instructions
Figure III.1.18 Plasticity is a
result of a layered structure.
Machine code
Hardware
With structural flexibility and a purpose that is not too obvious there is a
possibility that the behaviour will appear unpredictable, but still
purposeful. This system will look intelligent to us.
Hakan Gulliksson
Are you intelligent? Prove it!
29
III.1.6 Communication and language
A world of isolated systems is simply not possible, and to keep a system
together the subsystem need to communicate. We need communication,
and language is a key communication concept that we will discuss many
times in this book, along with various representations, transmitters and
media. It is difficult to define and describe what communication really is,
and the topic could easily fill a book by itself, but simply put it is the
process of transferring data, information and meaning between humans,
animals, plants, and things. One example is a dog barking at a cat, another
is the colour of a flower, signalling to the bee. A third example is text input
to a computer, and a last one is signalling that helps an aeroplane in bad
weather conditions. Communication needs a language, code or signal
with which it can represent the message, and a mechanism using a
physical transmission medium.
Assumptions for successful communication are that the message sent is
well organised, consistent, follows the agreed protocol, and is about
something meaningful to the receiver. The habitants of a West Indian
Island will find it difficult to understand messages about different types of
snow falling in Umeå. Communication generally does not imply any
intention on behalf of the sender, but usually a sending mind has a reason
for, and gains something by, communicating. Successful communication
often demands that the sender adapts to the receiver and to the
peculiarities of the transmission mechanism.
The definition of communication in the margin to the right is a bit
restrictive since you have to have a mind to participate, which excludes
some active communicators, exemplified by an intelligent thing. Nor is
passive communication covered, such as that seeing snow outside informs
you about the weather conditions. A much less restrictive description of
communication, formulated by Adam Bailey, is found below the first one.
Use of the word process in this latter definition implies that
communication has a time aspect, which in turn implies changes in the
causes, characteristics, and results over the time span of a communication.
Another comment is that the disseminated information should have some
effect on the recipient. Otherwise, there is not much use of the
communication. Our last attempt to a definition, also to the right, is the
most general one.
Human language can be seen as messages, i.e. as packets of information. It
is however also a medium for interaction. A good language should be
efficient, expressive and easy to use. Ease of use is supported by the fact
that the meaning of a sentence is a function of the meaning of its parts,
which in turn depends on the meaning of the words. Not as obvious as it
seems when you think about it. The grouping principle applies.
Experiments show that subjects remember the words for a couple of
seconds, but the meaning of the words stays for a longer time. Perhaps the
structure of language has evolved to facilitate remembering and
understanding the meaning of what is said? Important properties of a
human language are efficiency, ease of use, and expressiveness.
Hakan Gulliksson
The word “communication” itself originates from the Latin Communicare,
which means, “to make common”, or
“to share”, which makes sense.
Definition:"Communication takes
place when one mind so acts upon its
environ-ment that another mind is
influenced, and in that other mind an
experience occurs, which is like the
experience in the first mind, and is
caused in part by that experience"
I. A. Richards 1928
“Communication is a process where
information is disseminated from a
source to a recipient”
Adam Bailey
“Communication is change of state in a
receiver caused either actively by an
intent of a sender, or passively by
properties in the context of the receiver”
[HG]
Interaction is acting within a
representation
Brenda Laurel
30
Efficiency in the use of language means that we do not tell everything we
know to everyone, and what is left out is of no interest. Who has not heard
a lengthy tale of a bus trip where nothing happens? You wait for the
action, but instead learn everything about the colour of seats that were not
comfortable. If you came from a different culture, where bus riding is an
art, you might be interested, but otherwise the story violates the efficiency
code. Expressiveness is another important issue, because we want to
express ourselves as clearly as possible. We want to say what we mean,
even if it is something complicated, or deep. Other features in a good
language are ease of learning and error detection, as well as precision
and compactness of the language for efficiency of use.
What is laughter? Is it a language?
Is it communication? Is it
interaction?
Vocal communication between humans, i.e. spoken language, takes place
on many levels of experience; physical (sound waves), physiological
(nerves, muscles), chemical (processes in muscles and brain),
psychological, cultural (speaker environment), linguistic (language
specific), and semantic (meaning of message).
The written language looses information in the transformation from words
spoken in a conversation. Rhythm, face expressions and many other
helpful tips on how to interpret the message have disappeared. To
compensate for this lack of context we need to add more words. Written
language needs words such as explain and propose whereas in a face
to face conversation we would understand that someone is trying to
explain something to us, without explicitly telling us so. The language
used for SMS messages have to be as concise as possible. A new sublanguage has developed with expressions such as 4u, lol, and roflmao.
This language is also perfect for interactive TV-shows that display SMS
messages in tickers.
Language reflects culture, human behaviour, action, and other aspects of
life and people seem to always find ways of saying what they need to say.
One example is the word "drunk" that is supposed to have more
synonyms than any other term in the English language. Another Modifies
interesting, and equally useful, observation is that any language can
express almost anything expressed in any other language (given a little
time and lot of words). Culture is the creator of language and language
sets the limits for what can be expressed in a culture.
Society
(users)
Modifies
Language
Compared to the language of animals our own language is quite
advanced. Animals are not able to communicate about things outside their
immediate temporal and spatial contiguity. One exception is bees! They
have a language to describe both where to find a nectar source and also
the amount of nectar at the source. One limitation is that the bee language
is pre-programmed in bee genes. “ statement such that I hope I feel as
good tomorrow is impossible to make for any animal, at least as far as
we know. The statement abstracts the current state, and projects it to a
representation of a day that has not yet happened. When we expand an
utterance to a sentence, or to a whole story we will add layer of layer of
abstractions and symbols describing context, moods, actions, and much
more.
Hakan Gulliksson
31
III.1.7 Emergence
Emergence is when fundamentally new behaviour or properties emerge as
things, or other entities, are hooked up, run together, or are united. It is
more than an effect of multiplicity, so doubling the workforce to get twice
as much done does not qualify as emergent behaviour. If the workforce
starts socialising and drinking beer, then you have an emergent
behaviour. The collective behaviour is in that case not readily understood
from the behaviour of the parts.
Definition: Emergent properties of a
complex physical system are neither (i)
properties had by any parts of the
system nor (ii) a mere summation of
properties of parts of the system
Examples of emergent properties are temperature and pressure of a gas.
They do not follow directly from the description of one particle. Forming a
drop of water out of atoms of Hydrogen and Oxygen is another example.
Who could have thought that these two gases should combine into the
necessary ingredient of life?
Systems can be studied by first dividing them up into subsystems,
investigating each subsystem separately, and eventually breaking each
subsystem up into its basic elements. This strategy is called reductionism
and has been used in many sciences, for instance to help us discover the
atom and the cell. The basic idea is that through the understanding of the
elements we can deduce the workings of the whole. An alternative
strategy claims that the whole is more than the sum of its parts, and that
the system is best understood when viewed as a whole, i.e. holism.
There is an interesting asymmetry hidden here. When we reduce a given
system to its parts there are no surprises, the book is composed of a
sequence of pages each revealing a part of the plot. We start with
knowledge of the whole, which gives us a particular view of the parts that
explains how they contribute to the whole. For reductionism science
works nicely, even though there are systems that are currently beyond our
understanding. We do not understand how to reduce them to their
separate parts. One example of such a complex system is our brain. Other
systems, such as a filled red balloon, cannot be studied by taking them
apart.
If we, on the other hand, join already known parts together there can
certainly be surprises. It can be very difficult, if not impossible, to predict
the behaviour of the composed system, without actually putting it
together and testing it. If we add three short sequences of film together the
effect can be quite different from the sequences shown separately. A
typical example is the cat-mouse-cat sequence indicating a threatened
mouse.
One example of emergence from physics is that two pendulums placed
closed to each other will synchronize their swings. From the world of
insects we know that some termites tend to drop their mud balls close to
where other mud balls have already been dropped. This will create
impressive architectures and is also a good example of where the
environment and the actors interact.
Hakan Gulliksson
BUH!
You don't take apart a frog
to see how he jumps
Seinfeld
Emerging line
Emerged pile
32
In many designed systems the behaviour of the system emerges as new
skills are adopted and learned. This is a process that in turn could enable
new behaviours, even more difficult to envision. According to the
researcher Mark ”ickhard we should abandon what he calls a false
metaphysics – a metaphysics of substances particles and properties and
substitute it for a process metaphysics [MB3]. We already did that when
we exchanged phlogiston for quite a different model of fire. The problem
with emergent behaviour surfaces for instance when designing formal
procedures for a work setting. In reality what people do is to develop the
given formal procedures to new practices, from their own point of view
better adapted to their context of work [WC].
Why then does emergence emerge? The answer is of course that
emergence is the result of increased complexity, which is difficult to
formalise, or even to comprehend. Our perceptual and cognitive abilities
are not accurate enough and the analytical models of today are not
sufficiently powerful. The real world (including humans) is too complex to
understand! Intuitively it might seem that emergence is an unusual
phenomenon, but this is not true. We are constantly surrounded by
emergent phenomena. How come that the chair you are sitting on does
not collapse? Why should quantum fields form atoms that are grouped
into molecules, constituents of a tree that was used to make the chair?
Level after level of emergent phenomena are present in this example. You
are yourself an emergent phenomenon! Not only objects but also
behaviours can emerge. One example is panic at a rock concert. How
could you counter this behaviour using technology? Should technology be
applied at a personal basis, e.g. earphones, or for the whole crowd
loudspeakers?
Heureka!
Archimedes
The white color of all refracted
light, at its very first emergence ...
is compounded of various colors.
Sir I. Newton.
It is tempting to say that emergence is a result of interaction! The
constituents by themselves are not enough; interaction is the key that
merges them, increases complexity, and possibly emerges the result to
something new.
III.1.8 Space and time, change and mobility
Starting from the beginning we need to shortly introduce space and time.
They are presumably the two most fundamental aspects of reality. Why
this is so is a rather philosophical question, which we currently cannot
answer. Undoubtedly space and time are very important because they
provide any system with reference frames and contexts to delve in. An
input can for instance be larger or smaller than another, and before or
after. We compose a photograph by first placing objects on a scene, and a
video is a sequence in time of images where we can watch our
grandchildren play.
Let there be light: and there was light….
And God said, Let there be a firmament in
the midst of the waters, and let it divide
the waters from the waters…..
And God said, Let there be lights in the
firmament of the heaven to divide the day
from the night; and let them be for signs,
and for seasons, and for days, and
years:…
on the seventh day God finished his work
which he had done, and he rested on the
seventh day from all his work which he
had done.
Genesis
In the Beginning there was
nothing, which exploded.
Terry Pratchett
Aztek calender
Hakan Gulliksson
33
A stable system is a nice system; any input will deliver a limited output.
Unstable systems do not behave that well, but can be very interesting. If,
for a particular unstable system, you change the input even a teenyweeny, the systems output in theory rises to infinity. In real life we have
no such thing as infinity because we live in a world with limited resources
of energy and time. Anyway, you most certainly do not want to design
unstable behaviour into your system.
One example of an unstable system is a ball on top of a very high
mountain. The ball is balancing on the top and even the smallest
disturbance will send it down the mountain slope with increasing kinetic
energy. Another example is the familiar microphone-loudspeaker
interaction that quickly makes you turn the volume down, or move the
microphone out of the way. The human equivalent to stability is someone
who is not easily disturbed, or brought out of balance.
Patterns are necessary prerequisites and consequences for stability.
Adaptation needs a stable background, i.e. a pattern, to adapt to. The
Adapt
ability for pattern recognition is fundamental and extremely important.
Adapt Adapt
The example that first comes to mind is perhaps visual and auditory Adapt Adapt
pattern recognition, but the body registers patterns too, a callused hand
Adapt
Adapt
Adapt
from playing golf is one example. A tree also recognizes patterns, the
Adapt Adapt Adapt Adapt
foliage tends to be thicker on the south side of the tree because of direct
Adapt
Adapt Adapt
sunlight, and if a tree is partially shadowed it will lean over to catch more
45 sec Breakfast
light.
3 min Bathroom
2 min dressing
Equilibrium is another, quite natural description of a system state. The
12 hours work
balls in the figures to the right are in equilibrium. The ball on the
mountaintop is not in a very stable state, but it is in a state of equilibrium
since no energy is transferred between the ball and the system. When the
ball starts rushing down a slope it leaves equilibrium. Social equilibrium is
the human equivalence where the individuals involved do not have
anything important to say, and there are no outstanding questions to
resolve.
In terms of energy, the state of a stable system will not be disturbed by a
small amount of energy. Equilibrium means that the energy of the system
is divided among interaction participants and the environment in a way
that does not change with time. A glass of water on a table will soon be in
equilibrium, but put an ice cube on the table and you have a system that is
not. You can actually try this at home.
In terms of energy, the state of a stable system will not be disturbed by a
small amount of energy. Equilibrium means that the energy of the system
is divided among interaction participants and the environment in a way
that does not change with time. A glass of water on a table will soon be in
equilibrium, but put an ice cube on the table and you have a system that is
not. You can actually try this at home.
Hakan Gulliksson
The relevant equation is:
Knowledge = power = energy =
matter = mass; a good bookshop is
just a genteel Black Hole that
knows how to read.
Terry Pratchett
34
Time invariance is another of the nice system properties. It is nice in the
sense that a time invariant system is predictable and will respond the
same way to the identical input today, as it did yesterday, and as it will
tomorrow. Humans are not time invariant, but many computer based
applications are. Time variance implies some memory in the system
because if you have no memory of yesterday, why should you change
your behaviour today? In general invariance, and not only over time, is
something that humans need and look for. It stabilizes aspects of the
environment and helps us survive. Gravity, for instance, works one way
only and has done so for a long time. This makes predictions about falling
apples possible. A static system is a time invariant system with a
behaviour that is independent of time; a dynamic system on the other
hand is a time variant system. Its behaviour can be described as the
function system_behaviour(t), where t is the time.
Behind any dynamic aspect of a system, such as time variance and motion,
is the fundamental notion of change. The level of change is a trade off
between static aspects, such as stability, familiarity and security, and more
dynamic ones, for instance flexibility and creativity. An alternative
formulation is that existence is a trade off between being strapped down
and bored on one hand, and being addictive to perceptual experiences on
the other. Some fundamental changes are: open, on, start, increase,
decrease, pass, rise, collapse, towards, away from, stop, turn off, close.
A typical example of a physical system is a spring. If you pull a spring, the
force F you have to apply will be proportional to the amount L that you
want to extend it, i.e. L(F)=kF. This is independent of when you pull the
spring (it was ideal, remember). If you let go of the extended spring the
system is no longer static, but dynamic. The extension of the spring will
depend on when you measure it, i.e. L(F, t), how far you pulled it, and
when you let go of it, i.e. its initial state.
A very important aspect of a system is its current location and whether
this location is changing. Mobility makes exploration of a space possible,
and the necessary prerequisite motion unites the concepts of space and
time. The notion of space is not limited to the physical space of reality; a
world created inside a computer also qualifies as space, and a social space
is another example of a world where mobility means navigation using
language and emotions. One space where we still have not figured out
how to move is time itself.
The basic categories of animate motion for survival and reproduction are:
pursuing, evading, fighting, and courting. In abstract terms an interactor
moves to access a resource. We go to see our friends; nomads move their
tents when seasons change, or when the local food supply is finished. The
main feature of a mobile system is to maintain a continuous availability
of resources, for instance a communication channel. If a resource for some
reason is not accessible often enough the system, and the designer, has
failed. A GPS device that does not give the position when you are lost in
the wood will not be used for many excursions. Sundials do not tell you
when it is time to go to bed.
Hakan Gulliksson
"Right, you bastards, you're...
you're geography"
Terry Pratchett
- infinit decimal change
Long now foundation is
developing a 10.000 year clock
F
L
Definition:
Mobility - the quality of moving freely
Ask Jeeves on the Internet
If the mountain will not
come to Muhammad,
then Muhammad will
go to the mountain.
35
Interaction, and especially sharing a resource between two interactors, is
more difficult while moving. Without a stable context to use as a
reference communication efficiency can be low, and distracting interrupts
from changing context are more likely. If sound or vision is used
interaction is only possible within a small area, or if the two systems are
moving in parallel at approximately the same (low) speed. In all other
cases additional technology is needed. A mobile phone solves the
problem, but introduces new problems such as for instance low
bandwidth channels and lost messages.
Mobility has many implications. First of all a mobile system has to bring
its own energy source, or somehow extract energy from the environment,
e.g. using solar cells. If the system is to be carried around it should not be
too heavy, or otherwise be in the way, which implies restrictions on its
size. Similarly a mobile system cannot communicate over the wired
network without a wireless access network. In general, for any mobile
system, we have to bring technology along with us, or use some
pervasive infrastructure, e.g. a watch, church bell, or sundial.
Almost any change of the physical environment of a moving system is
stochastic because we do not have enough information to infer how the
context at the next position will differ from the current. Try out the
experience of travelling over the plains in the south of Germany and
suddenly see the Alps rise sharply in front of you. It is quite a surprise.
Naturally, both the physical position and the social context change as the
system moves, but so do also other environmental constraints change such
as lighting, and time zone. The wireless transmission capacity could be
reduced, or there is no one in sight to interact with. Some mobile systems
are carried around, and in this case the carrier at least provides a stable
context.
M. Duchamp, Nude
descending a staircase
Definition: Mobile, capable
of moving or being moved
about readily.
Up in the sky, swoop, swallows flees
and mosquitous follow, loop.
Up in the sky, swoop, swallow flees
and mosquitous, follow, loop.
A major problem is that changed context forces real time constraints on
the system. Recall from the previous chapter that recall from memory,
signal processing, feedback, adaptation, planning, and learning must be
fast enough. Also, important tasks such as identification, manipulation,
and navigation by a mobile system will be done under time pressure, as
well as decision-making. Guiding the driver of a car using a map in a large
unknown city is difficult. Just thinking about sitting beside the swearing
driver during rush hour, guiding him to the train station, where the train
is about to leave soon, raises the stress level.
Is mobility a prerequisite for intelligence? Let us limit the discussion to the
interactor Thing in the physical reality. The earth orbiting the sun is not a
very intelligent system by the criteria that we have discussed, so mobility
is not enough for intelligence. Many things behave intelligently for specific
simple tasks, such as playing chess, but this intelligence is not learned, it is
programmed into the thing by a (mobile) programmer. Even if we gave
the thing wheels, and close to unlimited processing power and memory, it
would still not manage in the physical reality. A major reason for this is its
limited sensory system. If we added a distributed network of sensors and
the ability to process the resulting information, would we now have
enough functionality for intelligence? In effect the distributed sensors
make a stationary system mobile without the need to move, but to build
an intelligent system we also need effectuators. Through them a thing
could affect the world and learn to draw conclusions from the result, i.e.
the thing in this case is an autonomous, context aware, interactive system.
Hakan Gulliksson
As the philosopher Hegel said:
Always already
36
Some researchers claim that even this is not enough. According to them
the thing needs to master a language and use it to explore reality,
interacting with equals that probably are mobile too.
Given the characteristics of mobile systems discussed above there is no
surprise that designing flexible, usable, secure, user interfaces for them is a
real challenge.
Motion
One possibility would be to use space and time as a basis for this book.
The problem with this is that they are too fundamental. We sense their
presence in everything, but cannot really exploit this feeling when
describing interaction and technology for interaction. Therefore we will
not start this presentation all the way from space or time, but leave a little Objects
something to philosophers and artists.
(Well, all right, here is a piece of wisdom to wrap this passage up. Space
without something in it is truly meaningless, Ok? Enter matter, which in
turn is energy. A chunk of matter without anything happening is not very
interesting, so we need events. As we add events time starts flowing and
Rock n roll is not very far away. It s really very simple , and in the end
it s all about having fun.)
Actions
Time
Frame of
events
III.2 Complexity, we and it are certainly complex
The most important property of a system is perhaps its complexity. All of
us have had encounters with complex systems that we cannot
comprehend, fix, adjust, or manipulate. Examples are the inner workings
of a television set, and organising a group of ten 10-year olds at a birthday
party. In other words, complexity is a natural concept to us humans, but
how can we measure and quantify complexity? If we could estimate it we
know if the task at hand is practicable, which helps us to predict and
decide, see table III.2.1. For a system formulated as a software algorithm
such estimation is usually possible. It is however much more difficult to
quantify the complexity of the discussions around the dinner table, Friday
night, at eight o clock, after a glass, or two, of wine.
Low complexity
Overview
Fast reaction
Cheap to manage
Small amount of data
Hakan Gulliksson
High complexity
Details
Slow response
Expensive
Large amounts of data
Table III.2.1 Some properties of
complex systems versus systems
with low complexity.
37
III.2.1 Why are systems complex and difficult to understand?
As human beings we have a confession to make. Our environment is too
complex for us, and the fact that we have survived thus far is truly
amazing! At any one time we can keep track of three to seven items, which
is not very many in the physical world. The input rate is also rather low.
We can focus on approximately 20 bits of information per second where
one bit is either a
or a . Compare this with the million bits per
second needed to code a video transmission of an ordinary televisionshow.
Definition: A complex
system is not easy to
understand or analyze.
It is easy to underestimate the complexity of ordinary life. Perhaps this is
because we are so used to our own environment that we live our life
through powerful, already accepted and learnt, abstractions? In design
and development of technology we are on the other hand constantly faced
with the details of implementation, and the abstractions are swept away,
or are useful only as referential patterns.
Obviously, we have somehow survived, so let us proceed with the
complexity issue. Why are systems complex? One reason for this is hinted
at in the definition to the right. To understand a system we need to
understand its parts, and to understand the parts we need to understand
how they interact.
We start by looking at a system we all know well, the family. It is an
example of a fairly complex system and the complexity is the result of,
among other things, the following properties. There are different kinds of
families, e.g. nuclear family, but all of them consist of a number of
individuals, each with a relationship with the others. The relationships
depend on the qualities of the individual. Families do not live alone, they
have to interact with the outside world. We can describe different views of
the family, economical or social, and a family fulfils different functions
such as raising children, and providing for a sheltered social environment.
A family is formed, grows, and disintegrates in a number of ways. It also
adapts to changing circumstances.
A stone is perhaps the simplest of things, but in the world of man, even a
stone provides a stunning complexity. To start with its exterior has a
texture, colour, and structure that is very difficult to describe in detail, but
we can still easily separate Granite from Sandstone, and without looking
we easily discriminate a round stone from an apple using our hands.
Internally granite is composed of quartz and potassium feldspar, but also
of other minerals depending on the conditions when it was formed.
Quartz is typically grey to colourless and potassium feldspar is almost
always pink coloured. Actually feldspar is a generic name for three very
closely related minerals: Orthoclase, Sanidine, and Microcline with similar
physical properties. They are all composed of the same elements, but with
slightly different crystal structures.
A small stone is something that we can throw, but it can also be used to
mark one post of a soccer goal, or to stop a car from rolling down a slope.
Several religions even recognize a Devine Deity in stones.
Hakan Gulliksson
Definition:
A complex system
consists of interconnected
and interwoven parts.
Complexity: When a wine is at
once rich and deep, yet balaned
and showing finesse
Internet
Hadschar al Aswad
(the Moslem sacred
“black stone”)
Kohinoor.
38
Some systems, like the weather, are natural constructs, while other have a
complexity because of man. Why was it necessary to make them complex?
One reason is that we try to build general systems that are applicable in
many contexts, but generalisation comes with a price tag. Radio is for
sounds only, but the Internet can be used by many medias, at the price of
complex behaviour and resources spent. A general system can handle
many types of inputs, but more specialised systems restrict their input and
functionality. Compare a bicycle to a family car, both can transport one
person, but the more complicated car is useful under many more
circumstances.
Another reason for complexity is that we need, or want, to perform
complex tasks, and even many relatively simple problems become
complex because we do not have the time, computational resources, or
memory to solve them in a straightforward way. Because of development
of technology we can manage new and more complex tasks, and our
interaction with the world becomes increasingly technology dependent.
We want to visit Los Angeles and London, listen to music when we are
out running, and use the Internet to contact our children, at our
summerhouse, out in the archipelago.
Simple systems?
Oscillator
Pendulum
Orbiting planet
Spring
Simplify simplify
Thoreau
Complex problems have simple,
easy-to-understand wrong answers.
Internet
1234567∞
When is a system complex? One obvious answer is that if a system
consists of a large number of components (more than 7) then it is a
complex system. Adding components results in a more complex, but
potentially also more adaptive system, which is one reason why increased
complexity is built into progress.
Another measure of complexity is the number of state variables that we
need to describe a system. If we need a lot of them to describe the system
(more than 7) it is a complex system. Whenever state variables are
mutually dependent the complexities multiply, which is why systems
involving interaction tend to be very complex. This can be reformulated
as: feedback and interaction adds to complexity. Highly concurrent
systems where subsystems work in parallel, rather than in sequence, are
also complex. This is not surprising, as the number of possible states at a
given time will increase.
Dynamic systems, for instance mobile systems, are inherently more
complicated than static ones, and timing issues such as synchronisation
and delay complicates them even more. Each of us has encountered some
technical equipment with too long reaction time. When you switch it off, it
does not respond, so you become impatient and try the switch again,
which turns the system on once more, and so your struggle continues,
constantly out of synchrony. Time delays that vary in an unpredictable
way are even more confusing.
George Gates
Hakan Gulliksson
Internet now consists of more than
108 interconnected computers, a
figure quite close to the 109 human
individuals now living.
Me
complex?
He, he, a naughty
teaser you are.
t
t
Bill Bush
39
Humans are good at handling many complex systems. We can recognise a
familiar face behind sunglasses, or behind a hat pulled down over the
eyes. Try building a robot system to wash up after dinner! How much
force can you apply to hold a plate without breaking it when you clean it?
A creative solution is to buy a dishwasher (or use paper plates).
Last, but not least, stochastic systems, where we only know probabilities
of events and behaviours, are more complex than the deterministic ones.
This is not surprising since we know less about the system s states. To
reduce complexity a stochastic signal is typically characterised by its mean
value. The average Swedish male for instance has a shoe size of about 42,
but this fact is of little use in the shoe shop when you want to buy your
uncle shoes for Christmas.
The answer to the ultimate question?
42!
D. Adams
Science has defined many measures of complexity for different purposes,
some complementary, or even contradictory. Intuitively a highly regular
thing is simple; think about a circle or a cubicle. We could also
characterize something that is very irregular, or highly random, as having
low complexity. White noise is mathematically very simple to describe,
even though it is impossible to predict the value of such a signal at a
specific time.
Maybe we can understand complexity better if we contrast it to simplicity?
The table below lists some differences between simple and complex
systems [RR2].
Simple System
Fully predicative and
predictable
Fully fractionable
Has computable models
Synthesis is the
inverse of analysis
Complex System
Contains impredicativities and
non-predictable
Contains non-fractionable aspects
Has non-computable and
computable models
Synthesis generally distinct
from analysis
One way to define complexity from a human perspective is that
something is complex if it surprises us, makes errors, or behaves
unexpectedly. Another, also intuitively acceptable definition is that a
complex system is difficult to fully describe. Next we will discuss how to
reduce complexity, i.e. to help us stay in control and understand what is
happening.
“War of the ants”
“Simple – Not involved or complex“
Websters dictionary
Table III.2.2 Contrasting
simple and complex
Rate
Time
III.2.2 Reducing complexity
It s a fact is that somehow humans survives. How is this possible? What
mechanisms have helps us to manage the amazing complexity of reality?
We are geared at recognition of patterns of behaviours, objects, and
structures. This is supported by of our physiology, and might be the best
(only) way to master complexity. Pattern recognition is however only the
tip of the ice berg. There are many mechanisms and principles supporting
and complementing it.
Hakan Gulliksson
Some characteristics of really
complex systems are: they learn,
adapt, react, organize, mutate,
evolve, explore, couple, expand
and organise.
40
The first principle is to group similar or closely linked items together and
the second principle is to order the items. It works by reducing the
number of features, behaviours, relations, or whatever the type of
components attended to. The principle also applies to the case where
something occurs often. Information about such events will be stored
efficiently and for fast access. The second principle reduces the perceived
randomness of the system under study. We will discuss grouping and
ordering in this chapter, but there are also other strategies. Attention, i.e.
ignoring information is one, and concentrating on differences another.
Grouping will create networks of hierarchies of items and hide details
behind interfaces. A good grouping will result in a structure that is
modular with low coupling between different partitions. In physical
reality space and time provide natural references for grouping, e.g. a
family lives at an address over a period of several years. Events or things
sensed simultaneously probably have a relation.
Coupling and cohesion are two measures of how close two objects are.
Coupling refers to the number of connections between objects, and
cohesion to what extent they are glued together. They could for instance
be parts of a pattern. Two objects with high coupling are a brother and a
sister, less coupling is found between cousins. A modular group with high
cohesion and coupling can easily be encapsulated as a new item, or
concept, like a football team, a kitchen, or a lawn. The windows on a
house have an extremely high level of cohesion to the house.
Car
Family
Figure III.2.1 Aggregated
systems.
Tires
Engine
Parent
Child
There are many notations for describing grouping. In UML (Unified
Modelling Language) grouping components into a system is called
aggregation, denoted by a diamond as shown in the figure III.2.2. The
diamond is placed at the grouping object s end of the association. “ car is
a grouping object, aggregating four tyres.
Car
4
Tyres
When we group similar objects and abstract the similarities into a new
object the resulting object is referred to as a generalisation, or in UML as
an Is a relationship, or in other disciplines as an abstraction. In art the
word abstraction is however used differently, interchangeably with nonobjective imagery that departs from representational accuracy . The figure
item to the left in the figure III.2.3 below is a generalisation of a circle. If
we follow the relationship the other way we say that the circle inherits the
properties of the figure.
Hakan Gulliksson
Figure III.2.2 Aggregation
in UML.
Simplification a’la Pablo Picasso
41
Square
Figure
”Is a”
Ellipse
Circle
Colour
Draw()
“Is a”
Man
Son
Figure III.2.4 Generalisation as
it is used in object oriented
programming, described in
UML.
Father
Generalisation for both actions and objects is useful, even necessary, but
the downside is that we loose track of any inner workings and internal
details. In real life humans exploit such additional information in many
ways. If you are told that someone has escaped it makes a difference in
if you are a terrorist or a magician. If you are a magician you are interested
if the escape was from a cast or from a chain. You would like to see a
video because maybe you could reuse some ideas. All of this is hidden by
abstraction in the action verb escape .
In software abstraction is used to hide the implementation of subsystems.
This makes software easy to reuse at the cost of loosing control over
internal data structures. A control that could be vital for optimised
implementation of the system as a whole.
The use of inheritance simplifies because one object serves as the blueprint that can be extended to describe a new system. The circle can be
described as a particular graphic figure that inherits the properties of the
abstract graphic figure such as having a colour and being something that
you can draw, see illustration to the right. A new figure, a rectangle, can
be created from the basic graphic figure and will also immediately inherit
the colour property and the basic method draw().
Similar to generalisation and aggregation is scaling. The basic idea is to
reduce complexity by studying the system at an appropriate level of
detail. If we for instance want to calculate the orbit of the moon we should
ignore the effect of people on earth, and model the system as two solid
bodies interacting by gravitational forces. Nicely complementing scaling is
focusing. When we focus we ignore everything we do not attend to. By
dynamically changing resolution levels and adjusting focus we can find
the optimal way to study a system. At the lower resolution we make
assumptions and build hypothesis, which we test at higher resolution.
Pablo Picasso (i.e. Pablo Diego Jose
Francisco de Paula Juan
Nepomucenco Maria de los
Remedios Cipriano de la Santisima
Trinidad Ruiz Picasso)
Figure
Colour
Draw()
”Is a”
Rectangle
Stretch()
Hint: Hungry for blood
Order reduces complexity and also the computational demand for access
(search time), which is important for many tasks. The telephone directory
would be quite useless without sorting the subscribers. Order can be
imposed by sorting into simple categories, such as in the phonebook, or
according to more complex criteria. One example is the family tree for our
flora where flowers and plants are sorted in a way that is not always
obvious. Another example is when we order actions to form a plan.
Ordering does not only show up for reasons of simplified data access.
Sometimes orderly behaviour and specific sequences of actions are
appropriate. There are furthermore also constraints in the time domain,
for example that you have to start working before you can stop, and also
Hakan Gulliksson
42
spatial constraints such as passing the hallway is necessary to get from the
front door to the living room.
A special case of order is symmetry, i.e. where entities exhibit
correspondence in size and shape. The left and right part of the human
body and face are for instance roughly symmetrical. Other special cases of
order are repetition of identical, or similar, instances, i.e. multiplicity,
empty space, and total randomness.
Programmers look for the features above (similarities, order, symmetries,
couplings) in a problem. If for instance a similar functionality is needed in
several places, the software for this functionality can be reused. Almost
any structure found in a problem can be used to simplify the design and
the implementation of software, and of course also to simplify other types
of design.
Grouping and ordering works for human interactions, and also for
human-computer interactions. One example is a hierarchical menu
systems where each menu provides commands that have something in
common, as exemplified by the edit menu in Microsoft Word ©. Society
tries to order and group humans, but this is not trivial, the medium size
does not seem to fit all of us. Humans on the other hand are adaptive
which means that if we want to, we ourselves can reduce complexity by
accepting the grouping and the orderly behaviour imposed on us. We start
working at about the same time in the morning even if it is not necessary,
and go to lunch when our colleagues do. Why?
What about nature in general? What is its level of complexity? Systems
that are too simple have problems adapting to changing conditions. But,
very complex systems on the other hand need to extract a lot of energy
from the environment to survive. There is a productive cycle here that
human evolution has explored. Adapt to gain access to more energy, use
the energy to adapt and find even more energy, and use it to further fuel
adaptation.
One important concept for orderly behaviour is the cycle where
something is kept constant over each cycle. Repetition, iteration, and
recursion all profit on the cycle that also naturally describes many aspects
of our lives, the cycle of life, seasons of the year, 60 seconds and the CPU
cycle.
Reducing complexity by layering is whether a useful way of grouping.
Functionality in computer programs, information processes and the like,
can be grouped and ordered this way. The system model used in this
chapter is itself divided into input–processing–output layers. Another
example is our vision. The eye registers light and pre-processes it, the preprocessed signal is sent to the brain where it is further processed and
delivered to the cortex of the brain.
Hakan Gulliksson
Photograph of snow.
Rendering of ocean waves.
Rendering of sand dunes.
Photograph (left) and rendering
(right) of human skin
Why are so many solutions in
nature, roses, waves at the sea,
snowflakes, beautifully ordered and
structured?
Do you prefer a cyclic model of
the Universe (rebirth for ever)
to a singular (one shot)? Why?
The king is dead
long live the king
43
Input
Output
a) Human sensory system ,
hearing, vision, smell.
Input
Output
b) Human vision
Figure III.2.5 Layered
systems are found
everywhere in nature.
c) Information flow in
an organization.
The figure above shows three different situations where layering is used.
Given n layers and m possible actions per layer, situation a) above allows
mn interactions since from any action at a layer, m new actions are
available. In situation b) there are (n-1) interfaces to consider and the
number of interactions is m2(n-1), a much less complicated system. The
simplicity gained however comes with a cost, as always. Implementation
of a complex system becomes manageable as functions are separated, but
the cost is that shortcuts between layers are difficult to implement.
Someone on the factory floor of a multinational co-operation will have few
opportunities to address the big boss himself, compare c) in the figure
above. There are too many opaque intermediate levels.
Humans are good at finding criteria for ordering and imposing different
structures. This is exemplified in registration plates on cars, university
departments, and books in libraries. The social security number is another
interesting example. It has a structure specially designed for computers!
Finding a good structure is a creative task. It takes time and experience to
find one, and a bad choice will usually result in more computations. It will
take longer to find and sort items for manipulation.
Layering is also an example of divide and conquer. Complexity is
conquered by intelligently dividing a complex set of items, or a problem,
into subgroups, and then recursively dividing each subgroup until
manageable subgroups are achieved. Caesar used this when he split his
enemies into smaller groups and defeated them one by one. In order for
this strategy to work the subgroups should have low coupling. In Caesar s
case the low coupling was provided for free by his opponents themselves
who were not too eager to co-operate.
Complexity is not always a bad thing. Because of complexity many
solutions to a problem can be found, and we can exploit this by adding, or
rather reconsider, some degrees of freedom of the problem we study. If we
consider repainting the kitchen we might find acceptable new dinner
plates. Einstein reconsidered the constraints on what light should behave
like, and invented relativity theory.
Hakan Gulliksson
Divide et impera.
Julius Caesar (100 - 44 BC).
Give some example where the
strategy divide and conquer does
not work because of
dependencies among
subproblems.
44
III.2.2.1 Examples of groupings
Gestalt theory studies the laws of human perception through the use of
visual patterns. Individual parts combine to reveal identifiable patterns.
This combination is done as an active process by our minds using sensory
input. The following laws for grouping have been identified by Gestalt
theory.
1.
Gestalt theory says that humans
actively identifies patterns. Why
do humans have such a facility?
How has it developed? Is it
possible to copy human pattern
matching to a computer?
Law of proximity.
Objects close to each other are grouped. The example is seen as
three groups rather than nine characters
2.
Law of similarity. Similar objects are grouped.
3.
Law of closure. We close structures where parts are missing.
4.
Law of appropriate continuation. We assume that structures
behave regularly, and as simple as possible, when we cannot see
them.
5.
Law of common fate. If almost everyone is involved, or takes
part, we assume that the rest do likewise.
6.
Law of Prananz. Whenever there is a choice, a simple structure is
preferred to a complicated one. The figure will be interpreted as
a square and a triangle although other interpretations are
possible. How many? 2? 3? 4? 5? 6?
xxx xxx xxx
The image to the right exemplifies that humans have problems whenever
an image does not adhere to the regular, simple structures we have
evolved to interpret.
Gestalt theory applies to sound as well. We will for instance fill out
missing parts of a sound to match it against our expectations.
How we do our grouping will always depend on our point of view. Let us
say that you are interested in animals. At one point you may want to look
up all of the dogs in a database, but at the next you want all brown
animals. A database is a computer tool for retrieving such diverse
associations and one way to implement it is to group characteristics in
tables, see figure III.2.6. This is called a relational database.
Animal 1
Dog
Brown
Four legs
Animal 2
Human
Black / pale
Two legs
Animal 3
Cat
White
Four legs
Figure III.2.6 A relational
database.
When you need to find all of the brown animals in the database you just
search it and check the third row. If you want all brown animals with four
legs you check the third and fourth rows.
Hakan Gulliksson
45
Cat
We will not go into sorting algorithms here, but as you can understand
from the above, sorting is very important for efficiency reasons. No one
really knows (yet) how humans perform grouping, sorting, and searching
in their internal database.
Grouping as described above (Animal (Dog, Human, Cat)) is one example
of approximation by generalisation, another type of grouping is
approximation by relaxing constraints.
White
Dog
Red
4 legs
Brown
Human
2 legs
Output
Output
Figure III.2.10 Approximation by
relaxation.
Input
Input
The figure above shows how data values can be relaxed. The
simplification is not only visual; the mathematical description of the
resulting linear system also becomes very simple. The output in the
approximated system is only a factor k times the input, i.e. output =
k·input + offset.
Give an everyday example of
approximation by relaxing of
constraints.
A geometrical way of representing even more complex signals and
systems is to use a vector space. Pixels in a greyscale image can for
instance be ordered in such a vector space. Each pixel in the image defines
an axis in a coordinate system, i.e. one dimension in a vector space. An
image with two pixels: (23, 12) will be represented as a point in a twodimensional plane, see figure below.
Pixel 2
Figure III.2.11 Image with 2
pixels. Pixel 1 has the greyscale value
of 23.
Image (23,12)
Pixel 1
Images with four pixels are similarly represented by four axes, i.e. in four
dimensions, where each axis represents the value (colour or greyscale) of a
pixel.
Pixel 3
Image
Pixel 1
Pixel 4
Figure III.2.12 Image with
four pixels. With eight bits
for each pixel we have a space
of 4096 images.
Pixel 2
A specific picture, i.e. with values defined for each of these four pixels,
will be represented as a point in this coordinate system. In fact, all possible
pictures with four pixels can be represented by the space spanned. A
typical image displayed on a computer is 400 by 600 pixels. If each pixel is
represented by 3 bytes the total number of images are 24 240.000 , which is
quite a lot of images (> 1 followed by 240.000 zeroes).
Hakan Gulliksson
46
A super sphere is the generalisation of a sphere to more than 3
dimensions. One way to use the vector space is to approximate all pictures
inside a super sphere to the point in the middle of the super sphere, see
figure below for an example in 4D. Hey presto, we have invented an
image compression algorithm where we can represent a lot of similar
pictures with only one point. This technique is also called clustering. Of
course there is a price to pay, the cost is that the approximated pictures
will be distorted. Vector spaces are powerful tools, so think this example
through!
Best compression, around
100% for ”DEL”.
Internet
Describe some everyday situations
where you use clustering.
Supersphere
Pixel 3
Pixel 1
Approximation of all
pictures in the supersphere.
Pixel 4
Figure III.2.13 Using the image
space for compression.
Pixel 2
III.3 Modelling, it and us
A model is a mapping of a system, or a design, onto something formed,
natural or artificial, physical or virtual that can be used to test, explore, or
communicate aspects of that system, or that design, puh... The word
derives from the Latin word mo´dulus, which means measure or scale.
Used in design a model is a device for understanding, communicating,
testing and predicting aspects of systems.
It is important to distinguish between the system itself, and the model that
represents the system, see figure III.3.1. The model is not reality, at least
not until virtual reality improves considerably, it is an abstraction, from a
particular view, of the system under study. Some formal systems such as
those constructed in mathematics are exceptions and have an exact model
- namely the system itself. This is true also for virtual worlds that do not
simulate reality.
Modeling is coping
with complexity
Van Dam
Things should be done as
simple as possible, but
not simpler.
Albert Einstein
Model world
Abstract model
Deductions, reasoning,
calculations
Abstraction,
modelling
Forecasting,
comprehension
Application,
evaluation
Questioning
Observations
Figure III.3.1 How a model
(upper part of the figure) relates
to a system (lower part).
Anticipation,
understanding
System world
Hakan Gulliksson
47
On the other hand, what you learn from any model by reasoning, or
calculation, can easily make an impact in the real world, see figure III.3.1.
One example is that a weather forecast can cancel the next day s skiing.
The taxation authorities also use rather advanced and influential models.
Sometimes results and understanding are better reached by questioning
the system, rather than by model based reasoning, or by solving
equations. The horizontal arrow in the system world in the figure above
indicates this. Individual humans are for example very difficult to model;
it is easier to ask them directly. The answers can be used to improve
predictions, or for evaluating a model.
Usually a model is designed for a specific purpose and supposedly
irrelevant details are omitted. The mapping from the system to the model
should be as close as s necessary, but not closer, which implies that any
model comes with built-in errors and that there is no universally best
model in any domain.
“We have to remember that what
we observe is not nature itself,
but nature exposed to our
method of questioning”
Werner Karl Heisenberg
Figure III.3.2 An image and a
digitised model of its essentials as
a signal.
If you for example want to model the painting in the figure above, the
frame is not relevant information. If you want to build a motherboard for
a computer your model will probably ignore the colour of the
motherboard.
Free painting included!
Who should do the modelling? When developing a computer based
service, full time workers in the modelled domain are preferred! The
problem is that knowledge about modelling is mostly found among
software developers, mathematicians, and physicists, not among the
practitioners of the domain. In general anyone involved in designing a
system benefits from knowledge on modelling.
How do we create new models? The following list is some suggestions
[NG3].
Composition and decomposition of previous models. There is
nothing new under the sun
Reordering.
Deletion and supplementation. One example of supplementation
is the phi phenomena where a dot of light moving fast back and
forth will be perceived as a line, i.e. an internal model of a line is
emerges in the viewer by supplementation.
Deformation.
Emphasizing parts of a model in a new way.
This is of course only one of an infinite number of ways to model
modelling. Remember, no single description will ever be the ultimate one
for all purposes.
Hakan Gulliksson
There is no more a unique world of
worlds than there is a unique world.
Nelson Goodman [NG3],
(think of a model as a world)
48
III.3.1 Abstraction level
One of the key questions in modelling is to decide the abstraction level to
use. A model that is too general will not reveal any secrets; interesting
behaviours will not be seen. Nor will the model be of any use if it is too
specific. We can model a car as a chariot of war or of triumph; a vehicle of
splendour, dignity, or solemnity, but this model will not help us if we
need information about how to change the tires. The maintenance guide of
the car, on the other hand, is of no use to us if we want to describe how it
feels like to drive a car, or what petrol smells like.
Definition:Abstraction is a view of a
problem that extracts the essential
information relevant to a particular
purpose and ignores the remainder of
the information.
IEEE Standard Glossary of Software
Engineering Terminology,
Abstracting behaviour and operations is also useful, for instance in
situations where many actions are necessary to accomplish a goal.
Changing all headings to uppercase may need one operation per heading,
but by using an abstraction in the form of a macro we could do all changes
with a single command.
Macro_up:
If header Then
Change a-z to A-Z
End If
III.3.2 Modelling view
At a given level of abstraction any system can be studied from one or
more of three different perspectives, intentional, conceptual, or physical
[DB1].
The intentional view describes the system from the perspective of how,
and where, it is going to be used, and what goals and expectations it can
fulfil. One example of an intentional view of a system that transmits a
message is as a sender that intends to tell you what time it is. While doing
this we should ask ourselves if our model is useful. Modelling the mental
states of a light switch does not add much value to us.
From a conceptual view, we can learn how the system works, its
properties, and what mental model we should use to understand the
system. To explain a concept similarity, pattern matching, metaphors, and
cultural associations are important. Understanding the principle behind a
cut and paste function in an editor is one example, another example is a
message viewed as packetised information.
B
A
The physical view is not too difficult to understand. It is concerned with
how the system and the real world influence each other. At this level a
packetised message can be modelled as a sequence of bits,
sent over an electric cable. Another example of a physical level model is a
neuro-physiological model of the workings of the eye. The 3D-model of a
coffee cup with a handle that is a perfect match for your finger is a third,
and a red button indicating a stop function a fourth.
Why do we need three levels? Why not two or four? This question is the
subject of an ongoing discussion among philosophers, but it seems that at
least three levels are needed. One physical interaction can be used to
implement many concepts, and many concepts can use the same physical
interaction. The cut and paste editing function can for instance be
implemented through the use of either a keyboard or a mouse. In a similar
way many intentions can make use of the same conceptual function and
many alternative concepts can be used to fulfil an intention. You can
achieve an objective in many different ways. If you want to tell your
mother some good news to cheer her up; you can either visit her and tell
her in person, or send her an email. The intentional view is needed
Hakan Gulliksson
Intentional
Conceptual
Physical
49
because the concept of sending your mother a mail does not include the
additional information that you want to cheer her up. Can you think of an
example where a single concept is used in many intentional views?
At which levels of description
is a car controllable and
observable through
measurements?
Reflecting on systems from these three perspectives is something that
humans do all of the time and it is a very useful practise in a design
process.
We can use an alarm clock to illustrate the three perspectives above. This
specific clock is added as an extra function to a digital camera. If you are
told that a digital camera is equipped with such a function you will
understand, because of the intentional view, when it can be used. You will
start looking for a user interface, i.e. you use a physical view of the device,
and you expect, from your conceptual view of an alarm clock, that this
interface will give you the opportunity to set the alarm, set the current
time and turn the alarm off. This leaping between levels is typical for how
humans think, matching different views.
Why, Who?
What? When?
How?
Next we will present other ways to model systems starting with the idea
of system transparency.
III.3.3 Basic types of models
The transparency of a model describes to what degree the inner workings
of a system are modelled. For minimum transparency, a
phenomenological model can be used. This is a qualitative model
constructed top down from observations and experiences, and there are
many such models particularly for complex phenomena. We can take the
stock market as a first example. In Sweden stock prices fall in the late
spring. This is a phenomenological model that is verified every year.
Medicine is another area where phenomological models are prevalent,
some cures work and, are used, but no one knows exactly why they work.
Many models from applied psychology are also phenomenological,
models on human reaction times for instance. If we want to describe a
messaging system at the phenomenological level it can be specified as a
set of messages, Computer “ sends a message to central control C saying
that it is alive . The good thing about the phenomenological model is that
it models available data very well, and yet can be quite simple, perhaps
just a graph showing the input-output relationship. The bad thing is that
new data might completely disrupt the model since it is neither
constructed on a solid theory, nor from deep knowledge about the system.
When we increase the transparency more information about the inner
working of a system is added. Either we focus on the structure or on the
behaviour of the system. A model of the structure of the system describes
how it is organised and composed of sub-components. In a
communication system we can for instance identify a transmitter, and a
receiver, connected by a channel as components. If we instead focus on the
behaviour of the system we are more interested in how the system s states
develop. In a communication system model the sender starts in a state
where he wants to send something, he generates a message, transmits it
over the channel, and ends up in a state waiting for an answer.
Hakan Gulliksson
A model expresses semantic
properties of a modeled world
W by syntactic properties of a
representation R..
P. Wegner, Brown University
(What more is there to say?)
+ H2O
=
Four wheels, engine accelerator,
and a steering wheel.
Release the clutch pedal slowly;
when you hear or feel the engine
begin to slow down, slowly press
down on the gas pedal as you
continue to release the clutch. The
car will start to move forward.
50
One alternative is to model the data flows through the system. This can be
exemplified by following a message as it passes different components, e.g.
communicating computers. The sender generates the message, which is
then transmitted and received by the receiver where it is assimilated and
interpreted. A corresponding example from another world is when a letter
is put in the mailbox, collected by the mailman, and delivered by the
postal service. A data flow model is also a functional model since if we
follow the data stream we can see how data is transformed at each
functional node it passes. At each node input is mapped to output.
F
To conclude, all of these models structure, behaviour, data flow, …
describe the system, but they do it using different concepts and level of
detail.
If a precise, clearly defined, model is necessary a formal model is used. It
has a well-defined textual or graphical representation, and rules that,
when applied, result in a predictable, bounded behaviour. A message
system described at this level can still include a set of possible messages,
but now the messages are described using a language with predefined
symbols, e.g. C <- “live “ “ sends message “live to C . Formal
models can be designed to describe the structure, behaviour, or any other
view of the system. The precise nature of the formal model has the added
value that designing it enforces clear thinking about the problem.
Less precise models are also needed since formalism restricts what you
can express with a model. One example of a less precise modelling
language is the spoken language. It is very expressive, but it is also very
difficult to parse it and to extract meaning from an utterance.
Many systems are difficult to formally describe. They are usually very
complex, such as a theatre group, or a user work situation and are better
described in writing, or by using informal graphics. A manuscript for a
film is one example that describes sequences of events to be interpreted by
the actors.
Another way to classify models is as descriptive or predictive. Models
used for engineering are mostly predictive and built on mathematics. They
can be used to evaluate performance without actually building the real
thing. Descriptive models, as the name suggests describe. A metaphor is
one example, describing by analogy, where the mapping is done, not by
formal rules, but by associations, cultural, personal, or other. One
example of a useful metaphor is the window as a user interface. The user
looks through a window into the realm of the computer application, much
in the same way that we look through an ordinary window. Another
useful metaphor, that we will use later when we describe different ways to
organise data, is the tree with its roots, trunk, branches, and leaves. In fact,
metaphors are much more common than we realise. One example is that
up is used as a metaphor for more , and that the future is ahead of
us. As we identify patterns using associations and metaphors we reduce
complexity and save effort, adding another useful behaviour to those
discussed in Chapter III.2.2.
How do
you do?
”Before you do this, close all windows”
Cleaning Windows: A few simple
Steps to a Clearer Outlook.
Anti Microsoft campaign slogan?
Metaphor? Window cleaning ad?
Descriptive models: topographic
map, orthophoto map, satellite
image map, ecological map,
geologic map…
What metaphors would you apply
to describe how to get where you
want in a complex topology of
web pages?
From the above we can learn that modelling involves trade offs. In
addition to the choices indicated above we have the following:
Hakan Gulliksson
51
Analytical - Learned. An analytical model formulated as a
program or a mathematical formula is explicit, but rigid.
Learning a model is nature s alternative solution, but what is
learnt is not always possible to explain, verify, or delete. How do
you for instance estimate the time it takes you to go from bed to
your chair at the office in the morning? Do you use an analytical
model, a learned estimate, or a combination?
Pure model – Data only. A pure model is computationally
demanding. The alternative using data only is exact, but might
need a lot of memory. Consider for instance the size of a
database of photographs where all possible views of a face are
stored, to be used for face animation.
Context screened– Context driven. Allowing context to affect the
model makes it believable and reality based, but at the cost of
increased complexity.
The ultimate model of reality is
reality itself, and such a model
will certainly be complex.
III.3.3.1 Discrete, continuous
Many system variables are best approximated and represented by
continuous, real values, i.e. a real value in an interval. Reality can be
thought of as a gigantic hierarchical composition of small, smaller, and
extremely small building blocks. One effect of the depth of this
composition is that using discrete attributes becomes impractical. You
could probably calculate your weight by summing atoms, but you will
save yourself a lot of trouble by using a bathroom scale. This preference of
viewing reality as continuous is a problem for the current transistor-based
computers that use binary representations.
III.3.3.2 Deterministic or Stochastic
Our knowledge of the behaviour of a system is sometimes reduced to a
statistical measure. We can estimate that the next car seen will be red with
the probability of 5 per cent. But, for all that we know there might be an
enormous line of red cars around the corner, maybe not very likely, but
we can never know for sure. We have to accept surprises. We for instance
quickly characterise people we meet, but if we spend more time with them
they will show new behaviours and talents and probably surprise us
many times. Our lives would be incredibly dull without surprises, yet we
fear loosing control, and the randomness that eventually will kill us.
Hello,
This is probably 438-9012, yes, the
house of the famous statistician. I'm
not at home, or do not want to
answer the phone, most probably the
latter. Leave your name and I'll
probably phone you back. So far the
probability of that is about 0,645.
The Net
Systems such as a car break down predictively, we just don t know when.
We only know that it will be at a very inconvenient time. Systems whose
behaviour we cannot predict with 100 per cent probability are called
stochastic systems. If they are fully predictable systems they are called
deterministic. One example of a deterministic system is the planetary
system. The sun will rise again tomorrow, at least with reasonable
assumptions about the Universe.
Hakan Gulliksson
52
Real world deterministic systems are always stochastic to some extent.
You can always invent some event that will make it stochastic (such as
that the sun explodes one morning). On the other hand you can argue that
any stochastic system is deterministic. If, somehow, you know all the facts
you can exactly calculate, and predict with 100% probability, that the sun
will explode, and turn the stochastic system once again into a
deterministic one.
What are we missing here? Is there no distinction at all between stochastic
and deterministic systems? The point is that we study a system for a
specific purpose and with some predefined knowledge. Given that, either
a deterministic view or a stochastic view is the most appropriate. If our
purpose is to build a practical calendar for everyday use, there is no need
to take an exploding sun into consideration.
Feb.
11.
12.
13.
Give an example of a system
that is 100% deterministic.
Since modelling is about representing patterns the system must at least be
determinitic enough to show a pattern to model.
III.3.4 Representations
To sense another system means that the system must somehow present
itself. A representation, expression, or coding is needed. We will often
use the word representation throughout the book and sometimes loosen
the definition to the right. The demand for a formal system in the
definition will be ignored, and how the information is made explicit is
sometimes obvious, and other times left out. One example is that we will
refer to a human face as a representation.
Definition: A representation is
a formal system for making
explicit certain entities or types
of information together with a
specification of how the system
does this.
David Marr.
For physical representations let us start by using natures own building
blocks. Beginning with the small and working upwards we have: atoms,
molecules, pressure fluctuations, cells, neurons, mechanical components
built by materials with different properties, electronic and electromagnetic
devices, and human behaviour. Combinations of representations give new
representations, in the same manner as data types are combined to new
data types in a programming environment. Man made structures of the
above representations are used in different forms of technology. Even the
volatility of reality can be put to use for communication. Pheromones
deposited by ants evaporate over time. Gravitation is used for dropping
bombs, which sends a message easy to understand by the receiver. The
environment in these cases can be seen as an active participator in the
interactions.
A representation makes explicit
certain entities or types of
information.
David Marr, sloppy version.
H
H
H
H
O
O
H
H
H
O
H
O
ICE (3D)
The digital world adds virtual representations, for instance computer
based internal representations of image, video, or text. Being virtual in the
definition to the right means that an interactor can accomplish something
without having a physical representation. A virtual representation needs
to be transformed into a physical representation before a human can sense
it. Virtual representations are usually man made, i.e. created by
technology. A database is one example, a web page another.
Definition:something virtual is
possessing a power of acting
without the agency of matter.
Internet
Context
For each representation there are many possible methods to access it,
depending on the application, and on the context where the application is
used. Examples of access technologies are a microphone, and a camera.
Access methods used for information (I) use technology (T), and are by
themselves technology, Access methods inherent to human (H), such as
hearing, are natural, i.e. do not need technology, even though an H is also
man made.
Hakan Gulliksson
Application
Access
Representation
53
There are also many ways to generate each representation, or in other
words, to synthesise it, and the methods and tools used for this are
themselves technology. Technologies for synthesis matching microphone
and camera are the loudspeaker and the projector. Finally, for each pair of
synthesis and access at least one representation is needed.
Representations exist at different abstraction levels and are transformed to
suit particular uses. The figure below shows different representations of
the logical connective AND. Leftmost the representation consists only of a
number of ink dots that form the word “ND . The truth table describes
the rules for the connective, and the symbol & is used in logic to
represent the connective. The symbol to the far right is used to denote an
AND-gate, which performs electronic operations on its input.
&
AND
Input 2
N
Ink dots
Give some examples of H-T-I, T-I-H,
and I-H-T transformations.
Transformations
Different
abstraction levels
A
Give some examples of virtualphysical-virtual transformations.
One example of a physicalvirtual-physical transformation
is the verification from the
computer by a clicking sound
when using the keyboard to
input the letter f.
D
1
0
Input 1
1
0
1
0
0
0
Figure III.3.5 Different abstraction
levels and transformations of
representations. Representations of an
And gate are shown from ink dots to
symbol used in electronic schematics.
Truth table (AND means that the
only time output is “1” is when
both inputs are “1”Ψ
The choice of representation is important; a bad choice was for instance
one reason why the Roman culture did not develop mathematics. Using
the right representation can save a lot of computational resources. This
goes for language too, it is no coincident that most sciences have their own
add-on language. It is efficient!
How do Romans multiply X and X?
X2=100 all the time?
The representation of something is not the same as its internal
organisation, content, or behaviour. As humans we are used to this
separation, we know that someone can feel sad behind a broad smile.
Most of the time we are interested in getting all relevant aspects of the
message through as clearly as possible, but overall optimisation is not
always possible. There can be conflicting demands, or we simply are not
creative, or knowledgeable, enough to find the best representation. One
example is the problem of finding a slogan for a large company. How can
a one liner represent all possible aspects of the company?
III.3.5 Language
An advanced form of representation is a language. As the definition to the
right says it is a system intended for communication. When used it must
be associated with something meaningful, i.e. it is used to model
something, and rules are needed to facilitate its interpretation.
Representations (symbols) are organised according to the rules of the
current context. One example is to order characters into words and
sentences for English readers. Another example is how binary digits form
packets of information that can be sent over the Internet. A third example
is body language that can signal social information such as power
hierarchies and individual desires.
Hakan Gulliksson
Definition: A language is a system
of communication using a representation, metaphor, and rules for
language use.
54
III.4 System environment, context, it is all around us
What a system can learn from its environment is very important. The
environment is also a system, so many of the properties discussed in
Chapter III.13 apply. A system can for instance be immersed in a
deterministic or non-deterministic, static or dynamic, discrete or
continuous physical environment. But, even if the environment has certain
properties this does not mean that the immersed system will perceive
them as such. If we for instance send an intelligent thing that can sense
only discrete events out into a continuous environment, the environment
will be perceived as discrete.
Physical environments are maybe the most obvious environments with
variables such as temperature, humidity, and acceleration. They are
inherently dynamic and continuous, for instance located in a mobile
space, e.g. a bus. Furthermore it is impossible to have complete
information about every aspect of them, which makes them nondeterministic and difficult to fully control, modify, and plan for.
Inaccessible, non-deterministic, and continuous environments are
sometimes referred to as open environments. Physical environments are
also important since they cannot be ignored. One example is how noise
affects speech recognition. Other examples are that keypads are difficult to
use in the dark, and that LCD displays do not work well at temperatures
below zero. The physical environment is of course also important since it
serves as a reference for the virtual environment. Space, time, shape,
motion and colour are all reused.
"What is real? How do you define
real? If you're talking about your
senses, what you feel, taste, smell,
or see, then all you're talking
about are electrical signals
interpreted by your brain”
Morpheus in “The Matrix”
+10
_
+
-10
Adjust outdoor
temperature
Other kinds of environments are human based social and cultural
environments, such as a nation, an organisation, a family, or a discipline,
for instance mathematics. Humans group themselves in many dimensions,
i.e. in many different societies and in principle we could list an infinite
number of possible social environments, each with a specific knowledge
base, skills and behaviours. In practise, each application, or interaction,
defines its own specific environment and together with administrative
considerations and cultural habits this means that most interactions take
place in relatively well defined environments, e.g. in homes, hospitals,
schools, or cars. Some cultural environments are goal based, for instance a
group of people travelling to a vacation resort, or focused employees at
Ericsson. Others have less clear objectives such as a family, or the citizens
of Umeå. A cultural environment is, unlike the physical environment, not
necessarily placed in time or space. The IEEE organisation for engineers
has over 30.000 members all over the world.
Social environments are open and the complexity of the physical and
human social environments cannot be overstated.
Hakan Gulliksson
55
Software based environments are examples of a third type of environment,
a technology defined environment. Developers need to carefully select
the operating system, software libraries, and maybe also the hardware
used. The virtual environment is a special breed of software-based
environments. It is technology defined, with laws that you can tinker with,
as opposed to the physical environment. In a virtual environment even the
notion of locality could be modified, all participants in an interaction
might occupy exactly the same location. For virtual environments,
adapting the environment to the user or agent is a real possibility. So,
instead of adapting thousands of software agents to the current Internet
environment it might be more efficient to prepare and structure the
Internet itself for the agent and user invasion. So far this has not been
done, we talk about cyberspace and the information highway but
structures, policies and social behaviour for those who travel on the
highway are missing.
Cannot find Internet.sys.
Universe halted.
Internet
World of entertainment, world
of Star Trek, poetry, physics,
golf, politics, parasites, Coca
Cola.
We will discuss the environment, or context, many times in this book.
"They're made out of meat."
"Meat?"
"Meat. They're made out of meat."
"Meat?"
"There's no doubt about it. We picked up several from different parts of the planet, took
them aboard our recon vessels, and probed them all the way through. They're completely
meat."
"That's impossible. What about the radio signals? The messages to the stars?"
"They use the radio waves to talk, but the signals don't come from them. The signals
come from machines."
"So who made the machines? That's who we want to contact."
"They made the machines. That's what I'm trying to tell you. Meat made the machines."
"That's ridiculous. How can meat make a machine? You're asking me to believe in
sentient meat."
"I'm not asking you, I'm telling you. These creatures are the only sentient race in that
sector and they're made out of meat."
"Maybe they're like the orfolei. You know, a carbon-based intelligence that goes through
a meat stage."
"Nope. They're born meat and they die meat. We studied them for several of their life
spans, which didn't take long. Do you have any idea what's the life span of meat?"
"Spare me. Okay, maybe they're only part meat. You know, like the weddilei. A meat head
with an electron plasma brain inside."
"Nope. We thought of that, since they do have meat heads, like the weddilei. But I told
you, we probed them. They're meat all the way through."
"No brain?"
"Oh, there's a brain all right. It's just that the brain is made out of meat! That's what I've
been trying to tell you."
"So ... what does the thinking?"
"You're not understanding, are you? You're refusing to deal with what I'm telling you.
The brain does the thinking. The meat."
"Thinking meat! You're asking me to believe in thinking meat!"
"Yes, thinking meat! Conscious meat! Loving meat. Dreaming meat. The meat is the
whole deal! Are you beginning to get the picture or do I have to start all over?"
"Omigod. You're serious then. They're made out of meat."
"Thank you. Finally. Yes. They are indeed made out of meat. And they've been trying to
get in touch with us for almost a hundred of their years."
"Omigod. So what does this meat have in mind?"
Terry Bisson (shorted version)
Hakan Gulliksson
56
Part IV: Interactors, we are not alone
Now the time has come to introduce the participants of the interaction. We
will use the term interactor to stress interest in interaction. It is also a
generic term, which helps us to abandon any prejudice linked to more
specific terms such as person, or thing [BMD]. Other terms used for the
interactor are agent, actor, participant, citizen, and sometimes also
demon, monitor, interpreter or executive. A line of thought, that we will
not follow up in this book, is that an interactor can be a part of another
interactor, and itself a construct of interactors [BMD].
The participants chosen are Human, Thing and Information. Why these?
Why three?
The human is a natural choice, the specie that created the environment for
this book, close to the writer, and to the intended readers, a typical user of
technology.
The thing as the second choice is more questionable. How about
Dolphins? Ant colonies [DH]? The thing with the intelligent, designed,
thing is that it is made by people and evolving very fast. Dolphins and ant
colonies may be more intelligent than we think and have a lot to teach us,
for instance on how to live our lives, but currently the communication
bandwidth to them is quite limited. The tool has been with us since our
savannah days, so interacting with its grandson is only natural. Also,
human thing interaction has been explored by the Human-computerinteraction (HCI) research community and by many other scientific
disciplines, which means that there is a lot of knowledge around. As a last
motivation, the thing is something real and commonplace.
So we accept the thing as a participant.
Are there enough participants? Let us take the perspective of a thing. This
particular thing is communicating at 155 Mbit per second with software
somewhere else. This software, is it a thing? It is certainly not human. The
software in this imagined case is a component based database system,
distributed such that the software for controlling the data retrieval is
executed on many computers connected over the Internet. The data and
the software is broken up into zillions of pieces, each residing on its own
computer, out there somewhere. It seems that we have a participant here
that is not human and not a thing (localised, real, something you can
touch). This participant we call information, and it is not necessarily
confined in space or time. It is virtual, hard to touch, difficult to lay your
hands on.
Hakan Gulliksson
Definition: A thing is a
physical object that can be
referred to as an.
Websters dictionary:
Thing: The real or concrete
substance of an entity.
(one out of 19 meanings)
N
I
F
O
A
R
T
M
N
O
I
57
So, do we accept information as a participant?
Information does seem a bit dull as a partner in an interaction. Where is
the spirit? People have intelligence, we have things all around us for
physical support, but maybe information could match our intelligence,
and our creativity?
One way to define intelligence is as the ability to surprise a human and
come up with new ideas, like the zipper, or the desktop metaphor in
human-computer-interaction. What then is an idea? It is not a human and
it is not a thing, even though both of these participants interact with ideas.
A good idea is to include the idea in the information interactor where it
opens up new possibilities. It interacts with other ideas to generate new
ideas. If we add idea to the information concept, information does not
seem so dull any more.
Now, consider the World Wide Web. The initial idea was to associate
information using active character strings, i.e. addresses as links. By
clicking on a link we can display new information, including new links.
This idea gave us the first web browser, and presently software agents are
researched that spend their time out on the web collecting information. So
from the initial idea of the active string many new ideas have emanated.
Who could have foreseen the software agent when the idea of a link was
born? The agents are also information, interactors that we might refer to as
active information. They can move, communicate among themselves, and
even replicate. Such objects could monitor context, keep track of the
interactors, and manage information transfer between them.
HOME
With more than three types of interactors the number of permutations of
pairs of participants in interactions will grow. So, also for practical reasons
three is a good number.
We will have a problem to classify many topics as belonging to either the
world of things, information, or humans. A human can be categorised as a
thing, as well as an information processing entity, and knowledge is
relevant to all interactors. The approach taken in this book is stepwise
procrastination with a human centred perspective.
We start out by introducing the human, providing it with as many
relevant characteristics as possible. Next, we discuss information. Once
again we include as many aspects as possible. Much of what has been
discussed up to that point applies to the thing as well, but we will not
repeat any material, only add. There are still quite a lot of unique
properties left to discuss for the thing.
The illustration to the right shows how we focus on H in our model,
embedded in information. Some of this information is directly mediated
from the physical reality (T), but an increasing amount of it is managed by
computers, and gradually this information will be a world of its own,
indicated by the arrows expanding the information area in the illustration.
Hakan Gulliksson
T
I
H
T
I
H
58
Now we will introduce a generic interactor and start by describing some
of its characteristics. We continue by studying its processing in context, i.e.
how interactors process input, and generate output. Examples are as much
as possible given using the human as an interactor.
IV.1 We have an interface, a structure, and processing capability
To be really interesting an interactor should be autonomous, be aware of
its environment, and have a rational behaviour. With identity,
intelligence, and a social life it can be someone like you. “n actor s
behaviour can be conveniently represented by the Stimulus-response
diagram (SR-diagram) where the behaviour is represented by its response
to stimuli, see figure below [RA].
Stimulus
Response
Behaviour
Figure IV.1.1 Stimulus-response
diagram.
Elaborated somewhat more, an interactor is a physical or virtual entity
with some, or all of the following skills [JF]:
It is capable of sustained acting in an environment.
It can communicate directly with other interactors.
It is driven by a goal, a set of tendencies (in the form of
individual objectives, commitments, or of a satisfaction/survival
function which it tries to optimise).
It possesses resources of its own.
It is capable of perceiving its environment to a limited extent.
It has a partial mental model of this environment and of its own
history.
It possesses skills, offers services and is prepared to handle a
possible failure.
It might be able to reproduce itself.
It has a behaviour that tends toward satisfying its objectives,
taking account of the resources and skills available, and
according to the information it receives, i.e. a rational behaviour.
Definition of an ideal rational
interactor: For each percept
sequence, do whatever action to
maximise performance, on
whatever knowledge available.”
[JF, adapted]
Goal!
Mental
model
Communications
Resources
Figure IV.1.2 The interactor
shown with its skills.
Actions
Perceptions
Environment
One important difference between the human and the other interactors is
that humans cannot be changed whereas information and things can be
designed and adapted to suit the situation and the application at hand.
Hakan Gulliksson
59
IV.1.1 Representation
The representations of two interactors can be quite different. Face,
interface, and surface are good descriptions of the representations for the
interactors we have chosen, and will be discussed in the following
chapters. Many interactors have a unique name or number, but
representations also include internal structures and architectures.
Behaviours or actions, sounds, and organisation can represent a system
and some examples are gestures, and a line of people waiting at a bus
stop.
What it is – Object
What it does - Computation
Affordance and accountability are two important concepts related to
representation. Affordance refers to how appearance suggests function
and gives opportunity for action. Accountability is how interactors tailor
their representations such that they can be understood, also in the course
of action, and is not always a static property of the system.
IV.1.2 Perception and cognition
The interactors detect, interpret, processes and effectuates. In other words
it senses, perceives, and has cognitive and maybe even social abilities.
Perception and cognition drive effectuators that generate outputs.
Cognition is additional functionality in the interactor for selecting
information (attention), manipulating it (thinking, processing), and storing
it (learning, knowledge representation, memory), see figure IV.3.3 that
shows the cognitive architecture. Sometimes perception is considered as a
part of cognition, but we will treat it as a separate subsystem. The word
cognition stems from Latin cognition, meaning examination, learning, and
knowledge and cognitive science is the science studying such systems. It is
a multidisciplinary research and psychology, computer science,
philosophy, linguistics, and biology all contribute. We will use the
cognitive architecture to structure the discussion.
Discrimination, identification,
manipulation, describing and
responding to descriptions of
objects, events and states of
affairs in the world.
Five tasks a cognitive theory
has to explain.
Harnad
Why the focus on human cognition you might ask yourself? There are two
reasons for this and the first is constructive; if we better understand
human cognition we might be able to build better robots or other tools.
The second goal is simply curiosity, how do we work? By building
systems that mimic humans we might learn a thing or two about
ourselves.
"A human being is the measure
of all things – of things that
are, that they are, and of things
that are not that they are not."
Protagoras 480-411 BC
Sensor
Sensing
modality
Central
system
Action
modality
Eyes
Ears
Skin
Nose
Mouth
…
Vision
Hearing
Taste
Smell
Touch
…
Thinking
Attention
Memory
Learning
Reasoning
Planning
…
Gaze
Voice
Facial
expression
Hand/body
movement
….
I cannot survive without
brainwork. What else is
there to live for?
Sherlock Holmes
Effectuator
Eye
Face
Hand
Body
Mouth
…
Figure IV.1.3 The cognitive
architecture for human
information processing.
Language
From figure IV.3.3 we can see how sensory inputs, in different sensing
modalities, flood the sensory system. Following paths through millions of
neurons the input is passed, and massaged, until it reaches the central
Hakan Gulliksson
60
nervous system. Here sensations are combined to
abstractions and processed by perception and cognition.
higher-level
Perception, from Latin perce’ptio, means to receive or apprehend. It is the
process of merging input into a usable mental representation of the world.
This means organising, ignoring, and interpreting sensations, is very
important tasks for complex beings. It allows animals to function in the
real world, find their prey, and separate predators from nice looking
individuals of the opposite sex. The number of perceptions and their
content limits the complexity of the behaviour.
Definition:Perception: the act
of appre-hending material
objects or qualities through the
senses.
A thermostat is simple and can perceive only two aspects of reality, too
cold or too warm relative to a reference temperature. For a complex
interactor like you to put itself in another interactor s place you have to
mentally emulate the perceptions of the other interactor. Try to put
yourself in the place of a thermostat, how would you perceive reality?
What is life like to a thermostat?
From a human viewpoint perceptions by evolution are marvellously
adapted to our environment, and they are very resource efficient. We
perceive a tiger through sensations, a smell, and a terrible sound. This
makes us focus our attention on the tiger, and we recall that a tiger is
rather dangerous. At this stage not even our mothers would recognise our
facial expression, and the blood freezes to an extent inversely proportional
to the speed with which our legs propel.
An interactor of type H or T senses its environment and converts external
physical variables to internal representations. To an interactor of type I
input and output degenerates to I-I interaction. Typical characteristics
sensed are the physical properties, and their resolution in sensor space
and time. There is an interesting trade-off here between sensing and
knowledge that is nicely illustrated in figure IV.3.4, adapted from [RA]. In
a structured world knowledge is easy to reuse and sensing is not needed
to guide actions. In a dynamic world, such as a motor highway, sensing is
necessary.
Dynamic and
uncertain worlds
Structured
worlds
Difficulty of
sensing
Utility of
world
knowledge
Figure IV.1.4 Trade-off between
sensing and using knowledge.
One room indoor
navigation
The price of living by knowledge is additional demands on memory, but
sensing also comes with a price. Continuously inspecting the environment
is expensive. The world that people inhabit is definitely dynamic and
uncertain, which means that sensing is of great importance to us.
Hakan Gulliksson
We are all disabled in
certain situations, e.g in
the dark.
Outdoor navigation
Value of
sensing
Possible to
predict.
Apple of my eye
Perceiving is separating
form from matter
Aristotle
61
A fundamental, and important fact is that perception is a guess! It is as
close as we get to reality, but perceptions are often wrong because it is
often a guess about objects and situations that is made based on
insufficient evidence. A moving shadow could be an eagle, and the bird
quickly hides, but it could be almost anything else that moves.
Output, or action, is realised by effectuators, such as a display, a muscular
arm, or some mechanical device. Similar to sensors effectuators also can be
characterised by resolution.
"Pen bad. Keyboard good. Thought
transference really good.”
John Dobbin's matrix of input
devices
IV.1.3 Processing summarised
Processing is the internal workings of the interactor, in contrast to
perception that interprets external sensations, and to the actions that are
the result of processing. Processing needs an execution unit that is fed by
input data or information, i.e. memory or perceptions is also necessary.
The execution unit can be characterised by its architecture and capacity.
Our brain is one example of an execution unit, and one with a very special
architecture and a limited capacity. For the input channel the most
important attribute is the amount of data it can access per unit of time. A
processing unit needs a description of how to use the information and
programming, planning, reasoning, and learning are ways to generate this
description.
Processing
unit
Input/
output
Memory
Many interactors reason on cause and effect: They think to achieve goals,
form a plan, or to overcome some difficulty. A prerequisite for reasoning
is a context, e.g. a subset of the complete mental state.
A plan is a sequence of actions positioned in time to fulfil one or more
objectives. The action sequence in is constrained by the resources
assigned, such as time, money, and states. Planning might also include an
optimisation problem if the best plan is not easily found. Reasoning is
once again useful when the individual steps of the plan are selected and
executed, and slight changes in the environment might necessitate
adjustments to the plan. The reasoning can be practical or theoretical,
where for instance evaluating the pros and cons of writing a book is an
example of practical reasoning directed towards actions. In theoretical
reasoning we rather reason about beliefs. If you believe that Sweden is the
best football team in the world, and they loose to Norway, you have some
theoretical reasoning to do.
All of this seems rather useless without a goal such as a pat on the back, a
smile, or reproduction, all very much social rewards. An objective, or goal,
is traditionally viewed as a mental construct, but with new technology it is
better thought of as an information structure. The goal is many times
selected by reasoning and they together with their relationships make up
the overall objective. Sometimes a goal can be reduced to subgoals and
often they can be described as a desirable change of system state. Finally,
both plans and goals have to somehow be represented whether they are
goals of a human, or of a thing.
Hakan Gulliksson
Definition:Planning is the
deliberate process of
generating and analysing
alternative paths through a
system state space, before
they are followed.
Definition: A goal is the
state of a affair that (when
achieved) termi-nates
behaviour attended to
achieve it. It is the end that
justifies the mean.
62
At this point it is time to introduce knowledge, the input to processing,
and the rules for it. A vital component of any system that is supposed to
exhibit intelligent behaviour, and yet difficult to define. Without our
collective knowledge and especially without the predictive power of this
knowledge, human societies would not work. Knowledge is equally
important to individuals. For instance, knowledge on how to dress in cold
weather, the best wine to drink with an elk steak for Saturday dinner, and
on how to behave when you get a temperature.
Knowledge has to be generated and here is where learning comes in.
Interactors operating in open, dynamic environments must adapt and for
this they need to learn which is also important to improve reasoning and
planning.
The somewhat abstract definition to the right describes learning.
Somehow the interactor gains knowledge and acquires skills, which
increases performance. In a social environment where individuals learn in
parallel and communicate the results to each other, we have a powerful
tool for progress. Chapter IV.8 to IV.11 will elaborate on reasoning,
planning, learning, and several other cognitive aspects for different
interactors. Figure IV.3.5 models some of the aspects.
New goal
Plan
Plan found
Action
Idle
Finished
Learn
Goal
achieved
Fail
Reason
Revised plan
found
In the following chapters we will discuss representations and
characteristics of interactors and their modelling and implementation. We
will do that using the human interactor as a reference as often as possible,
So much to say, so little time, and only a few pages. We would like to
stress the monstrosity of the subject ahead. Almost every statement in the
following is worthy of its own book, and it is for instance impossible for
any human to fully comprehend all of the behaviours that together sums
up to what we call a human being. The enormous complexity of the
human brain and of human society surprisingly enough stabilises the
system, a good thing for survival. As a by-product the complexity adds
inertia to the system, which is perhaps not so good for survival.
Definition: Knowledge is information
in context, organised so that it can be
readily applied to solving problems,
perception and learning.
Definition:Adaptation is an act of
changing to fit different conditions.
Definition: Learning is to become
able to respond to task-demand or
an environmental pressure in a
different way as a result of earlier
response to the same task
(practice) or as a result of other
intervening relevant experience...
Figure IV.3.5 One (out of many)
possible state descriptions of
the relations between planning and
learning.
It seems that we do not understand
humans. Much is magic, like
creativity. Is it possible to design
and build something that cannot be
understood? Are computers better
at this? On the other hand, babies
are made without understanding
how we work ourselves!
IP
ID
kg
IV.2 Human representations
What can you read from a human? There are actually many different
output channels available. Speech of course, but in general our entire
behaviour in any given situation. As technology evolves it will open up
new channels, both external and internal to the human body. External
information could be extracted from measurement devices worn close to
the body, such as a pulse meter that you always carry around with you.
Complementary internal information can be extracted from the neural
network, or other body systems at cell, or even at molecular, level.
Hakan Gulliksson
H
$
cm
IQ
hue
O
H
ZIP
H
63
The main physical features of a human, from the perspective of
representation, are eyes, mouth, fingers, hands, and body posture. All of
which is specified and built by nature. The face is roughly symmetrical,
probably because gravitation does not care for right or left, and the eyes
are directed forward indicating that human is a hunter rather than hunted.
Sensuality, integrity, power,
intelligence, attractiveness
[CC]
Human noise and voice are other significant representations, and each
human has its unique smell. How all of these representations affect other
humans is not fully understood, but new clues are found by researchers all
of the time. We for instance now know that we adapt more quickly to bad
smells than to good smells. We on the other hand are more sensitive to
bad smells, presumably for evolutionary reasons.
applause, bite, boo, breath, burp,
cheer, chew, chomp, cough, crowd,
cry, drink, eat, fart, footsteps,
gargle, gasp, giggle, groan, grunt,
gulp, heartbeat, hiccup, kiss, laugh,
scream, sigh, slurp, sneeze, sniff,
snore, write, yawn, yell
The representations of humans are dynamic. Gestures and facial
expressions can mean different things depending on their timing, but
dynamics is even more important for the human voice.
The human being is an entity studied by other human beings; it is in other
words a social being, and neither good nor bad. A human is equipped
with a consciousness somehow integrated with the body and can have a
belief about things (intentionality), and discuss what it feels like to taste
chocolate (qualia). One remarkable feature is the freedom of choice, or in a
slightly more negative formulation, the necessity to choose how to act at
every moment. No human should be treated, or even regarded, as a tool,
but as an individual with a unique personality.
The human body is an engine consuming energy, and it is a 100.000.000
times as heavy as a drop of rain, and 10.000.000 times as long as a virus.
Basic building blocks are atoms of carbon, nitrogen, and hydrogen. At
another level she is a system of interacting organs, brain, lungs, heart,
guts, that is sustained by food. On yet another level of abstraction she can
be described by behaviour, thinking, memory, perceptions, consciousness,
feelings, and emotions. As a race humans weights in total about as much
as 50 pyramids. An amazingly complex blueprint for a cell structure is
continuously developed through interactions in a social and physical
environment. The disposition of a person as a whole is not easily
attributed to any single reason. But, as with looks, at least some features
are inherited from parents.
Who, or what, is looking
out my eye?
The average human body contains
enough: iron to make a 3 inch nail,
sulfur to kill all fleas on an average
dog, carbon to make 900 pencils,
potassium to fire a toy cannon, fat to
make 7 bars of soap, phosphorous to
make 2,200 match heads, and water
to fill a ten-gallon tank.
Internet trivia
A human being should be able to change a diaper,
plan an invasion, butcher
a hog, conn a ship, design a building,
write a sonnet, balance accounts,
build a wall, set a bone, comfort the dying.
Take orders, give orders, co-operate,
act alone, solve equations, analyse a new problem,
pitch manure,program a computer,
cook a tasty meal, fight efficiently, die gallantly.
Specialisation is for insects.
Robert Heinlein
Hakan Gulliksson
64
IV.3 How to recognise Information?
Information, featuring the idea, is the second participant in the HIT troika
and the most information packed one, characterized by repetition, change,
pattern, and surprise. How basic it is to life is best exemplified by the
genetic code. This code is copied, mutated, and interpreted, and gives rise
to all living creatures. Information does not die from old age, but it can be
deleted. It can also be modified, filtered, and replaced. Another
description is information as a correlation between two things or events
produced by a lawful process , exemplified by that information about the
temperature is found by looking at a thermometer [SP]. The word
information itself originates from the Latin word informo that means to
educate, or to shape, something, but nowadays the popular everyday
usage of the term refers to facts and opinions obtained through life.
Without it we do not know anything. Not all information is however
knowledge, i.e. deeper insight, and certainly most information will not
qualify as wisdom, where morals and ethics are important. Information is
a necessary condition, but not sufficient in itself for knowledge and
wisdom.
A word often (mis-)used synonymously with information is data, plural of
datum. Strictly speaking data is shaped information, i.e. information that
is represented and coded, often digitally, in documents or databases. The
concept of data assumes that there is a reality outside of the human mind
or the machine, which cannot be directly captured, but can be indirectly
sensed, measured, and represented as data. An interesting analogy is
between data and information on the one hand, and sensation and
perception on the other.
IP
ID
kg
ZIP
H
$
cm
IQ
hue
Definition:Data is any symbol,
sign or measure in a form that
can be directly captured by a
person or a machine.
Definition:Information is data that
has a value depending on context.
In this chapter we will sometimes view data as a signal, which is broadly
defined as an interruption in a field of constant energy transfer, possible to
code or refer to, using spoken or written language. Defined this way a
signal is the most basic unit of communication.
The value of information depends on by whom, when, and where it is
used. Information at the right time can be valuable, as history shows and
if the thermometer shows zero degrees Celsius a bushman would quickly
hide from the cold, while an Eskimo does not complain at all (maybe it is
too hot?). In the wrong context information on the other hand is just data.
The value of information also affected by its reliability and completeness.
If the price is X98.99, this is not very valuable information, we would
really like to see that missing number; 298,9X on the other hand is all right.
Information as data, with a value, in context, is not enough to make
information an interesting participant in interactions. But, if we add
interpretation of information, as in the genetic code, information becomes
more interesting. If we to this add some capability to act on the
information, we have active information, possibly a rational agent. This is
a technological counterpart to a human and is exemplified by software
based mobile agents roaming through the Internet transmitting
themselves, their program, and their states between computers. A really
annoying example is the computer virus. Some of them are now called
worms, beware when they reach higher levels of consciousness.
Hakan Gulliksson
65
IV.3.1 Shannon’s information theory
Published as “ mathematical theory of information the information
theory was formulated by Claude Shannon as late as 1948. Quite late for
mathematics used today in practical applications. Actually Mr Shannon
did not intend to publish his findings, but he was urged to do it by his
fellow employees! He retired at the age of 50, a wise man indeed.
The theory defines information as a probability. A less probable event
carries more information and is more difficult to transmit. The most used
unit of information is bit and has two states, normally denoted
and .
This is strangely enough also the name and notation used for data
represented by a computer. One example of an information system is
flipping a coin. Head or tail are the only information units and the
information content of the system is low (1 bit) since both head and tail
have an equal, and not too high, probability. We can choose to represent
head with
and tail with
which makes it possible to describe any
event with one bit, i.e. as a
or a
. “ large number of improbable
messages could represent an enormous amount of information. One
example is a ballet.
“rmed with Shannon s model we can predict the amount of information
possible to send over a channel with a limited capacity, at least for simple
channel models. We also can predict the compression ratio of a video. The
bounds provided by information theory are fundamental, they are like the
speed of light, nothing that can be stretched or circumvented.
Order the following messages
according to their information
content:
I see a red car.
I see a yellow car.
I see a T-Ford.
I see a horse.
Inform.
theory
Using only probability as a basis for information is however not always a
good choice. It will not take into account the semantics or the aesthetics of
what is said, just the probability that it will be said. If you call the fire
station and tell them that it is your birthday it will be a highly unlikely
message with lots of information. Despite this the firemen will not be very
interested, they are only interested if the candles on the cake have set fire
to your house.
Some other problems with the model by Shannon are that many
expressions are ambiguous, they mean different things in different
situations, and that we use non literal sentences such as Nice weather,
huh when it snows for the fifth day in a row. The last problem of the
model, that we will discuss here, is that it fails to take higher levels of
noise into account. Trying to persuade a wolf that rabbits are cuddly is one
example where bias, prejudice, and cultural effects distort the message.
Another example is using complex terms, as in a technical jargon.
Ambigous
expressions
IV.3.2 Representations of information, see the soul of I
Talking about representations of information is actually a tautology since
information is representation, and a representation is information.
Anyway, almost all information is analogue, i.e. continuous in amplitude
and time. But, analogue signals are not very suitable for a computer, so
we have to represent the signals as sequences of 1 or 0 binary
numbers). In computer storage a
or a
is referred to as a bit. Bits are
grouped in groups of eight bits, each group called a byte. Why eight?
Hakan Gulliksson
Analogue signal.
Discrete signal.
66
Figure IV.3.2 1011 (binary) = 11
(decimal number representation 101
+ 100 = 11).
Digital numbers can easily be represented in the computer as a string of
bits, each interpreted as a power of two. The digital number 11 is binary
, which could be written as is · 3+0·22+1·21+1·20 as in the figure
IV.3.2 above. Representing floating-point numbers is more awkward, 0.5
is all right (2-1), but how do you represent 0.3? Text in the computer is
coded, i.e. represented, as a sequence of ASCII codes, each a sequence of
bits. One example is the string H G , with the binary ASCII representation
(1001000, 1000111) and the decimal ditto (72, 71). If we measure our
temperature and find it to be zero degrees we know we are in trouble. We
have several choices, either we represent the temperature by a single bit
, or by a fixed length number
, or as the “SCII character
in bits represented as
. Computer based representation of
information is already a bit complicated.
RIP
1910-1990
Moving upwards in the representation hierarchy we come to image,
video, and sound that also have to be stored as digitised samples in the
computer. Analogue representations such as sound cassette tapes and
analogue records are less interesting with improved digital quality and
storage capacity. For video and photography analogue formats are still
important, but at least for photography the digital technology will soon be
both cheaper and better. At an even higher level we can represent a story,
law, constitution, and a weather forecast. What is the next higher level?
What is the highest level?
Not all information has a physical representation. A yellow coffee cup, a
book, and a television set have representations, but what about Master of
Science, subtraction, or reincarnation? You understood what we meant by
Master of Science without us showing you a typical specimen. We only
needed three printed words (17 characters) to represent this abstract idea.
The duality of information as both representing something and at the
same time being a physical entity suggests that our own complex mental
processes also can be represented physically. A house is represented by
the word house , by voltage potentials in memory circuits, or by ink dots
on a paper. How do we humans represent it internally?
The temporal dimension is a special case. Time can be represented as a
time line where events are positioned, and from such a representation the
temporal relationships between events can be found. We can easily see if
an event is before, after, or overlaps another event, or time period. Like
any other value, time can be represented as binary, digital, or as ASCII.
Time can be pinpointed quite exact as in 24.00 31/12 1999, or less precise as
in a couple of hours, eons, or a generation. Some time values are cyclic,
7.00 AM happens every day, not only the day the alarm clocks celebrates
it. The year 2000, on the other hand, will never again occur, unless we
change the calendar.
Hakan Gulliksson
“The wheel of time turns and
ages come and pass, leaving
memories that become legend.
Legend fades to myth and
even myth is long forgotten
when the age that gave it birth
comes again”
Wheel of time, Robert Jordan
y
t
67
There are only a few basic ways that we can arrange data [RSW]. The most
important ones are by number, and by time as described above, but we
can also use magnitude, alphabet (from A to Z), category (similarity),
location, continuum or randomness, i.e. lack of organisation. Each of
these alternatives can be represented in many different ways. Magnitude,
for instance, can be represented by a measuring stick, distance, and
viewing angle, or by words such as in bigger than .
Size
Light
Shape
IV.3.3 Painting, Image and Video
The creation of real world artefacts and paintings as representations of our
inner mental models must have been a great leap for humanity. It was the
first attempt towards a written language, taken at least 40.000 years ago.
Image synthesis has evolved quite a lot since the cave-graffiti, and our
external representations are now central to society. The engineering
community depends on graphical tools for synthesis, and so do designers
and artists. We are however still very limited by our tools, and at the same
time totally dependent on them. Is it for instance possible to create an oil
painting using a computer? Is it even a good idea if it can be done?
Image analysis is the other side of the coin. From the scatterings and
reflections of light around us we perceive the world. This light is, as all
physics, very much an analogue phenomenon, impossible to represent
exactly in the computer. An oil painting is quite a different representation
from the computerised image, see figure IV.3.3. The painting is analogue
(real) and the brushstrokes are three-dimensional structures modulated
such that the painting will look different due to lighting and viewing
direction. How do we represent this in a computer?
123 123 145 178 179 178 145 123 123 135 142 138 244 244 232
122 123 123 145 178 179 178 145 123 123 135 142 138 244 244
127 122 123 123 145 248 249 248 245 123 123 135 142 138 244
131 123 122 123 223 145 178 179 178 245 123 123 135 142 118
175 134 129 222 123 123 145 178 179 178 245 123 123 135 142
134 129 122 223 123 145 178 179 178 145 223 123 135 142 118
144 132 175 234 129 122 123 123 145 178 149 178 145 123 123
144 244 132 255 134 129 122 123 123 145 239 179 178 145 123
138 244 144 132 175 134 129 122 123 123 245 178 179 178 145
142 188 144 244 132 175 134 129 122 123 223 145 178 179 178
135 142 138 144 244 132 175 134 129 122 223 123 145 178 119
123 123 135 142 118 244 144 232 175 234 129 122 123 123 145
123 123 145 178 129 128 245 223 223 135 142 118 244 244 232
A major difference between computer-based images and real world
representations is that it is much easier to copy a computer-based image;
we can even do an exact copy. A digitised representation of a painting is
consequently possible to copy, but not the painting itself. This possibility
causes quite a lot of copyright problems, especially when a digitised copy
with lower quality, e.g. MP3, images, or movies on the Internet, satisfies
the presumptive customer.
Hakan Gulliksson
If the number of humans that has
lived so far is 1011 and they each
have seen 30 images per second,
16 hours a day, for 70 years.
What is the total number of
images that have been seen?
[Owe this example to Professor
Haibo Li at Umea University]
Figure IV.3.3 Computer
image representation. what
is this?
Red filled circle on blach
background. What is the
perceived difference if done in
oil, water colour, or if it is a
digital representation?
68
A digital image is represented as an array of pixels in the computer. Each
pixel is stored as a number of bits, usually 8, 16 or 24, see figure below.
The bit value of a pixel is mapped to intensity or to a colour and the
resulting image is called a bit mapped image.
1
0
1
1
0
1
0
1
Information is free if bailed out.
/HG
Bit map table
Colour table
Pixel Colour
value
Figure IV.3.4 Bit mapped
image.
1
2
Image
…
Pixel
A nuisance with images is the large amount of data needed to represent
them. This is one of the problems attacked by the important and extensive
standardisation for image coding. The two formats most used are GIF and
JPEG where GIF is mostly used for computer graphics and JPEG for
compressed images. The problem is aggravated for video by a factor of 25
each second. The rate of 25 frames per second is fast enough for a human
to perceive a fluent motion of objects in the video rather than individual
images. For the resolution of 640*480 pixels, a typical resolution for digital
television and the standards MPEG-2 and MPEG-4, a calculation of the
memory demand gives 307,300 bytes per frame, even if each pixel is
represented by only one byte. A video with this resolution and 25 frames
per second will need 7,682,500 bytes per second, or 61,460,000 bits per
second. The impressive numbers present us with quite a problem!
.
Pixel correlation
-5
5
Figure IV.3.5 Pixel correlation in
a picture.
Pixel distance
The figure above shows an excerpt from an image of a bird. The point we
want to make is that the correlation between two pixels drops off quickly
with the distance between pixels. Only six or seven pixels away almost all
information from the reference pixel is forgotten. This behaviour means
that the information in an image is quite close to noise, which in turn
affects the way we process the image. Image compression standards such
as MPEG-2 consequently manipulate the image in blocks of eight by eight
pixels. There is on average nothing to gain by using larger blocks because
of the low pixel correlation.
Hakan Gulliksson
How many images can you
represent with 500 Kbit? If each
pixel in the image is 8 bits, how
many pixels are there in the
image?
A graphical object O of the
Euclidian space Rn consists of
a subset U Rn
and a function f: U Rp
(U defines the shape of the
graphical object and the
function f defines the attribute
space)
69
Colours in an image can be represented in many different ways, some
more adapted to human physiology than others. One way is to use the
RGB (Red, Green, Blue) colour space, which is chosen to match the human
colour perception. Another alternative is to specify a colour by its hue,
saturation and luminance (HSV). In this representation the RGB cube is
viewed along the greyscale axis. Luminance describes lightness, and
saturation is the purity of a colour. A highly saturated hue has an intense
colour, and with no saturation at all the hue becomes a shade of grey with
the specified luminance.
RGB
HSV
How we see colours depends on what we are used to see and on the
colours of neighbouring patches. Colour dots on a painting merge
additively when viewed on a distance such that blue and yellow dots will
appear grey rather than green, which would be the result if we just mixed
the pigments together into a new colour. CMYK (Cyan Magenta, Yellow
and blacK) is a colour space for prining that acknowledges this fact.
Colour and shades of grey can be represented on screens and paper in
many different ways. Dithering is one example where a pattern of screen
pixels or printer dots simulate colour or grey levels, see figure to the right
where Mona Lisa is built up by Mona Lisas in three layers. Dithering
works because human vision has a limited resolution and blends
impressions to make sense of them.
About five hundred thousand people in the United States claim that they
are artists [MC2]. If each of them makes one piece of art, each year, this
would amount to fifteen million artworks per generation. You have to do
something pretty amazing to get noticed.
Mona Lisa by
Adam Finkelstein
IV.3.4 Text
Written text is not much data, but still can represent a lot of information. A
typical data screen filled with text consists of a couple of thousand bytes,
and a typical novel is less than a megabyte of words. Dracula by Gram
Stoker is approximately 900 Kbyte, and Hamlet about 200 Kbyte. Printed
text on paper is voluminous, but still not much data. Hand written text is
even more voluminous except for psalm verses written on the backs of
stamps. A not so very important fact is that about one hundred thousand
new books are published every year in the United States [MC2].
The library of congress holds
morre than 25 million books.
At the lowest level text is composed of visual features.
Letters are the next level, they are combinations of the primitive visual
features, and themselves arranged to words. Words also have an outline
that for short words simplifies reading. It is for instance difficult to
manually find the spelling error where anl is misspelled as anl. Short
words such as and, a, and the are efficient. They are used frequently,
contain little information and are consequently, for efficiency, short.
Words such as phantasmagoria, serendipity, and flabbergasted are longer,
and the few times they are used conveys a lot of information.
Hakan Gulliksson
70
Words are grouped into different categories of phrases. Noun phrases
(NP) and Verb phrases (VP) are two examples that can be combined into a
sentence (S). We could formulate this as S: NP VP, and in a real sentence
The actor
is dead. . ”ut, a text is usually something more than a
collection of random sentences. There is an idea behind it that binds the
sentences together. This could be a simple causal relation, or a common
theme. A second sentence can elaborate on the first, or explain it.
The development of writing also
lead to the discovery of the representational structure of speech.
Merlin Donald
Something written is an external representation of thoughts and spoken
language. By using this representation we simplify rational and scientific
thinking. If the text is properly stored in the computer it can be traversed,
searched, and grouped in different ways according to the application. This
is a task suitable for technology, but how can we make technology
understand the meaning of what is written? One step towards this
objective is to represent meaning itself, for instance using a markup
language such as RDF used on the Internet.
IV.3.5 Sound and music
Sound is very important for human communication, partly because it
carries speech. Speech that carries language and through language culture.
Even though the demands for data storage and processing of sound are
less than for images it is still a challenge to technology. An interesting
observation here is that there appears to be an inverse relationship
between simplicity of the representation of a stimulus and of how easy it
is to perceive. We for instance easily identify the sound of a smashed
glass, which is extremely complicated and very difficult to describe by
mathematics and to store efficiently.
There is no such thing as a sound at a specific time. Sounds are because of
differences over time and this of course affects how sound can be stored
and manipulated. Even if data from sound is less voluminous than data
from images there is still a lot of data involved. When using 16 bits per
sample, a sample frequency of 44.1kHz, and stereo we need 1411200 bits
per second (CD quality). By sample frequency we mean the number of
times per second we want to check the signal value. Sending or storing
1411 kilobit per second is quite an assignment so one important question is
whether, and how, this figure can be reduced. Compression of sounds
reuses many techniques from image processing, and some of them
actually emanates from speech processing.
Speech and sound are both signals that need to be sampled from the
physical environment. Their representation is simplified by the fact that
human hearing is limited. The highest perceived frequency is about 20
KHz, which means that a sampling frequency slightly above 40 kHz is
sufficient. This explains why the sample frequency for a CD is chosen to
44.1 kHz. Our ability to differentiate sound levels is also limited, 16 bits, or
65536 levels, are enough for almost any application and 8 bits, i.e. 255
levels, are sufficient to comprehend speech and is used for example in the
telephone system.
dokidoki indicates the beating of a
heart in Japanese.
Frequency response of a
cymbal crash
Sound pulses separated by more
than 3 seconds can no longer be
grouped into pairs.
quack
Music is represented as sheets of music where individual notes are
grouped to motifs, which are grouped to movements, which are grouped
to pieces . A few sheets of music can keep an orchestra working hard for
several days to find the right interpretation. If successful, the orchestra
creates auditory scenes, where for instance birds can be represented,
emotions can be felt, and with rhythms that almost force us to move along.
Hakan Gulliksson
Give meaning to noise, sound
becomes communication.
Daniel Sonnenschein
The rest is silence.
Hamlet
71
IV.3.6 Speech
pip
At the lowest level of representation of speech we have the phonemes. In
English there are about 42 phonemes, roughly corresponding to the letters
in the alphabet [JD]. The phonemes by themselves form a phonetic
alphabet that can be given symbolic representations. The table below
shows some of the phonemes in use, along with their symbols in the
ARPAbet from the United States Advanced Research Projects Agency.
Symbol
i
A
p
r
Example
heed
mud
pea
race
“… writing is not a
language, but merely a way
of recording language by
visible marks .”
Bloomfield 1933
Table IV.3.1 Phonemes in use
and their symbols.
This seems all very organised and orderly, but the problem is that there is
no one-to-one relationship between a phoneme and its physical shape, i.e.
its sound. How we pronounce a phoneme, i.e. the prosody, or rhythm and
melody, of speech, depends heavily on context. A higher level of
representation is words, but as phonemes, words are also pronounced
differently depending on where and why they appear. Another problem,
sometimes called the segmentation problem in speech recognition, is that
it is difficult to find the pauses at word boundaries when we speak, i.e.
speech is much more continuous than we perceive it to be.
For speech the sound generation system is known, it is the human.
Knowledge about this system gives much information that can be used
both for speech analysis, compression, and speech synthesis. We also
know that speech is mainly used for speaking, hmmm…. It therefore is
constrained by what you want to achieve when you speak. One example is
that you rarely go from a very low amplitude to a very high amplitude.
This is not true for all other types of sounds. Another useful constraint is
that speech is not continuous (although most of us know exceptions). Talk
spurts are interleaved by periods of silence.
Speech is a volatile medium with some inherent limitations. It is for
instance difficult to describe spatial information using speech, and more
than one speaker at the same time is a bad idea. Using speech and its
constraints to extract meaning and intentions is still a challenging task
best, if not only, done by humans, and even we can only make an
approximation of what is really behind the words.
“Man knows that there are in the soul
tints more bewildering, more numberless, and more nameless than the
colours of the autumn forest;… Yet he
seriously believe that these things can
every one of them, in all their tones
and semitones, in all their blends and
unions, be accurately represented by
an arbitrary system of grunts and
squeals. He believes that an ordinary
civilized stockbroker can really
produce out of his own inside noises
which denote all the mysteries of
memory and all the agonies of
desire.”
G. K. Chesterton [SP]
(keep complexity in mind)
What is the sound an angry
Viking makes?
An image says more than a
thousand words, but demands
1000 times more power, 1000
times higher bit rate and 1000
times more expensive equipment.
Internet
talking LOUD!
Runa next to Gullik behind
Tova
Hakan Gulliksson
Anna and Hakan
eleaving the room
72
IV.4 The Thing outside in
"I'm sorry Dave,
I can't let you do that."
"I know you and Frank were
planning to disconnect me...
and I'm afraid that's something
I cannot allow to happen."
"I enjoy working with people."
"Will I dream?"
Hal 9000, Space Odessey 2001
Some four billion years ago life struck earth for the first time. Humanity
and all other living beings are strict descendants of these ancient bacteria,
built by DNA. Everyone is related! According to some probability
calculations life should appear every 500 million years, but so far we have
not seen any new forms of life. Someday though, new life will appear, and
probably wipe out or assimilate mankind. Could it be that the computerbased thing will evolve into this new form of life?
The thing is something that you can touch, it is real, and it can be pushed
and kicked. It is more interesting if it is smart, a bit like us, but an ordinary
table is also a thing. The table even has legs! A human being can also be
kicked, so it is a thing, but in this book we have our own interactor. An
animal could also be considered a thing, but we will ignore that line of
thinking here, and stick to designed physical objects. Words with similar,
or overlapping meaning to thing , are object , artefact , device ,
automaton , robot and machine .
“Since man is a child of God and
technology is a child of man I
think that God regards technology
the way a grandfather regards his
grandchildren”
Roberto Busa
According to the theory of
thermodynamics the information of a piece of material
is of the order 1024 bits.
(information in a book about
106 bits)
Since the thing is a matter of matter its external representations are
physical. Examples of attributes are material and surface properties,
colour, shape, and moving mechanical arms and hands with many
degrees of freedom. The appearance of a thing is fixed, at least compared
to the appearance of information. It is for instance difficult to scale a thing
a factor of two, or to reconfigure the constituent parts to better support a
given task. It is also impossible to physically reach into an advanced
computerised thing and modify its internal structure; a command is only a
suggestion! External representations of things often can move which is in
accordance with human representations. They can also be heard and
smelled.
A thing is designed, and to specify all of its attributes is a major issue. The
specification must be done with respect to time and money at disposal,
situation of use, user, security, and many other aspects. The cost of
manufacturing a thing is highly related to the number of things produced.
This tends to make things homogenous and generally applicable over
large markets. Since price is important features, such as adaptability and
additional sensors, are not added if they do not directly support the
Hakan Gulliksson
73
specified task. General applicability and low adaptability results in low
sensitivity to context.
Internally we build things up by hierarchical, and layered structures of
electronics and mechanics. At least one central processing unit, memory,
perceptual system, and one motor are needed for an autonomous, mobile
thing. For this book we will assume that a thing is computerised and
networked. This means that it mediates the virtual and the physical
world, and this is a very important feat. A thing will be able to perceive
the same things as a human does, while at the same time in theory having
access to an almost unlimited memory. While the network provides for
mobility in the virtual world, the thing still has the problem of physical
mobility. This is important because if the thing is supposed to learn about
the physical world it needs access to as many aspects of it as possible.
Will and should things ever look and behave as humans? This is an
ongoing debate in the research community and the main supporting
argument is that things should look like humans because then we could
reuse a wealth of social behaviours. Co-operation and communication
would be effortless. The main argument against is that we only fool
ourselves. There is no way we can build a thing with such functionality.
The result will only be frustrated human interactors unable to make
themselves understood. One requirement for believability is unique
reactions to many types of stimuli, implying a rich personality, emotions,
and self-motivation.
On
A physical entity is countable,
observable, and existent at
some point in time. It has a mass
and a volume.
Computers are not intelligent.
They only think they are.
Internet
Imagine two computer children playing outside. Will they ever play hide
and seek? Will a mother computer read the same story every night to her
computer baby? Will a computer baby skip dinner if the wrong dish is
served? How will a computer baby draw its family? These are examples
that make you think about the difference between a computer and a
human. Will there ever be an surrealistic computer artist?
Inevitably things will be more socially competent as they learn about
context and how to adapt to it, and as we learn about how to equip the
thing for adaptation. Managing social space is however not easy. One
reason for this is that social spaces cannot be directly seen; they need to be
discovered and formed through interaction. This is an active, generative
process of observation and action that is inherently dependent of a specific
context. Hopefully we will not be able to build things that can do as
terrifying and horrible things as humans have done.
Hakan Gulliksson
74
IV.5 Sensing it
Interactors need to observe and understand their context directly through
sensors or indirectly through representations provided by other
interactors. Without this information we cannot interact! For indirect
observation information processing is vital, including for instance visual
realism that will be discussed in the next chapter. In this chapter we will
focus on the sensors available.
Humans observe, i.e. sense, the world through receptor organs organised
as 5 senses, vision, touch, hearing, taste, and olfaction (smell). Claims have
been made for a sixth sense, but so far there has been no evidence for
telepathy, or other even more suspect human abilities.
The modality, or channel, used is one aspect of sensing. The fidelity of a
sensor is another. It includes the level of detail, i.e. the precision, and the
accuracy, i.e. to what extent the information can be trusted. How do we for
instance conclude that someone is happy and her level of happiness?
Can you tell the difference
between the touch of a loved
one and the touch of a
stranger? Is it possible to
learn how to interpret a
touch?
T
The raw information input stream from all of our senses is quite
impressive. One calculation sums to about 11 million bits per second
approximately distributed as in the table below [TN.
Sense
Vision
Hearing
Touch
Smell
Taste
Information stream bit/s
10.000.000
100.000
1000.000
100.000
1000
Table IV.5.1 Bit rates
(information content) for human
senses [MZ2].
The amount of information that we can consciously handle is much, much
less. About a hundred thousand times less, see table below! Notice how
the weight of the information from hearing has been increased.
Sense
Vision
Hearing
Touch
Taste and smell
Hakan Gulliksson
Bandwidth of
consciousness bit/s
40
30
5
1
Table IV.5.2 Bit rates for
human consciousness
[MZ2].
75
1000
The table should not be used to conclude that television could be shown
using only 40 bits per second. Humans move their attention and select the
interesting information. More information to choose from enhances the
experience. After focusing, and extracting interesting information, only 11
very personal bit/s are left to be stored and processed.
As for output, ordinary speech supports values of the same magnitude.
Reading one page (2400 characters) aloud, in a radio show, takes about
two and a half minute, i.e. approximately 16 characters per second. One
character is about 2 bits of information.
So much for human sensing, but what does the world look like to an
interactor of type information? Its goal is to manipulate some data
structure, and to do this it needs information, also found in data
structures. Access to data can be direct if it is stored locally, otherwise data
communication is needed. The world of the interactor is discrete, dynamic,
and since it is a sampled version of the real world, usually stochastic.
Context dependency is still an issue, and data without context, e.g. 42, is
no information. If the data structures involved are static the world is fully
deterministic and with ample computational resources an agent in such a
world is omnipotent, and will never loose a first-shooter computer game.
Sensory
compression
1
Human skin covers about 2m
square and weights 3 to 5 kg.
Sensing is extremely important since it is the basis for adaptive behaviour.
The more unpredictable the operating conditions are, the more important,
and difficult sensing is. A slightly more detailed discussion on the physics
behind the interactions is postponed to the chapter on T-T interaction, see
Chapter V.4. A thing can perceive:
Through databases that can be local or distributed. Examples are
address books, profiles, or the Internet.
Through input to applications run by the thing. Input can be
given by humans, other interactors, or by active environments.
Through sensors which is discussed next.
The figure IV.5.1 below shows the principle for how a computer-based
thing senses and acts in physical reality. The sensor converts a physical
variable to an analogue voltage. This voltage is converted to a digital
representation using an A/D converter and back again to an analogue
with the help of a D/A converter, the D/A block in the figure. The energy
supplied through this signal is transduced and possibly amplified to some
physical variable. Even if sensors are important, effectuators (actuators)
are what makes the difference in the outside world. Cause and effect is a
natural law, not easy to bypass. Some examples of effectuators are the
loudspeaker, video screen, propeller, and the electric train.
A/D
Hakan Gulliksson
T
D/A
Definition:A sensor is a device
that receives a signal or stimulus
and responds with an electrical
signal.
Jacob Fraden
Figure IV.5.1 From sensing
to action.
76
Stimulus
Sound
Visible light
Infrared light
Touch
Force
Proximity
Temperature
Time
Sensor
Microphone
CCD, CMOS sensor
CCD, CMOS sensor
Switch
Strain gauge
Hall sensor
Thermometer
Clock
Effectuator
Loud speaker
Photo diode
Any body at a temperature
above zero Kelvin
Moving, pushing object
Spring, string, motor
Magnet
Heater
Sleeping pill??
The table IV.5.3 shows some common physical sensors and effectuators.
Most of the signals in the table can be sensed in many different ways, with
many technologies from electronics, physics, biology, or chemistry.
Electronics is the branch of physics responsible for the hardware side of
the computer revolution, and Physics is the basis for electronics. Physics
describes properties of things such as why some materials are conductors
and others are not, and how a photo detector, transistor, or a photo diode
works. This is essential knowledge for understanding information
technology and especially its limitations. Once again mathematics is the
main modelling tool. Information possible to measure, but not listed in the
table are, frequency, wavelength, torque, acceleration, position, humidity,
pH, revolutions per second and many others.
A sensor senses by either generating an electric field, or a current, or by
changing its resistivity. The change in resistivity depends on material
characteristics and can be used to modulate a current. The resulting
current or voltage usually has to be amplified before it can be digitised.
Note that humans have built in sensors!
Table IV.5.3 Some examples of
sensors and effectuators (also
known as actuators) along with
their physical stimulus.
“We are no longer creatures of five
senses: technology has given us
hundreds of senses. We can see the
universe throughout the electromagnetic spectrum. We can hear the
vibrations, from the infrasound of the
seismologist to the ultrasonics used in
destructive testing. We can feel
molecular forces. We can sense the
age of ancient objects.”
Myron Krueger, Artificial reality II,
1991
x
x+x
The choice of a sensor in a particular situation depends on factors such as:
price, precision, input signal range, reaction speed, output range,
sensitivity (output range/input range), noise sensitivity, stability, or
simply personal preferences. For mobile applications portability, power
consumptions, size, weight, calibration, and set-up time are also
important. To this we can add constraints from design. The sensor device
should not force changes to the appearance of the product, and there are
also environmental constraints. For some applications we need more than
one sensor. If we for instance want to surveille a room we can use one or
two cameras in a distributed sensor network. Multiple sensors can also be
chosen to increase the reliability of the system.
Hakan Gulliksson
77
IV.5.1 Which sense is the most fundamental?
Hearing and vision provide us with the most information, but touch is
our oldest sense with a close coupling to the deeper, faster parts of our
brain. Touch is implemented by skin that is our primary physical interface
with the real world, even the eardrum is skin. While many blind people
learn to have a prosperous life, a person who has completelypip
lost skin
sensitivity will not. They cannot move around without risking to
inadvertently hurt themselves, and they have difficulty standing and
pip
walking.
Taste is a social sense, and of course also necessary. Even though we only
pip
have five tastes (salt, sour, bitter, sweet, umami), a dinner might well be
the highlight of the week.
pip
Would you rather be deaf or blind?
(Right (?) answer is that deafness
means a higher degree of isolation)
pip
CLUNK
All senses supports well-being and give pleasure. By selecting the optimal
input for a particular sense we could even use it as a drug. But, how do we
find this optimal input? One clue is to look back through human evolution
and search for positive inputs that can be extracted and concentrated.
pip
pip
JUMMI
with the
taste of
UMAMI
Smell has sensors that detect combinations of seven basic smells: minty
(peppermint), floral (roses), etheral (pears), musky (musk), resinous
(camphor), foul (rotten eggs), and acrid (vinegar) [DA]. Only eight
molecules are needed to trigger a nerve impulse, but forty nerves need to
concurrently react before a smell is detected.
Cover your eyes and you stop
seeing, cover your ears and
you will stop hearing,
But if you cover your nose and
stop smelling you will die.
Diane Ackerman
It is difficult to describe a smell. How does Jolt Cola smell? Still, when we
smell it we can say Oh boy, Jolt Cola .
What is the sound of a blink?
IV.5.2 Neural pathways
Current knowledge about the neural organisation of the senses suggests
that they are organised in pathways. Sensory input follows a path from
the receptor, via thalamus, to the cortex where most of the processing is
done, see figure below. Thalamus and the cortex are both parts of the
brain, thalamus in the centre, and cortex is the layered exterior (grey).
Thalamus
Optic
nerve
Figure IV.5.2 Neural pathway,
transmitting and transforming
sensory information.
Cortex
There are cross connections between the senses. We can for instance see
a Morse signal. Short, short, short, long, long, long, or we can hear it
given - - - . . . - - - . One of the perceptions will do. Try to recall the
sound of chewing a carrot. Another example is that our inner ear detects
head motion and feeds this signal forward to the vision system. This way
we can quickly compensate for head movements. In other cases vision
supports hearing. It is for instance difficult to decide whether a sound
originates in front or behind us in a room. This decision is left to vision.
The alternative? Three ears, one on top of the skull? A bit of a nuisance
when you wash your hair.
Hakan Gulliksson
It’s am-s-ng h-w m-ny l-ttrs we c-n r-m-v-d
78
All perceptions are individual and contribute to a personal history and
representation of the world (which will make twins less alike for each
second). Perception is also an active internal process. Results from
research in the neuro-physiology of perception show that there are far
more nerve paths going from the cortex of the brain to the lateral jointed
body than there are paths coming directly from the eye. The lateral jointed
body is a region in thalamus, see figure below, which acts as an
intermediary between the optic nerve and the visual cortex. The feedback
loop created indicates that perception in humans is an active process,
where the perceiver is involved.
Thalamus
Optic
nerve
Cortex
“ The senses do not give us a
picture of the world directly;
rather they provide evidence for
checking hypotheses about what
lies before us. Indeed, we may say
that a perceived object is a
hypothesis, suggested and tested
by sensory data.”
D. Drascic and P. Milgram
Figure IV.5.3 Feedback where
sensory information is
modulated by previous
knowledge.
In other words, what an individual sees is in fact both information that
she retrieves from the brain, and information from the eye. One example is
reafference that links the movement of the beholder and that of the
perception. By re-injecting what should be seen when moving, the selfmovement is cancelled out. When this is done the movement of the
perceived object can be calculated. Another example is when you slow
down from driving 100+ km/h, and without looking at the speedometer
turn right. You will be really surprised by the screaming tires, because low
speed coming from 100 km/h is not the same as slow speed coming from
40 km/h.
Perceptions can also be associated with stimuli from within an interactor.
This type of perception is called proprioception and one example is that
you can tell how your hand is turned even if you hide it under a table (try
it).
IV.5.3 Internet data pathway
Media signal processing, such as decompressing the mp3-file The
gambler by Kenny Rodgers, is by no means the only task where
information is processed by technology. On its way to or from the user
over the Internet data is processed, stored, and presented, by selected sets
of applications. The applications are needed to manage access control to
web resources, and to provide easy structuring of basic functionality. Take
as an example a user who wants to automatically download a photograph
to his web site. The photograph by default is stored in JPEG in a suitable
folder, close to other pictures. It should be compressed to use less than 25
kB of memory, and be automatically labelled by the motive since the user
does not want to add this information herself. All of the functionality
needed should be applied to the data stream on its way from the camera
to the web server.
Hakan Gulliksson
Psst ..
79
Access point
Figure IV.5.4 Information in
cyberspace is collected, transformed
and stored.
Access point
IR, Radio….
Sensors
Different
processing
possible.
Another example where execution units are chained together is a weather
service for a mobile phone. The phone asks the operator for the cell
location, and gets GPS position data in return. This data is locally
transformed in the device into zip code data. The zip code is sent to an
Internet service provider who returns the current weather conditions for
that area.
IV.6 Acting out
We will focus on two forms of interactor actions, expressing itself, and
moving around. In general there is no end to the number of intentions and
their corresponding physical and virtual actions, and we could also add
innumerable actions done without intention. The actions chosen here are
however two of the most basic and we will later, in Chapter V.15, return to
the subject and give more examples of actions, there in the context of
command based interaction.
Human consciousness has a very limited information processing
capability in the sense of Shannon s information theory, and also when it
comes to consciously expressing ourselves we are limited. One
approximation is that we can express less than 50 information bits per
second, using speech, dance, facial expressions, or other means. This bit
rate is reasonable in a face-to-face conversation with another human, but
in an interaction with a computer? The computer has the possibility to
input and output tenth of millions of bits per second. We however
subconsciously can use much more of the possible bandwidth. We for
instance reveal, without our intention, what we are really saying when we
are talking with someone. The output bandwidth for muscular and
motoric management are approximately shared as; skeleton 32%, hands
26%, language generation 23%, facial muscles 19%. Through our actions
we can change much more information, for instance by setting fire to a
newspaper, or throwing the hard disk out of the window.
The young generation will also
have overall better coordination, faster reflexes and
perhaps a better ability to deal
with 3D spaces. Will this
change how tools are built for
humans? Skills like using a
mouse and keyboard will be
taken for granted. How will this
affect social interaction and
product development?
"Where they have burned books, they
will end in burning human beings."
Heinrich Heine 1821
"What about shutting Internet down?
Hakan Gulliksson
80
Motoric behaviour and perception are interdependent and the initiating
action is not always obvious. Our eyes move which allow us to focus on
the interesting part of a scene, and we stop to listen. Without motion we
cannot access the interesting sensory impressions, and without sensory
input we do not know where to move. An obvious fact is that we do not
have wheels! This is not because nature was not clever enough to invent
the wheel, it eventually did , but because nature did not think wheels
was such a good idea. Instead it provided us with two legs. Not four, or
six, which would have simplified programming, just two, and we manage
quite well with only a left and a right pedal. The best research laboratories
in the world are still trying to figure out how to copy this feat. Nature
understood that he problem is not just to move forward at a steady pace.
Acceleration, trail bumps, steep slopes, jumping, turning, and inspection
of the environment have to be managed as well.
Physical
context
??
H is localised, not global
or distributed.
Your next marvel is how you keep track of your hand when your upper
and lower arm, and your wrist move in 3D. Quite a lot of real time
trigonometry! As a result of our mastery of this, humans are quite good at
aiming and throwing things. Maybe better than any other species on the
earth? Throwing involves the 3D real time trigonometric problem noted
above, also related to as kinematics or the geometry of motion, and also
estimation of dynamics, i.e. effects of forces.
The loop to move a muscle, from the sensors in the hand, through the
brain, and back again is quite slow, somewhere around 200 to 450
milliseconds, which is much too slow to be useful for controlling a throw
of a ball. So the mystery is, how can we be so accurate when we do not
know how we are throwing? The solution is that the brain has already
done some pre-calculations of how the hand, the arm, and the rest of the
body should manoeuvre to perform the throw. It is running a concurrent
simulation of the throw, in real time performing inverse kinematics and
inverse dynamics, and data from this simulation is fed to the muscles. An
adjustment from the simulation can reach the hand in less than 100
milliseconds, which is an acceptable delay. The simulation is built from
previous experiences, so practice is necessary and improves the accuracy.
Let us say that you have performed an action with your hands where
vision was also needed. The next time you do the same action your eyes
will move ahead guided by the muscles of your hands. And, even more
fascinating, if someone else performs the same action and you are only
watching, your eyes will still prepare you for the action.
Inverse kinematics and inverse dynamics are examples of ill posed
problems, which means that the solution is ambiguous. Another example
is to figure out which numbers to multiply to get a given product how do
you? What about 42 for instance? To solve such a problem we need more
information, or we have to guess intelligently. When we throw, the
missing information comes from the model of the throw in the brain, i.e.
trained behaviour, and from the environment, e.g. sensing a strong wind.
The simulation still depends on some feedback, for instance from the eyes,
because otherwise it will soon loose track of reality. The interesting
observation from the above is that by using an internal model the system
is actually faster than a pure physical implementation! This is a good
counterexample to the intuitive assumption that an internal representation
always will slow behaviour down.
Hakan Gulliksson
Transcribing movements
81
Throwing something is also a good example of the complementary roles of
analysis and synthesis. Analysis is needed to estimate the parameters for
the throw, its length, and the weight of the stone. Synthesis is needed to
execute the throw, contracting muscles to move the arm, and twisting the
wrist. We will come back to analysis and synthesis many times in the
following chapters.
Last but not least there is the magic of our hands and fingers. They are
magnificent instruments for gripping, pulling, and pushing in a variety of
ways. We manipulate objects of different size, form and with different
kinds of surfaces, and do not think much about it.
Things and information have their own means of acting. Information
needs a physical representation to express itself in the physical world. For
this the computer display and the loudspeaker are useful. We will later in
this chapter discuss how they can be brought into action. Although
information has a problem to access the physical world, the same property
gives information a clear advantage when it comes to moving around.
Roaming around approaching the speed of light is possible.
Thinking about what things do we can identify four kinds of, not mutually
exclusive [AS4] acting things:
1.
2.
3.
4.
Information in action: Ad
against drug abuse
Mediators of force and energy, e.g. a chair, and car.
Manipulators of matter, e.g. a lawnmower
Transformers of physical state, e.g. an oven
Processors of information, e.g. a calculator.
Note that this taxonomy uses different interactions to characterise things,
and that they are pre-wired and adapted, rather than adapting.
To perform the above things needs to generate heat and power, use forces,
and move around in the real world. The motor or engine is perhaps the
most important active device ever invented. It transforms electricity or
some other form of energy to motion, and is implemented in many forms
such as the steam engine, jet engine, car engine, DC-motor, and the
stepper motor. Around this power source many other ingenious
mechanical details have been invented, such as the gearbox.
Mobility can be achieved either through moving the thing itself, or by
selecting an input source matching the new physical environment. A line
of cameras is roughly equivalent to a moving camera. Currently however
we tend to think of a thing as a single physical unit, and we will stick to
this perspective in this book.
Hakan Gulliksson
82
Given a flat surface wheels is a good idea, which in turn implies the axle,
bearing, and the brake. Without the flat surface a cableway, aeroplane or a
rocket are alternatives, and if the surface is wet a boat (with a motor) will
do the trick.
IV.6.1 Action, the concept defined
Inter-action implies action, which is the behaviour resulting from internal
processing by the interactor. But what exactly is an action? The word
seems to be used in so many contexts that a precise definition is hard to
find. Action is a very basic concept for humans and often the more basic a
notion is, the more difficult it is to describe.
Not very encouraging, but let us start with the definition to the right->
That gave you a ride did it not? Let us look at another description, from
the world of UML (Unified Modelling Language), a standard language
used for software modelling, where an action is defined by its context:
Definition: An instance of
behaviour is an action if
and only if it is associated
with an inten-tion making
the behaviour into a means
for some end.
Jens Allwood
Preconditions: expected to be true at the start of the action.
Post conditions: ensured to become true at the end of the action.
Guarantee conditions: ensured to remain true during the action.
Rely conditions: expected to be maintained true during
the action.
An action is in other words a modification, and might include a reaction,
i.e. it is sensitive to its context. Since the real world that we live in does not
provide a stable environment, an action need not give the same result
twice, and in two different situations most probably not will give the same
results. Note also that an action in the definition above can be purely
mental, such as a prediction or an analysis.
Why is action and similar concepts so difficult to understand? One reason
is that if we focus on one concept and try to express it clearly, in all its
entangled details, we at the same time need to clarify all the other related
concepts, along with the relations between them, and their dynamic
behaviour at the same level of detail. Consensus is not very likely since all
of these issues are quite complicated, partly because they have evolved
over millions of years, and partly because of the many different of views
possible. A working reference architecture that we implement ourselves,
and results from neuroscience will help us to better understand the issues.
A precise definition of the
term action poses nume-rous
problems, and depends on
whatever quasiphilosophical
blanket ideas we may hold on
the subject
Jacques Ferber
(sanskrit)
Karma: the sum of all that an
individual has done and is
currently doing.Will actively
create future experiences.
Cause
Agency
Figure IV.6.1 Web of concepts
affecting human action.
Reason
Intention
Emotion
Action
Hakan Gulliksson
83
Over the years we will increse clarity and depth, and even though we
might never be finished the search and the discussion is a reward in itself.
The grand unifying theory will however not be found.
An action can be triggered by any number of casuses and by reasoning,
see figure below. One philosophical problem here is whether we always
can find a chain of causes for an action, or if an actions can be caused by
an impulse without reason or cause. This discussion is akin to whether
there are truly random events, or if the world is deterministic if we only
could follow all details of what happens.
Unconscious (social,
environmental, …Ψ
Framed by consciousness
Figure IV.6.2 Triggering action.
Cause
Cause
Reason
Cause
Action
Back to reality, let us exemplify action by the everyday event of going to a
very interesting lecture at Umea University, early in the morning [DAN]:
Form the goal. The goal is the purpose underlying an action. (Be
in time for lecture)
Form the intention, i.e. form a conative intentional state (Have to
speed up, or I will be late)
Specify the action (More throttle)
Execute the action (Push the pedal)
Perceive the system state (Check the speedometer)
Interpret the system state (Speed is accurate)
Evaluate the system state, i.e. check if the goal will be, or has
been fulfilled (I will be on time or I missed the lecture because I
slept too long)
ID
This decision cycle, or action cycle, is performed over, and over, and over,
and over again.
A common intuition is that an action alters the world close to the cause,
but now the global network will extend the reach of an action in space.
Cause a disruption, and it can have an effect somewhere, or even
anywhere, else. Similarly, actions local in time will increasingly affect
future states. This has already been done by books, newspapers, laws and
television shows. If you write a book it might be forgotten and found 10
years later, “h, this is a master piece . With extensive networked data
access the possibilities increase.
Hakan Gulliksson
As far as we can go in conceiving
the depths of the physical world,
we find agitation and specific
interactions. Immobility, fixed
states and repose are local and
provisional phenomena at the
level of our timescale and of our
perceptions.
E. Morin
84
Concepts related to action are activity and task, which is what the user
must perform to achieve a goal. In everyday use of the words the
distinctions between goal and task, and between task and action, are fuzzy
ones. If my task is to write this word, it is also my goal. An action has a
more coherent behaviour than a task; it is well known, and well practised.
Also, it involves no problem solving, and needs no control structure [JP].
But, recall that action is a fuzzy concept with no clear definition accepted
by everyone. After the discussion of interaction in the next part of the
book, we will better understand action.
Definition: An activity (task) is
an observable, distinguishable,
goal-oriented sequence of
state-changes within a system
initiated , controlled and
monitored by one or more
agent(s).
IV.6.1.1 Action cycle revisited
Last section discussed the action cycle shown in the figure below. This
model is however much too simplistic to fully describe how humans
behave in interactions and we will list some of the deficiencies, just to get
the point through that human behaviour is not easily formulated in a
simple model [DK].
1. Formulate goal
UnDo
Do
7. Evaluate goal
2. Intention
Human
6. Interpret
perception
Figure IV.6.3 Action cycle.
5. Perceive result
3. Detailed
plan/action
4. Execute
plan/action
To start with, goals are more dynamic than the model indicates. They are
not always well specified, prioritised, or consistent. Goals interact and
change while we try to accomplish them, and they could even be formed
while performing the action to fulfil another goal. This reflects the fact that
the context is changing and is affected by actions. One way to view culture
is as a device to cope with the complexity of everyday situations.
It is impossible to plan
the future, we can only
prepare for it.
Teknisk framsyn
Consider an artist who sets out to do a painting. The overall goal is well
defined, but how about the sub-goals? The composition and the detailed
choice of colours is something that will grow while painting. The
painter creates an environment (the painting) that will affect the sub-goals.
Before starting the actual paintwork the artist has probably done some
preliminary work, a sketch, bought some paint and selected a motive. This
preparatory work is also part of the solution. Another example of
preparatory work is that we rearrange the dishes before washing up in
order to start with the glasses. Servicing the car to keep it fit for fight is
another example where a task, this time a maintenance activity, will
increase the probability to reach the objective.
A third category of activities that is not found in the simple decision cycle
above are complementary actions [DK]. Complementary actions are
actions that help us perform the main task in a better way. They transform
the problem into one that better suits our cognitive abilities. We, for
instance, divide telephone numbers up in groups to make them easier to
remember, and we mark the current page in the book so that we will start
at the right page after a short nap. ZZzzzzzzzz…
Hakan Gulliksson
85
IV.6.2 Visual realism, information blending in
What does it mean for information to express itself, and why is it worth
the effort? The first question was discussed in Chapter IV.5 and the
answer to the second question is that the expression is a prerequisite for
any interaction, so if information wants to join the game it has no choice.
One example is business-to-business services where one information agent
publicly announces its functionality, and what it can deliver. The reader in
this case is another software agent. If an information agent on the other
hand wants to match human sensors it has to re-represent itself as
graphics, sound, tactile information, or some other sensible media. Luckily
the technology for this is available, and reasonably adequate. Ultimately a
human should see, hear, or feel information as something natural,
perfectly blended with other natural representations such as faces of
people, trees, wind, and falling rain.
Achieving true audio-visual realism means facing the complexity of
reality, and for this the tools we have at hand are still limited in many
respects; displays have limited visual resolution, acoustic properties of
rooms vary, and so on. The figure below illustrates the problem.
Information chooses to express itself as good as possible using available
means. In this case for instance using a graphical representation. There are
constraints such as bad lighting, or perhaps no color is available. The
receiver perceives the presentation from its own perspective, knowledge
and goals. For efficient communication the expression, should match the
constraints, and the receiver s point of view.
Complex
information
agent
Point of view
Black square
(fully realistic)
Perspective: A basic principle of
Euclidean geometry is that space
extends infinitely in three dimensions.
The effect of monocular perspective,
however, is to maintain that this
space does nevertheless have a center
- the observer. By degrees, [in the
Renaissance] the sovereign gaze is
transferred from God to "Man."
Victor Burgin
Figure IV.6.4 Information
expressing itself.
Expression
Constraints
What makes the problem feasible is first of all human limitations. We
have, for instance, limited acuity. Additionally, our incredible adaptivity
means that we will adapt to almost any perceptual quality, and if we are
really interested in the view its quality is of less importance. In the
following we will focus on visual realism, which is steadily increasing
because of improved colour and lighting representation, better methods
for surface representations, use of texture mapping and reflection
modelling, and also because of advances in physical simulation of particle
systems, fluids, and flocking behaviour. Despite the advances we still have
many unsolved issues, for instance how to measure visual realism in an
image. The more general problem of efficient interaction will be a major
theme of the next part of the book, and again note that realism is not
necessary for high bandwidth communication and interaction.
Hakan Gulliksson
Imagineering
86
Using everyday technology we have to make do with a two dimensional
projection of our 3D model, a process called rendering. The simplest
representation is a see-through solution where all edges are displayed,
see figure IV.6.5a below. At the next level of sophistication hidden edges
are removed using some smart algorithm, of course at the cost of some
extra computational load, see figure IV.6.5b. There is still not much
realism in the generated box; we need to add more visual information to
the surfaces. To do this we can use texture mapping to map a texture, i.e.
an image, to the surface of the box. The image can be a photograph, or
some computer generated graphics. A quadratic earth is no problem, but
maybe not a very realistic rendering.
Red – Blood
Green – Grass
Blue – IBM
Figure IV.6.5 a) See through,
b) hidden edges removed,
c) texture mapped image
d) bump map
The mapping can also be done with additional constraints such as the
mapping of a bumpy surface that can produce interesting visual effects,
see figure IV.6.5d.
Changing the characteristics of the texture is another way to increase
realism and to create surfaces with different colours, or reflectivity.
Transparency can also be modelled, and for certain scenes and objects give
spectacular results.
Lighting is an important parameter for increasing the realism in an image.
Simulation of point light, ambient light or directional light gives
possibilities that the computer can use to calculate visual effects. For really
good results, add an artistic touch, and usually lots of computations. The
example in figure IV.6.6 below shows two point lights reflected from a
sphere. The reflection is computed using normals, and to save computer
resources the sphere is modelled as facets. Each vertex is assigned a
normal that is the mean value of the normals of the surrounding facets.
This normal determines the intensity of the reflected light at that vertex.
As can be guessed from the figure, light values are calculated as linear
interpolations between the vertices. Not surprisingly, the illusion of a
sphere improves for smaller facets (with more computations).
Darkness, unlike light,
does not need a source.
Jeremy Brin
Figure IV.6.6 Compromises can
give visible features (Silicon
Graphics, Inc).
Shadowing is the next item on the list and it can be accomplished in at
least two ways. The first way is to generate rays from the light source and
follow them as they bounce around. Whenever they hit something this
something is lit up. This is called ray-tracing. Another method is to sort all
objects according to their distance to the light source and calculate the
shadows an object casts on more distant objects.
Hakan Gulliksson
87
Collision detection is another feature that improves realism, sending the
ball through the floor to the other side is not very realistic. This
interaction, that is so fundamental to nature, is actually quite hard to
reproduce in the virtual world. The principle is simple enough, we only
have to detect if the contour of one object overlaps the contour of another.
Figure IV.6.8 Perfect
representation of collision
detection is computationally
expensive.
The figure above hints at some of the problems. First, all of this is
happening in 3D where predictions and calculations are more computer
intensive than in 2D. Second, the level of detail needed for realism is quite
high, further increasing the computational load. The bounding volume of
the person in the figure is not a sufficiently good approximation for
collision detection in the case shown.
A better approximation would increase the number of volumes to be
checked for collision, aggravating the load problem. We can also use a
hierarchy of bounding volumes and if a collision with an enclosing
bounding volume is detected, then collision is tested for at a finer level of
detail. The cost of this is more complicated control and some additional
memory. A general solution is to use more or faster hardware, at the cost
of $.
Virtual information can be presented any way the designer chooses.
If you throw a ball at a wall it can
get stuck on the wall, bounce back
or continue through the wall,
everything is possible. As humans
we are adapted to reality. The
question is, will our previous
experiences of reality hinder
adaptation to the virtual
environment, or do we not have
such a limitation?
A particularly annoying problem when displaying shapes is the hidden
surface problem. This problem arises when 3D objects are projected to a
2D screen. The computer has to decide which object that is in front of
another. Consider the example in figure IV.6.9. To the human eye it is
quite obvious which of the bars that is the closest, even without the dotted
indication. Why?
Figure IV.6.9 The hidden surface
problem illustrated.
From the I/T perspective visual realism is not interesting. Visual accuracy
is perhaps a better description of the problem to extract just enough data
for the task at hand, and where sampling and careful selection of relevant
visual information involves a trade off between the amount of data and
the computations necessary to analyse the data.
Hakan Gulliksson
88
IV.6.3 Sound, speech synthesis and telling stories
An auditory space is a shared environment where many can hear similar
sounds while attending to different visual representations. Sounds, when
used as music and environmental signals, are necessary for realism. A
whining sound in the background of a movie, which we are not even
consciously aware of, can by itself build up a tension. By designing
sound scapes we can exploit sounds also for other contextual
information, e.g. to signal the time left until the alarm clock goes off.
Uses of colour in GUI:
Soothe or strike the eye.
Draw attention
Discriminate
Organize
Evoke behaviour
[BS1]
Whether music without song can actually say something, and be used as
a language to convey meaning is less obvious, but there is certainly a
complexity and a number of variables that can be used to construct
messages. Harmony, rhythm, intensity, pitch, speed, composition, form
(pop, jazz ..), timbre, and texture could in principle be combined to
messages, but maybe the effort of learning and using it is too high [GG].
“ thing speaking, saying Good morning , i.e. implementing speech
synthesis, is a technology already in use. Automatic answering services
directing you to press #, *, or are no longer a surprise. The perceived
impersonality of speech synthesis does not even make you angry any
more.
Modern technology gives us new possibilities for telling stories and
presenting facts in compelling ways. Story telling, or narration, has been
performed since the beginning of time, but there is much more to telling a
story than knowing the syntax and semantics of a language. A cohesive
story is supposed to evolve orderly over time, and to have a structure
such that cause and effect in the story makes sense. This means to answer
the following four questions essential to storytelling: who, when, where,
doing what. Who is the reader? Why is she reading? What is she supposed
to learn? Answering such questions, and adapting content, style and
format to the reader and the situation is extremely difficult.
Many attempts have been made to structure the stories according to
common components, or motifs. Here are some of the suggestions by
Georges Polti most of them loosely applicable to this book, chosen from
the original 36 suggestions; Supplication, to humbly and earnestly asking
for help. (1), Deliverance, recovery or preservation from loss or danger (2),
Disaster (6), Falling Prey To Cruelty And Misfortune (7), Revolt (8),
Daring Enterprise (9), The Enigma, puzzling, ambiguous, or inexplicable
(11), Fatal Imprudence, the quality or condition of being unwise or
indiscreet (17), Self Sacrificing For An Ideal (20), All Sacrificed For A
Passion (22), Necessity Of Sacrificing Loved Ones, here applied also to
behaviour and situations (23), Rivalry Of Superior And Inferior (24), An
Enemy Loved (29), Ambition (30), Erroneous Judgement (33).
Another attempt to analyse narratives was made by Vladimir Propp who
looked for, and found common elements in Russian folk tales. From his
list of 31 elements we select the following:
One of the members of a family absents himself from home (1) and the
villain causes harm or injury the family (8). The hero(ine) is tested,
interrogated, attacked, etc., which prepares the way for his/her receiving
either a magical agent or helper (12). The hero(ine) is transferred,
delivered, or led to the whereabouts of an object of search (15). The
hero(ine) and the villain join in direct combat (16). The hero(ine) is
Hakan Gulliksson
89
pursued (21). Rescue of the hero(ine) from pursuit (23). The hero(ine),
unrecognized, arrives home or in another country (24). A difficult task is
proposed to the hero(ine) (25). The task is resolved (26), and the villain is
punished. The hero(ine) is married and ascends the throne (31).
Note that there is an implicit ordering of the elements suggesting how a
story is built, and also indicating a linear cause-event structure of a story.
Scholars also have worked on categorising the type of a tale. Folktales are
interesting because they could be assumed to follow a tradition of story
telling from the dawn of time. One elaborate catalogue has been suggested
by Antti Aarne and Stith Thompson [CC]:
Animal Tales (Types 1-299)
Ordinary Folktales (Types 300-1199), selected items
a) Supernatural adversaries and helpers
b) Superhuman tasks, to search fortune, or the quest for
the unknown.
c) Magic objects and supernatural power or knowledge
d) Romantic tales
Jokes and Anecdotes (Types 1200-1999), selected items
a. Numskull stories
b. Stories about married couples
c. Jokes about parsons and religious orders
d. Tales of lying
Formula Tales (Types 2000-2399)
Unclassified Tales (Narrationes Lubricae) (Types 2400-2499)
The extended list is quite long, but the point we want to make here is that
most tales are social statements.
IV.7 We need knowledge, and we represent it
Knowledge is one of those familiar concepts that you think you know all
about, until you try to pin down its exact meaning. Here we will not even
try that, instead we will describe it indirectly, in action and also ignore any
differences between memory representation and knowledge. Two types of
knowledge can be distinguished: knowing-about (declarative
knowledge), and knowing-how (procedural knowledge). Declarative
knowledge is associated with a thing, place, person, fact, or subject, for
example Norrland is not too hot, Tova is a girl. It could be further
subdivided into episodic and semantic knowledge, where episodic
knowledge is derived from the experience of a particular individual and
semantic knowledge is commonly held information. Procedural
knowledge is understanding from experience, knowing how to do
something, such as preparing for an examination.
Philosophers have been studying knowledge for centuries and a whole
branch of philosophy called epistemology focuses on its forms, limits,
nature, and validity. One particular view from epistemology is that
knowledge is acquired by reasoning. This is called rationalism. Another
view, empirism, is that knowledge is derived from experience. A variation
of empirism referred to as positivism insists on a scientific approach
based on observation. Ontology is another branch pf philosophy
concerned with identifying the things that actually exist. This study of
Hakan Gulliksson
Definition:A system has
knowledge of something if the
system has a model of
something perceived by the
system.
Is it possible to teach wisdom
without knowledge?
System
Something
90
what there is nicely complements the goal of epistemology, i.e.
we can know .
what
IV.7.1 Knowledge representation
How is knowledge represented in the brain, and can this principle be reused in artificial intelligence? Nobody knows, but theories on
representations have evolved, and some have been tested on real world
problems. It is a popular topic of academic debates and at a very low
abstraction level the answer is easy (if you do not ask a neurologist). We
simply store knowledge by adjusting chemical potentials. The real
problem is how to store information effectively, while preserving relations
between data, such as abstractions, and still have quick access.
Information in the brain seems to be stored in memory in at least four
major representations. The first, and maybe the most important one, is
information stored as the equivalence to images. A second representation
is the phonological representation, i.e. stretches of syllables, such as how
we remember a phone number. Next, there are grammatical
representations, exemplified by nouns, verbs, and clauses. The most
abstract representation is one we use to store thoughts and it is called
mentalese [SP]. Each of the representations listed has its own important
application. The grammatical representation for instance supports
language.
I remember it so well.
”This is for you”, he
said. “Happy Birthday
dad”. His eyes was
shining and his voice
full of pride. “I have
done it myself”.
To achieve the representations above two suggestions are a symbolic, or a
distributed connection oriented representation. The symbolic
representation can be further divided into analogue representations such
as mental graphics, where the actual shape of a dog is directly stored, and
propositional representations that are more language oriented and
capture the conceptual content of knowledge. One example here would be
the concept of God stored as the word God. Actions can be represented as
state diagrams, or software programs, where discrete symbols are
manipulated on the basis of syntactic rules that operate on the symbols,
ignoring any meaning attached to them. This view nicely matches the web
of concepts shown in figure IV.7.1 Symbolism is a first solution to the so
called mind-body problem. This problem is evident in the question How
can the world of meanings, thoughts, intentions and desires, i.e. the
mental world, be connected to the real physical, material, world as
represented by our brain? The solution suggested by symbolism is that a
cat in the physical world is sensed and represented by a symbol of a cat in
our brain, or in a computer.
In a connection-oriented representation it is impossible to pin down the
exact location of where a piece of knowledge is stored, i.e. God is spread
out and stored as a pattern of activities throughout the system, for
instance a neural network. The idea explored is that mental processes are
performed by simple processing units, e.g. neurons in our brain,
communicating with simple signals in a highly connected network. In this
model knowledge is encoded as interconnections of different strength.
This representation is trained into existence, and every instance will be
different. Everyone will for instance have a different interconnection
pattern representing water.
Hakan Gulliksson
91
Everyone knows about water and understands reasoning about this
concept, and yet it remains decidedly difficult to provide a clear,
undisputed, and unified view of water (or actually any other concept,
under study):
Water as a drink – water to drink, use a glass of water.
Water leaks – transport of water, we need a piping system that
sometimes needs mending.
Water as a molecule – H20 molecule, it turns to ice in the
refrigerator and can be used in cool drinks.
Water for washing up – hot water and detergent, need to rinse at
least three times.
Each of these statements refers to know-how related to water and also to
problems involving water used in a context, which assumes
conceptualisation.
One view of knowledge is as a web of concepts where the know-how, i.e.
the usage of the knowledge, is what motivates acquiring the knowledge in
the first place.
Water as a drink
Water leak
Boiling water
to steam
Filling a glass
Plumbing
Figure IV.7.1 Knowledge
web, illustrated by a small
excerpt of a representation of
water.
Water as a molecule
The knowledge web can be used to illustrate the workings of our memory.
As we traverse the knowledge structure our short-term memory
constrains the nodes we can hold in memory. Since each node only gives a
small part of the picture short-term memory can never hold a complex
representation. We can solve this problem in two ways. Either we can
abstract a number of nodes into a new node, or we can traverse the
knowledge web as quickly as possible and try to solve the problem at
hand dynamically. Each retrieval will cost us approximately 400 ms, so we
must not be in a hurry when we use the latter mechanism [WK].
Sometimes we need to organize knowledge in more complex units that
represents a situation, or an object, in a domain. One example is that we
would like to represent the general concept of a teacher .
Hakan Gulliksson
92
For this we use a schema, also referred to as a frame, which is a structure
collecting all general properties of an object or an event, see figure IV.7.2.
It is an abstraction that allows the formulation of more general categories
and was used as early as in the eighteenth century by the philosopher
Immanuel Kant.
Eats all
the time
Dog
The simplification of anything
is always sensational.
G. K. Chesterton
Four legs
Figure IV.7.2 Generalisation
using a schema
Barks
External representations will increasingly complement our own memory
and what we have learnt. Representations such as books, paintings,
databases, and personal digital assistants will help us to overcome our
memory shortage. But, as usual getting rid of one problem creates a new
one. We now have to learn to manage our external memories.
Is there another word
for synonym?
The net
IV.8 We think and process
There are presumably many reasons why we humans have developed
thinking. Let us list some of them [SP]:
Enhanced group living, this extra joy in life provides lots of
opportunities for knowledge transfer and knowledge trade. It is for
instance very useful to be able to guess the intentions of other members
in your group.
Extended use of vision, we use vision to examine our surroundings in
3D and this gives us the possibility to organise our mental world
accordingly. We gain a framework for useful reasoning and planning.
Better use of our hands, our most versatile and important tool, the tool
of tools.
Efficient hunting, because without hunting, and preferably intelligent
hunting, we would not have had the resources to expand our brain to
the extent that it can be used for intelligent thinking (including hunting).
“I think, therefore I am”
R. Descartes
“said by an intellectual who
underestimates toothache”
Milan Kundera
Maybe you can think of more reasons?
Human thinking is slow, and the mental models are unstable and
incomplete. “lso, unscientific, intuitive models, and beliefs guide
behaviour. Despite this we are able to quickly solve many quite
complicated problems. The quote to the right clearly describes the main
difference between computer based technology processing and human
thinking. The computer can perform the same computational task a
million times, without making one single mistake. A human can perform
it ten times and make three different mistakes. In controlled environments,
with a suitable problem, a cluster of computers working in parallel can
compute at almost any speed, impossible for humans to match. Sealing
bottles is a task that computers do at breathtaking speeds. But, if we
slacken the control, let us say that the bottles arrive at stochastic intervals,
the computer will fail. A human will continue sealing a bottle every five
seconds while the computer smash bottles into smithereens. For other
tasks the computerised thing is surprisingly sluggish. Humans solve
simple survival tasks in the jungle in real time (proven ability). A
computer-based thing of today cannot manage! Why is that?
Hakan Gulliksson
“Human thought and its close
relatives, problem solving and
planning seem more rooted in past
experience than in logical
deduction. Mental life is not neat
and orderly…Human thought is not
like logic; it is fundamentally
different in kind and spirit. The
difference is neither worse nor
better. But it is the difference that
leads to creative discovery and
great robustness of behaviour”
Donald A Norman [DN]
93
Moore s law is another distinguishing feature of the thing. Every eighteen
months the computer performance has doubled as measured in MIPS
(millions of instructions executed per second). This will probably not last
for more than ten years from now with the current technology, but at that
stage we can use parallel computers to increase the performance. If this
trend continues, one prediction is that a computer matching the human
brain in performance will be available 2020 [HM]. After another 50 years
we will have a chip of the size of a sugar cube that stores the equivalence
of 10.000 human brains, and with the power of a million Pentiums!
There are also other hardware anomalies. Soon a CPU and a camera will
be inexpensive enough to allow millions of them to be used in everyday
life. There are already cameras attached to glasses that constantly take
pictures of everything the wearer looks at. Combined with global
ubiquitous, fine-grained networking such new technology, gives
enormous new possibilities difficult to foresee, especially if it is
inexpensive.
The term process covers a lot. Thinking, preparing food, building a car,
running a computer program, reading a book, or editing a text are all
processes. Computing is a general term for a machine executing software,
i.e. performing operations on the content of its memory.
There are several levels at which information processing is active [DM]. At
the highest level it is concerned with the question of what is done, and
why, i.e. the intentional level. This is the processing of ideas. While
reading that last statement you actually performed information
processing. Thinking about why you read it is also information
processing, and clearly at the intentional level. Maybe you want to
understand the meaning of information processing? At the next level we
specify the idea of how to manipulate a representation of information,
perhaps using an algorithm, what we have previously in this book called
the conceptual level. Reading is one way to learn about information
processing, and at the conceptual level we can describe it as sequentially
processing symbols, one by one. At the lowest, physical level, we are
concerned with how to implement the algorithm, and how to access the
representations. Reading can be done using a textbook, feverishly
scanning each line, frenetically turning pages, more, more, more.
What problems need faster
computers? Even more
cmplicated word procesors?
How can information or
ideas reason?, How can
information or ideas
perform actions?
H
T
I
IV.8.1 Situated action
One view of thinking is as manipulating an inner model of what we are
thinking about. When we think about dinner we imagine how the food
will look on the plate, the ingredients needed, the cooking procedure
mmm… . Daydreaming, and planning ahead, are other examples of inner
world manipulations, i.e. thinking.
This view is challenged, or rather complemented, by another emanating
from work on situated action [LS]. According to this we use the world as
its own model, and we think using this model. “ blind person thinks by
touching with her stick, her mind leaks out into the world using the stick
as an antenna.
Hakan Gulliksson
The web is a natural extension of
the spider
Shaleph O’neill
94
For situationally determined activities, such as avoiding collisions, tying
shoelaces, or laying out a jigsaw puzzle, it is not necessary to think about
what we are doing, we just do it. But, if we realise that there is a piece
missing in the puzzle we have to engage higher-level processes. We have
to make inferences, consider the alternatives [DK1]. Maybe the dog took
it?
We can exemplify some different ways of thinking by playing the game of
Tetris where a block is rotated and placed to level a landscape of
previously placed blocks, see figure below.
To gaze is to think
Salvador Dali
In the Cartesian tradition meaning occurs because of structures of the mind,
not experience; because of language
(the general language system), not
parole (the speech act or interaction)
[MC3]
Figure IV.8.3 Thinking with
your hands when playing
Tetris.
A beginner consciously rotates the block and finds a match for it. This is
very time consuming, and time is always a shortage, so also in Tetris. To
simplify the procedure, the player can first rotate the block using the
control buttons, and then do the mental matching. She is now thinking
with her hands! An even more advanced user will skip the rotation step
altogether, and directly match the block to possible openings; she has a
mental map of where a block of a certain kind matches. The figure above
indicates three possible matches, which an experienced user easily
identifies. To the skilled user the choice is also a tactical one. Which of the
alternative matches will give the best long-term result? Furthermore, there
is a context to consider, the next block can be made visible and might
influence the choice.
Is it possible to learn how to ride a
bicycle using a computer
simulation?
I type, I am not aware of the
boundaries and interfaces between
my mind, my fingers, the keys, the
virtual text on the screen
M. Rettie
Completing a jigsaw puzzle is another example where thinking is aided by
the physical reality. We order pieces in groups by colour or shape, to
simplify matching. As in the Tetris example we can also physically rotate
the pieces for matching. This action-evaluate behaviour is also termed an
action loop [AC].
Situated action emphasizes that humans plan on demand , i.e. that
improvisation is at least as important as planning, and many times more
efficient than trying to figure out everything in advance.
The artist Stelarc takes the consequences of situated action to the limit. He
argues that we are limited by our physiology and to evolve further, and to
think deeper, we have to enhance our bodies. A third ear, other additional
sensors, and new body extremities such as a third hand are some of his
attempts.
Hakan Gulliksson
The body is obsolete.
We are at the end of
philosophy and human
physiology. Human thought
recedes in the human past.
Stelarc
95
IV.8.2 Distributed cognition
Distributed cognition provides another angle on thinking. A systems
perspective is taken, where humans and tools collaborate to reach the
objective, for instance to manage a ship into port.
In principle a reflective system can adapt to anything by modifying itself,
but in practice no technology is indefinitely malleable. The designer,
material, original intentions with the technology, and much more
constrain the possibilities to modify a system and its technology in a given
situation. It is hard work writing an essay on a pocket calculator.
Below the figure IV.8.4 shows some of the complexity we are facing when
studying and designing for distributed interaction. Participants in the
system interact with other participants, and the system itself, with context,
and they also reflect and act on themselves. All of these interactions create
numerous feedback loops where meaning is created and supported.
T
T
H
Context
System
H
Figure IV.8.4 Some distributed
interactions in Hierarchical HITI.
I
The abstraction levels chosen for describing the interaction is an important
trade off. Should for instance the same description be used at all
abstraction levels for all interactors in the figure IV.8.4 above? If not, then
we need to find ways to match the description (multiple concepts,
different name same meaning, different frames of reference) resulting
from top-down and bottom-up analysis and design. Designing a system
bottom up creates a complex model with good descriptive properties, but
often with low predictive and generative power. A top down approach
could catch emergent properties, but will miss the finer details.
090..
A related issue is the number of models used at a given abstraction level.
Inventing a situated model for each here, now, me, this, and do is maybe
not a good idea. But, a well-developed general modeling tool such as
Activity theory is on the other hand quite abstract, and needs training and
experience to exercise well.
Hakan Gulliksson
96
Whenever we consciously think about something we have to explicitly
build, or use, a predefined model of the subject. One problem with this is
that the model is a selection, we cannot include everything, cannot
consider all aspects at the same time. The selection by necessity creates a
blindness to all the aspects we do not include. It seems that reflective
thinking is impossible without this blindness [TWD2]. If we think about
how to wash up the dishes we usually do not include our children in the
thought process. This might be a serious mistake! There is a minimal
chance that one of them will volunteer if we ask them. On the other hand,
if we started to consider every possible escape from doing the dishes we
would never get round to do it. Have you ever said, Oh, I did not think
about that ? “ssuming that you had the chance, why did you not think
about it?
I think I think, therefore I
think I am.
Descartes' failed attempt
to discard the notion of
objective reality.
IV.8.3 Trends in thinking
Two trends can be found in thinking about thinking and technology. The
first is that there will be increased computerised support for thinking in
the outer world. Computers can help us think. One example is the
computerised calendar that helps us to keep track of meetings and
reminds us of our wedding day. The other trend is more long-term. What
we are thinking about is more and more relieved from the constraints of
physical reality due to the increased complexity of our society. This is a
trend that started when we left the savannah. Many of us for instance
think about taxation, or research. What will happen when the two trends
combine?
"Why do we need to think?
Can't we just sit here and
go BLBLBLBLBLB with
our lips for a bit?"
Douglas Adams
… 60,000 to 90,000 conscious
thoughts each day, 98% same
as yesterday, 2% new ways of
justifying old thoughts
Thomas Enhager
IV.8.4 Artificial intelligence
Personal computers of today are deaf, dumb, and blind, even bathrooms
in most airports are smarter as they at least can sense a person using the
sink. Actually the sink is also computer based, but hopefully you get the
point. We are not at all surprised if the computer does not recognize us
after months of daily use, or if we get the same warning message for the
tenth time.
Artificial intelligence is a young science. Although it builds on
philosophy and psychology it is the exploration of computer technology
since 1940 that has spurred development. The frontier of what a computer
can and cannot do is constantly moved. “ computer cannot play chess, at
least not beat a human being , and other similar statements, at first
seemed reasonable, then questionable, and are now proven utterly wrong.
Computers currently beat the chess world champion Kasparov, and that
would certainly have been considered impossibly intelligent just a few
years ago. We have had to redefine intelligence to keep it in human
possession.
Hakan Gulliksson
Why is it AI and not AWE?
2b or not 2b?
As a computer, I find your faith in
technology amusing.
Internet
The essence of intelligence is to act
appropriately when there is no
simple pre-definition of the
problem, or the space of states in
which to search for a solution.
[TWD2]
97
The question of what constitutes intelligence is almost as difficult as
implementing it. Is an alarm clock intelligent? It fulfils its task, but is still
not considered intelligent. Will a computer ever be as intelligent as a
human, or even more intelligent? There are still many tasks not yet
accomplished by a thing. One is the Turing test, a test of intelligence
proposed by Alan Turing. Isolate a human, or a computer, in a room. Ask
her/it questions and if you cannot tell whether it is a human or a computer
by the answers, it has passed the test.
It has yet to be proven that intelligence has any survival value.
Arthur C Clarke
As another example, compare a cockroach to a car [AC]. A cockroach is
quite good at disappearing at the right time. It can sense wind disturbance
from an approaching attacker, and distinguish this from normal air
movements. As it detects a danger it escapes into the closest hiding,
avoiding obstacles along the way. A car on the other hand cannot even
sense another car approaching, and if it did and tried to avoid it, it
certainly would end up in the ditch. ”uy this new car it has got a
cockroach brain!
“The computer as intelligent is
not in our future; we haven’t
even achieved a Congress of
intelligent agents after 200 years
of trying. Instead, the computer
for the twenty-first century will
be the computer that stays out of
your way, gets out of your
desktop and into your clothing,
connects you with people instead
of with itself”
Mark Weiser, Xerox PARC
Some researchers in AI are very optimistic about future advances, but
beware, reality is much more complex than a game of chess! In fact, some
say that artificial intelligence is impossible. These researchers claim that it
is impossible to reproduce consciousness in a computer system. This is
because of the lack of interaction with the outside world, and the fact
that a computer is not part of a communicating community of other
intelligent entities of the same kind [JF][TWD2]. Researchers first have to
help the thing to translate aspects of the real world into symbolic
representations that a computer can use. Computer vision, and speech
recognition are some partial solutions. Second they have to represent the
information thus achieved such that computers can use it to reason. An
interesting question here is if the Internet can substitute for the outside
world and provide an interaction community for things and information.
When this happen this book will surely be rewritten.
One approach to AI is the expert system. An expert system represents and
uses knowledge from a limited area of experience such as diagnosis of
diseases. The knowledge is stored as rules, and inferences can be made
from the rules by a reasoning mechanism. Expert systems can be useful,
but after more than 30 years of research they still do not solve very many
real world problems. Representing the necessary knowledge in simple
rules has proved very difficult. Real world knowledge is fuzzy and
depends on context (and there are quite a lot of different contexts around).
We however constantly better understand the properties and complexities
of many of the problems to solve. This combined with increased
mathematical sophistication leads to more robust methods. One example
is speech recognition where new mathematically based methods such as
hidden Markov models (HMM), and large databases have greatly
improved usefulness.
The question of whether computers
can think is like the question
of whether submarines can swim.
E. W. Dijkstra
Why can’t a goldfish long for its mother?
Longing for one’s mother involves at
least:
(i) knowing one has a mother,
(ii) knowing she is not present,
(iii) understanding the possibility of being
with her,
and (iv) finding her absence unpleasant.
Aaron Sloman
I have been made by bright
monkeys. What other clever little
tricks will they pull on me before
my time is done?
Greg Bear
There is no silver bullet, only hard work!
Hakan Gulliksson
98
IV.8.5 Representations for processing
How can we represent processing, i.e. planning and action? One way is to
use a specific type of schema called a script, see figure IV.8.9. It orders a
sequence of actions, and can be used for instance to describe how to cook
pancakes.
Pancakes
Figure IV.8.9 Action sequ-ence
modelled as a script.
Stir together
flour, salt,
and sugar.
Stir together
beaten eggs
and milk.
Mix dry and wet
components.
Fry in butter.
Goals and cause-effect relationships stored as propositions is another
way to implicitly specify processing. We use some mechanism to fetch a
goal and select cause-effect propositions to find subgoals. We execute the
actions corresponding to the subgoals, and eventually reach the main goal.
IV.9 We remember
From the discussion above it is obvious that memory is very important.
For many systems memory size and access time are is a scarce resources
that have to be used economically.
I hear and I forget;
I see and I remember;
I do and I understand
Chinese proverb
Memory is essential for learning and reasoning. We will now introduce a
model of memory in which a human being is supposed to have three
kinds of memory, sensory registers, short-term memory and long-term
storage. This is one out of many models, each with a different view on the
functionality and the structure of memory.
Sensory registers are intermediate storage spaces between senses and
short-term memory. They assure that memory is associated with sound,
colour, touch and smell. In such a register, visual iconic storage will be
accessible, but only for a few hundred milliseconds, while auditory
memory can be available for up to 20 seconds. This is fortunate since it
takes some seconds to formulate and decode speech. We would certainly
have problems if we forgot the beginning of the sentence before the end
arrived!
Short-term memory is small, usually holding only three to four, or at most
9, information groupings called chunks. With such a small working
memory it is important that information refresh rate is high. Saved space,
i.e. less memory, has the benefit of higher refresh rate and faster access.
The result is that not only do we have a small memory, it is also short! Try
to remember what you thought about 10 seconds ago!
Hakan Gulliksson
Sensory
register
Short term
memory
Long term
memory
99
The severe limitation of short-term memory is strange considering the
total capacity of the brain. It seems to be a result of a balance that has been
established throughout evolution. Using more memory would increase the
possibility that the correct memory context is active. On the other hand
also irrelevant memory context is more likely to still be active. One
example can be that two conflicting goals are active at the same time.
"To know everything would
be impractical;
access time would be
exceedingly high"
From the television series
‘Mann and Machine’.
Long-term memory complements by preserving memories for years, up to
2 billion seconds in the end. Do you remember your best Christmas
present? The total amount of long-term storage, and the number of
associations possible, depends on the individual and can be increased by
training. Brain capacity expands as more data is entered! Compare this to
a hard disk where the available number of bytes is fixed. It has been
estimated that a human processes 10 terabytes of data over a lifetime, and
soon we will be able to store that amount of data on a single hard disc. The
soul catcher chip investigated at ”ritish telecom aimed at doing just
that, i.e. it tried to catch the soul of its user by collecting all of her
experiences.
Humans have a very small short-term memory. Furthermore, neither
General failure
short, nor long term memory is very exact. In one experiment people were
reading disk C
asked to draw both sides of a penny (the experiment was performed in the
USA). Out of eight possible features the median number remembered was
Who is this General Failure and
three [SP]! Try it at home (with your local currency).
why is he reading my disk?.
Internet
All knowledge, and also information about how to process information,
i.e. programs, have to be physically stored and accessed. In the human
brain this functionality is somehow integrated into the neural network. A
thing uses other mechanisms, including RAM memory and hard disk
drives.
Some parameters influencing the choice of memory type are price, access
time, capacity and power consumption. Examples are; CPU registers with
access time of 10 ns, a 256 Mbyte, main memory RAM with an access time
of 100 ns, and a 50 Gbyte hard disk in a PC, with an access time of less
than 10 ms. The 10 ns CPU register access time should be compared to the
70 us reported for raised human brain activity, and the 100 ms for human
conscious reactions, almost an eternity. The reason for the name Random
Access Memory (RAM) is that any (randomly chosen) memory cell has the
same access time. This is not true for a CD-ROM where access time
depends on where data is placed on the disc. Why a hard disk is called a
hard disk is a mystery for us. Ever seen a soft disc? Or a wet disc?
System memory
and its backup.
For each MIPS of increase in
computer performance there is
an extra 1 Mbyte of memory and 1
Mbit/s of extra I/O capacity needed
Amdahls rule (MIPS = Million
Instructions Per Second)
How data is structured in memory is also important for fast access and
this structure is usually application dependent. There are however two
general properties we can use to enhance memory performance. The first
is the principle of time locality. An item just referenced will tend to be
referenced soon again.
Hakan Gulliksson
100
CPU
CPU
cache
System
RAM
Hard
disc
Backup storage
Figure IV.9.1 Tradeoff between
price and performance for
memory.
Access time
The second principle is the principle of spatial locality. Items with
addresses close to the item just addressed will also probably be referenced
soon. With this in mind, and to get the most out of system memory per
Euro, a hierarchical structure is used, where data just used, and with high
probability to be referenced soon, i.e. data close to the data just used, is
kept close to the CPU. Very fast memory, called the CPU cache, is placed
closer to the CPU and is updated from slower, cheaper, memory. The idea
of a cache is reused also for web pages on the Internet.
Remembering something should be seen as an association of events with
memories. Along with the thinking of connectionism, see Chapter IV.7.1,
and considering that the brain is a neural network, we should not think of
memories as something tucked away in files and ordered according to
some logical pre-defined scheme. Rather, memories are retrieved after an
activation of the neural connections by external and internal inputs
(mental state and situation). As a result structures in the network of the
brains are highlighted. This will be done differently for every person, at
every time, in every situation, and it is a dynamic process.
We all build very personal networks as life passes by. Our common
heritage, and the fact that we share many experiences means that our
neural networks are similar, but they are never the same. If by a strange
coincident we would have the same structure of connections we still
would assign different weights to them.
On a clear disc you can seek
forever.
Peter J Denning
1999
2000
♠
♣
IV.10 We attend to it
Attention is of vital importance whenever a human is involved in an
interaction. This means that how to attain attention is something that must
be studied, which of course has been done in depth for public relations,
and political propaganda.
Any interactor faced with reality, and not prepared for the shock, will be
overwhelmed by information. Reality continuously presents us with
parallel events, audible, visual, tactile all around us. One way to manage is
to consider only parts of the information, i.e. to focus, and make use of
attention, as humans also do.
Hakan Gulliksson
101
On a high level human attention is determined by self relevance (needs,
goals), pleasantness of stimuli (music, humour), emotions, and ease of
processing. On a lower, functional level, the following figure illustrates
how we keep focused on the task at hand.
Long term
memory
Sensory input
Short term memory
buffer (cache)
Figure IV.10.1 Cognitive
architecture for attention.
Attention(t)
Current objective
Sensory input and memories are collected to a short-term access buffer
(cache). The buffer is not very large, which means that new, more
interesting sensory input, have to flush it. Results from internal mental
processes that need other sensory input, or memories, also flush the
buffer, as well as unwanted external distractions. The system is highly
time dependent, because objectives and sensory inputs change, and longterm memory evolve. Attention is also easily disrupted by stress from
noise, light, anger, or lack of sleep. Since people cannot be redesigned we
have to make sure that the systems we design take our limitations into
consideration.
Since attention directs the resources available to the most important issue,
it is necessary for successful interactions in physical reality. A computerbased solution to the same problem is to assign priorities to internal
processes and make sure that the process with the highest priority gets
access to the resources, such as CPU and memory. A system with a few
simple tasks can manage with fixed priorities. Systems working in a more
complex environment need algorithms to change the priorities.
Read the following sentences while at the same time
saying out loud
“7, 5, 2, 3, 10, 6, 1, 4, 9, 8”
“10, 9, 8, 7, 6, 5, 4, 3, 2, 1”
Which was easier to say?
The stroop effect
Definition:Attention is the
application of the mind to any
object of sense, representation,
or thought (just as You thought).
Attention is a finite resource and it needs stimulation to set off, i.e. an
alarm system. For humans we have many innate mechanisms for this.
Having something too hot to drink, and movements in the periphery
trigger such behaviour. Other more sophisticated alarm systems need to
be trained in a social environment. Most of us for instance learn to sense a
changed level of tension at a meeting. In general what sets us off is a
change of a pattern or state. Experiments show that airplane pilots using a
heads-up display when landing easily miss another airplane blocking the
runway (landings were simulated). Another example is that using a
mobile phone severely reduces the attention spent on driving a car, since
hearing and vision fight for the attention. Computer systems have an
interrupt mechanism whereby a lower level system, such as a mouse
driver, can alert higher-level software. In the case of a mouse a function is
triggered that updates the pointer on the screen.
Hakan Gulliksson
102
In the discussion above we ignored some rather important questions. One
is how different senses compete for attention. If we quickly skip that one
and just look at the visual system, only we have some questions for that
too. What information is important enough to capture attention? How
does the visual system know when to attend to a specific event, and when
to shift attention to another one? How does it do this efficiently? The same
problems are now facing engineers and scientists when they want to build
adaptive things. We will come back and discuss some potential solutions
in the next part of the book.
To somehow represent an inner model of the world is well-spent memory.
A mental state is an internal image on a functional level, and humans and
animals are currently the only interactors with such a model. If we restrict
ourselves to interactors that have beliefs and fears both about their
intentional states, and about the states of other interactors, we are
probably left with humans only.
Only by concentrating the finite
resource attention can we make
things happen.
[MC3]
If you believe, or know, about something , you have a mental state that
represents to believe, or know, about something . You can for instance
have a belief that there will be dinner on the table when you come home
after a long days work, perhaps candles, and some wine. Compare that to
a simple goal-directed action like eating when hungry, with the food
already there, in front of you on the table. In the second case there is not
much of a mental model, no intentionality needed for eating. What you
believe, when you believe that dinner is served, is however not merely a
sentence, there s more information represented inside your head.
We have mental states of different kinds, such as nervousness, elation,
depression, belief, desire, hope, and fear. To be in a mental state is in other
words to be disposed to behave in certain ways. For interaction some
mental states, called intentional states, are especially interesting. They are
directed at, or about, states of affairs in the world and many of them can
be externalised by means of human language, or by other interactions
[JS3]. Some intentional states are [JF]:
Interactional: percept, information, decision, request, norm.
Representational: belief, hypothesis.
Conative (to try, undertake): tendency/goal, drive, claim,
commitment and intention. Intention is used here in the meaning
of an act of will, and is a special case of an intentional state (easy
to confuse).
Organisational: method, task.
Other: fear, desire, hope, dream, affect.
All of the different intentional states are not independent; desire is for
instance also a drive.
Intentions are important because they heavily influence our behaviour
and course of action by the following properties [MW3]:
They drive reasoning and serve as goals.
They persist; we typically do not abandon intentions without
good reasons.
They constrain, we usually do not nurture inconsistent
intentions.
Hakan Gulliksson
103
When an intention is selected the actor makes a commitment to it.
Commitments are managed in different ways. One strategy is blind, or
fanatical, commitment where the intention is maintained at all costs until
it is fulfilled. This simple strategy is not the best if the environment
changes frequently. A trade-off has to be found between adaptability and
simplicity. So far humans have much more advanced mental states than
any technology can provide.
Below our consciousness, sublimal processes are at work, taking care of
matters that we do not currently care about. The task of driving a car on
the highway is soon delegated to a lower level of attention. Another
example is an advertisement in video (now forbidden) where a frame here
and there in the original video is exchanged by an advertisement. The
exchange is not consciously detected, but the effect is real. We will be
affected by the ad.
RIP
ALWAYS COCA COLA
By the way, are you sure that this text does not contain hidden sublimal
messages?
The cocktail effect is another interesting trick played by attention. You
enter a room filled with people chatting. In this noisy environment you
suddenly hear your name mentioned. The voice speaking your name was
not louder than the others, it is just that you have filtered out the familiar
sound pattern, and directed your consciousness at the sound. There are
also other effects. The precedence effect is for instance the very
convenient behaviour that the first sound that arrives gets attention, and
echoes are ignored. A last example is that differences are heard rather
than similarities. If you suddenly start hearing your car while driving it,
something is possibly wrong. The car sounds strange .
Dinner
shopping
Bear
approaching
Baby
crying
IV.10.1 Reaction time and attention span
From the previous section we learned that consciousness works like a
flashlight focusing on the currently most important input. According to
our consciousness we react immediately to external stimuli, but this is not
so! Experiments show that it takes up to a second before we express an
intention; and at this time our brain has long since prepared the reaction.
This evokes the question of free will. Are we really in control?
Some reactions are much faster, but they are not under our conscious
control. If you put your hand on a hot plate, no second elapse before you
remove it! Measurements done on the brain shows raised brain activity
about 70 microseconds after a visual stimulus, but almost a tenth of a
second will pass before we recognise the visual object, under the best of
conditions. Every action involving muscles takes time to perform, even if
it is not conscious. One example is the quarter of a second it takes to move
your eyes in the direction of an action. Things and information can be
implemented with much faster response times.
The response time of a feedback is an important factor when performance
is evaluated. Human attention sets an upper limit to this response time. A
chunk of data is kept in short-term memory for no longer than 15 to 30
seconds. After that time the information once again have to be retrieved
from long-term storage before we understand what is going on. The
philosopher William James estimated his own attention span to
approximately ten seconds. Try to measure your own!
Hakan Gulliksson
There is more to life than
increasing it’s speed.
Mahatma Gandhi
The bright senses, sight &
hearing, make a world patent
and ordered, a world of reason,
fragile but lucid. The dark
senses, smell & taste & touch,
create a world of felt wisdom,
without a plot, unarticulated, but
certain
Crowley
Attention span: The length
a time an individual can
focus attention on a
particular object, task, or
material to be learned.
104
For things and information the attention span can be as long as the
designer necessary. There is however a cost associated with a longer time.
More data will be ready at hand and this demands more memory and
increase processing time.
The human brain does not have an idle mode so if a task is put on hold
our brain goes off planning the next move, or wanders off to think about
something completely irrelevant. A slightly too long delay will always be
used somehow. Some delayed responses are particularly annoying, such
as when we plan ahead, and we know that if we have done something
wrong, and the time before we can fix the error will be long. Using an
overloaded Internet engine will trigger this reaction. Did I give the
wrong search word? This aggravation can be somewhat relieved by
feedback, for example an indicator showing the time remaining before the
answer arrives.
Moderate disturbance (e.g. a
quiet radio) and the presence of
other people can help sustain the
level of attention.. This is one of
the reasons why some students
say they can study better if they
are playing music at the same
time, if it is too quiet
performance suffers.
Events
What delays we accept depend on our expectations. When pressing the
stop button on a video recorder it should stop within a second or so
(within 2 seconds according to [BS1]). If it does not stop we become
impatient and press the button again. Still no reaction and we pull the
plug (at least if we work with computers and are used by their behaviour).
Another example is that if we press a light button in a virtual environment
we expect the light to go out immediately, or at least within a tenth of a
second.
time
Objects
The list of positive effects from fast feedback, i.e. short delays, include:
The plan for solving a problem is easily remembered.
Distractions are ignored since the focus is always on the problem.
Errors are rapidly handled with minimal distortion.
IV.11 We reason
What is the use of all this processing? Maybe it can help us to solve a
problem? The intelligent agent, including the human, has needs, targets to
strive for in its life, both in the short and in the long term. Solving
problems is also fun, and a major underlying theme of this book. It is the
process of accomplishing an objective through a series of not immediately
evident actions, and involves an internal representation of the problem, a
search through the space of possible actions, and a selection of a set of
actions using principles specific to a domain. The domain specifics are
what differs problem solving from the more general term reasoning. Both
are principles to draw conclusions by manipulating information.
Reasoning about problems is a fundamental human ability, and will be for
any thing or information that have to manage in an even the least
complex, loosely specified, or changing environment. It can be carried out
in two different ways, either by deductive, or by inductive reasoning.
Deductive reasoning starts from generally valid assumptions, true
statements, and uses them to draw conclusions that can help us reach an
objective. It works from the general to the specific, starting with an idea of
a theory formulated as a set of hypothesises. If each of these is verified
through observations then the theory is confirmed. We start with the
hypothesis that E=mc2, and try to verify it through experiments and
already verified scientific facts such as F=m·a, and the wave equation.
Hakan Gulliksson
I never guess – It is a shocking
habit – destructive to the
logical faculty.
Sherlock Holmes
Is there a life after death?
Eternal question
105
Inductive reasoning instead starts from one or more observations about
the world, observations that are verified only in special cases, i.e. they can
be false. From the observations patterns are sought. Just like Sherlock
Holmes we use these patterns to build a hypothesis that can be tested,
verified, and packaged into a new theory.
One example of how humans use inductive reasoning is the following one
[NS]. Try to estimate if words starting with an r are more common than
words with r in the third position. The first attempt to solve this
problem starts with searching your memory for examples of words of the
two kinds. Since it is much easier to retrieve words starting with r you
falsely ? induce that there are more words starting with r .
The problem in the game of chess is how to select the next move such that
it maximises the chance of winning. The search space is too big for a
human to traverse all of the possibilities, so heuristics are needed to prune
it. A heuristic is a rule that can be used to simplify a solution to a problem.
The rule can be found from commonsense knowledge, by trial and error,
or in some other way.
Eureka!!
!
Figure IV.11.1 Pruning
the search space.
Problems in open systems, e.g. social environments, are in general too
complex to be solved in an optimal way. The solution strategy is to learn
enough about the problem specifics, and of the context, to adapt an old
heuristics, or invent a new one, to approximately solve the problem. It is
often necessary to look at the structure of the information in the
environment to select a good heuristics. The solution in other words will
be specific to the problem rather than general purpose. Examples are the
heuristics you devise to select a mate, and how you decide on the location
of your new home.
One useful method for applying heuristics is means-end analysis. At each
state in the search space you choose the transition that minimises the
difference between the current position and the goal state. Figure IV.11.2
below illustrates the example where the road to choose is determined by
the shortest remaining distance to Umeå. Here it is quite simple to
determine the distance to the target. In a game of chess the choice is
usually not this obvious.
Umeå
10 km
Umeå
Show that xn + yn = zn has no
solution in whole numbers,
where n > 2.
Fermats last problem, dotted
down in a margin that could
have needed enlargin
T Lehrer
Figure IV.11.2 Meansend analysis. Select the
path with lowest cost.
Umeå
12 km
Hakan Gulliksson
106
When we solve algebraic problems we use another strategy. Such a
problem is formulated as a set of equations, using the well known notation
shown in the example below, and some prerequisites. The solution is
found by substitution:
Potent problems: pollution, poverty,
population, and political power.
Problem: What is the value of a?
Prerequisites: e=1, b=3
Equations: a= b + c, c= b + e
Unknowns in the equations are eliminated, one by one, in the right order,
to retrieve the value of a. We ask ourselves, what are the unknowns? What
facts are given?
This type of algebraic problem can also be formulated in words If Tom
has twice as many problems as Joe who has half as many as Mary who has
2. How many has Tom?
Already while reading the problem statement the solution is planned. We
ask ourselves what the unknowns are, and what data that is given [GP].
Tom has twice as many as Joe means that if we know how many Joe has,
we know the answer (Tom = 2 * Joe). Keeping this in mind we go to the
next statement Joe has half as many as Mary . Now we know the answer
if we know how many Mary has (Joe = Mary / 2). The last statement Mary
has gives us the last clue and we can now backtrack to the solution.
Joe =2 / 2 =1, Tom = 2 * 1 = 2.
Elementary, my dear Watson
Not said by Sherlock Holmes
in any of the books
A reasoning of different kind is cased based reasoning (CBR). It is not
directly build on logics, as the mechanisms above, but explores the fact
that we (and machines) already know patterns that can be reused by
analogy. If you know how to make pancakes you are well off to use the
frying pan for other courses. This reasoning involves generalising a
particular solution to another case. If you for instance know how an IFstatement works in Java you will have no problem using similar constructs
in other programming languages. CBR means extracting information
about a specific situation, and storing it. Next, the relevant aspects of the
new situation have to be found that can be used to find the stored
knowledge. How we represent knowledge is important. The last step of
the procedure is to adapt the found knowledge to the new situation.
What do you do if some necessary prerequisites are missing? In a more
complicated example an expert (you of course) will find out that
something is missing, but the novice might not. The expert will also better
determine what missing information to start searching for, and will know
how to obtain this information. Expert problem solvers in a discipline also
learn how to recognise patterns in problems, patterns that they can use for
selecting the next step in the solution process. A good programmer will
recognise structures in a problem and use them to delegate parts of the
solution to separate functions. A parent will know the significance of
slightly different screams from the baby. A crime investigator such as Mr
Holmes ignores the right irrelevant details.
Most real world problems are characterised by incomplete information
about system variables. Also, the resolution of known variables can be
low, i.e. variables change value too fast, or might even be tampered with
by evil forces. This leads to contingency problems where the solution of
the problem needs feedback while looking for the solution. The agent
Hakan Gulliksson
What to have for dinner?
Morning out of the sun
A smell of toast is in the air
When there’s a war to be won
The flying toasters will be there
Text from the “Flying toaster”
screen saver
107
must continuously explore the problem state space either directly
through experiments, or indirectly by simulation.
Typically real world problem also involves a lot of states. This means that
computational complexity, and memory consumption are important
considerations, and achieving real time performance is a problem. There
are two strategies available here. Either we use a general solution, or we
create a specialised solution to the problem at hand. The general solution
is flexible and can be reused on more problems, but consumes more
memory and computational resources. The specialised solution is efficient,
but since it is not flexible we are forced to develop a new solution for each
new problem, and development itself takes time and consumes resources.
IV.12 We plan and search
If we are faced with a really tough problem, like preparing breakfast, we
have an enormous information space of possible actions and physical
constraints, which means that we must use informed search. How do you
specify a heuristic function for preparing breakfast? You have several
tasks to perform, sometimes in sequence and sometimes in parallel. To
open the refrigerator, get the milk, put the water kettle on, and slice the
bread, are all relevant subgoals. So far, the only tool described in this book
and available for you is to the entire time search through all of the
impossibly many actions available for each subgoal. It seems that we need
at tool that better can help us structure actions and objectives. We need a
plan!
Action
Subgoal
Main goal
Napoleon used planning to make sure that his armies were used wisely.
How would he have managed with search only? He failed, was that
because of bad planning? Incomplete knowledge of state space? Maybe he
just had bad luck?
Given that we have an inner model where different actions are
represented, and a problem to solve, we must choose between actions. We
say that we have a plan if we have a representation of the goal, and a
sequence of actions that when executed achieves the goal [PG]. Note that
the choice implies a search through the possible sub-goals and action
sequences as indicated in Figure IV.12.1.
Think about the last time you made a
plan. Assuming that you remember
one, list 10 circumstances under
which the plan would not work.
Goal
(1) Planning
Sub-goals
Figure IV.12.1 The
planning process followed
by execution of actions to
achieve goals.
(2) Execution of actions
Planning is one activity that situates humans in time, the other being
storytelling about past or imagined events. Hope and regret are two
emotions that reflect this situatedness, and to support it we try to format
and structure information such that it makes narrative sense [CH].
Hakan Gulliksson
108
The human seems to be the only animal able to plan ahead. We can spend
four years on an education for an exam (silly is it not ). Or we can buy
twice the amount of pasta that we need this week, when it is sold out at a
lower price, because we foresee that we will need it next week. Our ability
to plan is a gift, but also a curse. We have to choose between enjoying the
passing day and preparing for the next. The uncertainty and anxiety that
comes from having to choose, and knowing that we choose, is
fundamentally human. Still, despite all planning and choosing, and
because of the complexity of our context, it is likely that we will discover
the consequences of our actions only after making them. Heuristics,
imitation, and post rationalization are consequently found everywhere.
To plan you have to understand cause and effect. It is inherently
connected to our notion of time, because without cause and effect time
would cease to exist. Since planning is fundamental for survival, sorting
events into cause and effect is also very important. We always try to find a
reason for things that happens to us. If we cannot blame anyone else, we
blame fate, or in more positive circumstances we accredit luck. Looking
for a cause is usually a wise thing to do; if you hear a bang when you are
out driving, you slow down. Puncture, gearbox problem, superman
landing on the roof? As events always bombard us it might seem that
planning is a kind of Sisyphus work, where we constantly have to revise
plans, but somehow we manage to keep small continuous changes in
prerequisites and context from flooding the planning process.
Planning algorithms use descriptions of states and goals in some formal
language, usually in first order logic. This explicit description enables
programs to reason about the states and the goals. Actions are represented
by logical descriptions of cause and effect enabling the planner to relate
states and actions. With this arsenal of descriptions the planner can, for
instance, use the idea of divide and conquer and attack independent
subgoals separately.
"Baldric, you wouldn't
recognise a subtle plan if it
painted itself purple
and danced naked on a
harpsicord singing `subtle
plans are here again'."
Edmund Blackadder, `BA IV'
I choose therefore I am
“Jag har en plan”
Sickan
What if the plan fails? There might be assumptions made during the
planning that does not hold the test of reality. Perhaps context changes or
is misinterpreted. This is the contingency problem applied on planning.
The planner has to do a trade-off between either adding too much
information about the world, or using sensory information to detect when
the plan does not work.
At this time we want to mention two other related fundamental problems
facing artificial intelligence. The first problem is called the qualification
problem and is the problem to define the circumstances under which a
given action is guaranteed to work. There are many possible reasons that
could stop you from going to work in the morning. The bus could be
blown to pieces by a bomb, a snowstorm barricades your door, your alarm
fails, or all of the above happens on the same morning. The problem is to
qualify just enough conditions to see if something can be done, a task well
performed by our common sense .
Hakan Gulliksson
109
The second, related, problem is the framing problem, also called the
ramification problem. It concerns the fact that we need an infinite amount
of data to exactly describe the currently relevant aspects of reality and the
implicit consequences of actions. When creeping out of bed in the
morning thousands of small creatures in your bed will get cold (about
10.000 of them) with consequences you never think about.
Even a seemingly simple problem such as preparing breakfast involves a
surprising amount of knowledge. How come that we know that the butter
stays on the knife, but milk does not? That milk stays in the glass? That we
cannot hold the glass of milk and the sandwich in the same hand? If we
took everything into consideration that could possibly have an effect on
our early meal we would die from starvation [DD]. We would, by the
way, also die if we tried to figure out everything that does not affect our
breakfast.
Every square inch of the human
body supports on average 32
million bacteria .
To make things even worse reality is dynamic and often difficult to
predict. There is a story from chaos theory about a butterfly in Paris
creating a storm in New York. This reflects the fact that any small physical
effect, under the right circumstances can be magnified totally out of
proportion and ruin several well-planned picnics in New York.
When solving a problem search is one way to find a sequence of actions
that leads to the goal, i.e. to find a solution to the problem. Search is an
extremely useful concept not just for solving problems, but also in general
for all sorts of information retrieval. It traverses the branches of the state
space tree from the root node (root state). If only two branches are allowed
at each branching point, the tree is called a binary tree. If you traverse the
tree and always select the next deeper node at each branch you do what is
called a depth first search. If, on the other hand, you complete each level
of the tree before starting the next level you do a breadth first search.
If the only knowledge you have is about which nodes that are direct
descendants to a node, you have to use blind search. A better strategy is,
if possible, to always traverse the path starting at the state with the
currently lowest cost. This is called best first search. In chess, the number
of possible moves is estimated to be 10 120 [NS]. Search strategies for a
problem with such a (realistic) search space have to compromise on
optimality and as a result the optimal solution might be missed. Other
reasons imposing tradeoffs are that the time for finding a solution is
limited, processing resources are limited, operations that have to be
performed while searching are very complex, or there is a shortage of
available memory. If it is impossible to visit every node in the tree, to the
full depth of the tree, i.e. to do an exhaustive search, a greedy search
algorithm can be used. The greediness is formulated as minimising the
estimated cost to reach the goal state . If we know the cost to reach the
current state we use a heuristic function that is a rules of thumb to
approximate the remaining cost to reach the goal. Next we pursue the
path with the least total cost in a best first search.
Hakan Gulliksson
Rheumatic pain is associated with
changes in weather. (Hmmm?)
Categorizations is made on the
basis of similarity between instance
and category members. (Huh?)
Two events can have greater
chance of co-occurring than either
event by itself. (What?)
110
One example where heuristic search is necessary is if you want to find the
research paper How to search in your room let us say that you have
several papers lying around in piles). You try some of the piles where such
a paper could be found. If you find a paper with the title Traversing data
structures you suspect that the paper on search is nearby, i.e. you use a
heuristics and you thoroughly search that pile first.
IV.13 We make decisions
Why are some plans and actions executed and others not, i.e. why do we
take the decisions we do? The model shown in the next figure focuses on
this question, dividing the mental system of a human (or any other
interactor) nicely into two blocks [JF]. The first block is the motivational
system where all inputs that can have effect, i.e. can motivate, are collected
into tendencies. Since living systems cannot stop doing, motivation is
more about selecting among alternatives than motivating doing
something. Motivations help to steer behaviour and increase alertness. If
you for instance are thirsty you look for water rather than bread, and the
thirstier you get the more intense your search will be (up to a point).
7.00 Monday morning, November.
(surely a false alarm)
A priori goals (survival)
Decisions
Tendencies
Motivational system
Motivations (interpreted
percepts, inter-individual and
social claims commitments,
drives), decision uncertainty.
Decisional system
Figure IV.13.1 A model
with two subsystems for
managing decisions.
To representational and
organisational systems
(commitments, plans, standards,
hypothesis and so on)
The tendencies are input into the decision system, which evaluates
tendencies and selects a decision. Outputs from the system are the
decision and the representations of the decision, for instance the actual
plan resulting from the decision, or the commitment made. We all have
lots of commitments, to our friends, to the society, to the environment and
so on. They reflect the fact that we plan ahead, that we can promise
something about how we will behave tomorrow, and they help to stabilise
the world [JF].
Which why?
What to do?
When? How?
The system should be thought of as a continuous interaction, a process,
where feedback is extremely important. Feedback from the environment,
on the results of decisions, is needed to evaluate decisions and can, among
other things, support a model of uncertainty for a decision. Uncertainty
can thus be used by the interactor to motivate behaviour, see figure IV.13.1
Human processing solves problems in a veritable chaos of changing
information. Problems are solved with limited resources in time,
processing power, knowledge, and memory. This among other things
calls for simple heuristics to
Search for alternative actions, more information, or both.
Know when to stop searching.
Make the decision from the current situation
Hakan Gulliksson
111
One example of a heuristic is to choose an alternative that is recognized,
and ignore the other. Another example is to reuse the strategy that worked
the last time. Along the same line of thinking a hypothesis that is easy to
represent in memory will be favoured, as well as one that we have a more
detailed description of. When we are faced with several different
alternatives another simple heuristic is to select one cue out of many and
use the decision with the highest value for this cue, ignoring information
about all the other cues. Recency is another aspect that strongly influences
how we interpret events; a recent occurrence is recalled more easily and
the decision made will be revised. There are also specific external events
that trigger actions and decisions. One example is that if a warning signal
sounds, the hypothesis is that something needs attending to.
If we have decided something we tend to favour it, and in general be
confident in it. We will to some extent even ignore evidence that does not
match, i.e. the first impression is lasting . This inherent overconfidence
in our own beliefs also affects the effort that we invest in evaluating our
beliefs. We rather search for evidence that confirm a belief than evidence
that do not. We are a positive thinking breed and should take this built in
behaviour consciously into consideration when making choices.
Humans are also somewhat irrational also when evaluating probabilities.
We tend to overestimate the probability of an event with a quite low
probability (less than approx 10%) as seen in the figure below. We also
tend to underestimate the probability for the rest of the interval. One effect
of this is that we behave differently when facing a possible loss or gain.
Suppose there is a choice between a certain win of 1$ and a 50-50% chance
of winning 2$ (or nothing at all). According to the curve we will
underestimate the 50-50% probability and choose the certain option. If we
reformulate this and say that we face a certain loss of 1$, or a 50-50%
chance of loosing 2$ we will tend to underestimate the risk and go for the
50-50% chance [CW].
Subjective
probability
Russia
Perfect behaviour
High
Typical behaviour
Low
Low
High
Figure IV.13.2 Humans
subjective probability
underestimates high
probabilities.
Stated
probability
Decision-making can be improved by learning and training. Accounting
and playing chess are two examples. In these domains feedback is
available and there are rules of behaviour. Stockbrokers face quite a
different problem with low predictability, and with ambiguous and
delayed feedback. Learning also helps to find alternatives when accepted
decisions and current behaviours do not work. One example is that a child
learns to walk instead of crawling. Another way to improve decisions is to
work out procedures, routines to follow given certain conditions. Most
decisions that we take repeatedly, such as weekly shopping, or driving to
work every day, are performed (and optimised) as procedures. A third
way to improve decisions is to automate them, i.e. let the computer take
Hakan Gulliksson
"In retrospect it becomes
clear that hindsight is
definitely overrated!"
Alfred E. Neuman
Consider that the world is
trying to tell us something, if
only we know how to listen.
Sensing can thus be viewed
as a form of communicationin which information flows
from the environment to the
attending agent.
[RA]
112
the decisions. Computers do not bias facts, or underestimate probabilities
(when programmed correctly).
Whenever a selection does not give the wanted result another course of
action must be chosen. One example for a strategy here is trial and error.
Trial an error does however not always work well, for instance when
meeting a tiger in the jungle. This is one example where emotions can
support the model in figure IV.13.1. An emotion, here fear, can modulate
motivation by weighting in the uncertainty of the situation. The little one
knows that the jungle is a dangerous place, and when facing the tiger
understands that uncertainty is high, flight rather than fight seems to be
the best option.
What we have learnt enables us to choose the right sequence of decisions
for a given situation. We usually cannot foresee all the particularities of a
situation, so we learn patterns and adapt them to the situation at hand.
Figure IV.13.3 shows how knowledge about decisions can be represented
in a data structure to solve a problem. In this case as a decision trees for
two different situations.
“The task of rational decision is to
select that one of the strategies
which is followed by the preferred
set of consequences.”
Herbert Simon
Context dependency means that the same sequence might not work in
another situation, and keeping track of all of the situations where a
decision works, or not, is itself a problem,
Situation 1
Situation 2
Figure IV.13.3 Two different
situations demands two different
sequences of decisions.
Solution
A similar figure to IV.13.3 can be drawn where two different sequences of
action lead to the same goal. In this case the interactor needs to select one
path, preferably the best one.
Making a decision is easier said than done. Complete knowledge is rare,
facts are uncertain, and only some of all the relevant strategies are possible
to evaluate. Facts can also depend on each other such that conditional
probabilities are important.
IV.14 We learn and adapt
Humans are surprisingly adaptive. While interacting we ignore all sorts of
irrelevant information, errors, and inconsistent behaviour. We adapt to
odd habits and foreign cultures, we overcome differences in age,
knowledge levels, and language. As teachers we adjust the presentation
level of knowledge such that it match the knowledge level of the students.
If the first explanation does not trigger a spark of understanding then we
try another angle, perhaps offering specific examples instead of describing
a general principle. A fundamental prerequisite for adaptation is our
ability to learn, see also Chapter III.1.4.
In the definition by Maturana to the right, learning and interaction are
intimately related. We for instance do not learn about hammering from
some abstract mathematical model, with forces and angles, neither do we
Hakan Gulliksson
Learning is not a process of
accumulation of representtations of the environment; it is
a continuous process of transformations of the behaviour
through continuous change in
capacity of the nervous system
to synthesise it.
Maturana, Biology of
cognition 1970
113
store the idea of hammering as a symbol. We learn it by engraving the
actions of hammering into the nervous system.
Figure IV.14.1 Learning as the
process of engraving behaviour
into the nervous system. (You
also have these interconnected
little circular shaped things
don’t you?)
To use knowledge that we have, we must be able to recall what we have
learned, when it is needed. This is however rather a re-production of
behaviour rather than only a retrieval of some previously stored data
structure.
Mother nature is greedy by nature and dislikes random knowledge.
Whenever knowledge is stored it is because it has been used for a purpose.
Therefore knowledge can also be seen as part of a problem solving
process. Finding your way to work is one example. You can take hundreds
of routes, but there is one short and easy. This is the one that you have
stored in your memory, and can follow without difficulty, even without
conscious effort, every day. Try to think about some knowledge that you
have that is not problem related!
A little knowledge that acts is
worth infinitely more than much
knowledge which is idle.
Khalif Gibran
Why is learning important? Most of all because adaptation by learning is
much faster than adaptation by evolution that takes generations to refine
randomly generated genetic variations. Learning combined with planning
and reasoning are powerful tools for survival. The development of
humanity can be seen as a journey along a path of knowledge [AC]. What
we do when we learn in school and from life is following our path of
knowledge.
Figure IV.14.1 Knowledge
is gained through interactions.
Science adds to the path by researching new pieces of knowledge from an
enormous search space. We add layer after layer of knowledge to our
collective fortune. This process is dynamic and there is no guarantee that
knowledge will survive, it is something that humanity will have to
constantly fight for. Some ideas added can only be understood once others
are already assimilated. It is for instance difficult to understand
multiplication without having mastered addition.
Hakan Gulliksson
Nature is seen by men through a
screen composed of beliefs,
knowledge, and purposes, and it is
in terms of their cultural images of
nature, rather than in terms of the
actual structure of nature, that
men act.
Rappaport
114
However, learning does not come for free. It takes time and other
resources, and during the learning period the interactor, or even the
society, is vulnerable. Are we learning the right things, and adapting the
right way, to the changed circumstances?
Delayed reproduction
time (approx. 20 yrs)!
IV.14.1 Taxonomy for learning
A group of psychologists lead by Benjamin Bloom developed a taxonomy
for learning behaviour. They identified three overlapping domains for
learning: affective, psychomotor, and cognitive learning. Affective
learning relates to how we learn emotions, attitudes and values. The fact
that we learn them is shown by how we behave when we grow up. We
learn how to show attention, concern, interest, and how to act responsibly.
Psychomotor learning is about learning more basic motoric behaviour
such as coordination, fine motor skills, dance, athletics, and how to make
facial expressions.
Cognitive learning in Blooms taxonomy is what we typically do in school,
and it takes place at the following six knowledge levels. With each level
the understanding of what is learnt is increased.
1.
2.
3.
4.
5.
6.
Knowledge, define, list, recognise, repeat, facts.
Comprehend, classify, describe, translate, facts and their
relationships.
Application, apply, interpret, operate, sketch.
Analyse, calculate, compare, criticise, examine, experiment, test.
Synthesise, construct, create, design, develop, organise.
Evaluate, defend estimate, judge, predict, rate, evaluate, argue.
This list is also interesting since with each higher level it becomes more
difficult to describe how humans manage to perform the tasks.
Learning about learning is by necessity a multidisciplinary effort.
Pedagogy, psychology, computer science, linguistics, philosophy, and
neuroscience are all interested in this phenomenon. Pedagogy studies
learning from a phenomenological standpoint, and also in the
complementary activity, teaching. Typical questions are Is this a good
way to teach this subject? , Will this situation enhance learning? , If I
formulate the knowledge in this way, will understanding be enhanced? .
IV.14.2 How do we build knowledge?
Knowledge is the result of interaction! There would not even be any
knowledge without interaction, i.e. without confrontations with the world,
and especially with other interactors. Try to find some knowledge that
you have that was not obtained through interaction! It is in other words
extremely important for people to interact with other people through
discussions, books, or by other means to achieve knowledge and knowhow.
Hakan Gulliksson
“Knowledge is above all the
fruit of interactions between
cognitive agents who or
which, acquire it through a
process of confrontation,
bijections, proofs and
refutations.”
Lakatos
115
Learning and training can be achieved by different behavioural
mechanisms:
Habituation is learning where we become accustomed to the
input. In the case of the sensory system, the input signal is
ignored after a while!
Conditioning, where good behaviour is encouraged and bad
punished, e.g. say ma-ma .
Copying, where successful examples are followed.
Trial-and-error, just doit. If you do not know how to accomplish
a task you just perform any, more or less random action, and
hope that it will reduce the distance to the goal state.
Playing, this is a variation of the trial-and-error method that is
sometimes driven by curiosity and encourages creative thinking.
Playing is a good way of elaborating knowledge, and results in
well-formed behaviour. It is also an excellent method to collect
information for later planning.
Knowledge elaboration of various kinds, such as playing, helps us to
remember, understand, and to acquire new skills. Because we tend to
forget things, we need to rehearse as well as to learn. A hopeless situation
in the long run for a single human who wants to learn as much as
possible, even if we constantly add new supportive technology, for
instance electronic calendars. Limitation in human attention also affects
learning; to learn we have to pay attention. It is for instance difficult to
learn how to play the trumpet and to study history at the same time.
ODEUR
For men
by hg
Ever tried? Ever failed?
No matter!
Try again, fail again,
fail better!
Samuel Beckett
"Ah, Percy. The eyes are open,
the mouth moves,but Mr. Brain
has long since departed."
Edmund Blackadder, `BA II'
From this it follows that we need to be free from the everyday survival
tasks in order to have time to pay attention. Another conclusion is that
specialisation is advantageous since it gives adepts more time to learn
(some important knowledge like this). This seems to suggest a world of
specialists, but there is a major problem with this prophecy. A specialised
language is necessary among the specialists from narrow disciplines, but
creativity prospers through the cross-fertilisation that occurs when
disciplines meet. Cross fertilisation implies overlapping knowledge and
common language, which means that experts cannot become too
specialised. Without creativity, adaptability will suffer, and adaptability is
something that will certainly be needed even in a society of experts.
Learning is creative work. A new knowledge pattern must be imprinted in
the brain and intertwined with previous knowledge. This process will be
unique for all learners, at least for all of the type human. Another
observation is that knowledge is personal, and something that everyone
has to build herself. This is because knowledge is learnt in a personal
context that can make it difficult to understand knowledge if we do not
have access to the context in which it was generated. Is it possible to
understand how to make a snowball if you have never seen snow? What
does the expression green fingers mean to an Eskimo? Think about
explaining to a bushman, coming directly out of the desert, how to change
language for spell checking in Word ®. If you had the ultimate wisdom it
would probably be so dependent on your own knowledge that it was
completely useless to anyone else (what a shame).
Hakan Gulliksson
Three types of knowledge:
What you know you know.
What you know you don’t know.
What you don’t know you don’t know.
Larry Marine
To be an expert is to not know that
you know what you know and to
know what you do not know!
HG (classic)
116
As you gain more and more knowledge on a topic, the knowledge will be
increasingly integrated into your thinking, and you will have more and
more of a problem describing your knowledge, or even knowing that you
have it. Examples of this are numerous, various crafts, carpentry, knitting,
and green fingers. Try to describe how you walk. Some years ago walking
was, for a while, quite a problem for you. Learning by doing or, as in
science, learning by induction by studying a number of examples,
generates knowledge that is embedded within the system and cannot be
reconstructed by the system itself. This knowledge is called tacit
knowledge. Animals have the same ability, but probably do not reflect on
it. Things on the other hand can only do things that can be formalised, i.e.
described. With enough knowledge you are ranked expert and spend your
time in a narrow knowledge area. The more of an expert you are, the
narrower the area. The more expert you become, the more you know
about related knowledge that you do not have.
More on how to acquire ignorance is accumulated later in this book; in fact
the whole book is about this accumulation.
IV.14.3 Knowledge management
Interaction is by itself not enough to generate knowledge and to use it
efficiently. It is necessary, but not sufficient, as mathematicians say. There
must be some mechanism to reduce the amount of information and order
it for efficient access. If we absorbed everything, we would be drowned in
information impossible to access again, because it would be so
voluminous, and without structure.
Information
Luckily, there are at least two remedies for this problem. The first is to
throw away everything that is not interesting enough. The second
remedy is to match new information against what is already stored and
only consider differences, enhancements, and abstractions. It is
illuminating to compare our own abilities to how images are stored and
accessed on the Internet. How do you find an image of a brown dog on the
Internet? How do you recall the memory of a brown dog? What is the
difference in the level of detail between your memory and the image from
the Internet? We are still far better at recognition than the machines, but
most of us have problems keeping track of the details.
Figure IV.14.2 When we want
to characterize infor-mation,
differences can be very efficient.
Abstraction gives us something more than efficient storage. If knowledge
is grouped into categories we can assign properties to them and use the
categories for inferences, i.e. predicting properties we have not observed.
If we know that a PDA stylus is a kind of pointing device we will not be
surprised if a cursor appears on the screen when we touch it with the
stylus.
Hakan Gulliksson
117
Categorizing and making abstractions would not be of much use if
properties were sprinkled randomly in reality. How come they are not?
Why are properties lumped together and assigned to local groups of
objects? The laws of nature, evolutionary pressures and adaptations to the
environment are some probable answers. One of the best arguments for
Charles Darwin s Theory of survival of the fittest is that it explains why
living things are hierarchically grouped into family trees [SP].
In 1619, Italian philosopher
Lucilio Vanini was burned
alive for suggesting that
humans evolved from apes.
IV.14.4 Machine learning
Things are by no means intelligent today. In fact, most things are quite
stupid and have severe problems learning anything at all. A pencil, a cup
of coffee, or even a TV-set are not that smart. One way to improve things
is to learn by studying humans and try to reuse the findings, but scientists
also uses results from biology and ethology, i.e. the study of animal
behaviour in natural conditions. Animals have a lot to teach us, from
simple motoric behaviours, to complex social ones.
Think about how you would
represent a screwdriver and a tin
of paint in the computer. Ready?
Now, test if your representations is
good enough to find out that the
screwdriver is perfect for opening
the tin.
Machine learning (ML) is a branch of Artificial intelligence and Cognitive
science. It explores a broad range of topics, such as modelling mechanisms
that underlie human learning and using machine learning on real world
problems. If we could teach things how to learn then we could build them
with much less effort, allowing them to explore the complexity of reality
by themselves. This section gives an overview of some of the terminology
used in ML, but without mathematical rigor. Please note that there is
much more to learn than we could be massaged into these few pages
[TM].
Nobody really knows how humans learn. Can we then teach computers
how to learn? In a way we can say that we learn the things by
programming them, what is the difference in principle between sending
children to school and to add another program to the thing? One theory is
that learning demands interaction with the context to be learnt from. If
this is true then things cannot learn faster than humans since the speed of
the interaction is constrained by the speed of the changes in the
environment.
Machine learning can be defined as the improvement of performance, in
some environment, through the acquisition of knowledge resulting from
experience in that environment, see figure below [PL2]. Learning is in
other words something that very much depends on context, such as prior
knowledge.
Performance
Knowledge
Environment
Figure IV.14.4 Machine learning in
context, i.e using the environment and
knowledge.
Learning
Hakan Gulliksson
118
Learning is useful for several types of tasks such as classification,
regression (learning how to fit data to a real-valued function), and
problem solving. Classification is applicable when we want to categorise
some input (instances) as one or more concepts in a concept space
(feature space). In the following figure instance x with some known
attributes is classified as concept y. Often we do not know the exact
mapping from instance to concept, so an inference engine will have to use
a hypothesis. The inference engine shown in the figure could learn the
hypothesis from a set of training instances. Learning from examples this
way is called inductive learning.
Concept space
Instance
Inference engine
Figure IV.14.5 The principle of
machine learning.
Attribute
Instance x
Concept y
Learning can be difficult for many reasons. Obviously the complexity of
the concept space and the number of instances and attributes adds to
complexity. Other typical problems are the number of irrelevant features,
the amount of noise in the feedback for supervised learning, and noise in
instance data.
IV.14.4.1 Representation of instances
The choice of the representations is very important. We need appropriate
representations of input (instances, the training set), output (concept,
feature space), and of the knowledge learned.
Some common representations for instances are Boolean values,
categorical values, i.e. mutually exclusive values for attributes, and
numerical values, see the figure below. The figure also shows a vector
space representation of the numerical data (bottom right). It is always
possible to describe the instances using a vector space representation.
Instance Hakan
Instance Someone
Big = True
Blue = False
Mean = False
Howl = False
Cluck = False
Character = Mean
Attributes with
Boolean values
Instance Hakan
Cannot fly
Can fly
H
T, I
Instance Hakan
Character = Kind
Attributes with
categoric values
Figure IV.14.6 Four different
representations for features and
instances.
Height
Hakan
Height = 1.87m
Weight = 90 kg
Attributes with
numeric values
Hakan Gulliksson
Weight
Vector space representation (feature space)
119
For some domains a propositional representation is more appropriate, i.e.
representing an instance as a data tuple (Length Hakan 1.87). We have
previously discussed how such knowledge, in the form of propositional
representations could create complex graphs.
IV.15 Humans are creative
We think we know about what we know, and about what we can do, but
sometimes we often surprise ourselves, e.g. when falling in love at the first
meeting. We also think that we have good mental models of other
humans, but they never cease to surprise us. Our acquired knowledge and
mental models are always limited. It is for instance very difficult to
understand parenthood if you are not a parent yourself.
Some aspects on human knowledge representation and processing have
already been described. Now we will discuss maybe the most magical of
all human features, creativity. It is usually considered in conjunction with
inventions and arts, rather than with adaptive behaviour. But, the fact is
that we all need creative thinking every day, in the small, to survive. We
are constantly put into situations where we need new solutions. One
example is when you try out a new recipe for a small dinner with candles
and spouse.
My favourite creativity exercise: Try
ing to find as many ideas as possible
on a randomly chosen topic.
A definition of creativity is found to the right [MC2]. The domain
mentioned in the definition alludes to a set of nested symbolic rules and
procedures embedded in culture; one example is physics, another
athletics. Note that if creation is goal oriented and purposeful it is
equivalent to design.
Definition:Creativity is any act,
idea or product that changes an
existing domain, or that transforms
an existing domain into a new one.
The creative process as described in the following steps is accepted by
most researchers [MC2]:
Preparation
Incubation
Insight
Evaluation
Elaboration
The preparatory phase is spent learning about an interesting problem,
studying and assimilating facts, and maybe gathering sensory inputs. This
phase can take years, or ten minutes, depending on the domain. The
second phase, the incubation, means doing something else. Subconscious
processes manipulate the information collected and suddenly, as a bolt of
lightning, the effort delivers some insight, an aha experience . The aha
experience might not be the right solution, it has to be evaluated and
possibly thrown away, a most unpleasant task. Throwing away is even
more important in the coming idea-rich information society. If the new
insight is admitted some hard work is still needed to adapt and refine it.
The creative process is said to be 1% inspiration and 99% transpiration.
Hakan Gulliksson
It’s not possible! It is not done that
way here! It is too much work! It is
not our idea! My idea is already
perfect!
Problem:
Creative solution?
Chi dorme non piglia pesce.
(He who sleeps does not catch any
fish)
How to put to other uses:
Adapt?Modify?Magnify?
Minfy?Substitute?Rearrange?
Reverse?Combine?
Osborn, Applied imagination
120
The result of the creative process must be possible to express. We cannot
say that we have created something if we cannot formulate, or execute it.
Actors perform it, and designers visualize it. Engineers build it, but also
think about how to build tools to support creativity and how to measure
it.
“Now it is impossible to
remain human and throw
away technology!”
Dr Michael Heim,
Art Center College
Comments?
"Arx" by Lars Vilks.
Its creator describes Arx as a
"three hundred page book
whose pages cannot be
turned. The reader must move
himself."
IV.16 Humans feel presence, and have social abilities
Presence is the experience of being there, in a situation or an environment,
involved in a cause-effect chain of actions. The internal representation of
the situation and the actions involved is called a frame, or a schema, and
in this frame you place, see, and can study yourself. It works beautifully
for instance on placebo pills. When a frame breaks, presence is shattered,
which is exploited by visual illusions and many kinds of humour. Frames
are dynamic, socially shared, and can be culture specific.
If you are present you could be more or less aware of the directions of
feelings and cognitive attention [RR3]. Presence increases as this
awareness decreases. “nother definition of presence is as the perceptual
illusion of nonmediation . Here instead of feelings and cognitive attention
it is the awareness of the mediation that decreases as presence increases.
Presence can also relate to social presence, i.e. to our awareness of a social
environment. Three types of presence can be identified, environmental,
social, and personal presence. The difference between environmental and
personal presence is that for environmental presence the environment
takes you into account and reacts to you.
The sense of presence is affected by several factors. A cultural framework,
the possibility of negotiation, and the possibility of action are important,
and also the affordances of the context. Presence is improved by ease of
interaction, user-initiated control, realism and length of exposure [RR5].
Hakan Gulliksson
You take the blue pill and you wake
up in your bed and believe whatever
you want to believe. You take the red
pill, you stay in wonderland and see
how deep the rabbit hole goes.
Matrix
121
The above discussion on presence focused on what it is, and how to keep
it up. Alternatively we can look at how it works and has evolved. The
reference [GR1] suggests three levels of presence, proto presence, core
presence, and extended presence. Proto presence is about the unconscious
embodied presence related to the level of perception-action coupling .
The next higher level is core presence where changes in core affect and
perceptions are consciously followed and attention is directed according
to evolutionary dispositions and learned knowledge, i.e. something
arouse me, here and now, I see and hear it, and I consciously react to it .
Note that this kind of presence still does not assume memory. Extended
presence is slower, and here perceptions and emotions are integrated into
a single experience, i.e. this is what is happening in the current situation,
I understand how it could affect me and my goals, I change my plans
accordingly . Extended presence builds the frame we discussed
previously and can be seen as a narrative structure involving us. The three
levels of presence correspond to three suggested levels of self built by
evolution, and also to the reeeeeeaaaaallllly thorny issue of consciousness.
Since our mental structure has been developing over a long time there is
bound to be interactions between the different levels of self and presence.
Emotion is one example of a feature cross coupling them. Think about the
following events; holding a conversation while drinking a cup of coffee,
and holding a conversation while trying to decide from taste and other
clues whether the coffee was Columbian.
When all three of the layers focus on the same event we have a maximal
presence and the prerequisite for flow, which will be discussed next in the
next chapter.
One important task for cognition is to support the social abilities that
groups of interactors need. For humans they have evolved over a long
period of time, and some evolutionary biologists even argue that
development at group level is why humans are successful. Groups of
interactors are formed for different reasons. One important reason is that
members of the group benefits. A group can accomplish feats impossible
for the individual alone, and an interaction among groups, rather than
among individuals, might be even more efficient.
Before we continue discussing social abilities we perhaps should first start
by introducing what we mean with the social environment. We will
mostly discuss the primary group, which is a small group of people that
stays together for a long time, such as a family. This kind of group was
important already on the Savannah millions of years ago and within it we
want as many and as tight connections as possible to increase
belongingness. Secondary groups are larger and less personal temporary
relations. For both kinds of groups the identity can be grounded in
interacting patterns of [BS3]:
My body is wherever there is
something to be done
Merleau-Ponty
Consciousness: an organisms
awareness of its own self and
surroundings
Antonio Damasio
“One blue LED flashes when the robot
is both recognizing behavior in another
robot and imitating it. In another
experiment, the researchers placed the
self robot in front of a mirror. Although
the blue lights fired, they did so less
frequently than in other experiments.”
Junichi Takeno, dec 2005
I suspect consciousness prevaildein
evolution because knowing the
feelings caused by emotions was so
indiscpensabe to life.
Antonio Damasio
Definition:
A society is a system where citizens
reach their goals through interaction
and where there is more interaction
within the society than between
societies. The citizens live in a
common space and time, are aware of
having a distinct identity from other
societies and believe that they can
obtain their objectives better within
the society. One objective would be to
keep the society together.
Shared norms, laws, values, beliefs and attitudes,
Artefacts used and created,
Blood ties, shared experiences, and physical closeness, i.e.
characteristics of the individuals in the culture.
Hakan Gulliksson
122
Artefacts are instantiated technology, and are constantly changing in
shape, which provides for powerful societal changes. Within the
framework of a group it is possible to achieve intimacy, establish trust,
confidence, and other social effects.
A first pre-requisite for a social environment is social presence, and three
dimensions have been distilled; co-presence, psychological involvement,
and behavioural engagement [FB]. Co-presence is the degree that a person
feels that he or she is alone, i.e. that she knows there is someone else at the
same location (co-location), or senses others while showing some aspect of
herself or her activities (mutual awareness). Psychological involvement is
to what extent the person attends, senses, or responds emotionally to
another person. Behavioural engagement is about the interactions that
constitute social relations, e.g. when someone is dependent on an action
by someone else. Connectedness complements psychological involvement
and describes a situation where we know there is a person thinking about
us, even though we cannot sense it [RR2]. The example given in the
reference is that we send someone a message just to tell that we are
connected to the Internet.
A number of key dimensions of social intelligence are [VK]:
Situational radar: understanding the social context and how to
adapt behaviour to it. Maybe acknowledging that someone really
needs privacy?
Presence, confidence, self respect and self worth, as seen by
others
Authenticity, telling the truth to oneself and to others,
Clarity, to use language effectively and efficiently,
Empathy, ease of creating and maintaining sense of
connectedness
Social systems favour the socially competent, and abilities such as
guessing thoughts and intentions of another individual are extremely
valuable. Imitation of behaviours is another important talent, and even
very small children follow the gazes of their parents. Social competence is
also about finding patterns, e.g. rituals, and using them. One example is
that if you see someone who twice gets really, really angry over dishes not
done, you suspect a pattern and perhaps make an extra effort the next day.
Imitating the angry father is a popular, and advanced, social activity.
Research on a social behaviour questionnaire selected 20 out of 172 social
parameters for social competence. The items were altruism, amicability,
assertiveness, compassion, competence, compliance, dutifulness,
eagerness of effort, empathy, good impression, gregariousness,
helpfulness,
likeability,
modesty,
responsibility,
sociability,
socialization, straight forwardness, trust, warmth [BR].
Hakan Gulliksson
A social system is a predictable
pattern of interaction among
persons made possible by shared
structures of attention.
[MC3]
123
Social connectedness means to highlight personal unique or exceptional
attributes, and to recognize, cultivate, and acknowledge such attributes
in others. It gives possibilities to acquire specialized social and other
skills, and with them the next move is to find groups missing such
competencies, while still adhering to similar basic values [DB]. The same
type of behaviour is favourable also when selecting long-term
companions. Other examples of competences useful in a social
environment are the abilities to detect presence and departure of others,
identify relations, recognise individuals for instance using faces or gaits,
and to refer to other individuals, using language when chatting and
gossiping. Means for communication and interaction are necessary,
otherwise a community will certainly not work, and this communication
will follow patterns. We for instance agree on how to represent
correlations between causes and effects, and there are protocols and rules
for turn taking in discussions.
Norman Rockwell
Promotion of cooperation is very much an evolved behaviour [DB]. We
need social commitments in cooperation for explicit coordination. If
interactors publicly state intentions, then other interactors can use the
statements for coordination. We should allow for future possibilities to
affect current decisions about relationships by increasing the number of
interactions and commitments to interesting individuals. If the norm is
reciprocity, helping friends and relatives then deviant patterns of
behaviour will reveal exploiters who does not give, just take. This
behaviour can be supported by insisting on no more that equity thereby
avoiding greed. A reputation as a greedy exploiter will not help make
friends. We will come back to co-operation and co-ordination later in the
next part of the book.
Whatever the reasons, groups are formed, roles are assigned, and when
they are, the complexity of the system rapidly increases, faster the more
heterogeneous and complex the individuals are. One way (the only way?)
to limit this complexity is to impose social rules on the system, using close
feedback via social interaction. The rules impose order and also simplify
social navigation and manipulation. Humans are unique in that we can
create, and even purposively design, new conventions for coordinating
our social contexts. We use rules to, among other things, build trust, make
friends, and identify cheaters. A marriage is for instance a long term social
bound telling who belongs and who do not. It puts and end to messing
around and signals focusing on the next generation. Hopefully it also is a
result of emotional attraction.
We will in this book mostly discuss interaction in smaller systems
consisting of only a few interactors. So little time, so much to do. There are
of course also large hierarchical systems where the interaction is between
members of large organisations, or between whole societies. If you
carefully read the definition of an organisation to the right, and think in
terms of interaction, you will see that interaction is the crucial element in
every organisation, simultaneously the source and the product of its
existence [JF].
Hakan Gulliksson
Three types of social action:
Rational
Traditional
Emotional
Max Weber
Definition:An organisation is an
arrange-ment of relationships
between components or individuals that produces a unit, or a
system endowed with qualities
not apprehended at the level of
the components or indi-viduals.
E Morin
124
Human social behaviours are very similar to other social species, such as
other primates and wolves. However, currently interactors of type
information do not organise themselves. Why? First, the interactors cannot
yet make sense of their environment sensed. Second, they are not
autonomous and flexible enough to explore the possibilities in the virtual
world. Looking for spontaneous grouping is currently unreasonable.
A major limitation is that societies created by technology so far have been
very limited in how goals are obtained and managed. The complexity of
the objectives and their interactions for any H is on the other hand
extremely high, emotion; mood, love, and grief all fill social functions.
Empathy, for instance, is an interaction that tends to keep a group
together. You know, or at least suspect, that I feel what you feel, and share
that feeling. Can a wolf feel empathy? An ant? The solution so far has been
to let a user , or a designer programmer , make the decisions.
Sharing limited resources and the use of communication implies mobility,
and a social context that changes over time. Social context, i.e. norms,
roles, and social pressure triggers mobility. We for instance leave a
meeting when we receive an important phone call. If we however imagine
a society of designed networked interactors then communication does not
demand mobility and virtual resources can be shared over the network.
#
# !
# !!
Try to put yourself in the place
of a car, how would you
perceive reality?
IV.17 Humans experience it
In the following section we will focus on experience and emotion. Not
only are they important for well being, for how we feel , what excites us,
and for what we potentially will buy, but they also difficult to clearly
describe. We will draw a rough sketch of a number of concepts that have
been discussed since Plato, and which are still not resolved. Emotion and a
sense of experience are based on interactions internal to us, but they are
tightly bound to evolution, and to the physical and social environment.
They are also processes rather than states. This section could have been
situated in the next part of the book, on interaction, but we choose
introduce the content as a characteristic of an interactor.
We prefer to see experience as a dynamic action cycle where a human
perceives internal and external events, and has intentions (goals) and
concerns. The percepts are appraised, and emotions result. Action
tendencies (arousal) are established, and actions executed that will
change internal and external variables, possibly triggering new events.
Concerns are or have similar effects as needs and urges. Together with
impulses, drives, and attitudes they sum up to a set of motivational
factors that can complement emotions and conscious reasoning to help us
decide what to do. Along with emotions we also have moods and traits.
Hakan Gulliksson
Let us have an emotional
experience together.
Impulses are behaviours that
trigger spontaneously grounded in
desires. We buy that nice looking
unreasonably expensive mobile
phone.
125
An event in this context is an external or internal representation that
behaves such that it attracts attention, i.e. it is the object of the emotion
and it changes. To perceive an event presence is important and attention
must be directed.
A mood is a person's sustained and
predominant internal emotional
experience; examples include aggression,
fatigue, depression and euphoria. It is
generally of a long duration,
unintentional, undirected.
Action tendency
Behaviour
Appraisal
Event
Emotion
Action
Figure IV.17.1 Experience as a network
of interacting components.
Concern
Cognition
Attention
Even though the concept network in figure IV.17.1 seems complicated as
with many other areas where the human is studied, the devil is very much
in the details. Scratching the surface discussed reveals numerous different
definitions and views, overlapping in scope, different in abstraction level,
describing highly entangled behaviours. However, for the purpose of this
book the above model is sufficient. There are of course also other theories
to choose from.
For many of the concepts considered there is a discussion whether they
are iinnate or learned, and if learned, how this is accomplished. Is for
instance the social environment the main factor for learning? We will not
discuss this here. To clarify the concepts we will now shortly define some
of them, still at a rather high level of abstraction.
Man consists of four principles, which
are called the Physical, the Nervo, the
Soul or Psychic and the Mind or Mental.
Max Theon's teaching
The septenary division may be given as
follows: 1. Physical Body, or Sthula-Sarira,
2. Astral Body, or Linga-Sarira,
3. Vitality, or Prana.
4. Animal Soul, or Kama-rupa,
5. Human Soul, or Manas.
6. Spiritual Soul, or Buddhi,
7. Spirit, or Atma.
We start with emotion.
IV.17.1 Emotion
Emotions serve many functions as our control centre. They produce
shifts in concentration and attention, motivate, and help us to sustain
explorations, manipulations and investigations. Furthermore they free
up cognitive resources when needed, and support social life. Emotions
help us to make decisions from insufficient facts (almost always the case
for us). To be surprised you have to know what to expeeeeeect, which
means that you have prompt access to a lot of information about the
world. Emotions involve neural, physiological, cognitive and social
aspects of behaviour, and they serve as modifiers, or amplifiers, of
motivations that drives human behaviour to fulfil needs. We for instance
need food, and for this hunger is a motivation. If you are hungry and see
someone eating, someone who does not want to share, then anger could
amplify hunger and trigger an attack. Without anger you could discuss
pros and cons of attacking until all of the food is already eaten. Emotions
also additionally serve as signalling mechanisms. They tell other
individuals about your emotional state and help them to guess what you
will do next. They could also guess your situation, and perhaps infer from
Hakan Gulliksson
“Emotion is too broad a class of events
to be a single scientific category. As
psychologists use the term it includes
the euphoria of winning the Olympic
gold medal, a brief startle at an
unexpected noise, unrelenting profound
grief, the fleeting pleasant sensation of
a warm breeze, cardiovascular changes
in response to viewing a film, the
stalking and murder of an innocent
victim, lifelong love of an offspring,
feeling chipper for no reason, and
interest in a news bulletin”
[JR]
126
your fear that running is a good thing to do. Furthermore emotions
represent social and moral values. Fear of going to prison is supposed to
keep you from committing crime [KS].
Emotions are experienced and attached to events, people, products, and
services. Generally they are short-lived, intentional, directed. They interact
with cognition as well as affecting physiology, e.g. anger raises blood
pressure. A common denominator for mood, emotion, and feelings is
affect, and the simplest descriptions of emotion is as a core affect. In
many models there are two dimensions of core affects,
Activation/Deactivation and Pleasant/Unpleasant, and they can be seen as
spanning a circle in two dimensions, see figure IV.17.2 below [JR3].
Alternatively we can use valence and activation to describe core affect
where valence is the degree of attraction or aversion that an individual
feels toward a specific object or event, and activation is to what extent an
individual is wake/tense or calm/tired. Pleasures are agreeable reactions
to experiences in general. Pleasure is similar to enjoyment, a word that
has been used for the more limited scope of positive responses to media.
On the periphery of the circle spanned by activation and pleasantness we
can find different combinations of core affects, again see figure IV.17.2. As
a human we are at every instant somewhere in this space, and,
importantly, forever on the move.
A grown up smiles only 17 times a day.
Happy
Sad
When humanity is gone I hope
it will be remembered by a joke.
/HG (the essence of a joke is
surprise and curiosity)
Activation
tense
alert
excited
nervous
stressed
elated
happy
upset
Pleasant
Unpleasant
sad
contented
serene
depressed
lethargic
fatigueed
Figure IV.17.2 Core affects and their
combinations.
relaxed
calm
Deactivation
One basic set of emotions often used is Fear (terror, shock, phobia), Anger
(rage), Sorrow (sadness, grief, depression), Joy (happiness, glee, gladness),
Disgust, sometimes complemented by Surprise [PE]. The main reason to
use this set is that they can be seen as face expressions. The emotions listed
above are the most often used, but there are between 500 and 2000
different categories of emotion suggested in the English language, and
different research views generates different categorisations [JR3].
Hakan Gulliksson
Feeling, similar to emotion, but still
a different kind of experience. The
diffference is that a feeling does not
call for an action or activation
change [NF].
127
IV.17.2 Appraisal
Stimulus of the system has to be appraised. This is the cognitive
interpretation of the event that also can be used to categorise emotions, see
also figure IV.17.1 [NF]. A set of appraisal variables distilled from research
is listed in table IV.17.1 [JG]. As can be seen from the table when deciding
the significance there is an agent introduced that mediates events, or is
itself causing the event.
Appraisal variable
Relevance
Desirability
Causal
attribution
Agency
Blame
and Credit
Likelihood
Unexpectedness
Urgency
Ego Involvement
Coping
potential
Controllability
Changeability
Power
Adaptability
Explanation
Does the event require attention or adaptive reaction?
Does the event facilitate or thwart what the person
wants.
What causal agent was responsible for an event?
Does the causal agent deserve blame or credit.
How likely was the event; how likely is an outcome.
Was the event predicted from past knowledge?
Will delaying a response make matters worse?
To what extent does the event impact a person’s sense
of self (self-esteem, moral values, beliefs, etc.)?
The extent to which an event can be influ-enced.
Table IV.17.1 Variables for
appraisal [JG].
To what extent an event will change of its own accord.
The power of a particular causal agent to directly or
indirectly control an event.
Can the person live with the consequences of the
event?
Another set of appraisal variables, with their associated emotions within
parentheses is [Scherer as cited by JU]:
Novelty surprise, amazement …
1st 6,
6, 28
Motive compliance (instrumental emotions such as
disappointment, satisfaction …
Intrinsic pleasantness (aesthetic emotions: disgust,
attraction to …
Legitimacy social emotions indignation, admiration
4th amendment (US), 10-4
2nd law of thermodynamics
0, ∞, 90-60-90
π, 1.618 033 988 749 894 848…
50th birthday,
10 Downing street
Challenge/promise interest emotion boredom, fascination …
…
42195 m,
135th (British) Open championship
Mount Everest
Hakan Gulliksson
128
From the appraisals suggested above a layered representation can be
constructed of how they are applied, see figure IV.17.3 [JU]. As the authors
also note this however is a very simplified model of your emotional life.
Guilt, disgust, anger
High
low
Shame
low
Pride
low
Incompatible with norms
Incompatible with self-respect
Ability to cope
low
Sadness
Very low
Goal hinderance
Unpleasantness
Novelty
Very
high
low
Figure IV.17.3 Highly simplified
process of appraisal filters [JU].
Fear
Happiness
Indifference
low
Some specifically social appraisals have also been suggested; status signals
(trigger pride), violation of mutual fairness (anger), and events that
happen to a group that we identify with will trigger emotions in us [JU].
IV.17.3 Concern (need, urge, drive, goal, utility, desire, motive)
Concerns and other motivators is another set of ill-defined overlapping
concepts. If the system detects problems related to concerns emotions
develop [NF]. A concern can be universal, such as the concern for physical
well-being, but others are personal, related to previous events. Concerns
could be abstract, such as concern for democracy, but yet other can be very
practical, such as worrying about a traffic stocking. That concerns can be
personal and situated is a problem for us here since we want to state
general facts about quality of life (QoL). Since concerns are personal, so
will appraisal be, along with emotions, and eventually the actions taken.
A concern is the disposition of a
system to prefer certain states of
the environment, and of the own
organism over the absence of
such conditions.
Needs are effectuated through drives, and are low-level and nonconscious, directed at achieving essential resources. Biological drives
include hunger thirst and reproduction [DD1]. We need to eat, need to
mate. Needs can be seen as particular qualities of experience that humans
need for QoL. By asking people to rate different needs a list of needs was
identified, and on top of the list were the four needs autonomy,
relatedness, competence, and self-esteem [KS2]. Autonomy means that
the activities chosen are self-endorsed, and relatedness is the need to feel a
sence of closeness to others. Self-esteem is about achievement, status,
responsibility and reputation, whereas self-actualisation is described by
personal growth and self-fulfilment, the process of becoming everything
one is capable of. Less important were security, self-actualization and
physical thriving. Popularity/influence and money/luxury where not seen
as important. The list of needs was chosen from a survey of different
proposals for needs. Another famous suggestion is the fundamental set of
Hakan Gulliksson
129
needs on different levels by Maslow; physical health, security, selfesteem, love-belongingness, and self-actualization. Other researchers
have complemented Maslow's hierarchy by aesthetic (beauty, balance,
form ..), and cognitive needs (knowledge, meaning, self-awareness). A
need for aesthetics is motivated by that humans should have evolved
systems to find rewarding the preferences that would have been adaptive
in the past. This relates to sexual attractiveness,
Longest kiss 30 hours
The most sit-ups using an abdominal
frame completed in an hour is 8,555
The longest time a coin has been spun
until coming to a complete rest is 19.37
seconds.
Yet another interesting suggestion for needs, or urges, are curiosity,
challenge and teaching [MT]. Curiosity is for instance clearly seen in
young children, and there is no end to the number of world records.
Curiosity without challenging anything will not get you far, which
indicates that challenge is a basic need. The teaching urge means that it is
difficult to keep a secret, and that it feels good to share your knowledge.
SDT (Self-Determination theory) gives us our next list. It postulates three
innate psychological nutriments for growth and well-being, competence,
relatedness and autonomy [ED]. Note that challenge and curiosity are
important, if not necessary for achieving competence.
Goals additionally can also be cognitive conscious creations. We have
social needs (belonging, esteem), and self-actualizing needs (mastery,
control, variety, meaning … [RV].
Yet another variation of the theme is motives that arouse and direct
behaviour toward specific objects and goals. The big three motives are
achievement, power, and intimacy, all related to the social world [RL2].
This short list includes the most important motives from the following
longer list by Henry Murray; Achievement, Exhibition (to make an
impression), Order (arrange neatly, precision), Dominance, Abasement
(admit
inferiority),
Aggression,
Autonomy,
Blame-avoidance,
Affiliation,/ Intimacy, Nurturance (to give, assisst, help, feed), and Succor
(to receive).
Next step in the analysis would be to find even more fundamental reasons
behind all of the different concerns, and try to identify the most important
concerns and their reasons. Maybe there are reasons that affect many
concerns? One attempt to such an analysis is found in [BS2]. Starting from
the three general components situation, environment and object the three
major areas of concerns found were power (hierarchy, competition, and
submission), death (violence, health, and self preservation), and love
(friendship, hatred, and lust).
Dracula’s kiss
IV.17.4 Action tendency (coping strategies)
If concerns appear endangered then action tendencies develop. The
following list is some possible tendencies with associated emotions given
within parenthesis [NF];
Approach (Desire),
Avoidance (Fear),
Being-with (Enjoyment, Confidence),
Attending (Interest),
Rejecting (Disgust),
Nonattending (Indifference),
Agonistic (Attack/Threat) (Anger),
Interrupting (Shock, Surprise),
Dominating (Arrogance),
Submitting (Humility, Resignation).
Hakan Gulliksson
130
Another view formulates the behaviour of appraisal as something that
triggers coping strategies, see table IV.17.2 [JG].
Problemfocused
Coping
Emotionfocused
Coping
Active coping: taking active steps to try to remove
or circumvent the stressor
Planning: thinking about how to cope. Coming up
w/ action strategies
Seeking social support for instrumental reasons:
seeking advice, assistance, or information
Suppression of competing activities: put other
projects aside or let them slide.
Restraint coping: waiting till the appropriate
opportunity. Holding back
Seeking social support for emotional reasons:
getting moral support, sympathy, or understanding.
Positive reinterpretation & growth: look for silver
lining; try to grow as a person as a result.
Acceptance: accept stressor as real. Learn to live
with it
Turning to religion: pray, put trust in god (assume
God has a plan)
Focus on and vent: can be function to
accommodate loss and move forward
Denial: denying the reality of event
Behavioral disengagement: Admit I cannot deal.
Reduce effort
Mental disengagement: Use other activities to take
mind off problem: daydreaming, sleeping
Alcohol/drug disengagement
Table IV.17.2: Some common
coping strategies.
The table distinguishes between problem based (cognitive) and emotion
focused coping, but the two types of strategies still interact.
IV.17.5 Experience
Experience is a useful concept when discussing quality if life. It certainly
is a complex and multifaceted concept since it is an aggregation of
whatever people encounter in their lives. After a deep breath we will now
attempt to dissect experience.
An experience is the sensation of
interaction with a product, service,
or event, through all of our senses,
over time, and on both physical and
cognitive levels [NS].
We are constantly interacting with our environment, attracted to
favourable experiences, and repelled by others. Our moods can even be
affected by words shown too fast to be consciously noted [RL3]. By
evolution we are selected to be healthy and to feel good. Evolution has
most efficiently eliminated those of our forefathers who did not enjoy
food, shelter and company [RV]. Playing tennis with a friend is typically
more fun than sitting in jail alone, and people who win Oscars on average
live 4 years longer than people who are nominated, but fail to win [RL3].
Hakan Gulliksson
131
We humans prefer a medium level of uncertainty. Total predictability will
be dull and grey (at best), and even worse is the ultimate chaos without
any patterns of stable references. We, in other words prefer a semi chaotic
environment. If this is not possible we will try to change context or
situation. When we find ourselves in a comfortable situation this is
however sadly only a temporary match and relief. Either we adapt, or the
situation develops into something we cannot handle. The problem is not
as acute in a social environment where the participants can co-develop,
but even there the context can endanger a long term relationship.
Hedonic experiences are experiences related to like or dislike. If positive
we can name them pleasure, enjoyment, excitement, fun or happiness,
and once again we face a set of semantically overlapping concepts. In the
following we will not make a difference between the terms. There are
however research that try to define and explore the differences, and
appealingness of technology for instance relates to pleasure, but novelty
involves pleasure, excitement and fun [HS1].
If we consider media enjoyment, such as reading a book, watching a
video, or playing a game, we seek a suitable tension between the cognitive
abilities of the interpreter, and the complexity and other characteristics of
the media message. We can view experience as a story, i.e. a frame,
possibly socially constructed, where we can be one of the participants.
Enjoyment results from moving in between overstimulation and
understimulation providing different level of arousal. The extent that we
are able to extract the message from a media, and experience arousal
depends on disposition, but also to a large extent on training.
Both cognitive and affective structures are active and interdependent
when information is processed. Even physiological aspects are involved. A
human infant tasting sugar will relax the muscles of the middle face. So far
however, no one has found any centre of pleasure in the brain, even
though there are several regions identified that contributes to feeling good
[KB]. Luckily, a larger area in the brain is allocated for positive
experiences.
Many researchers have tried to define the dimensions describing
experience. One is Donald Norman who describes the visceral,
behavioural (learned, habitual), and reflective experiences [DN1]. The
visceral level is the "biologically prewired", about how things look, feel
and sound. Another framework is provided by Jordan who classifies
pleasure as; physio-pleasure, psycho-pleasure, ideo-pleasure and socipleasure [PJ].
Hakan Gulliksson
132
A third approach is to look at the aesthetic experience that is ultimately
about a satisfaction resulting from an experience and could originate from
perception, cognition, or action, e.g. dancing. Aesthetic sensibility can be
trained and increases with age. It is personal, changing, and not
necessarily rational. An aesthetics can be defined also for the social
environment relating to trends, culture and religion [BS3]. Aesthetics is as
fundamental to human life as creativity and design, but it is concerned
with how things are done rather than finding them up. Everything from
making a cup of coffee, to writing a software program can be done
considering aesthetics, and it is blessing as well as a curse. A blessing
since if we care about aesthetics beauty is something that matters, but a
curse because it adds extra constraints that hamper productivity as
measured in time. Taken to the extreme a designer will not be able to
follow orders in conflict with his sense of aesthetics.
Why do we care for the aesthetics? It has no functional value and yet it is
highly valued. A crude explanation from evolution is that anyone who can
afford to mess around with arts and care about aesthetics must be wealthy
and worthy of respect. It is a matter of status. Another reason is that
aesthetics is about human concerns, emotional needs rather then
efficiency. Not only solving the problem, but doing it with elegance,
creativity, and with little resources.
Aesthetics: beautiful, good, true,
satisfying, efficient, and useful all at
the same time behaviour, form, or
thing.
White square on a white ground
Why then is an expression considered aesthetic? What is it that pleases the
individual, and satisfies by concentrating pleasure stimuli? For vision, and
if we once again use evolution as a master copy, we want to look at safe,
food-rich, explorable, learnable habitats and have friends, that are fertile,
healthy mates, and babies [SP]. Any ordinary family photo album can be
used to verify these facts.
Reproductions of basic graphic elements, that we employ daily to make
sense of our environment, for instance parallel lines and symmetrical
shapes, are also candidates for aesthetics. These elements have been
incorporated into us by evolution, to help us orient ourselves in the
environment, and now we can exploit this inherited capability in
meaningful, clear, and aesthetic graphics.
Is it possible to enhance human’s
visual literacy (as a new language)?
The search for elegance and the careful selection of design elements forces
the designer to reflect on the result, and to spend time with it. Sometimes
a new, complete and yet the most economical solution is found, and the
result surprises even the designer herself. Some of the keywords are
proportion, scale, contrast, and emphasis that evoke activity and interest.
An extensive framework for experience is provided by [NS2]. The intent is
to help designers and management to think about products. Some of the
characteristics affecting an experience discussed in the framework are:
Intensity, reflex, habit, engagement.
Breadth, price, promotion, channel/environment, name, brand,
service, service, product.
Significance, function, price, emotion/lifestyle, status/identity,
meaning.
Triggers, by sense or cognitive (concept, symbols).
Duration, initiation, immersion, conclusion, continuation.
Meaning, beauty, accomplishment, creation, sense of community
or oneness, duty, enlightment, freedom, harmony, justice,
Hakan Gulliksson
18
133
redemption from undesirable
validation by others, wonder.
conditions,
security,
truth,
In the framework the meanings of meaning are the most interesting. It is
defined as a distinct level of cognitive significance that represents how
people understand the world around them , and integrates emotional and
cognitive as well as cultural factors. Meaning is very important to all of us,
and many of the suggested meanings listed above can be seen as a person
living out a culturally specific frame.
There are also other human experiences that have not been covered in the
discussion above. Some of them are:
Commitments
Pride
Schadenfreude, enjoying the mistakes or bad fortunes of a rival.
Fiero, personal triumph over hardship or impossible superiority.
Human lives are extremely rich!
IV.18 Human’s subjective well-being, emotion, and flow
The ultimate reason for design is to maximise QoL (Quality of Life), or as
we refer to it in the header, Subjective well-being. Quality of life is a
number of suitable dimensions (quality) for assessing the emergent
process of one or more humans being (life). Synonyms for QoL from other
areas of research are Life satisfaction (economics), Happiness (psychology
and economy), Well-being (psychology and health), and Wellfare
(economics). The term Quality of life itself originates and is used in
sociology [BA]. We will assume that the total well-being of a system is the
sum of the QoL for the human interactors involved and a list of the most
used QoL dimensions, sometimes referred to as life-chances are shown in
table IV.18.1 [RS].
When the tide of life turns against you
And the current upsets your boat
Don’t waste tears on what might have been
Just lie on your back and float.
Hakan Gulliksson
134
QoL dimension
Indicators and descriptors
1. Emotional
a. Contentment (satisfaction, moods, enjoyment)
well-being
b. Self-concept (identity, self-worth, self-esteem)
c. Lack of stress (predictability, control)
2. Interpersonal a. Interactions (social networks, social contacts)
relations
b. Relationships (family, friend, peers)
c. Supports (emotional, physical, financial, feedback)
3. Material
a. Financial status (income, benefits)
well-being
b. Employment (work status, work environment)
c. Housing (type of recidense, ownership,
neighbourhood)
d. Infrastructures (personal and goods transportation)
4. Personal
a. Education (achievement, status)
development
b. Personal competence (cognitive, social, practical)
c. Performance (success, achievement, productivity)
5. Physical well- a. Health (functioning, symptoms, fitness, nutrition)
being
b. Activities of daily life (self-care skills, mobility)
c. Leisure (recreation, hobbies)
6. Selfa. Autonomy/personal control (self-endorsed,
determination
independence)
b. Goals and personal values (desires, expectations)
c. Choices (opportunities, options, preferences)
7. Social
a. Community integration and participation
inclusion
b. Community roles (contributor, volunteer)
c. Social supports (support network, services, events)
8. Rights
a. Human(respect, dignity, equality)
b. Legal (citizenship, access, due process)
The indicators listed in the table are not independent; interpersonal
relations are for instance extremely important for emotional well-being
and personal development. Claims have for instance been made both that
the TV disrupts and increases family interaction. Other surveys have
found that heavy TV-users are unhappy, but is this really an effect of
wathing TV, or do they watch a lot of TV because they are unhappy for
other reasons?
Social identity how you want to
present yoursel to others. To this we
can add the narrative self, your own
history as you see it, past, present,
and future. A frame you adapt to your
life, and try to adapt your life to.
Continuity is important for trust, on
the other hand you need to show that
you are unique, contrasting against
others.
Self-esteem is a persons feelings
towards herself, pride in oneself. Selfesteem depends on social factors such
as family traditions, language,
cultural customs and values, about for
instance economic background. We
all want to feel privileged and chosen
relative to the rest of the world.
Table IV.18.1. Quality indicators based
on reading 9749 abstracts, and 2455
articles, selected from the 20900 articles
with the term quality of life in the
titel and published since 1985.
"Um, I think your problem is
low self-esteem. It is very
common among losers."
We can study the indicators in table IV.18.1 in two ways. Either we ask
individuals and obtain an internal estimate, or we try to measure the same
parameter from the outside. When we ask individuals we face the
problem that we cannot be sure that the question is understood the same
way by everyone, and also that adaptability modulates the expressed level
of QoS. Other problems are that personality, disposition, temperament,
and recent experiences affects opinions. The cultural and organisational
context of an individual also change the priorities of the indicators in table
IV.18.1. More money makes a bigger difference when living among poor.
It is on the other hand difficult to establish the mood of a person from the
outside, although for instance analysis of facial expression can extract
emotional information.
The table above can be used to indicate in what areas positive changes will
affect the individual. Even if a few individuals do not appreciate
improved housing, on a population level this indicator is a true measure
of increased QoL. With respect to technology we can compare the
indicartors above before and after the introduction of a new technology.
Hakan Gulliksson
135
Measures of QoL provided by a product or a service could be:
What people are willing to pay for it.
Their reaction to loosing it.
How much they use it.
Their attitude towards the product.
How they feel after using the product.
Happiness is an individual s appraisal of life synonymous to subjective
well-being. It is a personal sampling of life as a whole, while in the middle
of living. Happiness is an easy indicator of QoL to measure, just ask. The
problem is to interpret and relate the answers given. If our measures for
instance indicate that unmarried persons are less happy this could be
interpreted as that the word unmarried is negatively loaded in the specific
culture. It could on the other hand be an indication of loneliness, or
correlated to the fact that unhappy people are less attractive [RV].
From table IV.18.1 we have the ideas of QoL dimensions. They can be seen
as a context for what we do and what happens to us, i.e. the course of
events that we experience. Table IV.18.2 below shows an example where
happiness has been measured for a selection of the daily events we
encounter and the actions we perform.
Action/event
1. Sex
2. Socialising after work
3. Dinner (socialising too?)
Relaxing (socialising too?)
Lunch (socialising too?)
Exercising
Praying
Socialising at work
Watching TV
Phone at home
Napping
Cooking
Shopping
Computer at home
Housework
Childcare
Evening commute
Working (! low index)
Morning commute
Happiness Average hours
per day
4.7
0.2
4.1
1.1
4.0
0.8
3.9
2.2
3.9
0.6
3.8
0.2
3.8
0.5
3.8
1.1
3.6
2.2
3.5
0.9
3.3
0.9
3.2
1.1
3.2
0.4
3.1
0.5
3.0
1.1
3.0
1.1
2.8
0.6
2.7
6.9
2.0
0.4
“We are made happy when reason
can discover no occasion for it”,
Henry David Thoreau
Happy: embullient, joyful,
exhilirated, elated, carefree,
contented, at peace, at ease, and
being in high spirit.
Table IV.18.2 A list of courses of
events and their happiness index.
Note the popularity of social
interaction [RL].
”Interaction”
Hakan Gulliksson
136
There is presumably a motivation for taking an action and reasons to
expose oneself to experiences. The most important one is to directly
increase happiness by having sex, or enjoying lunch, but we also plan
ahead as when we work to fund shopping or a good wine. The logic is
either; action -> feel happy -> more action, or experience something that
increase happiness -> take action to experience the same thing again.
Whether interaction technology makes you happier is a good question,
and one you need to ask to assess the result of the introduction of new
technology. Asking is important since if we cannot evaluate the result,
then what and why do we design? Measures are however not very precise
even if they are consistently positive or negative, for instance when
interviewed by someone in a wheelchair healthy persons will rate their
happiness higher [RV].
Happpines
s
China (Red)
Egypt, US (Yellow)
Cherokee (White)
Japan, Middle east
(Orange/Gold)
Brazil (Blue)
An overall assumption in this book is that happiness can be changed by
adding technology. Whether raised happiness over a longer term is
possible is however not clear. A widely accepted figure of the heritability
of well-being is 50% [KS], i.e. to a large extent our genes indicate our
destiny. Quality dimensions as listed in table IV.18.1 provide another
salient set of cues for how we rate our happiness, and even by themselves
can affect happiness. If we compare our social prestige with that of our
neighbour the result can affect how we evaluate our life. Life chances
could also directly affect how we evaluate what we experience. More
people without family bonds feel lonely. Each individual will however
interpret and sum their life-chances differently depending on for instance
age, disposition, and education.
After taking heritability and life-chances into account we still have a
significant share of happiness that depends on the events and the
experiences that we encounter, but also, and perhaps more importantly,
on the actions we take and how we experience the result of these actions.
We in other words have a chance to create our own happiness within the
context of our life-chances, but one problem with this is that we are
restrained by our traits. They are the affective, cognitive and behaviours
that are consistent across situations an over time, and quite stable for an
individual. Another objection against taking action for happiness is that
we anyway adapt to the new resulting circumstances, and consequently
striving for happiness is not worth the trouble in the long run. If we
however not look at the result, but on the process of gaining happiness,
the fight might be worth the effort. We are constantly interacting with our
physical and social environment, and it seems silly not to do this in a way
that brings us as much happiness as possible. If we can build tools to help
us in this, it is good use of technology. Identifying the actions to take, and
the technology we need, are important research issues. Furthermore, what
we do also changes the conditions allowing for new actions and ideas of
supportive technology.
Hakan Gulliksson
137
Examples of actions that change happiness are purposely being nice to
someone, and preparing a list of positive things in life [KS1]. In general
however the activities to choose depends on the personality of the
individual (strengths, interests, values) and the life-chances.
More people in developed countries, where housing and food is provided,
will look positively on life. They also think that they are more happy than
average, we are a positive breed of critically observers. On the other hand
we rate losses as more salient than wins, which tend to make us
conservative.
One important finding from the perspective of this book is the positive
correlation between happiness and urbanization, industrialization and
individualization. It seems that we are happier in a modern society despite
the obvious problems with anonymity and alienation [RV]. Why this is so
might be clearer when we later give examples of the technology that
surrounds us, and what it helps us to do.
Typical social traits associated with subjective well-being are;
extroversion, conscientiousness, and agreeableness, but what other
characteristics do positive social behaviour show? The follow-up questions
are, whether and how technology can induce and support such
behaviours.
Ebinezer Scrooge
Work hard
Increase production
Prevent accidents
and be happy
Voice
IV.18.1 Flow
Flow is the optimal presence, and it is the optimal experience, i.e. the
ultimate mindfulness [MC3]. It might also be more common that you
think. Where you lost in the previous sentence half a second ago?
Prerequisites for flow are:
A task with clear goals to complete.
Immediate feedback.
Ability to concentrate on the task.
Sense of control over actions.
Masters of science!
In the state of flow the duration of time seems to change, concern for self,
awareness of worries disappears, and after the experience a stronger sense
of self emerges. To achieve all this it is however important that skills and
challenges matches the person, see figure below.
Level of
challenge
Anxiety
Flow
Figure IV.18.1 Model of flow.
Apathy
Boredom
Level of skill
Hakan Gulliksson
138
People will accept or be faced with a level of challenge, and as skills
develop and new challenges emerge they will be forced towards the upper
right corner in figure IV.18.1. Also, note how easy it is to match the
prerequisite for flow to how a successful game affects a player.
Challenge can be generalised to the level of complexity that a person faces.
Boredom then means that we face a situation with low complexity
compared to what we can manage. This difference between situational
complexity and ability is called incongruity in [RN] and the figure below
shows a person in two different situations and how learning and
adaptation affects incongruity.
Complexity
“What, me worry?"
Alfred E. Neuman
Situation 2
Learning/Adaptation
Incongruity
Situation 1
Figure IV.18.2 Incongruity
(adapted from RN])
Context
Person
The figure is interesting since it suggests a definitely dynamic framework
for experience and also that all experiences are individual since
incongruity is individual. It is also context and situation specific.
Now, if we return to figure IV.18.1 above, and following the reasoning
from [RN] flow means that contextual complexity must be larger than the
individual s. “lso, a medium arousal level is sought which means that we
will search for such an incongruity. If the challenge is too low we will get
bored and look for novelty, i.e. for a more complex (challenging) situation
or context. If, on the other hand it is too high we try to lower the level of
arousal, increase confirmation, and reduce uncertainty.
High
Situation 2
Optimal new
situation
Pleasantness
Figure IV.18.3 Optimum level of
arousal (uncertainty)
Situation 1
Level of arousal
High
We can assume an equivalence to flow also in social activities. Too much
control reduces complexity, with not enough challenges we get bored. Too
much variation, i.e. a high level of social complexities that we cannot
handle, is also not good.
A simple solution to tweak the current situation is to use drugs, other
disputable strategies are all kinds of over consumption whether it is of
games, gambling, girls (3G), or of food, fat, and sugar.
Hakan Gulliksson
139
IV.19 Unique features for each of us
IV.19.1 Unique human abilities
Let us start with the human development. The main theory is evolution,
survival of the fittest. A kind of biological learning where favoured
mutations or new behaviours will be established as dominant after some
generations. The development cycle resulting from this is rather long, and
does not necessarily produce optimal solutions, survival is enough. Also,
it is not clear what it means to be the fittest in the new, fast changing,
information society. Humanity seems to have bypassed evolution for
individuals, but perhaps it is still at work for societies and ideas?
The list of intellectual feats mastered by humans only is rather long
[DAN]. Maybe the most important of them, apart from the ones discussed
previously in this chapter, is how we design and create artefacts to
develop old abilities and give us new ones. Some other animals also use
simple tools, but humans have created complex artefacts such as the
hammer, the automobile, and the computer. The hammer is by the way
not that simple to use, ask any chimpanzee.
People are deeply rooted in space and reality. Even a four-month-old
infant is surprised if objects pass through a gap that is narrower than the
object itself, or if an object disappears from one place, and materializes in
another [SP]. The spatial information is mainly provided by vision, and
there are innate behaviours in place to manipulate it, see figure below.
We all have a little weakness,
which is very natural but rather
misleading, for supposing that this
epoch must be the end of the world
because it will be the end of us.
How future generations will get on
without us is indeed, when we come
to think of it, quite a puzzle. But I
suppose they will get on somehow,
and may possibly venture to revise
our judgments as we have revised
earlier judgments.
G K Chesterton
Humans think they are smarter
than dolphins because we build
cars and buildings and start wars
etc...
and all that dolphins do is swim in
the water, eat fish and play around.
Dolphins believe that they are
smarter for exactly the same
reasons.
Douglas Adams
Figure IV.19.1 Spatial effects in
visual perception.
Processed image information is stored in long-term memory for
convenient retrieval, with better access for frequently used information.
You will quickly recall the colour of your house and the colour of the
house next door. But, recalling the colour of the next house but one is
slower as you need to do a mental scan over the street. Right?
Imagine the letter D. Rotate it
90 degrees clockwise. Put the
number 4 above it. Now remove
the small horizontal segment to
the right of the vertical line.
What familiar object do you
“see”?
Our species special spatial dependency is also evident in language. We [SP]
assume that distant objects and relations affect us less, and this way of
thinking can also be reused with reference to time. We say that The Space and force pervade language.
meeting lasted from 13.00 to .
, and The meeting is at 15.00 , I will Many cognitive scientists have conpay this bill, it is due tomorrow, that one is due next week . The same cluded from their research that a
metaphors are also reused for programming, These input parameters handful of concepts about places,
paths, motions, agency and causation
have almost the same values, and should produce almost the same output underlie the literal or figurative
from the procedure .
meanings of tens of thousands of
words and constructions…These
The human is also the only interactor we know of who troubles herself concepts and relations appear to be
with philosophical questions, such as what is consciousness, self, free the vocabulary and syntax of
will, meaning, e.g. I know what a natural number is, but how can my mentalese, the language of thought
[SP]
brain have a relation to an infinite number of items? . Other mindboggling problems are concerned with morality, e.g. Why not steal to “Consciouness made easy”
avoid starving? and knowledge, Why is the speed of light constant, and “Meaning of life for dummies”
why should that fact be true tomorrow ? Maybe we can use technology to New titles we would like to see.
Hakan Gulliksson
140
shed some light on some of these questions, or is our own nature such that
we are not capable of understanding the answers [SP]?
Some other feats to ponder on are:
Common sense, most of us have it, but no one can describe how
to teach it.
Humour, jokes are based on the human ability of association (and
more?).
Art, creates a strange world of its own, exploring socially
accepted truths that are sometimes agreed on by one person only.
Art is a multilevel experience. We perceive a work of art, we
reason and think about it, and it might arouse emotional
responses. We humans decide ourselves what should be
considered as beautiful, but there are some accepted universal
truths. Below, in figure 30.2, is an attempt from Dr Marquardt to
design the most beautiful, ultimate, face. It is created using the
golden ratio, also called golden section, golden proportion, or
the divine proportion. This number is the result when a straight
line is divided into a longer part a, and a shorter part b, such that
a/b = (a + b)/a, i.e. (5 –1) /2=1:1.6180339887. For some reason the
golden ratio is considered beautiful by most people, and also
appears frequently in nature.
Curious? Why?
Common sense knowledge is not the
sort of knowledge found in
encyclopedias, but, rather is the sort of
knowledge taken for granted by those
writing articles in encyclopedias.
Hubert Dreyfus
Tangram (spookey)
Figure IV.19.2 Face designed
using the golden ratio (left). The
width of your nose and the
width of your mouth should
follow the golden ratio, but, if
yours do not, do not worry,
beauty is not everything (right).
Games and sport, mental and physical exercise raised to art.
Schooling, organised cultivation, not always as efficient, or as
much fun, as it should be.
Rituals, used for social confirmation. These are tight social
interactions where the sum adds up to more than the individuals.
Time perception, what is time, another of the eternal questions.
Every one of us perceives context differently. Some of us quickly
identify situations and events, but others are quite sluggish. Does
this mean that people perceive time itself differently? Now it is
time to go to bed.
Time perception ….
Hurrying
Waiting
All mobile things face the problem with sustaining power. The sun in
mostly the source, directly through a solar panel (0.3m2 10W), or
indirectly through a Lithium battery (200g 6W). A human has the
wonderful power to convert food into energy. A not overly exercised
adult consumes 2000 kcal/day. This is equivalent to 2325 Wh. We would
need quite a battery! Some figures on consumption are that sleeping needs
80W and walking adds 60W. Researchers from MIT (Massachusetts
Institute of Technology) have managed to retrieve 8.4 mW of this energy
by inserting a piezoelectric element into a shoe. Arm curl adds another
Hakan Gulliksson
141
t
35W of consumption, and finger motion adds around 10mW. It is not
much use trying to burn fat by exercising your fingers [TS].
"Carriages without horses shall go,
And accidents fill the world with woe."
Mother Shipton (circa. 1530)
As our environment becomes more complex, more often the limitation in
the system will be the so-called human factor. Human memory and
processing limitations will inevitably lead to errors when a human is
given the wrong type of tasks. Humans are also limited in many other
respects; our running speed is for instance not very good compared to an
aeroplane. We have to carefully design systems such that they do not
overtax human levels for mental and physical effort, reaction times,
performance, or frustration.
But, even though the human factor is the cause of many problems the
truth is that it is the only unique feature of H, and should be nursed and
refined.
IV.19.2 Features and limitations not found in man
Many times things are more robust than humans, other times they break
without any visible reason, but so far they have never complained! A
rugged design can survive in the desert, as well as on the top of Mount
Everest, for a long time. Robustness stems from redundancy and other
qualities, which can be built into a thing, but a human being has to settle
for nature s readymade design.
The thing is currently evolving under quite a different environmental
pressure than man. Compare survival on the savannah with survival in
the market place. Evolution is supported by the fact that it is possible to
manufacture an exact replica of a thing. This is not possible to do with a
human, and is a major difference! We might copy the basic genetic code
and use it as a base for an individual, but to copy a human complete with
her psyche is impossible. Nature s choice of an analogue implementation
enhances adaptability at the cost of loosing some determinism,
including the possibility to do exact replicas. If such a trade-off is
necessary for intelligent, context dependent, adaptive behaviour is not
clear.
Every second someone hooks a new
computer to the Internet.
This person must be stopped!
Internet
I want to pull my arms into me
when they aren't in use.
John Dobbin
What are the limitations for a thing? None whatsoever?
The answer is that we really don t know yet, but the very idea of an
intelligent thing is something that by itself continuously enhances human
capabilities! By exploring the possibilities of things we learn about
ourselves, and about the reality we share. What we can say is that the
thing is less limited than the human in many respects, but that currently
humans are controlling the reproduction. We are no longer able to give
birth to a computer without the help of a computer, but we are in control
of the on/off button to the womb. In practice we have lost also this
possibility since our society now is too dependent of the computer.
Hakan Gulliksson
142
A grown up human cannot decide her extent of autonomy, it is fixed by
nature. How much autonomy to implement in a thing is a design decision,
and depends on the purpose. The complexity of the internal workings of a
thing will be proportional to the extent of autonomy. Perhaps emotions
and beliefs are overkill for a vacuum cleaner? If your child s toy becomes
too autonomous it might walk sulking away from the playpen.
Another limitation is that the thing currently has no consciousness; it is
not self-aware the way humans are.
One prediction is that technology by 2029 will provide the necessary
means for consciousness (memory, processing power). If these are
sufficient is another matter [GB]. The prediction is based on the fact that
there are 1012 neurons in the brain, each with 1000 synapses, summing up
to a total of 1015 synapses. An artificial neural network needs 4Bytes of
memory per synapse, so in other words, we need 4 million Gbyte of
memory. Estimations from the typical random-access memory
configurations in personal computers the previous 20 years gives the
following formula:
noByte 10
(
year 1966
)
4
,
solving this for 4 million Gbyte gives the year 2029.
Even if the number of memory cells, and the number of connections are
equal in an integrated circuit and a brain, and even if the circuit is a
hundred times faster than a neuron there is still another trick that the
brain could use to achieve its outstanding complexity, and that is timing.
Timing in an integrated circuit is used only to start a computation, for
instance by enabling a gate in an inverter. In the brain timing can be used
also to encode information. A short difference in time between two
neurons firing in parallel could mean that a trailing neuron is triggered. If,
on the other hand, the difference is large the trailing neuron will not fire.
The complexity possible for such a scheme is in principle infinite. In
practice robustness will set a limit to the complexity, using too small
timing differences is not practical.
Power is quite necessary for an intelligent thing. Currently computers
drink electricity and depending on the application and the situation this
can be a severe problem. It is expensive (as it is for humans) to quench the
thirst. For stationary things power is usually not much of a problem, but
mobile things need batteries, solar cells, or some other means for energy
source. There are also other demands on the energy source; size, weight,
service life time, rechargability, time to recharge, replacement cost, and
environmental and ecological concerns. It is tough to be a hardware
designer. A typical Lithium battery for 6W weights 200gram, and a NiCd
battery for 6Watt is quite heavy, about 1.5kg. We can use the Mips/Watt/ $
to estimate the power performance of a system. A typical value for this is
20 (1999). A similar measure for memory is MegaByte/Second/Watt/$.
31
Dec
2028
Isaac Asimov formulated the
following three robot laws in one
of his science fiction novels:
I A robot may not injure a human
being or, through inaction, allow a
human being to come to harm.
II A robot must obey orders given
by a human being except where
such orders would conflict with the
first law.
III A robot must protect its own
existence as long as protection
does not conflict with the first or
second law.
So far no robot has needed any
laws because they still cannot
even reliably detect humans.
But, if they needed laws, would
they really need the second law?
When, if ever, should an
“intelligent” thing lie to another?
To a human being?
A number of different approaches
are being tried.
Engineering, Internet
(We are still guessing at this point).
A personal computer dissipates around 100 Watt, and as a household
radiator a thing can dissipate 1 or 2 KWatt.
Hakan Gulliksson
143
Computers do currently not learn very well, and this fact forces them to
develop through evolution. They do not reproduce, which means that they
are at the mercy of their creators. Survival of the fittest gets a different
meaning then, namely survival by manual selection. On the other hand,
things can be copied and each generation can have a short lifespan, e.g.
compare mobile phones. Which is best in the long run? To be able to learn,
or to develop through fast evolution? Maybe these two approaches are
equivalent in our case since we are all interdependent?
Currently another limitation is that we do not know much about the
mechanisms for creativity in humans which makes it extremely difficult
to build things that are creative. What we can strive for is not technology
that is creative by itself, but that helps a human to be more creative, i.e.
technology that supports the steps of creativity described in the Chapter
IV.15 on human creativity.
IV.19.3 Summary Human vs Thing
The table below is a summary of the characteristics of man and machine
[BS1]. As we do in the table we tend to confront the thing and the human,
but, maybe we should rather see the thing as an extension of a human, a
complement extending senses (binocular, hearing aid), motorics (car,
bicycle), and cognition (computer based calendar, calculator).
Machine generally better
Sense stimuli outside human’s
sensory range.
Count or measure physical
quantities.
Store coded information
accurately.
Monitor prespecified events,
especially infrequent ones.
Retrieve pertinent details without a Make rapid and consistent
priori connection.
responses to input signals.
Draw on experience and adapt
Recall quantities of detailed
decisions to situation.
information accurately.
Select alternatives if original
Process quantitative data in
approach fails.
prespecified ways.
Generalise from observations.
Infer from general principle.
Act in unanticipated emergencies
Perform repetitive predefined
and novel situations.
actions.
Develop new solutions
Exert great highly-controlled
physical force.
Concentrate on important tasks
Perform several activities
when overload occurs.
simultaneously.
Ethical reasoning
Maintain operations under heavy
load.
Emotional (computers do not care?). Maintain performance over
extended periods of time.
Thinking about the things
we used to do
Nancy Sinatra, Dean Martin
Imagine one million things,
each with sensors and capacity
for a mental model. Connect
them into a high-speed network
and wait. What will happen?
What are the main challenges to
make this scenario true?
Humans generally better
Sense low-level stimuli, e.g. using
the finger tip.
Recognise constant patterns in
varying situations.
Sense unexpected events and
subjectively evaluate them.
Remember principles and strategies.
Hakan Gulliksson
Table IV.19.1 Summary of
humans versus things.
144
Part V: Interaction, we do it together
This part of the book will introduce feedback under the pseudonym of
interaction and it is one of the most important concepts of this book. We
start off by pinpointing the concept of interaction. Next, Chapter V.1 to V.5
give short introductions to interactions between the three participants H, I
and T. We discuss interaction as a way of improving quality of life, and
also why we need technology for this. Since the main objective is to
improve quality of life, much of the discussion, and almost all of the
examples, are taken from a humanistic perspective. We continue by giving
context a special treatment. It is important since context is the background
environment that fuels interaction. Chapter V.7 defines context and
describes how it can be used. Equipped with knowledge on context we go
on to Chapter V.8 and V.9 where we focus on interaction modelling and
interaction characteristics. Next, mediation gets its own chapter V.10. The
last five chapters discuss interaction terminology and technology for
interaction control and cooperation, specifically the last chapter of this
part of the book, Chapter V.15, will give an introduction to command
based interaction, i.e. the traditional core of human-computer interaction.
The word interaction hints at its own meaning. It is built by the words
inter and action, both derived from Latin, inter meaning between or
among, and action, from Latin a’ctio, actually meaning action. The
following story adds to the concept:
An author (whose name we have forgotten, perhaps it was Victor Hugo?)
becomes really nervous when his new book is released, and escapes to a
resort. He soon becomes feverishly curious and sends a telegram to his
publisher, which only contains a ? . His publisher returns a ! and the
relieved author continues his vacation.
I
T
Interaction is where two or
more actors share a common
time-space-state universe.
This is obviously an interaction between two communicating entities. It is
also something more than a mere exchange of messages. Because the
publisher knows the author he can predict how the author will interpret
the ! .
One description of this interaction is that the two participants are
brought into a dynamic relationship through a set of reciprocal actions,
that is through a series of events, during which they are in contact with
each other in some way . “ better and shorter definition is Mutual
interdependence . This more accurately describes the coupling between
the participants, but the fact that the interaction above was goal directed is
still missing. Let s use the definition to the right.
Hakan Gulliksson
Definition:
Interaction is a method for goal
directed mutual interdependence.
145
A goal of an interaction can be engineered into the system, or emerge
from previous interactions. If it emerges it could do so through a
democratic process, or be imposed in a more dictatoric manner by one of
the interactors. Emergent goals have the interesting property that they can
dynamically change behaviours, and make old objectives obsolete. To
follow norms and fulfil obligations are examples of emergent, and also
changing, goals. In this book we will however not always insist on a goal
for the interaction. Two charged particles affecting each other will also be
considered an interaction.
We can see that traditional sciences explore two types of phenomena at
different levels of detail. They study objects, and also operations acting
on and performed by objects. Typical objects are molecules, cells,
computers, humans, networks, societies, and stars, which we in this book
refer to as interactors. The operations, in this book called interactions, can
be exemplified by human conversations, chemical reactions, software
processing of sensor data, data communication, and gravitation.
Interactors and interaction are intertwined and dependent, which means
that even if it is possible to spend a lifetime studying almost any detail of
either an interactor, or an interaction, we believe that such a strategy will
not give the whole picture. We cannot fully understand an interactor
without knowledge about the interactions it is engaged in, and we also
cannot understand the interaction ignoring the interactors. When we
study the world we can try to figure out why an interactor does
something, or we can study the actions themselves. The latter is obviously
much easier, at least if we restrict ourselves to observable actions. If we
consider that an action is a product of the local context and the situation,
and if we see an interaction as a set of actions extended over time that can
have emergent properties, then actions do not seem so tangible and
comprehensible any more.
What keeps the following
systems from falling apart?
A society?
A rock?
Internet?
Sun
Earth
%
At each level of detail there is much to learn, but as humans we should
start our investigations at our own level. Here we have a first hand
experience that makes it easier to understand principles, without too
much formalism. It is easier to apply knowledge from our own lives. Even
though quarks are interesting, and are described by well thought out
formal models, they are hard to relate to in daily life. The price to pay for
studying ourselves is that the information hidden, because of the high
level of abstraction, makes the interactors and interactions under study
slightly magic.
With two actors, a shared space, and some means for communication we
have the basic pre-requisites for coordination. Managing the coordination
is in itself an interaction, i.e. a meta-interaction. This could for instance
involve planning for the interaction, or specifying rules or infrastructure
for communication. A door bell is a perfect example of a tool for
coordination.
Hakan Gulliksson
”England expects that every
man will do his duty”,
12 sets of flags, in total 31
flags.
146
Two antagonists will coordinate by competing, but more amiable actors
will co-operate, or form a symbiotic mutually beneficial relationship.
Interaction adds properties to the system that goes beyond the individual;
it is both a source of power, and a well of problems. Anything that joins
could be the cause, or the medium, of a conflict; marriage, and a common
language are two examples. More about the co-x words in Chapter V.11 to
V.15.
The main reason for interaction is a shortage in resources, necessitating
coordination, and perhaps leading to co-operation or antagonism. One
shortage could be the lack of skills, but there are an unlimited number of
possible resources to fight over.
Definition Coordination is the
process of sharing access to tools,
objects, space and time. It includes
transfer of tools and objects.
Adapted from [DP1].
Resources
Resource space
needed by A
Resources
needed by B
V.1 H-H Interaction, the reference
The problem for human-to-human interaction is sharing or exchanging
information between two persons, or within a group of people, while
sometimes excluding other people. Many times we as senders are forced
to adapt to the receivers, and the receiver to us. How do you manage this
in a highly dynamic, stochastic, context?
Human to human interaction is the highest, most complex, form of
interaction! It has been around for quite a while, so the fundamentally
social human animal has established many rules, and even hard coded
some behaviour. It is important to know about these constraints whenever
you are interacting with other people, and even more so if you are
developing systems to support H-H, H-I, or H-T interaction. Wave your
hands at sea, and you summon the coast guard. You place yourself last in
the queue for coffee. You say Good morning , Good night , See you
tomorrow , “fter you , Please ,…..
Are we having an
interaction together?
There are many levels and types of interactions, some of which we are
conscious of, and some that we have to train ourselves to recognize and
register. If one person in a group of people sitting around a table yawns,
inevitably several others will do the same. As another example try dilating
your pupils at will! This could be quite useful since dilated pupils of a
woman have been found to cause the pupils of men s eyes to dilate by as
much as 30 percent, indicating that there is an affect. It is not clear if the
response holds true also for the other sex, but it is worth a try. Most H-H
interaction in this book will be considered as symmetric interaction
between equals.
A third example of H-H interaction is a mother talking to her child. There
are four sound patterns that reappear in cultures all over the world [PG]:
Encouraging, Come to mommy , raising the tone.
Rewarding, Good girl , ”ravisima, bravisima , lowering the tone.
Warning, No, no, stop that, no,", short staccato.
Comforting, Hush hush , soft.
Hakan Gulliksson
147
So far, the best tools for H-H interaction are the spoken and the written
language. An interesting theory is that the spoken language was invented
more for social reasons than for information transfer. The language gave a
group of humanoids a common reference that could keep the group
together even if the number of individuals in the group increased. In a
small community each individual can get to know all of the other
individuals, but as the size of the group increases, socialising without
language takes too much time, and the group will starve, or split up. In
fact, most human knowledge is stored in a common culture, which is kept,
and developed, by interaction and language. Because of the complexity of
H-H interaction and the adaptability of H emergent behaviour, such as the
creation of a language, is the norm rather than the exception. Language is
also an example of an applied coordination technique. Examples that have
emerged are history books, timetables for buses, and university lectures.
We can use technology to enhance person-to-person communication and
consequently to further develop our culture. The telephone makes
conversation independent of distance, and e-mail removes the time
constraint from the interaction. Unfortunately technology also reduces the
communication bandwidth. It is quite difficult to see gestures through a
phone, and it is also difficult to hear the tone of voice in an e-mail. Is it
possible to hear a smile over the telephone?
But, note that the limitations in technology are not fundamental. Certainly
technology can provide a much higher bandwidth than any human can
manage. According to one calculation a human accepting at full speed
crunches 1 Gbit/s of sensory data, and such a bit rate is available already
with today s technology.
I marmaladed a slice of toast
with something of a flourish and
I don’t suppose I have ever come
much closer to saying ’Tra-la-la’
as I did this morning. It is no
secret that Bertram Wooster,
though as glamorous as one
could wish when night has fallen
and the revels get under way, is
seldom a ball of fire at the
breakfast table. Confronted with
the eggs and b. He tends to pick
cautiously at them, not much
bounce to the ounce. The reason
for the improved outlook on the
proteins and the carbohydrates
was not far to seek. Jeeves was
back.
P. G. Woodhouse
This ’telephone’ has too
many shortcomings to be
seriously considered as a
means of communications.
The device is inherently of
no value to us.
Western Union internal
memo, 1876
There are several different reasons for communicating. From the
perspective of interaction we can list them as in the figure below [UB]:
Entertain
!
Inform
Coordinate
Collaborate
Co-operate
We entertain for fun, inform to let someone know, coordinate to
synchronize or to level, collaborate to achieve a common objective and
finally we co-operate using shared resources, also with a common goal.
Evaluating any of the activities in the figure above is notoriously difficult
in H-H interaction because we cannot read human minds. This fact also
makes any H-H interaction potentially non-linear, e.g. slightly changing
the context of the interaction does not necessarily mean a small change of
the result. Since we cannot be certain about the goal of an interaction it is
also difficult to know if the goal is met.
We structure interactions in many ways, depending on the circumstances.
Planning, estimating time, repeating actions, and aligning constrained
tasks come easy to us. However, with H-H interaction results from
repeated interactions are unpredictable. This is many times a result of the
opaque human mind, but sometimes also because of the shear complexity
of context or the task, designed or natural. H-H interaction is at least
predictable enough to improve efficiency when repeated. Creativity is the
other side of the coin; it emerges along with complexity and
unpredictability.
Hakan Gulliksson
Figure V.1.1 Reasons for
communicating.
Below are some of the 16 essential
interactions defined by Thom:
Ending
Beginning
Being
Rejecting
Stirring
148
Humans are social beings. This means that we immensely enjoy, and
engage ourselves deeply, into chatting, gossiping, and discussing. We can
participate in several simultaneous interactions, and for each interaction
we have multiple information channels. Some with high bandwidth, such
as speech, and some with lower bandwidth, such as asking a 10 year old to
leave his teacher a message.
H
H’
H
V.2 I-I Interaction, so far for efficient data transfer
For I-I interaction the problem is to exchange information supporting the
intended functionality as efficient as possible; aesthetics and satisfaction
are not relevant. Information-information interaction is the most
structured and formalised type of interaction, with rules that have to be
rigorously specified down to the last information bit. The following
section will discuss I-I interaction in general. Many more examples and
details of interactions will be given in the later chapters.
There is no known limit neither to the complexity, nor to the bit rate of I-I
interaction, but currently the data transfer is constrained by the state of the
art in data communication technology. Fundamental physical limits such
as lack of bandwidth, and background noise levels have not yet been
reached. The complexity is hampered by a plethora of data representations
that currently are not compatible, not general enough, and with limited
information content. It seems that the problems are practical in nature, but
whether I-I interaction can reach the complexity of H-H interaction is still
not known. Should it be possible it will however take many years of hard
work. The behaviour of many systems might seem intelligent, even
creative, but so far this is more a result of the enormous amount of
information available, and the strange, foreign, character of the
processing.
Information resources are increasingly hooked up to the global network,
and the information can be combined to create new information with
emergent properties. All books ever written in the western world, such as
the one you are reading, are only combinations of less than a hundred
characters, and in any new book you will find fragments and ideas from
other books. This certainly goes for this book too. Combining information
from unemployment registers and salary registers creates information that
suddenly sends people to prison. Statistics of demographic indicators
distributes resources; polls and stock market quotes are active interactors
in politics and economy. Statistics very much rule the day in this era,
whether valid or invalid. Note that interactors of type information never
forget, which means that the full history is always available to the
interaction.
“The communication tail is
wagging the processing dog”
Paul Saffo
A
B
42
6
0
For I-I interaction data access is the goal, and access rights are
consequently important. Computer viruses roam the network trying to
take over precious processing and memory resources. Hundreds of
gigabytes on hard discs all over the world have a content that the owners
of the computers would be shocked to know about. This accessibility of
information reflects human society. Not everyone is allowed to have a
peak at the data files of NSA, but, at least in Sweden, the personal income
is available to everyone. Without explicit access restrictions all data in an
I-I interaction is in theory available to all interactors, even data internal to
the interactors.
Hakan Gulliksson
149
Having the access rights to data is not the only problem though; even
worse is that the information available to an information agent is
fragmented in many ways. First, it is distributed, a problem that can be
fixed by search and indexing, even though the enormous amount of
information still presents a problem. Context to I-I adds even more
information, and there are lots of contexts. The amount of data means that
some views of the data will be favoured while others need to be hidden.
The next problem is that information is stored in many different formats,
and that different abstraction levels are used. Finally, to top the above, not
enough meta-information and context is available to support
interpretation of data. Altogether this means that information and
information interactors must be purposely designed to be useful.
n
o
t
r
i
i
a
m
n
f
o
So far everything that happens emanates from human intentions. H is still
designing the data structures, aligning processes, and the one pushing the
buttons, for instance enforcing the law by deciding which databases to
match. The increased connectivity means that pushing will effectuate
more, and do it faster. If demand on productivity continues to increase
more and more applications will eventually have to remove the human
out of the loop, and statistics will be in charge. This will favour is
information that is measurable, and that is actually measured. All other
aspects are likely to be ignored.
Things and behaviours that
will disappear or change
when mobile technology
matures
-Newspaper
-Telephone book in paper
-Calender in paper
-CD-player
-Dedicated TV-set
-Wallet
Wrist watch
Interactions of the type I-I are manifested not only by exchanging
messages. Ideas that interact is another example, put forward by the
biologist Richard Dawkins. Analogous to the gene, the information carrier
for the idea is called a meme. Memes combine into new memes that
represents new ideas, such as the one by R. Dawkins. Memes do not
mutate at random, as genes do in the biological evolution, but by creative
or purposeful adaptation fostered by humans. Some memes are potent,
they survive, and spread like epidemic diseases. Other dies, either because
they were bad ideas, or because they showed up in the wrong place, or at
the wrong time. Memes are interesting but will not be further discussed.
Nature is also a “meme bank”, an
idea factory. Vital postindustrial
paradigms are hidden in every
jungly ant hill.
Kevin Kelly
Internet has autonomous subunits, high connectivity, and no centralised
control. These features match characteristics in nature and society, which
can provide many ideas, metaphors, and structures for models and real
implementations as Internet develops. Current examples are the firewall,
a stockade raised against malicious messages, and the Internet backbone
where the bulk of the long distance messages travel. A third example is
the use of protocols. I-I interaction really can use good ideas since it has
some catching up to do. Speech appeared about 100.000 generations ago
and Internet only 1!
Rose is a rose is a rose is a rose
Gertude Stein
1995
Birth of the WWW
V.3 H-I, H-T Interaction, joining forces
Human to thing, and human to information interaction are the most
difficult interactions to implement with high bandwidth and efficiency.
One reason for this is the incompatibility of the cognitive systems of the
interactors. Both processing and perception are radically different. As a
consequence adaptation must be exploited as much as possible. The
current approach is to primarily adapt I/T to H by developing technology
and using it intelligently. To what extent H and society, can adapt to I/T
over a longer time perspective is not known. Cars, traffic jams, TV and
soap operas however prove that we are capable. Another problem for a
designer of H-I or H-T interaction is that any interaction involving H will
Hakan Gulliksson
150
be complex as aspects of human behaviour and society will affect the
designer.
We have grouped the interactions H-T and H-I together since one without
the other is not interesting, or simply impossible. Interactions involving a
human always have a physical representation, telepathy is not yet
common. So, although the intention is to interact with information, the
means for doing so is a physical thing. One example is the mouse. Are we
interacting with the cursor on the screen (H-I), or with the mouse (H-T)?
One way to understand the development of computer-based technology is
as a result of computers reaching out into the human environment, or in
the words of Jonathan Grudin
, the computer is colonizing its
environment [JG ]. Starting out as a working tool for expert
programmers the computer currently is an indispensable, and many times
invisible, support. The computer interface to the world is getting more
advanced at an accelerating pace. But, as Grudin pointed out in the
reference the metaphor of the computer reaching out has a major problem.
A human child learns about its own context. A computer on the other
hand tries to understand the human world rather than its own. It is
supposed to support humans and therefore needs to understand them
rather than itself, and its own environment. An illuminating example
comes from the reference [SS], where the authors discuss how a user
might formulate a command to change the light (computers reaction
within parenthesis):
H-T interaction Gersdorff 1517
UNIX
UNIX
It is dark So what?
I need more light by the stove What is a stove?
Set up the lighting like it was yesterday ”ut, your wife is still
asleep upstairs?)
Light this room for ”ob Who the hell is ”ob?
As you can see from the examples there is much knowledge of the human
world hidden in even a simple command.
Perceptually technology is still not mature. It is however improving fast,
and there are many possibilities for new and improved channels, which
will support closer relationships between interactors. This will also mean
that negative aspects of close relationships will surface, for instance
gossip, possibilities for aggression, and personalised advertisements. As
the interfaces between interactors improve also other contexts of the
interaction will affect the interaction. Later chapters will give details and
provide more examples of the above.
We are now facing the design of systems where networking and
processing capacity are built into our environment and everyday
appliances. Such systems will mediate information from context and other
interactors. They will give us access to the digital world and give the
system information about us which means that a designer will have to
understand social issues as well as technological limitations. Together
with the human user a ubiquitous system will be intelligent and adaptive,
and with a large potential for emergent behaviour. In short many, if not
most, aspects of H-H interaction will be combined with the possibilities
and limitations of I-I interaction. This means unpredictability, difficulty to
observe internal states, and limited access to goals, but it also implies
Hakan Gulliksson
H
H
H
H
H
H
H
O O
O
N
C
C
151
constant availability of interaction history, and increased interaction speed
and frequency, at least as an option.
The topic of this chapter is also the focus of the research disciplines
human-computer interaction (HCI) and man-machine interface (MMI).
“ definition from the Curricula for Human-Computer Interaction
written by ACM Special Interest Group on Computer-Human Interaction
SIGCHI Curriculum Development Group puh! is Human-computer
interaction is a discipline concerned with the design, evaluation and
implementation of interactive computing systems for human use and with
the study of major phenomena surrounding them .
V.3.1 Ubiquitous computing
Technology soon allows for ubiquitous computing. The idea is that
computers are networked and numerous, executing everywhere in the
physical environment, and that this can be exploited to make computing
invisible to the user. The computerised device in other words is integrated
into, and spread out over, the background environment. Combined with
sensors the resulting pervasive computing in principle continuously can
monitor what a user does, record it, and react to commands given
anytime, anywhere.
User interfaces in intelligent ubiquitous computing environments have to
accept the lack of a single focal point [SS]. Related events happen in
parallel, on different locations. This further aggravates the problem of how
to interact with the user. Another issue is when several co-located users
simultaneously interact with an interface, perhaps with conflicting goals.
Furthermore, the context of use will change as the user moves around and
this could mean that:
”The most profound technologies are those that dissappear.”
Mark Weiser
OFF
ON
ON
OFF
ON
OFF
Users change interaction devices, e.g. to a smaller screen with
new usability constraints.
Users are distracted by social and other changing aspects of the
environment.
There is no more a singe most important task, and maybe not even a single
most important user. The goal of a next generation supporting ubiquitous
computing could well be a system flexible enough to allow for the context,
including humans, to modify the system itself. Either a designer provides
predefined services, or he suggests a platform where the users can
augment the environment, and perhaps even develop services by
themselves. Top down design is then replaced by an interactive, iterative,
design development, compare to how television is now desperately
adapting to the Internet. In principle a reflective system can adapt to
anything, by modifying itself, but in practice there are numerous
constraints. No technology is indefinitely malleable. The designer,
material, original intention with the technology, and much more
constrains the possibilities to modify a system and its technology in a
given situation. It is hard work writing an essay on a pocket calculator.
People have intentions, emotions,
dislikes, phobias, perceptions,
interpretations (and misinterpretations) and many other motivators that drive their behaviour in
unpredictable ways that are impossible to even model accurately, let
alone instrument or infer.
Victoria Bellotti
A designer still needs to put herself in the position of the participants of a
designed system that is distributed and adapted to local circumstances.
Not a simple task. We as humans have problems with conscious parallel
activities; we are highly sequential thinkers. We also have problems
following dynamic processes, we prefer to freeze them and study them at
one point in time at the time.
Hakan Gulliksson
152
Because of the above, ubiquitous computing challenges the prevailing
interaction styles [SS2]. As the physical environment becomes
computerized anything in the environment and any combination of
things are potential interaction devices. Either indirect, as the
environment tracks the thing, or direct using functionality built into the
thing, including networking. Even the number and type of interaction
devices might change throughout a task. We start an interaction in the
car, continue while walking into the office building, and end it in the
coffee room.
Phone
home!
The mouse, keyboard, and the display are special cases and not always
available for interaction. Input through other means is maybe not that
difficult to imagine and realise, but what about output? If there is no
output device at hand, how can the user be contacted? Speech is one
alternative, but it is not always the best choice in a crowded room. Room
lighting is another possibility; a purposeful blinking of the light is a good
signal. It seems that there is a need for more options for output, preferably
continuous in time. Privacy will be an issue as information looking for
users roams screens, and loudspeakers everywhere potentially spells out
secrets. If a phone also transmits context, is it obvious that bystanders
want to participate in the context of a videoconference? Law prohibits
camera surveillance in public places, at least in Sweden.
If the functionality of a device becomes less clear, because of the
possibilities to build functionality into everything, its appearance will be
even more important, at least from a usability point of view. Ubiquitous
systems are, on the other hand, very complex systems and this implies
that they cannot be built overnight. In other words we will all have a lot of
time to get used to the new systems, and some will loose a lot of money
from mistakes done during this adaptation. We as designers will certainly
learn a lot about how to make technology publicly available for general
use such that it does not disrupt existing social models and norms.
180
I/O Device?
V.4 I-T Interaction, access to reality at the speed of light
A thing moving around in a physical environment faces many of the same
interaction tasks as humans; identification, navigation, choice, reading,
writing, and manipulation. The sensors are different though, as well as the
characteristics and capabilities for communication and processing.
Although humans can absorb and process huge amounts of sensory
inputs, the rate for digital data is low. A computer on the other hand has
problems interpreting even the simplest of scenic views, but easily accepts,
manipulates, and stores, Mbit of data per second.
WC
Furthermore, for digital data the thing needs fewer intermediary
transformations and interpretations. This means that most of the
technology that people need for information management is not needed
for T and I! A thing does for instance not need a display to inspect a digital
image.
Potentially I-T is a powerful interaction. It has contact with both the
physical and virtual environment; and there is a built in evolution where
intelligent things could provide feedback to information agents in the
virtual environment that could further improve the things. The evolution
is fuelled by the fact that I-T interaction can keep a detailed track of what
Hakan Gulliksson
CD-cooler
153
has happened, something that the designer later can use for the next
improved generation. With no H directly involved the complexity of the
interaction is lower, but even without H the real physical world is
unpredictable, and certainly not a static information source. Currently for
I-T interaction, we cannot use curiosity, hunger, or sexual instincts as
motivations to get something done. On the other hand we do not have to
since I and T will not refuse, loose confidence, or be afraid. Context
awareness is obviously immensely important, but the channels to the
physical reality are so far narrow and isolated. The mobile phone for
instance estimates its position using a radio signal only. One problem is
that a single channel will be sensible to noise. Failure of the channel will
terminate the service. Compare this to how we use vision, hearing, and
tactile information to constantly orient ourselves. Finding the current
position is a well-researched problem area, but despite this fact only few
commercial applications use the technology.
No problem!
The I-T system can be seen as an organism where things explore the
physical reality and information agents the virtual world. Such a system
communicates with close to the speed of light and is inherently
distributed, which makes controlling it a problem. Its structure will
probably start out as a hub based system where control is centralised, but
as the capability and capacity of the nodes increase control will be more
and more distributed. Coordination will be split up over the nodes, and
command based behaviour will in the long run give in to co-operation and
negotiation. Design according to standards, and for upgradeability will be
immensely important.
Goals are designed into the thing and are currently specified at a low
abstraction level. Goals such as survive or do good are well out of
reach. Even a modest goal such as dust the floor is very difficult to
obtain. Goals on a low level are on the other hand easier for a designer to
evaluate and verify. If the designer so chooses the goals and other internal
information can be made available to the virtual environment. The
situation will then be equivalent to the I-I interaction, with the same
problems and possibilities as we previously discussed.
Since things populate the human environment aesthetics is important, as
well as ample functionality and physical properties such as size and
weight. Currently a typical interaction sequence involves a human in
some part of the sequence (X-H-Y), but next generation interactive
applications will increasingly exchange H with I or T (X-I-Y, X-T-Y). The
question is where, and to what extent this can be done, and it is a billion
dollar question. Already we have a pen that translates English text to
Swedish speech, and a pen that reads the name of a TV-show from the
paper and automatically programs the video. In fact, anything that
happens to a thing and that is registered manipulates information. One
example is that the towing company moves your incorrectly parked car,
and at the same time cash will disappear from your bank count.
T
Hakan Gulliksson
154
V.5 T-T Interaction, forces matter
The following chapter is a short one. This is not because the topic is small;
on the contrary it is huge. The main reason is instead that many of the
interesting problems involved in T-T interaction have already been
discussed in the previous chapters. Another reason is that most interactors
of this type involves participants that are not very interesting from the
rather high-level perspective on interaction in this book. In other words,
atoms, molecules, neurons, cogwheels, and gearboxes are considered
components rather than autonomous participants in interactions.
Exploiting interactions between things is nothing new. The hammer and
nail, axe and tree, pen and paper, head and pillow are accepted and
important interactions. What is new is that things become more and more
intelligent, and can provide us with more services where humans need not
be directly involved in the interaction. One example is that the blinds of
your house could close and open in synchrony with the air conditioning
system. Another example is that the calendar on your mobile phone
coordinates with your alarm clock at home.
Most of the interactions of type T-T are simple interactions and considered
as being part of the context, i.e. of the physical environment. Position,
speed, temperature, and all kinds of forces are interactors of this kind. The
possible interactions are always constrained by the laws of the universe
and are studied in Physics, a science devoted to interaction. Physics sets
the fundamental laws even though the sum of the parts often emerge to
something more than the parts themselves. It is of course impossible to
dream up a car, or a parking lot, using only basic physics, even though
their atomic relationships can be described in principle. The models we
use are sometimes discrete and sometimes continuous. The same is true
also for the models we typically use for our day-to-day reasoning. We say
that someone s length is . meter, but this is only an approximation of
the real length that is a rather large number of piled atoms.
One way to group physical interactions is as gravitation, nuclear, and
electromagnetic interaction. Nuclear interaction involves forces in the
very short, nuclear range. Gravitation on the other hand is a very longrange weak force that is difficult to use other than indirectly as in friction,
e.g. where the rubber meets the road, or in pendulums. What is left is the
electromagnetic interaction, which is responsible for most types of
physical interactions.
Interaction in physics is interaction between particles mediated by forces,
exemplified by the electromagnetic force between two electrons. A force is
in itself an exchange of energy through energy quanta, modelled either as
particles, or as energy fields. For an electro-magnetic force the energy
quantum is called a photon. If we zoom in, and increase the level of detail,
we can see many examples of electromagnetic interaction. All of the effects
of magnetic and electrical fields, optics of course, but also inter-atomic and
molecular interactions are examples of electromagnetic interactions. The
electromagnetic force is for instance active active for both sound and light,
although at different levels. Light is itself photons, i.e. packetised pure
interaction energy! A sound is transported by atoms and molecules
interacting by exchanging photons. Similarly, all other mechanical
interaction, hammering a nail, banging your head into the wall, and
spinning the tires of your Porsche, are higher order incarnations of the
electromagnetic force.
Hakan Gulliksson
Hard
"The ships hung in the air, the
exact same way that bricks don't"
Douglas Adams
"In science, there is only physics;
all the rest is stamp collecting."
Ernst Rutherford, physicist and
Nobel Laureate
(ironically, he was awarded the
Nobel price in Chemistry)
H
H
May the force be with you.
Star Wars
H
H
Gravity is a habit that is hard to
shake off.
Terry Pratchett
155
The effect of the loudest tolerable sound of 1 W/m2 can be compared to the
200 W consumed by a PC. The softest sound possible to perceive is on the
other hand 10-12 W/m2. Quite a difference in magnitude. The energy 1 Joule
is the same energy as 1 Ws, and this is about 19 magnitudes larger (1019)
than the energy needed to excite an electron in a photo detector in a digital
camera. Recall from Chapter IV.19 that an adult consumes roughly 2000
Kcal/day, which is equivalent to 2325 Wh.
Tool with and without
embedded engine
Plant or Ice?
Bo Tannfors, TFE
V.6 Why do we interact?
Interaction has to be useful, otherwise why bother?
A fundamental and important result of the highly dynamic concept of
interaction is stability! This might seem as a paradox, but interaction
allows for the emergence of a stable configuration of a system
or organism. Stability and configuration of subcomponents is in other
word both a result of adaptation, and a pre-requisite for next level
stability and adaptation. We can see it happening for society, and for the
cell. Some prerequisites are communication among the components,
processing, and some means of representing the adaptations found, i.e.
memory. Another example is the brain complemented by nerve pathways.
The given prerequisites are not enough however, sensors, and actuators
are also necessary to connect to what is happening, understand it, and
make necessary external adaptations. Finally, to support all of this energy
is needed. A tree, a car, or a computer are easily seen as interacting
systems, and the above is also a necessary pre-requisite for our self, where
the body and our brain build a relatively stable reference, and a partially
consciously controlled action-reaction interface.
We can restate the above and say that interactions are useful because
humans, and in the future perhaps also for things, cannot develop
properly without interaction [JF]. We need interaction to stay in touch
with the world in order to gain knowledge. Interaction is how we as
humans realize our existence in our world, and how we interpret and
experience reality.
Hakan Gulliksson
"On top of the list [of characteristics of the self ] I placed stability.
In all kinds of self we can consider
one notion always commands center
stage: The notion of a bounded
individual that changes ever so
gently across time but, somehow,
seems to stay the same."
Antonio Damasio
“Studies of flow have demonstrated
repeatedly that more than anything
else, the quality of life depends on
two factors: how we experience work
and our relation to people. Mihaly
Csikszentmihalyi
Could it be that interaction is, or
should be considered, a good in
itself- even disregarding what the
interaction is about, or the
character of interaction?
[LEJ]
156
Through interaction we can verify that communication has occurred and
perceive the effects of the communication. This is of utmost importance in
almost any situation and means that interactions increase survival
capacity. Not only is this true for the individual, but also for collectives,
organisations, and families. Two typical examples are that if predators
hunt in groups they can succeed in finding and killing larger prey. The
prey on the other hand can hide in the group, or build a better defence by
working together with the group.
Figure V.6.1 Interaction.
This brings us to our next motivation for interaction, and that is to
improve performance. It can be improved both quantitatively, e.g. it takes
two men half the time of one man to clean a bathroom, and qualitatively,
e.g. two men can carry a piano but one man cannot. Maybe he could
manage half a piano, but that is not very useful. Performance is measured
in many ways, quarterly earnings and examination points, just to name
two.
Interactions are also necessary for setting up social organisations. It is
through interactions that organisations are bound together and influence
each other, and as a result, social entities and new functionalities emerge.
Social groups are thus both the results of interaction, and where the
interaction takes place.
Last, but not least, interaction is mandatory for conflict resolution. It is a
fact of life that there is a shortage of resources. Autonomy in things and
humans will then, inevitably, lead to conflicts that have to be resolved one
way or the other, for instance by regulation, arbitration, negotiation, force,
destruction, conflict avoidance, or prioritisation. The success or failure of
conflict resolution can be measured in terms of the number of dead and
injured, the number of individuals involved in a conflict, or the amount of
the fine.
Situations of conflict are both the
effect and cause of interactions.
They have their origins in a lack of
resources, and they call for supplementary interactions so that a way
out of conflict can be found.
Jacques Ferber [JF]
V.6.1 Why use others for interaction?
One assumption made in this text is that interaction, and more specifically
well designed interaction technology, will enhance the Quality of Life
(QOL).
QOL relates to either human individuals, or to human societies. You
install equipment to help find the car if it is stolen, but this is not because
the car would be hurt if you did not care. You do it because you want your
car back, and your car is not supposed to care. Neither is any other thing
currently in use. A dog cares, but is not discussed in this book. So, at least
for now, the ultimate goal of interaction is concerned with a person or a
society.
A problem with QOL from an engineering point of view is that it is
difficult to define and measure [RV1]. To start with life and quality are
two tricky concepts. Do we for instance mean life of an individual or of a
group? Quality could refer to objective or subjective measures. Two
persons under the same objective, i.e. same observable external
circumstances, could subjectively report different levels of quality of life.
Hakan Gulliksson
”Products or services that reduce the
time and amount of tasks needed to be
performed … increase our enjoyment,
entertain or reduce tension, gives us
information or challenges to improve
our knowledge or our well-being”
Quality of life, Philips
157
What we do know is that providing QOL inevitably means hard work, an
effort that must be rewarded. In other words, someone has to get rich in
the process, or else nothing will happen.
One view of human behaviour is that it is either reactive or reflective.
Most of the time we just react on events, but sometimes we stop and think “.…[activities that give QOL
are those that]... have built in
over the how, what, why and when in life [DAN]. Reactive behaviour is
goals, feedback, rules and chalthe form of experiential behaviour highly appreciated by Hollywood lenges, all of which encourage
soap opera producers, and owners of ice hockey teams. Currently much of one to become involved in ones
the development of technology is geared toward this side of human work, to concentrate and lose
behaviour. One example is broadcast technology for mass distribution. It oneself in it”
supports content that has to be mainstream to survive. Better interaction Mihaly Csikszentmihalyi
technology means that entertainment could be made adaptive and
individualised, and in this way interaction technology could increase
enjoyment and entertain us more efficiently. Good interaction is a basis
We are now shifting from the
of competitive advantage.
information society to the
interaction society
Interaction technology also has the power to encourage and reinforce M. Wiberg, Umeå University
reflective behaviour. Adaptive applications that adjust their reflective
level to the user could give suitable intellectual challenges for everyone,
not only for the main stream. Some of the more demanding computer
We are now shifting from the
games are interesting examples [DAN]. Without any previous training,
interaction society to the
and with only a few clues given at the right moments, an eager player will realtion society
solve level 1 after only a couple of tentative attempts. If it can be done for H. Gulliksson, Umeå University
games, then it can be done for other applications, so interaction
technology can make thinking interesting and worthwhile, and maybe
assist in the production of useful results from the thought process. This is
a real challenge! To help us think deeper and faster. Machines currently
enhance our muscles, and interactive tools for thinking could do the same
for our brain and memory. We do not believe that human thinking will be
replaced for another couple of years :), but why not complement what
Meaning
humans are good at, i.e. creative thinking, with tools for improved logical
of life?
reasoning, and for exploring alternatives? We could ask questions such as
What if? , or give commands like Explore that path . One first example
of such a tool is the calculator.
There is of course a risk with this scenario, as with all possibilities.
Technology used for replacing intelligence could reduce its users to
reactive machines. It might, on the other hand, allow humans to use
intuition, and associative abilities, i.e. human specialities, without
bothering about details.
Continuing this line of thinking interaction technology can enable and
support previously impossible tasks, such as functional alarm systems
for the elderly, or educational games for kids. We already depend on
technology to solve many problems. Imagine calculating tomorrows
weather by hand! To the category of previously impossible tasks we can
add technology that make invisible processes audible, visible, or shown
as tactile information. Oven temperature, driving mileage meter, and a
share index are some examples.
T
Another objective for new interactive tools is to simplify or eliminate
tasks that we are not very interested in. We could transform them into
reactive tasks, or eliminate them altogether, allowing us to focus on the
really interesting problems. Consider, as an example, the many possible
sources for messages, each with its own user interface. Unified messaging
is a technology that simplifies by providing one input basket that is used
Hakan Gulliksson
158
for all mails, phone calls, and voice messages. This way we need only
learn one interface, instead of one interface per message type. Another
example of focusing on the real problem is e-learning, where interactive
tools that strip off boring mathematical manipulations, and let us explore
the basic principle by, for instance, graphically simulating gravity.
Simplification is however not always a good thing, we risk eliminating
also meaningful tasks. The alarm system for elderly should not replace
also the human contact.
18
Interaction technology is also a new basis for social interaction.
Videoconference and e-mail are examples of interactions where
technology helps us surpass limitations of time and space. Meetings
among family members scattered all over the world are no longer
considered science fiction. Presence in a virtual gym is another example. It
could motivate you by verifying that others also work out. You could
compare improvements, conform to praxis when everyone exercises, and
learn how to do it by studying the more experienced [BF2].
As a bonus our new interactive systems can be used to study the
mechanisms behind interactions, for instance how people are influenced
and motivated. The knowledge gained can drive mathematical and
computational models of interactions, that in turn can be used to
synthesise even better interactive systems, and at the same time help us to
better understand human interaction.
Most interaction in nature is severely limited in time and space. Even if
you shout as load as you can you will not be heard 10 kilometres away,
and direct communication face-to-face is real-time and impossible to
delay. Talking while strolling along the river is possible only if your
partner walks in the same direction and with the same speed. Technology
can eliminate such limitations in time and space. They are no longer
fundamental problems, only engineering problems. Telephone and email are our first attempts at solutions. Furthermore, technology opens up
new communication channels. One example is MRI (Magnetic Resonance
Imaging) that lets us inspect the workings of the brain. Another example is
using GPS to keep track of the wolves in the Swedish wilderness. To the
visually impaired, and people with hearing loss, technology is already
indispensable.
The examples above are found at a physical level of interaction but
technology can support also at a higher, semantic level. It is for instance
possible to synthesize a video of a smiling face, and there are experimental
systems that can distinguish between a smiling and an angry face. At the
aesthetic level technology can currently only help by providing access to
numerous examples from which we can learn new ways to express
ourselves.
African ”talking drum”
Risks with information technology:
Anxiety, Alienation, Information
poor minority, Complexity and
speed, Technology dependency,
Invasion of privacy
[BS1]
Another motivation for tools is that they are the only way that we can
access the worlds of things and information. Try to speak kindly to an
ordinary PC and persuade it to print a document! When information and
things need to access each other excluding technology is not even an
option.
Behind the development of technology there is a deeper trend, or rather
two complementary trends [UCIT]. The first trend is that we humans are
moving into virtual reality. More and more of our interactions are taking
place in virtual reality where space, and sometimes even time, are not
Hakan Gulliksson
159
important. When you write an e-mail you can write it at home, or at work,
the mail system does not really care where you do it. The letter is written
into a virtual message space where the reader accesses it, from anywhere
in the world, either two minutes after it is sent, or two days later. It will
not disappear if it is not read. The trend of moving into a virtual reality is
something we already know a lot about. Internet has prepared us for
years.
We shape our buildings and they
shape us.
Winston Churchill
The second trend is that virtual reality is entering real reality. The things
that surround us are slowly evolving. They are transforming, like the
transformers kids used to play with. Dead things are programmed into
things with identity and knowledge. How about an answering machine
that expands physically, like a balloon, whenever a new messages arrive,
or an intelligent call service at the tax office where you can request paper
forms, with no human service required.
The two trends complement each other, and in the long run we will not
only live as real persons in a real world with a virtual image in the virtual
world, but also as virtuals in a real environment (a bit speculative), and as
real persons in a virtual environment (really speculative).
VR RR.
A word of caution is needed here. Living in two worlds gives new
possibilities, but it does not come for free. We have to get used to new
ways of doing things, e.g. navigate in virtual 3D space, and have to learn
new concepts, such as avatars and hypertext. Also, society does not
change in a day. New habits and social patterns take generations to
establish, and does not always mean improvements for all.
To conclude, all interactions in one way or another end up fulfilling needs
of people, or society. The main reason for this is that ordinary people are
the participants with wallets.
V.7 Context, it is everything else
Whenever interaction is taking place there is an extremely important
shared context, i.e. a shared system environment, to consider. It provides
the receiver with a reference for the information, and helps to interpret
any messages. If you hear a tiger roaring in the living room you will not be
too frightened. You assume that the sound comes from a television
program. Let us start with the definition to the right.
The definition is very broad listing any information relevant could keep
an ambitious designer busy for quite a while. It is on the other hand
narrow since context is limited to interactions between a user and an
application.
A second definition is given below the first one in the margin. Admittedly
this definition is even broader than the previous one, but we think that it
better reflects that focus should be on the interaction, and that context is
something that affects, this interaction.
The context has many aspects: situation, time, physical, virtual,
technological (battery, screen size), computational (cpu, memory and
network capacity), social environment, activity, self, user, the application
used, and a lot of others. The aspects are partly overlapping, and for each
interaction, i.e. each context s context, some of them are more important
than other.
Hakan Gulliksson
Definition: Context is any
information that can be used to
characterize the situation of entities
(i.e., a person, place, or object) that
are considered relevant to the
interaction between a user and an
application, including the user and
the application themselves. [AD2]
Second definition: Context is any
information relevant to an interaction between two interactors (i.e.
a human, thing or information),
including the interactors.
Synonyms to context:
Circumstance, situation, phase,
position, posture, attitude, place,
point, terms, regime, footing,
standing, status, occasion, surroundings, environment, location.
[AS]
160
Self
Figure V.7.1 Example of context
classification [KL].
Environments
Activity
Self, the cognitive state of a human, or the internal state of the device, is
one important context, see figure V.7.1. If you are angry while driving
your car you might speed up, and take a chance rather than wait. The
situation, or circumstance, describes the activity in which the interactor is
embedded. An angry bear attacking is an interesting situation; a more
common one is driving a car. An application involves a task, i.e.
something to do, as does driving a car, but the application is embedded in
a computerised tool created for a specific purpose. Using Microsoft
Word® is an application based context, but writing a book is situation
based. Activity as a context overlaps application and situation, but focuses
on the task that the user is performing, rather than the circumstances in
which they are performed in, or the application used to perform the
activity. One example is killing the angry bear.
Context can be modelled at different abstraction levels [JL]. Here we
suggest three levels, physical, perceptual and cognitive. At the lowest,
physical, or sensory level numeric values are collected. Thermometer
readings, time, positions, or pixel information in an image, are extracted
using sensors. Often, the interesting event is something that differs from
the normal, and to find the unusual we have to know about the usual.
This means that we have to collect and maintain background information,
for instance the image of a wall seen by a surveillance camera in a house.
Things moving against the background are easily filtered out.
At the next level sensory information is processed into symbolic
observables. We call this level the perceptual level. Interactors and
objects as well as behaviours and relations are identified and
characterised. The information at this level is independent of how the
sensory level collected the information. Perceptual information can itself
be represented in different ways and at different level of detail. A picture
of a pink house (a sensation) can at the perceptual level be represented as
the text string "house", or as a collection of features of the house such as
the number of windows and its colour, or in a countless number of other
ways.
The cognitive abstraction level is the third level and here we interpret the
symbolic information. The number of possible perspectives and intentions
are even larger than the number of perceptions. Conclusions are drawn
from a chosen point of view, and with some intention for the use of the
result. By combining information from a thermometer and a photograph
of a view from a window we can determine the weather situation. If our
intention is to go out outside we can use the context to decide which
clothes to wear.
A Zebra is easily found on
a checkerboard.
Context
Observables
For all of the three levels we can use previously stored information to
certify our findings. We can also use information from other systems, such
as suggestions from human users. Aesthetics is one such collective opinion
Hakan Gulliksson
161
closely related to culture, which means that knowledge about context is
important also for a satisfying experience.
The process of understanding context is simplified by clues provided by
new and more advanced sensors. This is the good news. The bad news is
first that the complexity quickly increases with the number of sensors, i.e.
the number of possible interpretations escalates. Building a context in real
life bottom-up from sensory information will give us (too) many possible
interpretations of the data. If we instead try to describe the world topdown we get an explosion of the number of possible relationships among
our chosen aspects of the worlds. We do not have access to enough details
to sort out the meaning of all of the aspects. It is like trying to find the
Eiffel tower in Paris without knowing how to make sense of street signs.
We have to use context to understand our sensory information, and need
sensory information to understand context. The second bad news is that
technology and its use makes the problem even worse by transforming the
context in unpredictable ways. Your small office will for instance serve as
a meeting room when you have a computer supported conference, even
if you are physically alone in the room [LEJ].
Hen
(contextual
overview)
Egg
(sensory info)
Implicit interactions are those where the systems in the environment get
input from an action performed by an interactor, even though that was not
the interactor s intention. One example is leaving the car keys on the table
where the environment can keep track of them. Another example from
[AS3] is a garbage bin that scans the bar codes of products and registers
the information. The stored information can be useful to the system, but
the action to deposit the garbage was done without any intent of
providing the system with information. Explicit interactions are all other
actions, aimed at exercising the system. Implicit interaction gives a new
meaning to the concept of a user. It is up to the system to decide who, or
even what, a user is.
The context is something that is shared among interactors, but this does
not mean that it is interpreted the same way by all of them. Cold weather
is not the same for all people. Also, different interactors might choose
different representations for the same feature. What one interactor
addresses as Lummerstigen 12, another refers to as close to the university.
The problems of interpreting representations especially haunts distributed
applications where an interactor has to reason about the relation between
its own and the remote, perhaps mobile, interactors representations of
context.
Perhaps love is like a resting place
A shelter from the storm
It exists to give you comfort
It is there to keep you warm
John Denver
V.7.1 Use of context
Advances in technology make it possible to quickly change contexts, and
also to combine contexts in new ways. One example is a mobile phone that
displays information about the lecture you are currently missing, both its
content, and when it ends. The next moment it plays Tetris. Possible uses
of context will be classified in the following.
Using context as additional input is the first possibility, e.g. it is -17º C
outside, too cold for you to go outside. The context here serves as a
resource to the interaction. “n interactor s goal is for instance a resource
from the context of the type self. Knowledge about plans, and of effects of
actions are other important inputs to the interaction.
Hakan Gulliksson
162
Context can also be used to modify input. If you are in Sweden you expect
to hear Swedish words and will try to interpret any muttering as Swedish.
System
Input
Output
Figure V.7.2 Context modifies
input.
Context
Another way to use context is for feedback. This is how web browsing
uses context to help you select the next link. You find the link when you
scan the current page. Context can also be used as a receiver of output. By
painting your house and buying new clothes you express yourself to the
context. Our last example is to use context as action trigger. Information
can be bound to a site and accessed at the right position using a
computerised tourist guide [JG]. A less virtual example is a stop sign.
Context is important and will be discussed many times in this book. But,
if it is so important, how come it is not used more by technology? Some of
the difficulties for H-T interaction comes from the following properties
[AD].
WC
New sensors are needed that have to be integrated into the current
infrastructure, i.e. the keyboard and the mouse are not enough. Further,
information from sensors needs to be combined and abstracted to be of
any use. One example is that coupling a position with a temperature
reading at that position enhances the context information value. Physical
context is local in space, and for mobile interactors this can dramatically
change the situation. Heavy rain when putting for a birdie on the ninth
green is one example. Physical context is also dynamic, i.e. local in time,
which is a problem for both stationary and mobile interactors. Two
contextual situations may look similar but could differ dramatically due to
internal states of the interactors, changing objectives, or interaction
history. A human could, at any instance, change the current goal, on
almost any grounds. This makes it difficult, sometimes even impossible, to
set up predefined rules for how a system should react. A thing has to learn
a lot to cope with these situations!
If an application, or more general any interaction, makes use of context it
is said to be context-aware. One example is the Fasten Seatbelt warning
in a car. As long as the seat belt is not fastened, and the car detects that
someone sits at the drivers seat, it gives a warning forcing the driver to
adapt. Other interactions with context are less persuasive, such as using a
computerised address book. Here the owner has to manually specify and
request the information about the addressee.
An interactor, or an application, in a context-aware interaction poses the
following questions to determine why a situation is occurring [AD]. Who
is involved where, when, doing what how? The depth and width of the
answers determine the quality of the internal model of the world that can
be built. In current systems location, identity, time, and activity are
important answers, and we will elaborate on how to get and use them
later in this chapter.
Hakan Gulliksson
An interaction is contextaware if it provides relevant
information and/or services
to the interactors where
relevancy depends on the
integrator’s tasks.
Adapted from [AD]
I have never had any faith
in the future,
but I think I will have
Blandaren
163
V.7.2 Real, Virtual and Augmented reality
Virtual reality (VR) is the simulation of a real or imagined environment,
and it will provide the ultimate environment where everything is possible!
When we get a taste for the immense amount of virtual information, and
the physical/virtual interactions available we will not let go of them.
Throw out your television (no cheating with computers), and you will
experience a small foretaste of how you will not want to feel.
Virtual reality can enhance the interactor experience by immersion,
agency, and transformation [JHM]. Immersion is the level of presence we
feel when experiencing a story, or some other course of events. Deep
immersion intensifies the experience and reduces the mental effort to enter
it. A well-designed user interface based on gestures and natural language
should give deeper immersion compared to a cryptic command language.
This is certainly true for the first time user, but maybe not for an expert
user? More details also give deeper immersion, but at the same time
implies more advanced technology. Other factors that affect immersion
depth are [GR]:
Number of senses involved.
Multiple actions are possible in parallel.
The dynamics of the environment and to what extent it interacts
with other environments.
Uncertainty, which forces us to actively strive to make sense of
the environment. A completely random environment does not
affect us. Familiarity with the environment reduces uncertainty,
but copying reality is difficult.
High tech is however not a necessary prerequisite for immersion. Reading
an interesting book can give the same effect. We feel the yearning of the
characters; we can hear what they say in our heads. The first moving
picture event in Paris 1895 showed an approaching train and is reported to
have sent people screaming out of the cinema. Today a five-year-old child
would not have bothered to look twice. The limits of human adaptability
are not clear.
In its present form, equipment
like television or film does not
serve communication but prevents it. It allows no reciprocal
action between transmitter and
receiver; technically speaking, it
reduces feedback to the lowest
point compatible with the system
Enzensberger
What does it mean to write “Hello
World” in an ubiquous computing
environment?
Tom Kindberg, Hewlett Packard
Why has cyberspace replaced
outer space as the most
important context of the future?
Interaction?
Figure V.7.3 Image from
the first moving picture
shown in Paris 1895
Agency is the satisfying power to take meaningful action, and see the
results of our decisions and choices [JHM]. In computerised storytelling
there are many opportunities for agency. Both the hero and the villain can
be influenced or controlled, and we can navigate, explore, and solve
problems. Transformation is the intriguing possibility in virtual reality to
change our appearance, and our behaviour. Red, grey, or even bright blue
Hakan Gulliksson
164
colours of the hairs are possible options. Man or woman, child or dog is
up to you.
Advanced environments for immersed virtual reality experiences are VRcaves and VR-glasses for video, and headphones for audio. A VR-cave is
a cubicle where video is projected onto the walls creating an impressive,
but rather expensive, virtual environment. The equipment can provide
experiences that can be quite overwhelming, even leading to attacks of
nausea.
A less immersive version of virtual reality is augmented reality where
technology, usually semi-transparent glasses, shows additional
information complementing ordinary reality. One example is using glasses
to display the service manual when both hands are already occupied
repairing an airplane motor. Another application is shown in figure
V.7.4a below, where the view of reality through the VR-glasses is
augmented by the path to follow [TH]. A flag in the view indicates that
interesting information can be found there. Through VR-glasses even a
user s hand can be transformed into a display, see illustration in figure
V.7.4b. If a hand appears in sight graphics could be mapped over it [LC].
The body is something that every user brings along.
Figure V.7.4 a) Augmented
view of reality b) Hand serving
as display.
b)
a)
H-T will eventually be H-I
because I (VR) provides much
better service than T. How can
you change colour of your
telephone in the real world? How
do you accomplish it in the
virtual world? Which is simpler
and cheaper to implement?
Church
We can classify spatial technologies in many different ways. One
alternative described in [SB] identifies two dimensions, transportation,
and artificiality. Transportation describes the level of precense of the
physical body in the application of a technology. Artificiality concerns to
what extent the world created is virtual.
Dimension of
artificiality
Augmented
Synthetic
reality
Physical
Physical reality
(meeting face
to face)
Local
Virtual
reality
Figure V.7.5 One example of a
classification of different worlds.
Telepresence
Remote
Dimension of
transportation
The classification allows for many different worlds at different levels of
Local/Remote and Synthetic/Physical. Imagine a 3D display that shows
your home as it looked 20 years ago.
Hakan Gulliksson
Home
165
V.7.2.1 Bridging virtual and physical contexts
VR-glasses are not necessary for augmented reality. We can use many
other devices and interfaces to represent the virtual world in the real
environment. Mobile phones, or PDA:s, are obvious alternatives to audio
and video tunnels allowing activity in the real world to be heard in the
virtual world and vice versa. Virtual environments that we are so
accustomed to that we do no longer recognise them, are music and other
synthetic audio environments. Maybe not as fancy as full-blown
immersive VR-caves, but still very efficient as mood stimulators and
information transmitters.
Interplay between real and virtual space is found, and is useful, for both
actions and states. Crossing out a phone number in your paper-based
calendar has an obvious virtual counterpart in your computer-based
calendar. The mapping from reality to virtual reality can be arbitrarily
chosen, but usually a designer decides on a natural mapping. This
simplifies for the user because most mappings make little sense. One
example of a useful mapping is to assign a web page to a physical
location. From virtual reality to physical reality the mappings cannot
always be chosen at will. We for instance cannot let the virtual world
affect things that have already happened. We have to accept now, and the
flow of time. What we can do is to read the current time, and set the alarm
clock. By the way, does time constitute a virtual or a physical
environment?
Another interesting fact is that showing relationships among items is
easier in the virtual reality. How do you find out who is related to whom
at a large family party of an unknown family? You have to ask someone,
or get hold of the family photo album. In the virtual world, relationships
can be shown directly! Histories are not directly visible in the real world,
neither are histories of actions, nor other traces over time. Who broke that
window? What started that war? Simulation of cause and effect could
explore different paths of actions and provide insights. In virtual reality it
is even possible, at least in principle, to backtrack and exactly evaluate all
factors.
As we turn off, and throw out, our PC:s in the coming pervasive
computing environment we will have to admit that they were very good
for some things. One of their masteries is to keep track of a local virtual
context of the user. Our calendar, task list, address list and the files we are
currently working with are readily at hand. This is obvious to anyone who
has had a major disk crash. We will also in the future need a local virtual
context, and the place to put it is out in the cyberspace. By doing this we
gain several things. First, and most importantly from a mobile
applications point of view, we can access the virtual context, from
anywhere. Second, we can more efficient manage our data by centralising
the management. No more problems with disk crashes. Third, if everyone
uses this kind of web-based context, and technology continuously
improves, there will be better standardised tools for accessing it. Tools that
simplify navigation in the data space that we generate. As more and more
networked devices provide communication channels we can expect them
to give more coherent displays of the chosen aspect of virtual reality.
When you go from the living room to the kitchen the sound channel from
from the living room TV could be mapped to the kitchen radio.
Hakan Gulliksson
We have taken our biological clocks,
moved them outside our ourselves ,
and then treated the extension as
though they represented the only
reality
E.T. Hall, Dance of life
The human species, however
paid a price when it choose the
extension route. Extensions are a
particular kind of tool that not
only speed up work and make it
easier but also separate people
from their work.
E.T. Hall Dance of life
T
H
I
"Think of it as reality and it
becomes reality.”
John Dobbin
We will demand to use VR in
the real world
Dr Michael Heim,
Art Center College
166
Several experiments have been done where a video projector is used to
project aspects of a virtual world into a public space. The shadows of the
virtual world could be used as an artwork, or for information display. The
figure to the right, from the artwork Video place by Myron Kreuger, is
one example. By image processing the shadow of the user, and at the same
time projecting a computer animation of a ball, a mixed physical-virtual
ball game is created. In another example users wore VR-glasses and
manipulated 15 by 15 cm paper cards. A video camera captured images of
the paper cards and information about their positions were sent to a
computer. The computer rendered the video frames with additional
information such that a card could change its appearance at any time.
V.7.3 Context of H-H interaction
Human-to-human interaction is always situated in time. An e-mail is
written such that it will make sense at the time when the receiver reads it.
We place the reader in a later time slot, and adjust our message
accordingly. The same thing happens when we read e-mail ourselves.
Automatically we adjust our interpretation to the time frame of the writer.
This is one reason why information on context reduces the amount of
communication. According to research in psychology we only accept a
delay of a tenth of a second to feel that a response is immediate. If the
response arrives within one second we will at least follow our line of
thought, but if the response is delayed longer than ten seconds we have
already forgotten what the dialogue was all about and have to restart it.
Memory is precisely stable enough to keep the conversations going.
People are situated in space as well as in time. All sorts of languages are
used to represent space, and to relate different spaces to each other. Many
coffee cups all over the world have been used to represent roundabouts.
We want a table for two by the window, the voices in a restaurant indicate
distances, we lean closer to indicate intimacy, and use a napkin to draw a
quick sketch of where we live.
One reason to the unprecedented complexity of H-H interaction is that it is
embedded in a social context, which among other things is culturally
specific. The people that surround us, and their behaviour, are obviously
extremely important to us. Anyone who has lost a close relative knows the
depth of this statement. Solidarity, love, comradeship, friendship, and
military honor are examples of complex social contexts, see also Chapter
IV.16.
Human society is built by interdependent hierarchical structures. The
family, the organization, and the state are all basically social and
hierarchical. One alternative is to organize people in networks built either
by specialized individuals with complementary skills, or by peers that join
a network for efficiency or for fun. Relations between individuals hold
hierarchies and networks together. They are built over time by authority,
by friendship, love, by birth, and in many other ways. In fact, the stability
of all societies depends on feedback from their inhabitants.
Hakan Gulliksson
Tatemae –sensitivity towards
others, public self
Honne – sensitivity towards
one’s own private self
Suji – situational significance of
an event
E. T. Hall Dance of life
(important terms in the highly
contextual Japaneese culture)
167
Why do we organize ourselves in hierarchical structures? One reason is
efficiency of communication and control. In large communities hierarchies
make control visible, and enable swift distribution of control messages. A
related reason, for smaller groups, comes from evolution. A group makes
it possible for one strong individual to dominate and guarantee
reproduction of his or her genes.
Since social structures and behaviour are well established in human
thinking and behaviour they obviously can be exploited in different
computerized applications. There is an enormous amount of results from
research on these issues from psychology, and we are ourselves aware of,
and affected by, many social influences. Most of us for instance have a
tendency for social comparison, i.e. we behave as our neighbors do. This
is one way for a social animal to survive, or at least to take easy decisions,
just follow the group. People enjoy imitating the behaviour of other
people. We humans form groups into lines and queues, look in the same
direction as the crowd, and wear clothes to help others to understand who
we are (or who we want to be). We adjust our behaviour to groups in
many ways, automatically, and all of the time, for instance when we
follow the group leaving the airplane, supposing that everyone is going to
the luggage claim. Another example is that we prefer a crowded
restaurant to an empty one. Such behaviours are currently not exploited
on the Internet, or by any other technology. Additional examples of social
dynamics are group polarization and social facilitation. Group
polarization means that a group after a discussion tends to assume a more
extreme point of view. People who do not like to make the dishes like it
even less after discussing it with each other. Social facilitation is the
interesting effect that a social environment increases the performance. You
will run faster when competing against a person compared to racing only
against the clock.
One way to formalize the space of social settings is suggested by Rowson
in [JR2]. Two dimensions are specified in a scenario space, see Table V.7.1.
The first is relationship, which relates to the group size, and the second is
the role, describing the physical/social location where the interaction takes
place.
Relationship Individual
/ Role
School
Homework
Recreation
Movies
Family
Casual team
Formal team
Community
Passing notes
Group project
Chat
Soccer team
Shopping
Work
Spiritual
Social links in Canberra
Australia, [AK]
Table V.7.1 Social relationship
versus role. Where would you
place falling in love?
Prayer
Some obvious social activites have been suggested in the table, but there
are many blanks for you to fill in. Can you think of any more relationships
or roles that could be added to the table?
Hakan Gulliksson
It is the human and social aspects of
context that seem to raise the most
vexing questions. And, though these
are the very aspects of context that
are difficult or impossible to codify
or represent in a structured way,
they are, in fact, crucial to making
tha context-aware system a benefit
rather than a hindrance or – even
worse- an annoyance.
Victoria Bellotti
Human salient details of context:
Identity, Arrival, Presence,
Departure, Status, Availability.
168
V.7.4 Context of I-I interaction
Context is important also for I-I interaction where both the participants
and the context belong to the virtual world, and where it is sometimes
difficult to distinguish the interactor from the context. The objective for I-I
interaction is usually to locate, retrieve, exchange, or generate new
information, and this is best done in the most important virtual context
today, the World Wide Web. A problem is that since the web currently is
mostly meant for human readers, much of the information is in practise
worthless to a software program.
In general an information agent can sense a lot of data, but this data
cannot be interpreted because the internal model of the interactor is not
adapted to the data and contextual information found. Humans evolved
into a solution for the physical environment, but it took them several
million years. When we today design new software we work around the
problem in two ways. First we specialise interactors for limited
information environments, and for simple tasks. One example is designing
an agent that looks for terrorist information only. Within this limited
context the sensitivity for typical data can be enhanced as well as the
interpretation skills using the particular knowledge in the domain. When
looking for terrorists, the words bomb , kill , or liberate in a message
header provide highly relevant contextual information. Second, we adapt
the environment to the capabilities of the agents, for instance by adding
meta-information to guide search. Referring back to the terrorist example
we could simplify the task by making sure that all e-mails pass through a
few selected intermediary nodes. A problem with context, in the real
world as well as in the virtual world, is that both its topology and content
will change. In H-H interaction we have specialised functionality to
manage changes. Attention, curiosity, and vigilance help us adapt and
adjust focus. How can the corresponding functionality be implemented in
I-I interaction? Scanning, another human speciality is difficult for I
because of the vast information spaces involved. Humans solved the
problem by specialisation, and using local context only.
Most perceptual cues are wasted on a visiting software program.
Information such as font size, depth queues, and colour will not be used.
Even worse is the problem to extract structural information hidden in
diagrams, tables, maps, and other figures. It is for instance difficult for a
software program to understand that the text string Umeå in relation to
the positions of other text strings on a map conveys a lot of information.
Any human reader will easily approximate the time to travel to Oslo if the
time to Kista is known
So far, the Internet has been built by humans, to be read by humans. The
illustration to the right is immediately recognised as a family and you
probably guess that the family name is Gulliksson and that the phone
number is +46-90-142613. Because of the lack of structure, and the implicit
information hidden on the web, an agent faced with the problem of
understanding the web has two options. Either to look for metainformation describing what to find where, or to do an extensive search
and try to correlate the information found.
Hakan Gulliksson
..100, here I come
kill
terror
Bush
kiss
Liberate
save
LOVE Osama
bomb
Umeå
Oslo
Kista
Gulliksson
+46-90-142613
169
Suppose that you are an interactor of the type information, i.e. a piece of
active information roaming the Internet. What would be your view of the
context, and what would your perfect sensor look like? This is not an easy
question. There are for instance very few sensors available to perceive
other interactors of type information, or to identify I-I interactions.
Autonomous, software based, interactors usually do not have accessible
internal data structures, and it is difficult to find out which agents that are
operating in the virtual neighbourhood. There are some indicative
information that could be used; perhaps a new process starts up,
processing capacity suddenly decreases, or if the memory usage goes up.
From the perspective of interaction, context could be described as the
capacity for processing and the amount of memory available, i.e. a
technological context. An alternative is to consider the context as a system
by itself and describe its interface using the vocabulary of communication,
i.e. the messages are sent to, and received from, the context. This view can
be useful in some situations, for instance when modelling interaction
between only two autonomous agents. If the coupling or the immersion
level is high using communication as a descriptive framework will be
cumbersome.
The most straightforward solution is however to view the context as a
data structure preferably accompanied with a model describing the
structure. This model is itself a data structure, maybe in the form of RDFstatements. Without such a model an agent is forced to scan and search
every time it needs information from the context. The interactors in this
case will also be represented as data structures, possibly with their own
model of the data.
V.7.5 Context of H-T and H-I interaction
The number of contexts are unlimited and some of them are; hospitals,
airports, museums, theatres, health care at home, playing games,
interaction devices worn inside or close to the body, and your memory of
how to save a file from Word ®. Your house will awake from its sleep and
turn into a context where all sorts of radio-based equipment co-operate
under the command of you and your family. It will be more of an
autonomous context sensitive thing with senses and effectuators. Context
dependency is already evident for instance in the modern camera [WB].
With a single push of a button, lighting conditions are estimated, auto
focus calculates the distance to the object, and the time of exposure is set.
After the photo is taken, the camera even stores all sorts of metainformation about when and how the photo was taken.
What additional features could be delivered to you by an application,
given that a computational context is available? Perhaps the following
[AD]:
1.
2.
3.
”It’s a world of cameras
aimed at everything
everywhere, watched
over by machines, and
occasionally examined
by people”
Paul Saffo
HELLO
WORLD
HELLO
WORLD
Presentation of information and service options.
Automatic execution of services.
Tagging of context with information for later retrieval.
A printing application is one example that allows you to select from
nearby printers (exemplifies category 1 above), illustrated in figure V.7.6a
below. If nothing else is specified the application automatically redirects
the print to the nearest printer (2), and remembers which printer that was
used (3). If you do not know where to find a document just ask the
application. This last possibility, using the system as a memory extension,
Hakan Gulliksson
170
has further implications. If we trust the system to keep track of our
belongings, files, keys, and children we will possibly change our
behaviour to focus on other, more high level tasks, that in turn can raise
our quality of life.
a)
T
H
b)
c)
I
T
I
T
I
T
H
I
H
I
Figure V.7.6 a) User selects
printer b)automatic execution of
service c) Location based
presentation of information.
I
A related class of applications that combines items (1), (2) and (3) above
allows us to leave a remainder that automatically will be presented when
needed, see figure V. . c. When you drove this street the last time you
turned left at the next crossing . “pplications from class
can also be
used to optimise behaviour, an automatic navigation service could tip you
off that the highway is free. The features listed apply to a group of users as
well, and could be used to share user experiences, e.g. to show a replay of
the last goal of an ice hockey game, which is an example of a combination
of (1) and (2).
Error: Can’t
find the
printer!
Now let us discuss an extremely simple context sensitive application, the
thermostat that only adjusts the room temperature. ”ut, is this really
such a simple task? Some people like their home very warm, others prefer
a lower temperature. If you catch a cold you probably would like to raise
the temperature, but if you have been out jogging you prefer a cool
apartment. If the husband, who likes it really hot, goes abroad, his wife
would like to enjoy a nights sleep in a frosty bedroom. It is in fact
impossible for the thermostat to infer all of these, and many other reasons,
for adjusting the temperature. From this we can at least conclude that any
system that acts on behalf of a user will be complex. Furthermore, the
user should always be able to override the system, here the thermostat,
and this in turn means that the user must be able to deduce the workings
of the system.
The problem would be less difficult if we could trace the users state of
mind, but nature ruled that option out. Another possibility is continuous
update of their representations by the users themselves, but research and
empirical evidence shows that people are notoriously bad at this type of
assignments. Whether we will eventually be able to build systems that act
on our behalf in any non-trivial situation is another of these raging debates
that will not be resolved until we build such a system . The next couple
of generations of systems will keep on the safe side by providing the user
with rich representations of context as seen by the system, but leaving the
interpretation of this context, and the decisions, to the user.
V.7.5.1 Context identification
How do you recognize a situation, daily behaviour, activity, cultural
(social) environment, or yourself? Why does not the telephone ring signal
tell you why someone is calling? Why does not the calling party already
know that you are in the bath, and never even think about answering the
phone?
Hakan Gulliksson
People are difficult to deal with as
contextual entities. They make
unpredictable judgements about
context. Inother words, they
improvise.
Lucy Suchman
You can’t reboot the world, let
alone rewrite it to introduce
new technology
Tim Kindberg
171
“ small example shows the complexity of the task Magnus and his wife
leave their newly built house. It is eight o clock in the morning so maybe
they are heading for work, but no, it is Saturday. Oh, I see, they have their
jogging outfit on, and I also recognize their dog. Probably they are taking
their dog for a walk .
”The only and truly useful
context-aware application is
the automatic door, and it was
invented decades ago.”
Pessimistic view
Think about the many conclusions that have to be drawn and the
refinement of sensory impressions needed for this statement. Not at all
trivial. One estimate of the activity can be found by tracing all of the user s
positions and motions over time, and by analysing uses of objects and
services. If this is properly done the trace can later be asked questions like
Where did I leave my keys . The time dimension can give additional
hints on behaviour. If you are looking for your keys, and had them a
minute ago, they are probably somewhere nearby. Automatically
detecting presence, this way, or any other way, gives many new
possibilities. Since detectors can be made much more sensitive than
human senses we can build amplifiers that feel the presence of almost
anything. A technology based sixth sense.
Location information reduces uncertainty. Knowing that someone is in
the kitchen, the bathroom, or in the bedroom, for how long, and with
whom says a lot. Location as a context is consequently important in
mobile applications. Adjustment of time depending on the current time
zone is for instance possible to do automatically. For many applications
we have the opposite problem, the user wants to keep the computational
environment independent of the intention for moving, and of the
movement itself. The address book should for instance not be bound to a
specific position. Another example is that the car radio should
automatically retune to the selected radio station if the transmission
frequency changes as the car moves. Modifying output depending on
whether the user is moving or stationary is another nice feature. The font
size on a mobile phone should be increased when user is moving, and
some interactions such as entering a phone number should be simplified.
It is for instance easier to select from a list of recurrent numbers.
Context parameters for a user can be seen as a context space [OR]. As time
goes by every user makes a personal journey in this context space. If we
store the traces we can build many interesting applications. The computer
can recognize déjà vu situations, and our trace can answer questions like
[OR]:
When was I here last?
What did I do then?
Where did I go next?
Did I see Mona Lisa when I visited the Louvre?
Definition:Motion- movement
Synonyms: act, action, advance,
agitation, ambulation, change,
drift, dynamics, flow, fluctuation,
flux, kinesics, locomotion,
mobility, motility, move,
oscillation, passage, passing,
progress, stir, stream, sway,
sweep, swing, tendency, travel.
Mood
Time
Company
Location
If we are allowed to also use other people s context traces, even more
fascinating questions can be asked:
Who is going in this bus with me?
What do people usually do here?
What happened here an hour ago?
Has someone here seen Mona Lisa?
Where is my child just now, when did she go
to school this morning?
When do people usually go to lunch here?
Hakan Gulliksson
172
Some of the problems when trying to identify human actions are that they
are interrupted, can continue for a long time, and are executed in
parallel (especially by women they say) [GA]. Cultural differences, and
situational constraints add to the list. One example of a cultural constraint
is gesture recognition, where for instance south Europeans are more lively
than north Europeans.
The problems discussed above become even more difficult if we want to
predict rather than to identify. At the very least we need data from similar
situations. If we want to predict personal behaviour we need personalized
data, for instance recordings of our habits. Some help can be found in
assumptions on reality such as that a person can only be at one place at a
time, follows his schedule, moves around with a maximum speed, and has
a habit of making habits.
Is it really 2006 this year? In some
other counting systems, it is 5766,
or 1426.
V.7.5.2 Situations
Travelling is an example of a situation. While travelling there is spare time
to kill that can be used for interactive adventures. Another interesting
thing about travelling is that we seem to do it more and more, even
though in theory we should use the Internet for communication rather
than use gasoline. Could it be that we get to know even more people
through new information technology, new friends in distant locations that
we want to meet face to face? Or is the number of possible and mandatory
activities to perform increasing, making more meetings necessary?
A taxonomy for different situations from the reference [MM] shows that
the essential situations in which we spend our time are remarkably small
in number. The table below shows a typology of everyday situations (try
to find one more essential situation, and some additional situation for each
column):
At work
Deliberating
(places for
thinking)
Presenting
(places for
speaking to
groups)
Collaborating
Negotiating
Documenting
Officiating
Crafting
Learning
Cultivating
Monitoring
Hakan Gulliksson
At home
Sheltering
(places with comfortable climate)
Recharging
(places for maintaining the body)
On the town
Eating, drinking
(places for
socializing)
Gathering
(places to meet)
On the Road
Gazing/Touring
(places to visit)
Watching
Remembering
Confining
Servicing
(places with local
support)
Cruising
Shopping
Sporting
Belonging
(places for
insiders)
Attending
Commemorating
Adventuring
Driving
Walking
Waiting
Hoteling
Table V.7.3 Different
situations.
173
The mass media and many other active structures in society are trying to
adapt to the changes in our behaviour, and thereby help to speed up the
changes. Situations such as reading a newspaper, or watching television,
will change. News on paper cost at least $1 and that is quite expensive
compared to a computer-based alternative at the price of 10c. Television is
also made obsolete as a group activity when we learn new behaviour from
the web. Who wants to watch while someone else is zapping? Other
examples from this trend are that we do banking and taxation over the
Internet.
The new information society has however not succeeded in eliminating
queues. They reappear in digitised shapes We will soon accept you call
…. , Network busy , ”uffer full , and other similar messages are all too
familiar. In their previous physical incarnation they had some charm. We
could study the behaviour of other people in the line, or estimate the
waiting time. How can we re-create at least some of the positive aspects of
standing in line? In 1989 researchers estimated that in the United States
more than 100 million person-hours were spent per day queuing. How
much of this was unnecessary, and could be replaced by an interaction
involving an intelligent PDA?
Time trap
Concept
Time
Technology
Use
E. Stolterman, Umea University
V.7.6 Context of T-I interaction
Context of T-I interaction has much in common with the context for H-T
and H-I interaction. One dilemma for T-I is that the distinction between
the environment and the interactors in this environment is not always
clear. What is for instance interactors and context when a camera
automatically takes a picture and sends it to a server? H-T/I is simpler in
that H is a natural discrete object and usually the focus of interactions.
T/I
T/I
H
Questions typically asked by an interactor in any context-aware
interaction are: why is a situation occurring, who is involved where,
when, doing what how [AD]? Let us elaborate on these questions
somewhat. The who question can be answered by either an explicit
identification, or by noting that something, or someone is involved, i.e.
detecting presence. If all objects nearby register their positions this is a
simple task. If not, an alternative approach is for every object to have a
contact zone inside which any other object is registered. This is the way
humans usually do it, by vision, and is also how many networks establish
network connectivity.
Most of the questions above can be answered either directly by the
interactor s own senses/sensors, or indirectly by asking another interactor
or the context [AD1]. Where is a question about location or position, and
finding it is an easy to understand problem that is quite difficult to solve.
Outdoors we can use GPS down to a resolution of a couple of meters, but
indoors we are still looking for a solution. When is not too difficult to
figure out, but the last question in the list poses a major problem. To find
out doing what how is quite difficult and certainly needs contextual
information.
Hakan Gulliksson
174
An example might help at this point. Imagine Hakan driving his Porsche,
rather tired after a long night at the keyboard. The car detects the presence
of a driver by a sensor under the drivers seat, and supports this fact by
noticing that a door of the car has opened and closed. The identity of the
driver is detected through personal car keys, and once the identity is
known the drivers seat is adjusted. The where question can easily be
answered by a GPS receiver, and the when question by the cars internal
clock. A feature of this particular Porsche is to detect if Hakan is sleepy or
not by analysing of a video from a camera in the car. Even if this feature is
advanced it is still easier to implement than finding out whether H. is
taking a ride just for fun, or is picking up some flowers to surprise his
wife. Physical contexts could however be used to a much higher degree
than today. Imagine what applications you could build if the car radio
knew about speed limits and the current speed of the car.
A wearable device can be used to supply context to the application as
well as to the user. Who (or what) is carrying, from where to where, and
with what intention? What if every mobile phone reported the
temperature when used outdoors? Could such a massive data stream be
used to improve weather forecasts? Another example is that an intelligent
mobile phone could use the physical context to do the following mappings
automatically [GC]:
Vibrate – In hand.
Ring – Not in hand.
Adjust ring volume – In suitcase.
Keep silent – When owner is eating, or at a lecture.
Any way – Outside (where is the user?).
The physical context is not the only relevant context. Access of internal
data structures within interactors, and within objects in the environment,
provides contexts that could further enhance functionality. One example is
a car that accesses the data structure of the car key, another example is a
mobile device that explores the menu of a restaurant, when outside in the
street. From an application developer s point of view this poses new
challenges. Now, it will no longer do to separate the internal data
structure of the application from the application interface.
V.8 Interaction modelling, back to the basics
Interactions have necessary pre-conditions. At least two interactors must
be present, capable of acting and communicating, and there must be
system state variables to modify, otherwise the interaction will be kind of
boring. Each participant follows all, or some, of the basic steps in the
action cycle; start with a goal, form intention, specify action, execute
action, perceive system state, interpret resulting system state, and evaluate
outcome. Interaction occurs when these steps intertwine for two or more
participants.
Hakan Gulliksson
Non-empty ordered set of
events engaging more than
one agent.
Alternative definition of
interaction
175
As with many other topics that we have discussed, and will discuss,
interaction needs both analysis and a synthesis. Analysis of interaction
means that you study an interaction and try to figure out what is
happening. What you learn can be used for the complementary activity,
synthesis. You cannot really synthesise actions for other humans, at least if
you are not God, but many other interactions are man-made and possible
to generate, analyse, and tune. What you can do for human interaction is
to synthesise rules and an environment that constrain the interaction. For
software based systems we can do better, UML and state based modelling,
provide diagrams and concepts that can be used to automatically generate
software. The automation of synthesis is crucial for building the complex
systems of the future.
Interactions and interactors are researched using many different models,
at different levels of detail. An atom can be described as a solid sphere
using a mechanistic model, or as a wave packet using an energy-field
model. But, if we want to model interactions between two atoms
separated in space the energy-field model is much more suitable.
Figure V.8.1 Models of atom
at different levels.
The example illustrates that models of interactors and interactions should
work well together. Field-based models are one general class of models
matching this criterion. Another well-behaved model is the state-based
model that we will use extensively in this book. For a really complex
system such as a human, and for human-human interaction, there is no
obvious candidate for a formal model, even though there are many
informal psychological and sociological models.
Can every physical representation have a virtual
counterpart? Are there any
virtual representations not
created by man?
V.8.1 Modelling view
At a given level of abstraction any system can be studied from one or
more of three different perspectives, intentional, conceptual, or physical
[DB1].
The intentional view describes the system from the perspective of how,
and where, it is going to be used, and what goals and expectations it can
fulfil. One example of an intentional view of a message transmission is
that a sender intends to tell you what time it is. While doing this we
should ask ourselves if our model is useful. Modelling the mental states of
a light switch does not add much value to us.
From a conceptual view, we can learn how the system works, its
properties, and about the mental model we should use to understand the
system. To explain a concept similarity, pattern matching, metaphors, and
cultural associations are important. Understanding the principle behind a
cut and paste function in an editor is one example, another one is a
message transmission viewed as sending packetised information.
B
A
The physical view is concerned with how the system and the real world
influence each other. At this level a packetised message can be modelled
as a sequence of bits,
sent over an electric cable. “nother
example of a physical level model is a neuro-physiological model of the
workings of the eye. The 3D-model of a coffee cup with a handle that is a
perfect match for your finger is a third example, and a red button
indicating a stop function a fourth.
Hakan Gulliksson
176
Why do we need three levels? Why not two or four? This question is the
subject of an ongoing discussion among philosophers, but it seems that at
least three levels are needed. One physical interaction can be used to
implement many concepts, and many concepts can use the same physical
interaction. The cut and paste editing function can for instance be
implemented through the use of either a keyboard or a mouse. In a similar
way many intentions can make use of the same conceptual function, and
many alternative concepts can be used to fulfil an intention. You can
achieve an objective in many different ways. If you want to tell your
mother some good news to cheer her up; you can either visit her and tell
her in person, or send her an email. The intentional view is needed
because the concept of sending your mother a mail does not include the
additional information that you want to cheer her up.
Intentional
Conceptual
Physical
Can you think of an example where a single concept is used in many
intentional views?
Reflecting on systems from these three perspectives is something that
humans do all of the time, and it is a very useful practise in the design
process. We can use an alarm clock as an example to illustrate the three
perspectives above. This specific clock is added as an extra function to a
digital camera. If you are told that a digital camera is equipped with such
a function you will understand, because of the intentional view, when it
can be used. You will start looking for a user interface, i.e. you use a
physical view of the device, and you expect, from your conceptual view of
an alarm clock, that this interface will give you the opportunity to set the
alarm, set the current time and turn the alarm off. This leaping between
levels is typical for how humans think and use mappings between the
different views.
An alternative set of modelling levels is the conceptual, semantic, syntactic
and lexical levels [FDFH][BS1]. This is a description chosen in accordance
with how language is modelled. The conceptual level is in essence the
same as the conceptual level discussed above. It is important for a
designer since it establishes a common ground for everyone involved
interactors. Using a metaphor is one approach, but stretching analogies
too far could hamper usability. The semantic level, also called the
functional level, defines the operators, their meaning and the information
needed to execute them. In the alarm clock example above, one operation
is to set the time, and for this the current time of the day is needed. At the
next lower level, syntactic, or equivalently sequencing, design is
performed. Units of meaning such as input of character, and button-clicks
are grouped. Setting the time on our alarm clock involves the following
basic units: enter set time mode, input four digits, and exit set time mode.
The lowest level, lexical design level or binding design, assigns physical
properties to the units of meaning. Colour, line widths, and text fonts are
specified for objects. When binding for our alarm clock we could decide
that throwing a pillow across the room should stop that annoying alarm
sound.
Hakan Gulliksson
At which levels of description
is a car is controllable and
observable through
measurements?
Ever washed a window?
Crashed it? Minimized it?
Why, Who?
What? When?
How?
177
V.9 Interaction characteristics, some suggestions
All humans have names, as well as many information agents and
most types of physical interactors. Interactions on the other hand, in
general have no names, and there are no systematic naming conventions.
One reason for this might be that there are too many interactions. Another
reason is that behaviours of physical objects have not been interesting
enough, which means that a description of the interaction degenerates to a
one-way command or action. We hit the golf ball, but do not care about
the behaviour of the ball when it is hit (ouch!), only that the ball landed
200 meters away in the middle of the fairway. Consider the way we build
our sentences. The subject-verb-object organisation such as in "HakanG set
an alarm clock." is obviously working well for us, but perhaps it
overemphasizes the interactor s role, and favours behaviours where an
action by someone is directed at something.
Some particularly interesting interactions have been honoured by names.
Examples are high-level human behaviours such as flirting, courting, and
conversation. There are however no short forms for shaking-hands-whenone-person-is-happy-one-is-grumpy, or adjusting-temperature-for-ashower-when-water-in-pipes-are-cold.
Maybe we will someday have a language based on interactions? To
make such a language useful we need to extend our vocabulary of
interaction, but even changing the order of the clause elements would
make a difference: "Fighting and arguing are Adam, John, and Peter over
original sin".
Noun / Verb (Adjective,Adverb), Actor /
Action (Characteristics),Object / Method
(Attribute), Program / Execution, Data /
Operation, Representation / Transformation, Signal / Filter, Knowledge /
Learning, Message / Modulation
blast, blow, bump off, burst,
crash, damage, demolish, destroy,
devastate, fulminate, lay waste,
sabotage, break, bust, collapse, crush,
damage, decompose, disassemble,
discontinue, disrupt,
end.
What is the Universe?
A collection of objects exiting in space, or
the processes that shapoe it, dynamic
forces at work?
[CC]
The true intention behind the behaviour of a human is difficult, if not
impossible, to identify. When we sing in the morning it could be because
the sun is shining, but it could also be a torturous way of waking the
children up. Most interactions also exist at several levels. If we, for
instance, want to talk to the house owner we alert him by ringing the bell,
by moving a finger [RV1]. To accomplish complex goals higher abstraction
levels use internal models of planned combinations of lower level
behaviors. Lower levels on the other hand take care of the detailed howto-do-it aspects of the interaction. We change levels from lower to higher
by learning, and we change from higher to lower levels when our plan, or
the behaviour we learned, fails. A complex interaction that we have not
performed before will force us to consider elementary actions. Try to
remember the first time you prepared a gourmet dinner. Each step in the
recipe was probably a major obstacle. Envisioning the prepared plate was
probably easier.
There are many different views we can use to study interaction, for
instance the intentional-conceptual-physical views introduced in Chapter
V.8.1, or a state-based approach. A third alternative is to view interaction
at the following three levels: aesthetic, functional, and transmission level,
another variation of the layerings introduced in V.8.1.
’1’
’0’
Aesthetics is concerned with the experience of an interaction. Did you
enjoy it? Was it visually appealing? Do you want more of the same?
However, interaction is of no use if the participants do not gain from it.
The functional, or semantic, aspect highlights this constraint. Failure is
also inevitable if messages never arrives at the receiver, or if there is no
Hakan Gulliksson
178
input channels available. The parts, and the behaviour, of the system that
ensures low-level feedback are referred to as the physical, or
transmission, level.
Let us discuss a smile as an example. The intention is to signal and induce
a positive emotion, but a smile for less than 1ms will not be detected. If it
is 100ms long it might be detected, but you will not be sure that you saw
it. To be smiled at for a couple of seconds is a nice experience, but if the
smile continues for a minute, or more, you will start to feel uncomfortable.
“t the semantic level you analyse the message and ask yourself, What
does this person mean by smiling at me all of the time? The original
physical representation of a smile is a facial expression accomplished by
moving muscles. But, in the right mood you can also see the man in the
moon smiling. What about the aesthetics of a smile? Is perhaps Mona
Lisa s the perfect smile, or maybe the hint of a smile of your first-born?
Try to think about an ugly smile. What makes it ugly? Its representation or
context?
At each of these three levels the communication channel is susceptible to
noise, or its equivalence. Noise at transmission level is easy to understand.
One example is that the transmitter sends a , but the receiver sees a
due to some electromagnetic disturbance. At the functional level noise can
be exemplified by an e-mail in Swedish sent to an English receiver, or a
mathematician trying to explain the solutions of Fermat s theorem to
anyone else.
For modelling we can borrow a battery of terms from systematics. An
interaction can be seen as a system with input and output from context. A
closed interaction is the special case without information flow. Interactions
can be adaptive and have memory. They are usually non-linear, and
neither stability, nor time invariance, can be assumed. An interaction can
be deterministic or stochastic depending on the complexity and the
characteristics of the participants. Interaction with angry people, or with
individual raindrops must be considered as stochastic. Another aspect is
the observability of an interaction. In other words, if we can deduce all
internal states of the interaction from observation. When the participants
are human this will not be possible. We cannot know everything about the
inner workings of the players in a game of football. Software based
interactors in principle can read each others minds, but in practise any
complex interactor s expressions, whether external or internal, are very
difficult to decipher.
We need to measure interactions in different ways, for instance to
evaluate if they fulfil our expectations, and return our investments. A
game of football has a score, which is one measure. If the objective of the
game is player exhaustion only, we should measure how tired the players
get. Measuring is not too difficult if we are content with physiological
measures, but measuring the entertainment value of a game of football is
not that easy. Will subjective measurements do? Perhaps we could analyse
face expressions rather than asking interactors? At least we all agree that
an exiting game is more entertaining than watching the presentation of
Office® while installing it. Another example is that some users complain
about the quality of speech when using an Internet phone, even though
they clearly can understand what is said. This is an example of where Min
aesthetics and subjectivity matters. At transmission level the Information
theory by Claude Shannon provides useful measures and estimations of
quality. The functional and the aesthetic levels need other, indirect,
Hakan Gulliksson
Context
Max
Min
179
measures of evaluation however, such as a questionnaire, or a viewer poll.
But, how do we estimate the results at these levels already before we start
an interaction?
There are many other aspects of an interaction that we can estimate. Here
are some suggestions:
Complexity. The level of complexity describes the combined
complexity of all state machines involved in the interaction. An
interaction can have a high level of complexity even if the rules
for the interaction are simple. Complexity increases with the
number of interactors involved, and with the perceptual and
processing resources available to the interactors. If a rule implies
choosing between only two alternatives the choice can still be
enormously complex, depending on the states and contexts of the
interactors. The extent to which an interaction actually exercises
the potential complexity of the interactors is the complexity
depth of the interaction. Interactivity depends on the choices
available [CC].
A
K
All
in?
Asymmetry The complexity, e.g. skills, cognitive capacities and
number of internal states involved, could be evenly distributed
between participants, or associated to only one of them. One
example of a highly asymmetric interaction is the user pushing
the elevator button where most complexity is on the user s side of
the interaction
Interaction bandwidth, or resolution. This is the interaction
channel capacity. A game of chess has simple rules, and with
each move only 5 bits of information is transferred. If you instead
consider all the moves in a game of chess as one transaction of an
interaction, or if you take the psychological interplay between the
interactors into account, then the bandwidth is much higher.
Many parallel channels, and also communication symbols rich
with information, increase bandwidth.
A
A
A
K
A K
Other aspects of the channel is its dynamics, i.e. the relation
between the maximum and minimum bandwidth, and how
difficult channel access is.
Effectiveness or congruence. The effectiveness of an interaction
is to what extent the objective of the interactors is fulfilled by the
interaction, i.e. to what extent an interaction satisfies system level
goals [HP]. Some of the goals are perhaps not fully compatible,
and it might even be impossible to obtain all of them. You and
your friend will get some exercise by raising your glasses of beer,
but a round of golf is more congruent with an aim of efficient
exercise. Still, you might rather take another beer. Driving any
car for fun, or a red Porsche, makes quite a difference.
Hakan Gulliksson
180
K
Efficiency. An efficient interaction needs few resources to
achieve its objectives. Use of network capacity should for
instance be minimized at the transmission level, and using an
appropriate compression algorithm for coding the message helps
to accomplish this. At the functional level attention is a precious
resource, and distractions are bad guys.
Entropy level. A user action will not always give the intended
result. The uncertainty of the outcome is the entropy level of the
interaction. When a golfer makes a put from 1 meter he will miss
one every now and then. A high entropy level makes planning
difficult, and if the level is too high the interaction will appear
chaotic and uninteresting. If, on the other hand, the entropy level
is very low, the interaction itself will bore humans even if it
produces useful results (9999 puts from 1dm).
Creativity level. This measure estimates to what extent the next
useful state will be a surprise. A high level of creativity
corresponds to low predictability, but in this case good comes
from bad. As the potential for the unexpected rises, so do the
chances for discovering something new, and also the number of
possibilities. To not understand something completely could
increase curiosity, liveness, and attract attention. The level of
creativity is usually raised with increased interaction bandwidth
and complexity. Generating the next step on the dance floor is
one example of a highly creative interaction. The creativity depth
complements the creativity level and indicates the length of the
sequence of states or transactions that can be used in a prediction
of the next useful state.
History and memory. Relates to the number of previous actions,
and other aspects of the interaction that can affect the current
state. A high level of entropy indicates that the history is not very
useful.
Immersion level. This measure describes to what extent a
participant experiences being there while interacting. A welldesigned interaction can give a high immersion level, even if the
interaction bandwidth is low. Immersion level also measures
whether irrelevant context is screened from the interaction.
Typically immersion is discussed for virtual reality, but advanced
technology is not necessary, an interesting discussion in a chat
could also screen the rest of reality out. A related concept
sometimes used with the same meaning, as immersion is
presence.
Affection level. If you run into a wall, the wall will not think
much of it. Affection level measures to what degree an action
from one interactor changes the states of the other interactors.
Human interaction has a potential for high affection levels. We
tease, hurt, and excite each other using words and actions. To
automatically characterise and estimate emotions is currently an
active area of research.
Hakan Gulliksson
181
Satisfaction. This is another of the concepts without consensus
on the definition, and yet everyone roughly knows what it
means. It is an emotional and subjective response to a stimulus
where the response is relative some expectation, or fulfils some
need, of the receiver (puh!). Satisfaction is a sense of contentment
when expectations are met, or even exceeded, and it has a
significant social dimension. The receiver wants to feel privileged
and chosen relative to the rest of the world. Things and
Information do not care much for satisfaction and it is difficult to
measure objectively.
"Excited," "euphoria,"
"thrilled," "very satisfied,"
"pleasantly surprised,"
"relieved," "helpless,"
"frustrated," "cheated,"
"indifferent," "relieved,"
"apathy," and "neutral"
Range of intensities, Joan L.
Giese and Joseph A. Cote
Cohesion and correlation. Cohesion describes the level of
correlation among interactors over time. Strong cohesion implies
high bandwidth, and short communication delays. A bicycle race
has a high level of cohesion. All participants start at the same
time, follow the same route, and finish at the same place.
Correlation is a mathematical measure of the co-behaviour of the
involved interactors.
Coupling. Measures the number of the channels between the
interactors. A crowd of people all taking to each other at a party
is one example with quite a high degree of coupling. Figure V.9.1
below shows three different structures for interaction. The
leftmost figure illustrates similar highly coupled computing
components that work in parallel. With more heterogeneous
interactors we can have loosely coupled groups, and the third
figure, to the far right, shows autonomous interactors, loosely
coupled, that cannot be assumed to share common objectives.
Multiple coupled interactors acting in parallel quickly increase
the complexity of the system.
Figure V.9.1 Coordination
structures with different levels of
coupling.
Structure. As the complexity of an interaction increases structure
emerges, or is imposed, see figure V.9.1. This happens because
interactors need structure to study, or be able to manipulate the
interactions. Typical measures of structure are number of layers,
number of groups, and depth of the resulting hierarchy. Let us
say that the board of Ericsson decides to refocus the company,
from technology oriented to service oriented. The detailed
implementation of this is clearly a monumental interaction. Each
department will have its own agenda that will affect the
workings of every group in the department. Probably the
transformation needs to be performed in steps over time. Each
step, and each affect, in itself is an interaction. The new service
orientation will itself mean a modified structure of interactions
between Ericsson and its customers.
Hakan Gulliksson
182
Context involvement. As previously discussed there is usually a
context that constrains the interaction, i.e. the interaction is
typically an open system. An active context (environment) can
even change the rules or the goals of the interaction. The
representation of a more passive context can be accessed by the
interactors, and thereby indirectly affect the interaction. A game
of football, where the loser will be thrown out of the league if a
third team looses, will be much less interesting if the third team is
in the lead by 3-0. One example of an active context is heavy rain
forcing the referee to cancel a game. Rain is one of the more
active contexts that we can find in the soccer world; FIFA does
not usually change rules during matches. Interactions are
embedded in a hierarchy of contexts where each higher level sees
an active context at a lower level as an interactor. In the football
example the rain is part of an atmospheric interaction. The rules
at that level are quite different from the rules in the game of
football. Seen from the outside an open interaction could learn
and adapt to a changed context. In this case we have an
autonomous interaction.
Duration and Extension. Every interaction has an extension in
physical space or over a structure. As a special case it has a
duration in the time domain. The granularity of this duration can
be very small, such as in the single interaction of asking a
question, or it can be an uninterrupted period of participation
such as in playing a game of football. Start and stop, or perhaps
more interestingly, birth and death are events important to the
interaction.
LAW
Frequency and delay. The time between actions in an interaction
is sometimes very important. Exchanging one love letter per year,
or one per minute, makes a difference. If the frequency is too low,
an interaction tend to be a one-way communication where all
relevant context from the last action is forgotten, or has changed.
The time to download a web page certainly affects the
experience. Prompt delay is also the reason why spreadsheets are
so popular.
Alignment. Related to the issues of timing and structure above is
to what extent interactions are taking place in parallel,
sequenced, or are otherwise constraining each other. Two
interactions can be aligned either in time or by some other
structure. One example is taking turns in a conversation, and
another that cars in most countries are driven on the right side of
the street.
The qualities selected above are not independent, and there are other
qualities that can be described as combinations of them. One example is
that adaptivity implies high level of complexity and creativity.
Hakan Gulliksson
183
V.10 Mediation, with the help of it
The result of an interaction is usually more important than the interaction
itself, the interactors involved, or the means for the interaction. For most
people a car is only a convenient tool for transporting people from A to B,
even though some might oppose to reducing a red Porsche to a mere
vehicle. Reading a book is another example where the result, in this case
the content of the book, is more important than the format of the book, the
font used, or the colour of the sofa where the book is read.
H
H
If we study interaction from the point of view of interactor X in the figure
V.10.1 below, what happens comes from a change in X (a modified
internal state), which in turn through actions affects Y, the context, and the
interaction. The interactor Y mediates, i.e. pre-interprets, prepares, and
transports actions, information, humans, or things to and from context.
The context can be any type of environment, i.e., other interactors,
physical environment, task, situation, internal state of Y, or even aspects of
X.
A mediator is neither limited to a channel that only tunnels information,
nor to a filter that only reduces or shapes data. But, these operations are
good metaphors for possible mediations.
Context
X
Y
X and Y establish the interaction and soon become experts in it. As long as
the mediation behaves orderly X might not even be aware of Y, just of the
results that Y provides. But, if for some reason X and the mediation
suddenly lose synchrony, i.e. the expectations of X and the results of Y
somehow become incompatible, a breakdown occurs, and X needs to
focus on Y, and on how Y works, rather than on the result of the
interaction.
One example is when you run Internet Explorer® and expect your
favourite web page to show up, and instead the message The page
cannot be displayed appears on the screen. This will abruptly force you
into debug mode and to think about what you did wrong (or someone
with a good self confidence to ask himself what Microsoft did wrong). If
you tear out a page from a book, a reader will experience a breakdown as
he reads past the missing page. Another example is that you ask your son
to make the dishes, and he actually makes it without fuss! Really strange,
an investigation is necessary. If you remove the rails for just a few meters
the passengers of the train will certainly experience a breakdown.
Figure V.10.1 Interactor X
uses the information from the
context mediated by Y.
The page cannot be displayed
The page you are looking for
might have been removed, had
its name changed, or is
temporarily unavailable.
Or maybe this is a joke...
We are now quickly approaching the situation when we will not survive
without the Internet. It is our most important tool for interaction. What is
causing this trend towards distributed systems?
Hakan Gulliksson
184
To start with the problems are distributed, they surface locally, in
vehicles, at home, on the factory floor, and in hospitals. Problems are also
heterogeneous, meaning that distributed control is best and that
adaptation to the environment is needed, further localising the solution.
Next, the problems are increasingly complex indicating that different
expertise are needed for solving them and this expertise are also
distributed. Furthermore distribution better supports robust, error
tolerant systems. If we combine the locality and the complexity of
problems distribution seems inevitable.
We are currently building the infrastructure for distribution (the Internet)
and inventing tools for implementation. The main problem facing
designers of distributed systems is that context is not as easily shared as in
a local interaction.
Distribution gives things an advantage over humans since they can easily
connect to the network. We tend to think of a thing as a single
autonomous unit, because this is how we usually see ourselves. We get
another perspective if we instead consider ourselves as a co-operating
collection of cells. A thing could be as small as a cell, designed by man
such that many of these tiny things co-operate. Hundreds or thousands of
relatively limited units could organize themselves into appropriate
structures. If they move around they could for instance reorganize
themselves into a shape that best fits the terrain, perhaps to climb a stair.
An even more visionary application are units at the molecular level that
together physically model 3D data.
Assume that the storage
capacity was improved by a
factor 1000 over network
capacity. Would that change
how we build and execute
applications?
Assume that the capacity of
computer networks suddenly
improved by a factor 1000
compared to computer
capacity. Would that change
how we build and execute
applications?
V.10.1 Mediation as a model
After the introduction to mediation in the previous section it is time to
introduce our next model. Below, in figure V.10.2, the roles played by the
mediator Y are ordered according to their level of meditative power. A
tool has a relatively low mediative power and is typically fully controlled
by simple commands. It could be as simple as a hammer, or as complex as
Word®. Its main objective is to simplify for the interactor using it (chair,
positioning device, data base), or help it achieve otherwise impossible
tasks (screw driver, Apollo shuttle, camera, Word®).
Complexity
of mediation
Noisy
channel
Tool
Medium
Models are the mediating
artefacts of design
David Benyon.
Figure V.10.2 Mediator Roles
ordered according to complexity.
Social
actor
A meditative tool is quite a broad concept so to make it useful we need to
refine it into categories. There are many ways to do this, but we will
follow the suggestion from [BF2] in which a tool can have the following
meditative functions (X and Y from the figure V.10.1 above):
Reduction, Y reduces, focuses the actions possible for X.
Tunneling, Y guides, transports X through a predefined sequence
of actions.
Hakan Gulliksson
185
Tailoring, Y adapts possible actions to the characteristics of X.
Suggestion, Y suggests actions to X at the right time and place.
Conditioning, Y teaches X by encouraging an action
Self-monitoring, Y monitors X.
Surveillance, Y monitors context.
Rehearsability and reprocessability, Y allows X to reexamine or
edit an action before or during the interaction.
We will give more examples of tools in the following sections.
An interactor with the meditative power of a medium provides
experiences and could allow an interactor to explore cause-effect
relationships. A book, or a calculator showing a range of curves, are two
examples on the low end of the scale and on the other end a full blown
VR-environment could provide symbolic as well as sensory data and be of
any complexity. Mediation is related to the concept of immersion, i.e. the
experience of being there in VR, but it is not identical. Immersion is the
effect of an interaction as perceived by the receiver. Mediation is
accomplished by one interactor directing output towards another.
The US and worldwide number of
original book titles that have been
published, both in and out of print.
65 million book titles
University of california
A social being is capable of even more powerful mediations, which
could be used to establish and maintain social relationships. Being social
means that the mediator has a fairly extensive model of the other
interactor and uses this model to modulate powerful representations. A
social actor also could assume a role that matches the objective of the
mediation. As this type of mediator is capable of creating and
manipulating a fictive world it can also be classified as having a narrative
intelligence, and be trusted or not.
Viewing interaction as mediation by a tool, medium, or a social actor can
be used to grasp the complexity of persuasive computing that will be
discussed in Chapter V.15.2 [BF2]. The model above provides a framework
for possible meditative roles. It is not useful to describe details of the
mediation, or how it comes about.
V.10.2 The medium
A medium is the realisation of a symbol system (images, sounds, texts,
sign languages) including its implementation and how it is used [LEJ2].
Many theories have been developed to model information and
information exchange. One is the model from information theory in figure
V.10.4, pioneered by Claude Shannon, and often used to describe
communication between computers.
Information
source
Transmitter
Channel
Receiver
Destination
A medium is the realisation of a
symbol system (images, sounds,
texts, sign languages) including its
implementatation and how it is
used [LEJ2].
Figure V.10.4 Shannons model of
communication.
NoIes
Hakan Gulliksson
186
In this model, communication is seen as a method to transfer information
from information source to destination through a channel. One example of
an information source is an image. It starts out as light (physical
representation) that is digitised into ones and zeroes. This digital
representation is transmitted over a wireless network using oscillating
electromagnetic fields (an analogue physical representation) to the
receiver. The transmitter formats and modulates data such that the
channel is used as effectively as possible. One of the problems facing the
communication engineer is that both the transmitter and the receiver have
characteristics constraining the possible transmission bandwidth. The eye
and the fingertip allow quite different possibilities for communication.
T
T
T
T
T
I
The channel is the media with properties that enables communication, but
also with a maximum bandwidth limiting the transmission capacity. It is
for instance impossible to transmit video with the same quality as
television through changes in air pressure, i.e. as sound. Some channels
are considered digital since data is represented digitally, but even digital
data need representations in the analogue physical reality to stay alive.
The physical channel has its own set of problems. One is damping and
another one is that noise can disturb the channel, possibly distorting the
message. The fact that the signal intensity decreases as the distance to the
sender increases has major implications on human and animal social life.
It is for instance one reason why a family is such an efficient organisation.
The channels people use are certainly not ideal for communication. A high
level of background noise, such as an ambulance screaming through the
room, can be quite disturbing. Discrete digital technology such as
computer networks are built to circumvent some of the problems. You
rarely get a slightly faded e-mail message.
Most signals in nature have a behaviour in the frequency domain such as
the one shown in figure V.10.5. Relatively high amplitudes at low
frequencies and lower amplitudes for higher frequencies. One example is
the human speech.
Amplitude
Figure V.10.5 Typical frequency
behaviour of signal in nature.
Frequency
The channels used by nature have a different shape. They typically exploit
some physical resonance phenomenon at some frequency, which gives
them a characteristic bell shape, see figure V. . . “t the peak frequency
the transmission capacity is maximal. One example is the human ear that
drops both the lowest frequencies below 20 Hz, and the frequencies above
20 KHz. Human eyesight has a similar shape.
Transmission
capacity
Figure V.10.6 Typical frequency
behavior of channel in nature with a
bandwidth of B Hz.
B
Resonance
Hakan Gulliksson
Frequency
187
The bell shape results in a channel bandwidth that relates to the amount of
information that can be sent. According to Shannon the channel must
provide us with a capacity that surpass the demands of the signal. The
resulting curve cuts off signals with low frequencies, but since low
frequency indicates low information rate, not too much has been lost.
High frequencies are cut off as well, and this is another trade-off. Very
high frequencies imply high signal energies with high transmission costs.
It also indicates problems in generating and receiving the signal.
The channel could filter, or transform, a message. One example of a filter
is a television set that does not allow certain programs before 5 pm. An
example of a transformation from the same context is the subtitling of
foreign movies.
To compensate for noisy channels we add redundant information in the
message. The natural language for instance contains lots of redundancy,
up to 50 percent. People with different dialects saying the same word will
sound quite different, but the redundancy will help us guess the right
word. Interestingly this task is quite difficult for computers since
redundancy in human-to-human communication is not easy to represent.
Data communication between computers uses its own breed of
redundancy, carefully calculated and added on purpose to counter noise.
Let us list some more examples of transmitters and media:
Image: painter, water colours, light waves, museum.
Music: loudspeaker, differences in air pressure, rock concert.
Data item: computer application, light wave, web page.
Anger: eye brows, gun, bullet, Western movie.
T
It can be difficult to characterise something as a transmitter, channel or
medium, or as a message without going into deep philosophical
discussions. Sometimes the medium is itself the message, an idea put
forward by the Canadian media educator Marshall McLuhan. Think about
MTV for instance, selling itself as a music video. The abstraction level at
which we choose to view the message, transmitter, and the medium can
differ enormously. From the lowest level where atoms collide to the
highest (?) social level. At a social level you are for instance supposed to
understand that an angry glance means that you are standing too close to
someone else.
V.10.3 Pragmatics
Pragmatics is the study of language use in relation to language structure
and context of utterance [ADFH]. It is a multidisciplinary study (like most
other topics in this book), involving at least linguistics, philosophy,
psychology and sociology. We extend our area of interest from a single
sentence to the study of discourse, i.e. the study of more than one sentence
connected to a system of related topics [ADFH]. It might seem that
language, in a straightforward, automatic manner, can be split up into
sentences and terminals that correlates to objects, structures, and
meanings in the world. But, this is only true if the context is strictly limited
[TWD2]!
Hakan Gulliksson
Pragmatics: Interpretation I of
representation R of world W.
(Representation, Meaning,
Interpretation) <= >(Syntax,
Semantics, Pragmatics)
Peter Wegner, Brown university
188
The intended effect on the listener is usually one of the following [JS]:
Conative: deals with an action by the addressee as in an order or
request, Do this , “nswer my question .
Expressive: used to communicate states or beliefs, I feel fine , I
believe in you .
Referential relates to the state of the world or of third party, It
is raining , He seems to be in a good mode
Phatic tries to open or keep the channel open, Hello there! ,
Please continue , Copy that, Over , “cknowledge .
Metalinguistic: information that concerns messages, or the
communication itself, This is important , Say that again .
Poetic: aesthetics, describe something such that it arouses the
receiver.
Another taxonomy of possible effects starts with two basic messages:
assertions and queries. An assertion informs, it states a fact, and a query is
issued to retrieve some information. From these basic types we can
identify other such as; reply, explanation, command, permission, refusal,
offer, promise, acceptance, denial, and so on.
Conversations let us externalise questions and thoughts that we have, and
as we formulate ourselves we often see things from new perspectives. As
listeners we have to interpret messages, and this will tune our mental
models. The process is fundamentally social and feedback is essential.
Speakers constantly adapt the message to the state of the audience, and we
always modulate the message, giving it a personal touch, consciously or
subconsciously trying to impress the listener with our personality.
Figure V.10.15 below illustrates the structure of an electronic knowledge
community where conversations are taking place [JC]. Circles represent
conversation topics; a large circle implies higher activity. Black filled
circles indicate new topics, and grey circles shows participants
contributing to a conversation. The figure in a fascinating way shows how
knowledge is created and who is contributing where.
“Clouds appear
and bring to men a chance to rest
from looking at the moon.”
Basho
Conversation is essential.
We use conversation as a medium
for decision making. It is through
conversation that we create,
develop, validate, and share
knowledge.
T. Erickson, W. Kellogg [JC]
Conversational distance is
maintained with incredible
accuracy (tolerance of an inch)
E. T. Hall Dance of life
Figure V.10.15 Topics and
participants in conversations.
“nother illustration, the social proxy for social awareness, from the
same source [JC], complements by showing activity within a topic of
conversation, see figure V.10.16. A conversation is once again shown as a
circle and each labelled circle denotes a participant.
4
3
1
Figure V.10.16 Activity in
conversation.
2
Participant 1 is not contributing at the moment and participant 4 is the
most active, indicated by the distance to the centre. From the figure we can
see that someone is active, but we cannot hear what is said, much in the
same way as we can see two people talking on the other side of the street.
Hakan Gulliksson
189
Also, we cannot see whether the other participants are paying attention,
but this is difficult also in H-H face-to-face interaction.
The social information in the figures above is presented in new abstract
ways. It is not a direct copy from the physical world to virtual reality. By
reviewing traces in diagrams, such as the one in the figure above, we can
follow the history of conversations in a knowledge community and update
ourselves on what has happened. This illustrates the fact that we enter the
digital world our conversations will no longer disappear. They will persist
and possibly be reused.
V.10.4 Social dynamics and timing
We do not exchange messages randomly with other people. There are
patterns and rules supported that in effect define our culture. Here we will
shortly describe discourse, turn-talking and conversation. Discourse in
this context, also referred to as talk exchange, is the general behaviour
when sentences in a discussion of some topic are adaptively combined
into a symmetric interaction controlled by rules and principles.
Give me a place to stand on, and I
will move the earth.
Archimedes
Turn-taking, also called floor passing sets the rules for who talks when. A
talk exchange is started by an opening utterance to attract attention. After
the opening there are different principles governing the next speaker s
entry into the exchange. It is done either by appointment from the current
speaker or, if there is a sufficient pause, by breaking into the
conversation. If the pause is too short the interrupting speaker is
considered rude, and if a pause is too long the conversation will be
inefficient. One alternative procedure is to pass a token around and
anyone who has the token is allowed to speak.
We also use our eyes for turn-taking and this is one reason why face-toface interaction is more efficient than interaction over a voice only media.
The timing in an ordinary speech conversation could be extremely tight,
down to 1ms, a delay that is quite difficult to match with current
technology.
About one forth of betweenspeaker intervals are shorter
than 100ms. This is such a short
time that the next speaker has
already decided what to say
before the current speaker is
finished.
V.10.5 Meaning and inferential model
We are still left with the problem of how a receiver extracts the intended
meaning from an utterance or an image. The message model from
Shannon, described in Chapter IV.3.1, will not do, and speech acts, see
Chapter V.15.6.1 only give us a fresh view and a taxonomy for sorting
utterances. This section will discuss the problem of finding meaning.
Surprisingly, too much information is often a worse problem to tackle
than lack of information. The behaviour when faced with the second
problem is simply to get some more information. The problem with too
much information is to find the correct and relevant information hidden in
the false or irrelevant. A small group can handle the problem of abundant
information by at least five basic strategies [AD4]. A first one is for a
member to take some action. Perhaps by asking questions to, or trying to
force opinions on other members of the group and awaiting the response.
A second strategy is to combine information in a variety of formats, text,
graphics and others, and from several sources, e.g. colleagues, family
members, or Internet. Combining the different aspects or forms of
information hopefully clarifies matters. The next strategy is to use
contextual information, for instance by drawing conclusions on similar
Hakan Gulliksson
A man with a watch knows what
time it is; a man with two watches
isn’t so sure.
Anonymous
190
events, in another time or place. A fourth approach is to carefully reason
about, and contemplate everything learned from the three previous
strategies. Lean backwards, arms behind neck, feet on the table, and start
humming. This (creative) process can take a long time. A final strategy is
to put yourself in the position of others and try to understand their
reasoning and understanding. This is a good way of establishing a
common ground.
Language is a way of representing knowledge that has been used over
generations. Such a complex system that has evolved over many years is
not logically well defined. Our language interacts with the lives we live
and can therefore only be understood in relation to man and society. It is
continually revised and refined to support its users, and at the same time
changes the behaviour and structure of the users and their community.
We can exemplify the effect of evolution by the many words for different
kinds of snow that Eskimos have been reported to have, and that there are
African tribes with no word for green. Green is not a very useful word in
the jungle where everything is green. The claim for the many Eskimo
words for snow has been denied [RN], but it is an illustrative example.
Furthermore, expressions are ambiguous. The listener must make use of a
context for a proper interpretation and one example is that the phrase
You poor thing will have a new meaning after reading this book. Even
before reading the utterance is underdetermined, i.e. what is comforted is
not uniquely determined. Yet another problem is that the utterance by
itself gives no clue as to the intention of the speaker. The expression You
must do that! might be a positive suggestion, or a direct order, according
to circumstances.
To really understand someone, that is to have a deep, successful,
interaction, involves a common understanding of a shared context.
Context of a real world conversation is extensive and makes almost any
conversation stochastic and difficult to control, Even a simple word such
as end has
listed meanings in Webster s electronic dictionary, and
close has more than
meanings. Luckily, humans are good at keeping
a goal-based conversation stable. The utterance This is the end , said by
the sink, or while splinting a rope helps, or rather forces, the listener to
make different inferences. Making sense of a sentence like This is not
fair! an interpreter has to recognize the situation of the utterance, and
what fair means to the speaker. The conclusion is that language cannot be
understood as transmission of information only.
If participants in the conversation have the same inferential model, and
the same contextual understanding, then the meaning can be re-generated
in the listeners. The same inferential model means that both the speaker
and the listener come to the same conclusion from the same facts. If given
the statement I'm sure that the cat likes you pulling its tail the listener
(hopefully) recognizes that the animal loving speaker could not be
speaking literally, and infers that the speaker means the opposite of what
is said. Two humans never have exactly the same inferential model and
the same contextual understanding, even though they are often similar
enough for communication. Consequently two conversations are never the
same. As the individual fragments of a conversation are fused into a
coherent message, ambiguity is reduced, and a reasonable guess can be
inferred about its meaning. Furthermore, people remember different
things, so also the memory of the last interaction will be different for the
participants.
Hakan Gulliksson
qanuk - 'snowflake
qanir - 'to snow'
qanunge - 'to snow'
qanugglir - 'to snow'
Snowtalk in Central
Alaskan Yupik
Despair or gratitude?
One of the few words
with more meanings
than the word ”thing”
is the word ”anything”
Comment on the following: “If
a lion could talk, we could not
understand him."
Wittgenstein (1953)
191
If the participants use separate sets of rules, or have significant differences
in their knowledge, the speaker has to adapt to the listener, and prepare
the message carefully. One example of this is an adult trying to explain
something to a very young, unknown, child. The child s parent, who
better knows the child, sometimes has to intervene and translate. We
constantly adapt our speech depending on who is speaking and listening,
what the dialogue is about, where it is taking place, and how it proceeds.
To the list of problems when interpretating utterances we can add others
such that we do not always mean what the words say. When using irony
we even mean the direct opposite. We sometimes speak indirectly, as
when we say to the service man: My car has a flat tire , and what we
actually mean is that he should help us change the tire, It s getting late ,
which in fact is a request to hurry, and The door is over there , asking
someone to leave! To top all of the above there is humour. Other
important presumptions on the speaker are sincerity, and truthfulness.
How is a computer supposed to make sense of this? The solution is that it
has to understand, and use more of, the context, specifically human
intentions, and that it has to study the human language. Thinking about
interpretations has resulted in the theory of Hermeneutics, where one
dispute is whether a text has a meaning independent of interpretation, or
if the meaning is created while reading [TWD2]. This distinction is
important because if each individual always recreates meaning, all
individuals in a society have to learn to interpret language in the same
way. Otherwise society will not work. The focus will be on dynamic
interaction rather than static representation. The importance of
interpretation has been taken even further by the philosopher Heidegger
who claims that existence is interpretation, and interpretation is existence
[TWD2]. We do not even exist if we do not interpret!
Egyptian hieroglyph for
movement, walking.
The overall trend is that more and more time is spent on reading and
updating information, and that update is continuous rather than at any
specific time. Examples are numerous, ICQ and similar on line chat
programs complement e-mail, television sets in every room show the
news all day long. Infrastructures support the trend, telephone modems
are exchanged for “DSL, always on line , modems and mobile phones
now have stand by times of several days. Why this enormous appetite for
information?
V.10.6 Infrastructure of H-T, H-I interaction
The current infrastructure for HCI will soon change. The PDA (Personal
Digital Assistant), also called the PIM (Personal Information manager), or
the PID (Personal Information Device), and the mobile phone are the
predecessors for more advanced wearable, networked, and computerised
tools. The main technical challenges are limited computational resources,
low communication bandwidth, memory limitation, power conservation,
and raised radiation levels.
Hakan Gulliksson
T/I
H
T/I
192
It is interesting that many infrastructures seem to develop towards
distribution. The fixed power outlet is more and more distributed using
advanced battery technology, wireless networks are soon more prevalent
than their wired ancestors, and computing is no longer performed by one
central resource.
A long-term vision is the distributed wearable computer, built by
wearware, communicating over a high bit rate body area network. It is so
small that it can be integrated into clothes and jewellery, and it uses
sensors and effectuators all over the body. The intimacy with the user
opens up some interesting possibilities. First, the computer could more
easily learn behaviour from its owner, a single individual. Second, the
close synergy between the computer and the user makes it easier to
protect the system from attacks.
Water anytime, anywhere.
Camel association to ∞*∞
Cheese!
The criteria s for identifying a wearable computer are
It may be used while the wearer is in motion, or is doing
something else.
It exists within the corporeal envelope of the user, i.e. it should be
not merely attached to the body, but become an integral part of
the person's clothing, see illustration to the right.
It must exhibit constancy, in the sense that it should be constantly
available.
Wearable computers introduce new ergonomic problems. Wearability, i.e.
the physical shape of the wearable, and how it interacts with the human
body is important for equipment worn throughout your life. We have an
aura around our bodies that the brain will perceive as part of the body and
where permanent equipment could be placed. We all have different
physical forms and dimensions that the equipment must adapt to, and the
equipment must not be too heavy. It should not dissipate disturbing heat,
but look good, and somehow get access to energy [FG]. New technology
also has to be socially acceptable. If the telephone earphone-microphone
unit is so small that it is invisible, then the socially accepted behaviour
could well be to insert a finger in the ear to indicate a telephone
conversation.
What time is it?
Do you want to use a graphical user interface based on Windows® for
your wearable computer? Alternatives?
The answer might be right before your eyes. At the same time as we dress
up in computers, the computers will invade objects in our environment. If
you want to enter text you use any keyboard nearby. The feedback from
the system will appear on your PDA, or on any other appropriate display.
You could move the dials of any clock to set your alarm clock at home.
The possible interactions are unlimited in a world filled with things that
continuously look for other things and combine with them into more
powerful interactors.
Hakan Gulliksson
“If every work place had a gravity
invertor, productivity would be
greatly increased and repetitive
strain injuries would decline
instantly."
Only 99.95$
193
Embedding the user interface into the environment rather than carrying it
around has many implications for the user interface. The availability will
be better since wireless networks and small battery based power sources
can be avoided. On the other hand, users might not trust an embedded
ubiquitous system, and privacy is certainly an issue with public displays.
The interaction between a computer and a human user could be described
in different ways, see Figure V.10.22 [SM]. Interaction shown in part a) is
the basic form that we have discussed. Part b) of the figure shows the
typical situation where a human accesses context and uses the thing as a
computational device, or as external memory. In interaction c) the thing
works as a filter, and in the last interaction, d), we could say that the thing
uses the human as a resource. Interactions b) and d) are complementary
views, either H uses T, or T uses H.
a)
H
b)
T
c)
What behaviour from your
wearable computer do you
prefer? Should it just passively
filter data, i.e. without
personality, or should it be your
personal attendant to which you
direct explicit questions and
discuss interesting issues, i.e. a
light schizophrenic touch.
H
T
d)
T
People live in their environment, they don’t use it
E. Stolterman
Figure V.10.22 Different interactions
possible between H and T [SM].
H
H
T
All interactions with things need not be computer based, compare the
paper and the pen. Adaptive materials for specialised tasks will be
available and there are all sorts of hyped technologies, micro mechanics,
nanotechnology, and others that eventually will deliver.
Money makes the world go around and it is also needed to build, and
maintain new infrastructures and services. One question is how users (I
and H) should be charged to support further development (and generate
the necessary profits). Charging by network service is acceptable if the
user understands the relation between the value of the application and the
money charged. One example is to let the user pay per byte received and
sent, but this is a relation not always easy to understand. Charging by
time is the traditional telecommunication solution to charging. As a
principle it is simple to understand, but no one wants to pay when doing
nothing, which means that most of the computers connected to the
Internet will be turned off most of the time, maybe not the best scenario
for new services. The third option is to charge users depending on the
applications used. Now the problem is how to compensate the network
operators; they need money to improve the network. Maybe a
combination is the best solution, or maybe some other solution, such as a
governmental support?
Hakan Gulliksson
Shape memory alloy is malleable at
low temperatures, but above a
"transformation temperature" it
becomes hard to deform and
pushes back to its predetermined
shape.
The Chinese University of Hong
Kong
111
194
V.10.7 Screen or paper as the medium
It is somewhat unfair to compare reading from paper to reading from a
display. Paper print technology has evolved for 500 years, from hand
printed characters to standardised Times New Roman in millions of
documents. Line separations, margin widths, sentence length, and many
other aspects have been optimised for paper over the years. But,
perceptual and cognitive experiments show that if the conditions are the
same (good lighting and contrast, equal resolution) there is no difference
in reading speed or task completion. Today, paper is still superior in
resolution, but this is only a technical problem. Some potential problems
with reading from a display are that fonts can be poor on low resolution
displays, one page of paper needs several screens, and that you have to sit
in front of your bulky display, which is both an ergonomic problem as
well as a practical problem. Do you really want to read your good night
story from a desktop computer?
Virtual
version
Paper
version
On the other hand, an electronic reading board, e.g. a Tablet PC, has all the
important features of the book, for instance page flipping possibility, and
a comfortable format. It also has some extra features from being a virtual
rather than a real book. You can add, and edit, personal notes, ask the
board where you stopped reading yesterday, read in the dark (!), and even
change to another book by a short command. Of course there are some less
wanted side effects such as that the board switches itself off when battery
level is too low, and that it has a smell of plastics rather than of paper.
Technology development also explores the complementary possibility, i.e.
to make the paper more flexible. New types of paper have recently been
invented where the paper can be electronically updated. You can use the
same newspaper every morning, but with a different content electronically
updated every night.
Imagine the applications you could build if technology allowed smooth
transitions from paper to digital media. You use a paper based book when
a digital environment is not available, e.g. on a sandy beach, and switch
seamlessly to a digital representation when the environment allows you
to, and it is favourable. There are several interesting research projects
aiming at such a smooth integration of paper and digital information. In
one project a real desktop is used as a computer screen and a video
projector generates the output images on the desktop, for example an
electronic version of the paper book you read when at the beach [HK]. A
video camera collects information about real and virtual objects on the
desktop and enables direct manipulation by analysing hands and fingers
found in the video. A problem with the idea is that you have to clean your
desk.
V.11 Interaction control, a joint venture or one of us in control
Many systems only have a single controller, an algorithm, a finite state
machine, boss, or a policeman. In general however, an interaction involves
two or more participants, each potentially capable of controlling the
interaction. If there is a clear objective for the interaction along with some
rules to follow, and if the participants agree on the rules and the objective,
these constraints might work as a control. A game of soccer is one example
where two teams with 22 players agree on rules and a goal. Actually three
goals, but we leave the two with nets out of the discussion. The goal here
Hakan Gulliksson
No control
195
is usually to have fun, get some exercise, and to entertain the spectators. It
is rare that one member of a team takes the ball and tries to score with his
hands. It is also rare (at least for grown-ups) for someone to take up the
ball in the middle of a game and leave the field.
The interaction, in this example a game of football, does not follow any
predefined agenda. It is impossible to predict that the result will be 1-0,
and that the goal will come from a free kick. No one is in complete control
of the match; all 22 players collectively control it. This interaction in other
words is an example of distributed control, where peers communicate
and decisions are taken locally. Distributed control is nicely exemplified
by any H-H interactions between equals, where each of the interactors is
autonomous, and intelligent. Such an interaction can be labelled as a cooperation, competition, or as a mixed-initiative system, and will be
further discussed later in this chapter.
As a special case, one or more of the participants act as subordinates. They
can for instance have very simple behaviour, such as elevator push
buttons. This leaves another participant in control, i.e. in centralised
control. If you, an intelligent driver, see a car accident, you will command
your car to steer away from it. Many interactions with centralised control
can be described as command oriented, a view taken in the fields of
human-computer-interaction and control theory. Examples of commands
are pushing a button, and military commands such as “ttention .
Commands are convenient, but they have some limitations. It is for
instance difficult to use a single command to describe concurrent actions,
e.g. command two mice, or to manage coordinated action, i.e. two
different actions synchronised in time. Command based interaction will be
discussed in Chapter V.15.
Communication enables social conventions and other rules. The nature of
the rules is a trade-off between inflicting too hard restrictions on
interactors, and chaos. Imagine driving in the absence of traffic rules.
Without rules attaining anything at all in a social environment would be
very complicated. Information could be passed directly between
interactors, or be mediated through contextual representations, as when
we leave a note on the refrigerator door. In the following matrix, adopted
from [HP], the ways of transferring information (indirect, direct) is
combined with the possible control structures (centralised, distributed):
AP reports that IBM'er
David Bradley, who came
up with the Ctrl-Alt-Delete
key combination, is retiring.
Mute
Go away
Buy it
Pizza
Parental remote
Control tech
Centralised control Distributed
Direct information transfer Command oriented
Indirect
Constrained
Conversation
Stigmergy
Table V.11.1 Different
control structures and
information transfer.
Centralised control and direct information transfer together corresponds
to a command oriented interaction. If information transfer is indirect
rather than direct, the distinguished interactor, who is in control,
constrains other interactors by changing exogenous context
representations. A parent does this as he dad tired turns the electricity off
in the whole building when the kids refuse to leave the PC and go to bed.
Direct distributed interaction, for instance a conversation, gives us
humans much pleasure.
Hakan Gulliksson
196
Stigmergy, or indirect distributed interaction, relies on actors changing the
context and perceiving the changes. When there is no more milk in the
refrigerator you know that your children have been at home for a quick
afternoon snack, and that your next mission is to restore the level of milk
for dinner. If you yourself also wanted a glass of milk the stigmergy
specialises to competition [HP].
V.11.1 Coordination
All interaction involves coordination where an actor communicates and
synchronises its actions with other interactors, or more generally with its
context. Heavy traffic in a roundabout is a good example; a driver entering
either has to wait for a long safe access slot, or somehow has to
communicate with the cars in the roundabout telling them I am coming
in, move! . “nother example of coordination is playing a game of chess,
where coordination is based on very exact predefined rules. Any
coordination is initiated (drivers enter the roundabout), one of the
interactors makes a proposal for adaptation (waves a hand), a decision is
taken whether to go ahead (break or speed up), execution (iiiiiiih,
screeech), and evaluation (puh!).
Wikipedia – The
free encyclopedia
Definition:
Coordination is the
process of managing
interdependencies
between activities.
Malone
There are three types of coordination, unplanned, implicit and explicit.
Explicit coordination is what we usually mean with the term
coordination. Interactors have goals, apply reason to their actions, and
communicate. The coordination is realised by planning, by executing some
predefined algorithm or protocol, or by joint intentions. Voting, a prompt
for more information on the computer screen, and floor passing during
discussions are examples of the control mechanisms used. Explicit
coordination can benefit from mutual modelling where all participants
have a model of the other participants internal states, and of their views
on the current situation.
Unplanned coordination occurs when actions are triggered by the
unexpected behaviour of interactors, or by the environment. Examples are
that it is effective to shovel snow after it snows, the web browser indicates
a missing link, and that we should go to bed if we catch a severe cold.
In the third variation, implicit coordination, action motivators are
predictable and built into the behaviour of the interactors, or into the
environment. Furthermore, direct communication for coordination is not
possible, or not used for some reason. Typical implicit examples are social
laws, i.e. you should call your mother on her birthday. Another example is
a progress indicator that shows how long time you have left for doing
something. While coordinating explicitly an interactor can communicate
with the other interactors, but in implicit coordinating it has to guess their
intentions. This means that a model both of the interaction and of the
other participants, is necessary. Bluffing in poker is a good example.
Hakan Gulliksson
197
It might seem as if humans have an advantage over other types of
interactors since their communication channels have many more
dimensions (speech, newspapers, gestures), and since they are part of a
society with developed social conventions. This will soon change. As
things hook up to the network they will gain access to information via
other things and all this information will give them opportunities for
interaction that are not available to people. If people want to keep up with
the machines they too will have to hook up to the network.
Coordination implies a sense of presence, and communication, but also
reuse of the taxonomy and models from Chapter V.10 on mediation. We
for instance discussed different complexity of the mediation, as a noisy
channel, a tool, or even as a social actor. Also, the characteristics of
interaction discussed in Chapter V.9 are applicable to coordination.
Related to coordination is correlation. It is a mathematical measure that
indicates the co-variation of two processes, for instance the mathematical
models of two interactors. Without correlation there can be no interaction,
but with correlation we still can only presume interaction. Correlation
does not guarantee causality, but hints at it, a fact exploited in the
presentations of many fishy statistical investigations. The difference
between correlation and coordination is that correlation is only a statistical
measure
of
non-independence
while
coordination
involves
communication between an actor and its context.
I reached out a hand from under the
blankets,
and rang the bell for Jeeves.
“Good evening, Jeeves”
“Good morning, sir”
This surprised me.
”Is it morning?”
“Yes, sir”
“Are you sure, it seems very dark
outside”
“There is a fog, sir. If you will
recollect, we are now in the autumn –
season of mists and mellow
fruitfulness”
“Season of what?”
“Mists, sir, and mellow fruitfulness”
“Oh? Yes, yes, I see”
P. G. Wodehouse
V.11.1.1 Mechanisms
The mechanics of coordination can broadly be characterised as sharing
and transfer [DP1]. In sharing there are different actions to obtain a
resource, reserve it, or protect a resource already obtained. Transfer is
accomplished by handoff or deposit. Handoff is a synchronised transfer
of a tool or an object. A birthday gift, or using a protocol to update of a
virus definition list are two examples. A deposit on the other hand is an
asynchronous action where something is left in space or time for someone
or something else to pick up. Sharing always has a built in potential for
conflict, which transfer does not have. It is for instance not always easy to
share an alarm clock.
When machines communicate over networks they use protocols, i.e.
predefined control and data units, which are exchanged according to
predefined rules. Humans also use protocols but with the difference that
for machines all the rules have to be rigorously and explicitly defined, and
followed. A little slip from one of the participating machines will
completely destroy the communication. Technology, as we know it today,
obey the rules, but this is not true for humans who constantly, many times
joyfully, tweak rules. Protocols, in turn, needs mechanisms on lower
levels, for instance allocation of time slots, or conventions for when to
speak.
Two mechanisms for the necessary communication are selective
messaging and information sharing. By selective messaging we mean
explicitly sending a message to a specific receiver or group of receivers.
We can use speech, writing, gestures, or a number of different media
supported by technology. A prerequisite is that we need to address the
receivers and this can also be done in many ways. One is by just looking at
the receiver, another is by using a phone number. Information sharing on
the other hand is indirect, the message is left in, or transferred to, a
common medium without specifying the intended receiver. Message
Hakan Gulliksson
198
boards, overhearing a conversation, wall graffiti, and dogs marking their
territory are some examples.
Selective messaging allows for efficient feedback, because it is easy for the
receiver to acknowledge the message. This is exploited on the Internet,
and elsewhere, by the client-server protocol. One example is when a client
requests a service, e.g. a web page, from a server. Both selective messaging
and information sharing can be used in a producer-consumer protocol.
Here, feedback is not necessary; a one-way communication channel is
sufficient since the producer only starts the consumption, see the figure
below.
Client
Server
Request
Producer
No feedback
Consumer
Invocation
Response
Figure V.11.1 Two patterns for
communication, Client-server
and producer-consumer.
Another important prerequisite for communication is an infrastructure, at
least if we want to distribute the message over a large population or area.
Today the developed world has plenty of efficient infrastructures;
newspaper distribution systems, television, and the Internet are some of
the most prominent.
V.12 Co-operation, we are all in control
Because of co-operation more can be accomplished, but with the resulting
coordination, we face the problem of conflicts. Accepting this trade-off is
necessary, because without interaction, individuals are detained of
adaptive capabilities and cannot reach their full potential. Co-operation
might even aggravate conflicts. This paradox can for instance be seen
when a war is expanded by different pacts and agreements. Other
variations of related complex interactions are obstruction, exciting,
agitation, threatening, and finally parasitism where the benefits that one
participant achieves come at the expense of all others.
Necessary conditions for co-operation are joint intentions. These are
social commitments used for explicit coordination. The interactors
publicly state intentions, and other interactors can use their statements for
coordination, If you promise to stop working before midnight there will
be some tea prepared for you when you finish . Publicly stated intentions
provide stability in the world since interactors do not have to re-evaluate
everything all of the time. But, this stability has a price; it is essential that
enough interactors are sincere.
Interactors commit to the overall objectives, as for instance all players in
the Swedish national football team commit to win the next match. When
circumstances change commitments sometimes have to be dropped.
Conventions identify when to do this and which commitment to ignore.
No player should give up a match just because England starts out with an
early goal. On the other hand, all players should accept the result as the
final whistle blows, even if Sweden should loose the game. (Which they
won t . In our human society conventions are in the form of norms and
Hakan Gulliksson
Definition: Convention - A
practice or procedure widely
observed in a group,
especially to facilitate social
interaction; a custom
Language itself is nothing
more than a convention
which we choose in order to
coordinate our activities
with others
[MW3]
199
social laws. They serve as patterns that simplify decisions on behaviour in
everyday situations. Note that a convention is a coordination mechanism.
A difference between co-operation and coordination is that we are forced
to inspect the interior of an interactor to determine if co-operation is
taking place. If your children ate all of the candy it could be because they
know you are on a diet, i.e. they co-operate with you, but perhaps it is
more likely that they eliminated the risk of loosing some, by eating it all.
For co-operation, once again the three levels of modelling: intentional,
conceptual and physical can be used [DB2]. An intentional view of the cooperation is about its purpose. Why is it taking place? Do all interactors
share the same objective? The conceptual view considers information
flow, information structure and possibly mental models of the interactors.
What kind of interactors are involved? What are their relationships? How
can their interaction be characterised? Could we use an auction to sell this
merchandise? Similarities to other cooperation? At last the rubber meets
the road at the physical level, where the appropriate view may be
physiological, biological, or focus on a computerised implementation.
We can take the teacher-student interaction in a learning situation as an
example. In this example the topic taught is the IF statement in Java. “n
intentional view will focus on the interactor s objectives. Hopefully they
both share the same objective that the student wants to learn what the
teacher teaches. At the conceptual level the student knows that the teacher
has valuable knowledge about programming languages, which in this case
will be transferred in a classroom situation. At the physical level the
teacher will use a blackboard to write Java code as the means of
expressing his conceptual view.
Definition:
Norm - a standard, model, or
pattern regarded as typical:
Nilikuonyesha nyota (mwezi) na
uliangalia kidole tu. (Swahili)
I pointed out to you the stars (the
moon) and all you saw was the tip of
my finger. (English)
Co-operation is necessary because
no single node has sufficient
expertise, resources and
information to solve a problem.
E.H. Durfee
Co-operation = interactors
acting in parallel +
coordination of actions +
resolution of conflicts
Jacques Ferber
Interactors in a system have objectives, need resources, and possess skills.
Table V.12.1 below shows different situations involving the interactors.
We will not go over all the possible permutations, only provide some
examples. If the interactors have compatible goals, the resources are
sufficient, and the interactors by themselves have insufficient skills we
have a good opportunity for distributing the work. The condition that all
interactors share the same objective is called the benevolence assumption
and greatly simplifies the task of the designer. Complicating the problem
is that decisions often must be made at run time, i.e. we have a dynamic
system with real time constraints. For moderately complex systems such
behaviour is impossible to hardwire at design time, which means that the
interaction has to adapt.
Hakan Gulliksson
200
Goals
Compatible
Resources
Sufficient
Skills
Sufficient
Compatible
Compatible
Sufficient
Insufficient
Insufficient
Sufficient
Compatible
Insufficient
Insufficient
Incompatible
Sufficient
Sufficient
Incompatible
Sufficient
Insufficient
Incompatible
Insufficient
Sufficient
Incompatible
Insufficient
Insufficient
Types of situation
Independence
(Indifference)
Work in parallel
Obstruction (Co-operative
involvement)
Co-ordinated
collaboration
Pure individual
competition
Pure collective
competition (Antagonism)
Individual conflict over
resources
Collective conflict over
resources
Table V.12.1 Aspects of goals,
resources and skills give different
types of interactions.
If, as in the fourth row in the table above, resources are also scarce, as well
as skills, one way is to somehow multiplex the actors. This is exemplified
by the assembly line where it would be too expensive to have a group of
employees working in parallel at each stage of the production line. The
solution is to train experts to perform one stage each in sequence as the
product passes by. Try to think of examples yourself for the other
combinations. Which row matches a high jump final in the Olympics? A
boxing game? A war for more than 50% share of the market between Coca
Cola and Pepsi? Queuing because the road is not wide enough?
Goals do not necessarily have to be incompatible even if they are not
shared. We have many other cases where goals interact. If one actor
achieves its objective this might partially fulfil the goal of another actor.
One example is the birds that clean the teeth s of crocodiles why
crocodiles don t occasionally swallow a bird is not clear .
In a house building competition in Los Angeles the
winning team completed the
house in four hours! The
team was composed of 200
builders.
[DHA]
V.12.1 Measures of co-operation
How do we measure the amount of co-operation, and why? The first of
these questions is typically what an engineer would ask, and we can use
the previous section on co-operation to formulate some qualitative
statements [JF].
In any co-operation the addition of a new interactor should
increase the performance levels of the group.
In any co-operation an action by an interactor should reduce
actual, or potential, conflicts.
At the same time as a measure gives information on the level of cooperation, it estimates problems in a group of interactors. No co-operation
at all is usually an indication that something is wrong. The following
quantitative measures of cooperation can be used [JF]:
The number of adjustments to actions.
The degree of parallelism, which depends on the distribution of
tasks and on their concurrency.
The amount of resource sharing.
Hakan Gulliksson
If you have an apple and I have an
apple and we exchange apples then
you and I will still each have one
apple. But if you have an idea and I
have an idea and we exchange these
ideas, then each of us will have two
ideas.
George Bernard Shaw
201
The level of non-redundancy of actions, co-operation is
characterised by a low rate of redundant activities.
The number of blocking situations.
What is measured is either how well a system of co-operating interactors
works as a unit, i.e. how well it uses resources, or to what extent the
system avoids the implicit problems of coordination.
V.12.2 Mechanisms for co-operation
We will now discuss some mechanisms that give rise to, or are results of,
co-operation: grouping, specialisation, and resource sharing. We start by
noting that a necessary prerequisite for all of the mechanisms is
communication.
We can increase the possibility of co-operation by grouping the
interactors. This is easier if we decrease the physical distance between the
interactors in space and time. Grouping improves reliability by increasing
redundancy, and also enables parallel work on local problems. But, the
result is not always altogether good, suburbs for instance leads to
congested traffic that no one likes.
With grouping comes a possibility for specialisation. It improves the
solution of similar recurring problems, because expertise is reused to
increase efficiency. But, with more autonomous specialised individuals,
vulnerability increases. Everyone, or no one, wants to follow a particular
specialisation, and some cannot, even if they want to. It is difficult to be
both a specialist and a generalist. A specialist therefore needs to be either
highly adaptive, or depend on others and less capable of surviving alone.
As specialisation increases, compelled by efficiency and productivity
demands, specialists will be more and more confined to their speciality.
The surrounding society will become more and more dependent on the
specialities of the experts and this has been, and will be, exploited by the
experts.
Hospital
Umeå
University
It is hard to argue
when you are alone.
Figure V.12.1 Generalisation and
specialisation.
Generalisation
Specialisation
Sharing of tasks and resources is both a cause for grouping, and a result
of it. Supply and demand, centralised or distributed control, and problems
with coordination are some of the issues involved. Sharing resources is
also a typical computer science problem, complex software accesses all
sorts of resources, hard disk drives, and shared data and code segments,
allowing only one program at the time to access the resource.
Task sharing also implies problem decomposition and allocation of subproblems to actors. One-way to do task allocation is to publicly announce
the available subtasks and have an auction where interactors commit for
the tasks they select.
Hakan Gulliksson
202
V.13 We compete, and compromise
Competition is a type of interaction. Perhaps not as well behaved as cooperation, but found everywhere, and as we did with coordination we can
describe it as implicit, explicit, or unplanned. Implicit competition is
based on behaviour built into the interactors, or the environment. One
example is competing for a place in a soccer team. There can be only 11
players on the field. Unplanned competition is when you get a call from
the television station inviting you to compete in University mathematics
with a 4-year-old for a nice red Porsche. Another interesting example is
when less well-behaved people try to jump a queue. A queue is typically
an explicit coordination, but the annoying ones turn it into an unplanned
competition. Explicit competition is exemplified by all sorts of games
where the interaction rules are clear and the players known.
Apart from an occasional runaway train, a few overheated nuclear power
plants, and a rocket exploding because of numerical conversion error
humans do not seem to be heads up against I or T. Conscious competition
has so far only been seen in films such as Terminator.
But very little gossip is malicious –
only about 5%, if that. And gossip
does I think a lot of good, it’s an
important way for keeping us all
honest.
Professor Nick Emler, Oxford
university
Machines and software are however designed and used for someone s
purposes, not always compatible with yours. The computer virus is an
interesting example that forces you to spend money for a close to
everyday update protection. Because I and T change the human condition
and behaviour they still can cause conflicts. They are limited in many
respects that certainly will affect and constrain humans. Why do you for
instance call a number and not a person? The unique characteristics
described in Chapter IV.19 can be exploited typically with goals often
related to increased efficiency. Speeding up travelling is one example,
which is done at the cost of missing sceneries. You will only see a boring
(safe) highway. Another example is software agents buying stocks faster
than humans can manage.
Automatic machines are (so far) typically designed for deterministic tasks.
When people work together with such systems the demand for
predictable, repetitive action will afflict also the human tasks. Decision
speed increases, tolerances shrink, and quality spells increased
productivity. Furthermore, coordination and communication must be
done on conditions optimised for the computer.
We need oil. Or is it the machines
that need oil?
To continue the bad news technology threatens jobs. Many simple tasks,
such as making a cup of coffee can be done by a machine. There are also
new jobs created with the introduction of advanced technology, but they
are different, and often amounts to controlling the behaviour of the system
at a higher abstraction level. Compare digging a hole with a spade to
using an excavator. There is a substantial difference in skills needed, in
economical investment, and demands for efficient use of the respective
tool.
Conflict resolution is a good reason for interaction; if conflicts arise they
must be resolved or ignored. Some useful methods for resolution are
arbitration, justice, negotiation, bilateral agreement, laws, regulations
(which can be compiled or hardwired into things, but broken by humans).
For software, prioritisation is often used. If two programs do not agree, the
programmer or the operating system assigns them a priority, i.e. a
Hakan Gulliksson
203
number, each. The code with a higher number gets the favour and is
granted access to the resource. Alternatively each program could get
access to the resource in turn for a finite amount of time. This idea is
referred to as time-sharing. As discussed above time-sharing will be a
problem when humans and machines cooperate because of their different
characteristics.
To strike a more positive note improved technology should make it
possible to adapt machines to us, rather than the other way around. Even
without such adaptation we have used machines for hundreds of years
and managed without too much problems. For a small example of how
new technology changes our lives consider TV-news. You do no longer
have to watch it as 19.30 sharp. You can record it and watch it any time
you like. Soon you will watch it over the Internet, and select the news
items that suit you.
Living in general and competition, in particular, force interactors to trade
off alternative paths of actions, i.e. to make decisions. There are several
interaction strategies for decision making [SS]. The most obvious strategy
is to follow a plan, and if we do not have a plan we make one. A more
direct way is to match a goal using a set of currently available affordances
and knowledge about effects of local actions, e.g. knock him out to win the
fight. A third strategy is to use a history based selection of decisions, such
as This worked the last time , This usually works , This works every
time .
Decision making becomes difficult in interactive situations when we have
to consider also what other interactors will decide. It is especially
complicated when noone is willing to give information away. If we have a
competitive situation we still can reach an agreement by using
negotiation, or argumentation, which in turn need protocols.
Representatives for different nations sit down at the negotiation table and
try to find agreements, but first they have to agree on the rules. Who is
allowed to talk when, and for how long, i.e. they agree on the protocol.
Argumentation is a way to gain an edge in a competition. One actor tries
to convince another using logical, emotional, visceral, or kisceral
arguments. Visceral arguments are physical, for instance an applause to
support an idea. A kisceral argument appeals to the intuitive, mystical, or
religious as in “ccording to the ”ible … .
Eeenie meenie
mini mo …
The Sanskrit word for "war"
means "desire for more cows."
For very simple negotiations auctions are suitable, but they can only
handle allocation of goods. The bidder with the highest bid wins the
merchandise. If the negotiation is more complex, perhaps there are
several issues that have to be balanced, an auction is not flexible enough.
We can still use a simple protocol, such as a series of rounds, with every
actor making a proposal in each round, but we have to find some strategy
for what proposals to make. The proposal chosen by us of course depends
on those made by the other negotiators. When you buy a car you have to
consider price, milage, colour, conditioning, and also a salesman that
keeps adding new arguments. It is sometimes a problem to find a rule for
when the deal is closed, and for a highly complex negotiation it is even
difficult to agree on what the negotiation is really about. ”orders,
settlements, weapons, terrorist definitions …
Hakan Gulliksson
204
We can use many criterias to evaluate a negotiation. Pareto efficiency
measures the global result of a negotiation. A negotiation x is pareto
optimal if there is no solution x such that at least one actor is better off in
x than in x, and no agent is worse off in x . “nother criteria is that all
actors should gain something by participating in the negotiation, i.e.
rational behaviour is assumed.
In strategy, it is important to
see distant things as they were
close and to take a distant view
of close things.
Shinmen Musashi No Kami No
Genshin, also known as
Miyamoto Musashi from the
book ”Go Rin No Sho”, 1645
V.14 Computer-supported co-operative work
We can study computer-supported co-operative work (CSCW) at
different levels of detail, e.g. as H-I-H, or H-T-I-T-H. The technology for
CSCW, also called groupware, steps in between humans, borrowing
interaction metaphors from formal meetings (video conference), and from
the telephone (video telephone). As the supporting infrastructure evolves
CSCW will be more and more important.
It is an accepted truth that human communication mediated by
technology has a lower communication bandwidth. But is this necessarily
true? Let us say that you want to communicate with someone in the dark.
In this situation you could use an infrared camera to enhance the
bandwidth. In fact, by using technology any human-to-human
communication channel could be enhanced! Information display of
previous experiences and automatic matching of interests are some
possibilities.
“If, as it is said to be not unlikely in
the near future, the principle of
sight is applied to the telephone as
well as that of sound, earth will be
in truth a paradise, and distance
will loose its enchantment by being
abolished altogether”
Arthur Mee, 1898 [JC]
New technology will necessarily change behaviours, but the effect of
introducing a groupware system could take days, weeks, or even longer
before it is seen. It for instance took mobile phones 10 years to become a
commonality in Sweden. Considering how long it takes for a technology
to change the behaviour of grown ups it is strange that very few research
papers study how interactive systems affect the behaviour of children.
One example of changed behaviour is reported in [WM]. The commons of
EuroPARC office could be seen over a local network. The effect of this was
that people showed up in waves at meetings. When three or four were
seen, the rest of the participants appeared more or less at the same time.
They used the video to optimise their time. Another example is that
answering machines at one time were considered rude. A couple of years
later it is considered rude not to have one!
The four different player
categories of role playing
games are:
Achievers, explorers,
socialisers and killers.
R Bartle
For any collaborative application to be a success it has to fit into the
context of the users. It must do so in many aspects, but first of all it has to
be accepted by all of the users, in all of their roles. They should all gain
something from the introduction of the new system. A groupware
system is of no use if only one person in the department uses it. Other
important aspects are:
Communication media should provide adequate support.
The transition back and forth between individual work and
groupwork should be seamless.
The groupwork should be integrated into the overall work
process.
Hakan Gulliksson
205
There are some fundamental physical and social constraints that cannot
be overcome by technology. We have different time zones, unavoidable
delays from speed of light, cultural differences (when in Rome, do as the
Romans do), differences between generations, language issues, and a
healthy scepticism towards technology in general. We also have to accept
inherent limitations in technology. The sender of an e-mail, for instance
sitting at a stationary desktop, expects the e-mail to be read in a similar
environment. But, it could be impossible for the receiver, using a mobile
computer, to follow an attached web link.
Chairman (far away)
Bad lighting
Figure V.14.1 Typical
CSCW setting [JG1]
Bored participant
(close to camera)
Remote participants
(small images)
The videoconference setting where two teams line up at each site
encourages antagonism rather than collaboration, see figure V.14.1. This,
together with the complexity of H-H interactions that we have discussed
above, makes negotiation difficult [WM]. Better results with CSCW are
reported when people together concentrated on the solution of a problem
rather than on each other. One typical example is remote troubleshooting
of a machine where video was used as an aid to pinpoint causes and
follow up on repairs.
”Watson, please come here”
First words by A. G. Bell on hs
new telephone
V.14.1 Taxonomy
How does a group accomplish its task?
The first thing to acknowledge is that groups are complex social systems
with both internal and external relationships. There are many views
possible, economical, sociological, managing, and educational. We will
here introduce the TIP (Time Iteraction Performance) model by McGrath
which emphasizes that a group is a social system with a purpose [JM].
The immediate future is a
concentration of series of open
possibilities, and the mobile phone
is increasingly indispensable for
arranging these possibilities and
establishing priorities.
Timo Kopomaa.
In the TIP model groups are seen as simultaneously and continuously
engaged in three activities. The first is production, i.e. getting the task
done, including problem solving and task-performance. Next, member
support encourages its members and increases participation, loyalty, and
commitment. The third function is to keep the group together as a social
unit, for instance by management. Small groups carry out the three
functions in four possible modes:
1.
2.
Inception (choice and acceptance of goal); a group working well
quickly starts up work, and easily generate new ideas and plans.
Problem solving, a good group efficiently finds the preferred
means and methods. Also involves staffing, and role issues.
Hakan Gulliksson
Compare the four modes to
the left to the action cycle.
206
3.
4.
Conflict resolution, conflicting views or interests need to be
resolved, for instance in work assignments and preference
resolution.
Execution (implementing the solution to reach the goal), possibly
done in competition or against common knowledge.
The four modes are concurrently active, but focus shifts depending on
knowledge level, type of task, group preferences, available technology,
and other changes in contexts. The group may be in different modes in the
three different activities mentioned above. It could be problem-solving in
the production function, and engaged in conflict resolution in group wellbeing and member support.
The activities above are by no means static. There are continuous
processes of coordination and synchronization within the group, and
between the group and its social environment. Also, the group evolves
over time as members get to know each other and establish routines and
norms. To further complicate matters people might belong to many
groups.
There is no one single
way from start to finish.
Production
Member support
Group well-being
t
A social interaction can be characterized in at least two dimensions, space
and time, as shown in table V.14.1. Asynchronous and synchronous
interaction addresses whether the interaction is happening in real time, or
if it is delayed. Not many people visit the site of an already finished
competition where the winners have already drunk the champagne.
Same place
Different place
Synchronous
(Same time)
Face to face
(quite common)
Information about a
closing airport gate.
Asynchronous
(Different time)
Messages on the
refrigerator
Electronic mail, Book
Table V.14.1 CSCW, spatial
and temporal view.
The most difficult cell in the table to support from a technical point of
view is the synchronous/different place combination. A network is needed
with enough capacity to transport the information, and the delay must be
kept within the hundred milliseconds range. This delay includes all
processing of data by the computer, which can be substantial, for instance
in a videoconference. Given more time, i.e. the asynchronous applications
in the table, the demand on the technology is not as severe, but
transporting a video mail is still a problem since the amount of data is
quite large.
A slight variation of interactions of type different place is indirect
interactions where there is no possibility for the receiver to affect the
sender of the message, i.e. no return channel. We take the most popular
brand of some product, i.e. asynchronous interaction, and if we are in a
hurry we try to find a clear path through a crowd, which is an example of
synchronous indirect interaction with the crowd.
Hakan Gulliksson
207
We get a different perspective on CSCW if we replace place (Same,
Different) in table V.14.1 with the objects involved in the collaboration
(Artefact, Participant) [RR].
Artefact
Participant
Synchronous
(Same time)
What is happening to
the artefact?
Who are around and
what are they doing?
Asynchronous
(Different time)
What has happened, or will
happen, why, when and how
did it happen?
Who has done what and who
are going to do what?
Table V.14.2 illustrates some possibilities for a CSCW application. Let us
as an illustration briefly discuss notes, i.e. artefacts supporting
asynchronous and synchronous interaction. Technology is constantly
inventing new ways of sticking information to a place for others to find at
some other time. Less sophisticated is spray-painted graffiti and almost
magical are electronic notes, where a physical reference helps the user s
browser to find the right web page with local information. The
information supplied by the notes could assist in way-finding, or display
historical events. It might describe some interesting aspect of the local
context, or be left as a message to a specific person passing by, i.e. to fulfil
some social function. Passing notes in the classroom is one way to send
private messages and technology could provide students with a wireless
electronic equivalence.
Table V.14.2 CSCW, participants and temporal aspects.
Potential reference
link (bar code)
A permutation of the previous tables, see table V.14.3, gives yet another
perspective, and now we have to consider virtuality. The table helps us
not only to take the history of an interaction into account as in table V.14.1,
but also to elaborate on the fact that technology overcomes physical
distances.
Artefact
Participant
Same place
What physical interactions are
involved?
Face-to-face collaboration
possible without technology.
Different place
Networked artefacts
Table V.14.3 CSCW, participants and spatial aspects.
Technology
mandatory for H-H
interaction
V.14.2 Effective interaction
Humans are embedded in social information and many decisions are
taken using it. Some examples are, buying a house in the right
neighbourhood, having the right, tight jeans, not choosing a restaurant
that is empty, shopping at certain times just to enjoy watching other
people, or the opposite, shop when no one else does. Compared to this
wealth of information available in real life we are socially blind in the
digital world [JC]. This section will discuss the degree of goal-attainment
that can be reached when a group interacts, i.e. the effectiveness of the
interaction.
Hakan Gulliksson
Das pferd frisst keinen gurkensalat
(The horse eats no cucumber
sallad)
The first telephone message,
Germany 1860
208
A prerequisite for effective interaction is common knowledge, i.e. a shared
common ground, and a communication channel with appropriate
bandwidth. Table V.14.4 below, adapted from the reference [JC], describes
some characteristics of synchronous face-to-face interaction and
exemplifies them in different ways. Throughout the examples the
importance of a common ground can be seen. It is easier to maintain if the
interactors are situated at the same place, i.e. collocated. Collocation
means more possible communication channels and that you know, and
know about, people around you (could be years of discoveries). How do
you appreciate your co-workers pheromones over the Internet? What does
an e-mail say about the stress level of your fellow worker? Common
ground is also enhanced by co-reference, and implicit cues. A task that
demands a high level of coupling additionally requires short feedback
loops, and possibly a large number of complex messages over more than
one modality.
Characteristic
Description and Implications
Individual control Each participant can freely choose what to attend to.
Familiar
Identities and characteristics known for participants,
collaborators
their roles and their relations. Helpful for
interpretation of messages and behaviour and for
identifying expertise and knowledge.
Rapid feedback
Many communication channels with short
communication delays. Quick corrections possible.
Multimodal
Voice, facial expression, gesture, body posture, and
more. Enables efficient complex messages, and
redundancy for error resilience.
Fine-grained
Analogue or continuous information flows. Subtle
information
message differences possible and information
modulation possible.
Shared local
Participants have similar physical environments and
context
experience the same local situations, objects and
actions. Allows for easy socializing as well as
mutual understanding and learning by copying. It
also provides means for a shared history. A key such
as the chaotic desktop of a colleague is useful in
deciding whom to discuss a problem with. How can
similar cues be provided in the virtual world?
Co-reference
Easy joint reference to objects. People and objects
have known spatial locations. Gaze and gesture can
easily identify objects by pointing.
Implicit cues
A variety of cues as to what is going on (events and
effects of events) that are available in the periphery
to an individual. Natural operations of human
attention, e.g. eavesdropping, provide access to
important contextual information including facial
expressions and body postures. Provides information
to a history.
Hakan Gulliksson
Definition:Common ground - an
agreed basis, accepted by both or
all parties,
Groups are inherently complex:
A group has a past and a future,
variable membership of more,
exist in an environment
(communities, organizations,
neighbourhoods, kin networks,
departments), tasks are related,
never repetitive without
variation, ad hoc,modulated by
time, place and situation, and
not (always) rational
Table V.14.4 Characteristics
of synchronous face-to-face
interaction.
209
Social information is necessary for social manoeuvring, and the implicit
cues support social awareness, see table V.14.4. Social awareness is
important since it supports cultural rules, we know that we will be
directly accountable for our actions. You can see that someone else is
currently working on the same project as you are, and you know that this
works both ways. This relationship is not necessarily obvious in the digital
world. Through implicit cues social rules and social control is exerted and
it also enables humour, discussions, and implicit learning by copying, i.e.
as in the Swedish saying, knowledge is in the walls . In general the
downside is that there is a trade-off between visibility and privacy.
Another problem is that it will take time to introduce a new member to
intricate established social cues.
Most Chinese are never out of
earshot from another Chinese.
True?
Physical contact and familiar collaborators means more and better
opportunities for you to extend your social networks. It is much easier to
ask someone you know about somebody else. In general knowledge
formation is a social phenomenon.
Figure V.14.1 Hole in space,
Galloway, Rabinowitz 1980.
Real-time video connection with
full size images between New
York and Los Angeles.
New technology does not seem to decrease the number of meetings faceto-face. A typical use of a mobile phone is to discuss when and where to
meet, and to keep options for meetings open. Neither does use of new
electronic medias seem to replace other forms of interaction, they only
complement them. Paper is for instance still used (a lot) even if most
information is digital.
Reviewability and revisability are characteristics of the digitised
message. Writing an e-mail gives you time to formulate, and re-formulate,
your anger in a more sophisticated wording instead of immediately meet
person to person to make even. The fact that any message transmitted
using technology is potentially stored and later retrieved is of course not
only a blessing. Users concern for loss of control over unique data, and for
dissemination of personal confidences must be considered if technology is
to be used in the communication loop. Trust in technology? What if the
videoconference where you participated (and did something you
shouldn t have is on the loose, out there on the Internet [JG ]?
New options on how to record and access data make it possible for us to
acknowledge knowledge work in new ways by keeping track of who is
making major contributions [JC]. At the same time we will be able to see
and find those who are doing nothing, or are working with other things,
something that affects privacy and will tend to prioritise certain kinds of
work that is measurable.
Hakan Gulliksson
Top ten list of the places that people
are most likely to gossip… At number
10 – unisex loos; at number 9 –
supermarket queues; at number 8 – to
their personal trainer; at number 7 –
with cab drivers; at number 6 – in
crowded bars; at number 5 – at
meetings; at number 4 – on mobile
phones; at number 3 – friends telling
other friends; at number 2 – on train
and tube journeys…and at number
one – restaurants!
Relations Analyst Stephen Forster
Neither transmitters nor receivers
seemed particularly sensitive to the
public nature of transmissions,
although they did indicate some
embarrassment about the fact that
speech emanated from unexpected
areas of their body, depending on
where they had clipped their
cellular radio. This sense of body
parts such as thighs or hips
“talking” did not result in a
change in practice, i.e., they
continued to use speaker audio.
A. Woodruff on Push-to-talk
210
Mobile applications add new twists to the discussion since they allow
constant availability. Interrupts anytime, anywhere, by anyone will be
the result. If you turn off your mobile device you might miss something,
or be considered asocial. On the other hand, constant access to others gives
a sense of safety and security. This is a nice feature when you travel by car
in the deep woods of Sweden with a temperature outside below –30 C.
Voices are distracting, too personal
for communication with unknown.
Dr Michael Heim, Art Center
College
V.14.3 Interaction bandwidth
The figure below describes H-H interaction using technology from a social
point of view. It grades the coupling and emotional involvement
necessary (and possible) in different interaction forms, i.e. the social
presence possible. Text chat is the least engaging form. You can very well
participate in three different private chat sessions, all at the same time.
This is not easy using face-to-face videoconference.
Increased bandwidth
Text
chat
Animated
Telephone
cartoon figure
synthetic voice
Animated
virtual face,
true voice
Face to face
video conference
Close coupling is easier to maintain in collocated face-to-face
collaboration. Network delays do not exist, and there is no need to learn
complicated tools. A large vocabulary is available, and with a number of
analogue communication channels available it is easy to exactly specify
your message (well, at least relatively easy). Individual personal control
lets a collaborator focus on the most important message and on the most
important channel, e.g. that person over there yawned, did I hear a laugh?
Maintaining a close coupling is simplified by a well established common
ground since formulating messages is easier if you know how the receiver
will interpret them. Close coupling in other words increases efficiency and
effectiveness. This is however not always desirable because it also makes
participants vulnerable, open for criticism. Furthermore, face to face
communication clearly favours good-looking individuals, and as we
restrict the communication channel to only text everyone who can type
has the same possibility, humour will be important. In a text based
interaction we give interactors the possibility to hide physical weaknesses,
and allow the interpreter to build an illusion of the sender. A reduced
channel can even increase the experience of using it!
Visibility and audibility are lost in a two-way text chat. It is impossible to
know if someone is paying attention to the chat, and also difficult to force
someone else to give a prompt response. The real-time coupling is lost in
an e-mail where message content and sequences have to be recalled. An
answering machine is another example that also accepts asynchronous
messages and leaves them hanging in the cyberspace waiting to be heard.
It does maintain audibility though, and you can easily hear if someone
you know well is disturbed or angry. The human voice is a sensitive
instrument that displays emotions, whether the speaker wants it or not.
On the other hand, a text chat makes it possible to hide emotions, but also
makes it difficult to show them.
Hakan Gulliksson
Figure V.14.2 Interaction
bandwidth for different
applications.
People who have established
a lot of common ground can
communicate well even over
impoverished media…
G. Olson & J. Olson
Feedback, immediacy of
response, multiple
communication channels,
multiple participants, symbol
variety, rehearsability,
reprocessability,
Features of the optimal
medium
Video is a very powerful medium,
perhaps too powerful
Wendy Mackay
211
There are many other interactions possible on the scale. You can enter
messages using voice that can be presented as text, enhanced by emotion
amplifiers **. You can enter text that is presented as voice, or you can
distort voice output in some convenient way. A video might be
manipulated such that the individuals, or what they do, cannot be cleraly
identified, but the video would still give a sense of presence. This could
provide a feeling of presence akin to the one you feel from a group of
people passing, chatting and laughing, outside your closed door [JC].
Computer games such as CounterStrike use voice for communication
among players. This means that thousands of families will be listened to in
real time, all of the time. Is this good or bad for society as a whole? Maybe
this is not something more than the ordinary phone? When the phone was
introduced one objection was Will anyone be able to call me up for a
shilling?
How could an application that
facilitates communication harm
teamwork? Are there new sources
of friction among participants
when using computer-based cooperation?
V.14.4 Social quality of service
As we build tools for H-H interaction we should also evaluate the social
quality of service of the tool. How does a particular tool present social
services? How are the users affected and what means do a user have to
adapt? Some of the questions related to social services are:
Who is allowed to join?
Who has joined and who has joined and left?
Who is allowed to do what, with what, and together with whom?
Who is doing what, and has done what, at which activity level?
Who is allowed to follow the work as it progresses and to what
degree?
Who is following and has followed the work?
Who is allowed to see the results?
Who is viewing and has viewed the results?
If we build systems where answers to such questions can be found and
presented in useful ways, we have a chance to enhance social life, at the
risk of being controlled and restricted.
The figure below shows a part of a web page where the number of people
currently visiting the page are shown as gauges. A fuller gauge indicates
more visitors, and by using a logarithmic scale the gauge can be made
sensitive to few visitors and still be able to scale to a lot of visitors [DC].
Much focus is on loss of privacy
using new technology, but what
can be gained? What social
quality of service is possible with
new advanced technology?
Discuss this for an elderly user,
with relatives living in another
part of the country.
Figure V.14.3 Visitor
activity shown by gauges.
Buddysync is another example where a designer has identified the needs
of the users in a cultural context. It is mobile communication tool for kids
that presents communication mechanisms adapted to different social
environments [HS]. Close friends are always connected and their real-time
status is displayed. Larger communities of friends are accessible via
another, less intimate interface, and voice is used to communicate with
parents. The last functionality probably added to motivate parents to pay
Hakan Gulliksson
In fact, the more we try to get a
system to act on our behalf,
espercially in relation to other
people, the more we have to watch
every move it makes.
Victoria Bellotti
212
for the device. If we could automatically detect social context even more
intriguing applications are possible. But, as noted many times before this
is a most difficult task. Finding out types of relationships, subject matter
and detecting subtle mood changes that affects social relationships is a real
challenge.
As the social aspects of life drips into digital life we will also see more of
the less flattering aspects of human social life, suspicion to strangers,
protectionism, and jealousy, just to name a few. These behaviours will
necessitate explicit mechanisms that enforce identity and accountability
for actions. The system need to detect arrival, presence, departure,
activity and identity [VB]. All of them good old human capabilities.
What do you do if you
stumble over a personal
secret on the Internet?
Figure V.14.4 Friendship
choices among fourth graders
(adapted from Moreno, J. L.
(1934). Who Shall Survive?
Washington, DC: Nervous and
Mental Disease Publishing
Company).
Videophones, or stationary desktop video at workstations, raise ethical
issues, as well as practical. We want to know who is watching us, and
why. Normally there should be an indication that someone is remotely
using a video camera, and this will be yet another distraction, adding to
the distraction from the sound of arriving e-mails plopping into the
mailbox. You should at least have the option to turn the video off, and the
question is whether you will ever turn it on again. The following four
issues, at least, have to be considered with respect to privacy [WM]:
Control: Users want to control who can see or hear them at any
time.
Knowledge: Users want to know when somebody is in fact
seeing or hearing them. They also need feedback on what is seen.
If a recording is done and reused in another context the user
should know about it.
Intention: Users want to know the intention of the connection,
i.e. whether the video is stored, or otherwise processed, and for
what reasons.
Intrusion: Users want to avoid connections that disturb their
work.
Warning! You are currently
being watched!
A picture of you, naked, gets
distributed over the Internet.
What can you do to remove it?
On the other hand, we somehow manage to take the subway and go to the
restaurant without too much problems with privacy issues.
Whether you choose an audio-visual recording of your activities, audio
only, or none at all depends on where you are (bathroom, library),
what you are doing (giving a lecture, singing in the bathroom), the
relationship to the person who will have access to the recording
(everyone, your wife), what you are wearing (a smart suit, robe), if you
have already recorded a similar activity, if you are hungry, you want
to share your message with someone you care for, …
How can an application possibly infer all of these factors?
Victoria Bellotti.
Hakan Gulliksson
213
V.14.5 The social-technical gap
It is not possible to fully represent, and manage, the same amount of social
information in a CSCW system as we do when socialising, without effort,
every moment, every day. We will illustrate this by following the
discussion on privacy preferences and P3P in [JC]. The idea with the
Internet P3P protocol (Privacy Preference Protocol) is to create a privacy
standard for the web, allowing the information owner to have detailed
control over access. For P3P we are faced with a difficult user interface
problem. There are several millions of users and if we assume a fine
information granularity, almost any user provides thousands, probably
millions of data items. How can we set the access rights for each item and
keep them updated to changed conditions?
We need to group data and users to keep the complexity down, but as we
do this we loose control over details and will have to introduce numerous
exceptions. Also, the systems that we currently use are discrete and
precise which means that we will not even have the possibility to
postpone the decision and stay ambiguous, which would be a typical
human social solution to a similar problem. We are faced with the socialtechnical gap [JC].
Please call, but don’t
expect me to answer.
Figure V.14.5 Lay ”ountiful ,
popularity map where a large circle
indicates high popularity,
(Adapted from Lundberg, G. A., and
Steele, M. (1938))
The problem is, it seems, impossible for a computerized system to solve
even if it is assisted by a human, but it is easily solved many times a day,
by any human, in a social context. The gap is a fundamental problem
originating from the complexity of life itself, as indicated in figure V.14.6
below.
Human
society
Flexibility
Hakan Gulliksson
Human
brain
Neural
networks
Fuzzy logic
Logic
Figure V.14.6 Complexity
means flexibility and loss of
control.
Control
214
One partial solution is to build really flexible systems that learn from, and
co-develops with, humans. Next generation, at least in Sweden, will have
spent more time with the computer at the age of 15 than the previous
generations will over their whole lifetime. The extent to which ICQ and
interactive networked games are used says something about the
importance of the next generation of computer support.
Maybe we should aim for the S in CSCW and try to find ways in which the
computer could augment rather than replace our social abilities.
Technology could provide strange augmented reality support such as
seeing through walls, or maybe technology for automatic blood pressure
analysis of people walking by.
V.15 Command based interaction, someone in control
In many cases of interactions one of the interactors is in control, and the
other interactors are controlled. Interaction in other words is asymmetric,
or command based, and control is centralised. This type of interaction is
the rule rather than the exception, whenever technology and a human user
are involved, at the current state of technology. Human-computer
interaction in Windows® is a perfect example.
The good thing with this asymmetry, seen from the interaction designer s
point of view, is that centralized control simplifies interaction. We only
allow objectives from one participant to influence the interaction, and
usability goals are easy to formulate, such as that interaction should be
efficient for the controlling user.
Centralised control does not necessarily mean that the same interactor is in
control all of the time. There are techniques for rule-based distribution of
the control among participants over time, such that each participant gets
its control slot. One way to do this is to pass a token around, and whoever
has the token is in command. Another way is to assign predefined cyclic
time slots. A board of directors is assigned a time slot of a year.
V.15.1 Mechanisms
One possible choice of taxonomy for commands is identification,
navigation, choice, manipulation, read, write, and system control. We
start with identification, since it is a prerequisite for any command and
used to identify representations of objects, behaviour, relationships, states,
transformations, familiar spatial or temporal settings, faces, sounds,
smells, or anything else that describes participants, or the current
circumstances. Matching representations through recognition and
identification is a cornerstone for generating knowledge. We will use
identification in a broad sense here including discrimination,
segmentation, and classification. Properly speaking, identification, i.e. I
can see that it is you , is not the same as discrimination, i.e. I can see it's
not you , segmentation, i.e. I separate hair from face , or classification,
i.e. That is a typical nose .
Hakan Gulliksson
I will set sail
chose my new course
restart with ctrl
/HG
215
Navigation, allows the interactor to explore a local environment and to
follow a path through system representations. By exploration the user gets
to know the environment, and how things are related. She then positions
herself in the proper context, e.g. moves to an advantageous viewpoint or
position. In a CAD program for architects, navigation involves selecting
the blueprint of the right floor. Navigation also implies an intention and a
target. The target can be indicated by pointing devices, voice detection,
gestures, head tracking, or gaze tracking techniques. Furthermore,
navigation needs some means to modify the current position and speed,
by specifying direction, speed, or acceleration, and not to forget, some way
to indicate stopping.
Choice, is a selection among alternatives using sound, gesture, button,
menu, dialog box, or any other effective method at hand. Manipulation,
modifies the system. It involves modification of object representations,
changing system state, or self. One way to accomplish this is through
direct manipulation. Read extracts system state representations and write
is a command that changes system representations. Write provides system
input that will change the system state. Sequences of read and write
corresponds to manipulation.
The last interaction type is system control, manipulating the prerequisites
for system behaviour, e.g. change environment from Earth to Mars, or
load a new operating system. Starting and stopping the system, loading
new modules, and saving system states are some other examples of
system control tasks, which are in turn implemented by other interactions,
such as choice or manipulation. System control is not available for
everyone in all situations. Remember that terribly embarrassing situation?
Was undo an option? Reload from a previously saved state?
As you can see from above the interaction types are partly overlapping,
identification for instance implies a read operation. They are also
somewhat arbitrarily chosen. Navigation could be replaced by search
(combined with identification and manipulation), and nothing is wrong
with a command such as track .
There are many other taxonomies possible for interactions, at different
abstraction levels, and for different purposes. One of them, more related to
information use is; creation, gathering, processing, retrieval and
communication. These commands could all be expressed by different
combinations of read, write, navigation and choice, but they are closer to
of the user s intent and internal processing. Create can for instance be
expressed as a write, but the concepts of create and write are certainly not
equivalent. There are in fact an unlimited number of taxonomies possible;
every verb in Webster s dictionary could under the right circumstances be
used as a command.
Effectuate
Select
direction
Evaluate
new position
Feedback
Controlled
Interactor
Read()
Display
representation
STOP
I
T
Selection – choose object from
alternatives
Position – Specifying a position
within a range, e.g. pick a screen
coordinate.
Orient – Angle or 3D orientation
Path – Series of positions or
orientations
Quantify – Specify numeric value
Text – Entry of symbolic data
Foley, Wallace and Chan, “The
Human Factors of Computer
Graphics Interaction Techniques”,
1984
Hello, and thanks for calling. Your call is very important
to us and, we're sure, to all of humankind.
If you would like to challenge my sincerity, press 1.
To report a discrepancy between the way you planned
your life and the way it's turning out, press 86.
If you wish to end this call or return to the main menu,
do not press your luck.
You are not going back to any main menu, my friend.
You have come too far. There is no turning back. You can
only press on.
Internet
Hakan Gulliksson
216
V.15.2 Intelligent support
The question of whether the user interface should have a life of it s own is
raging. The main argument against is that users want to stay in control,
and that intelligence at the other end distracts from the task at hand. The
proponents argue that as the amount of information grows humans need
support to cope. Both sides are of course right (as usual). The trick is to
figure out under what circumstances support is needed, and when it will
not degrade usability too much. Forcing the driver to discuss the next turn
with the car, or even to beg it to turn left, is maybe not a good design
alternative.
Historically the ideal has been transparency. You, the user, should spend
your time on the task, not on administrating the tool. At the extreme we
will not attend to the user interface at all. A professional Formula 1 driver
does not have the time to worry about finding the break. The driver s
hands should never leave the steering wheel. This transparency is
challenged by context dependency. A common example is a wireless
network that at one time gives very good performance, but a moment later
completely fails to deliver enough capacity. If information about the
network capacity is hidden, i.e. full transparency, the user will only see the
resulting strange application behaviour and will soon shut the application
down. On the other extreme we can give the user all information and tools
to adjust the application to the current network capacity. Once again we
risk loosing the user, this time the user closes the application out of shear
exhaustion after having tried to keep up with the changes in the network.
The solution seems to be to give the system enough self and context
awareness to manage minor network fluctuations.
Mixed-initiative systems are compromises that interweave direct control
and automation. The dialogue in such a system behaves similar to a
human conversation. By turn-taking the interactor with the most urgent
task takes control, but this is more complicated than it sounds. In a H-T
setting the Thing has to evaluate the relative importance of its own task
against the task it guesses what the human is solving. It also has to make
social considerations. How many times has the human been disturbed?
Does he seem annoyed? Is he interested in the result of the task that the
thing is about to perform? Does this particular human understand the
result?
One important aspect of intelligent tools is that we will be reluctant to use
them if we do not trust them. Will you trust a web agent with your secrets
if you do not know how it works? Maybe it packetises your secrets and
sends them to Microsoft? Intelligent support on the other hand can be
used to enhance experiences and surprise you. You come home Friday
evening and your house announces an Italian weekend. The rest of the
weekend the intelligent thermostat matches the temperature in the house
to the temperature sensed at a specific piazza in Naples.
Adaptability was listed as one of the characteristics of intelligence in
Chapter III.1. It can be provided for at many levels. At a physical level this
could mean adjusting lighting for the task at hand, or automatically tilting
the driver seat of a car. At a conceptual level adaptation could involve
changing the user interface to match age or expertise. Adapting at an
intentional level was discussed above, and is even more complicated.
Most adaptations could benefit from learning about the user and the
environment without explicitly requesting information, and the more the
system learns by itself the more magical it will seem. Another criteria
Hakan Gulliksson
Systems that assume that the user
has infinite attention, complete
knowledge of the system, and
infinite patience quickly become
annoying.
E. Horowitz
"Anything that happens, happens"
"Anything, that in happening,
causes something else to happen,
causes something else to happen."
"It doesn't necessarily do it in
chronological order, though."
From a story by Douglas Adams
Had to call the SmartHouse people
yesterday about bandwith problems.
The tv drops to about 2
frames/second when I’m talking on
the phone. Today, the kitchen
CRASHED. Freak event. As I opened
the refrigerator door, the light bulb
blew. Immediately, everything else
electrical shut down -- lights,
microwave, coffee maker -everything. Carefully unplugged and
replugged all the appliances.
Nothing. The police are not happy.
Our house keeps calling them for
help.
From the NET
217
for intelligence was the ability to solve problems, especially problems in
the real world. To achieve this we have to tackle the issues discussed in
Part IV, i.e. reasoning, planning, and decision making, and also the even
more fundamental problem of how to select and represent enough of the
relevant aspects of our environment.
There are many interactions where we really need some limited level of
intelligence in all of the interactors. Imagine that you want to move a
heavy patient from one bed to another in a hospital ward, and you are
assigned two robots to assist you. For you to manoeuvre every movement
of the robots would be very complicated. An alternative is to build limited
intelligence into the robots such that they understand what they are
supposed to do, and how they should co-operate to do it under your
command. The problem here comes with an extra twist because we do not
want the patient to get hurt (but without support you are the one who gets
hurt).
If we give up the goal of transparency we can instead view the application
as a collaborator. We give this collaborator advices for how to behave and
it learns both from the advices we give, and from how we give them. We
sometimes will have to show it how to do things, correct it when it is
doing too much, or done the wrong thing. If this is too annoying the
system will not be used.
Stupid: Slow to learn or understand;
obtuse. Tending to make poor
decisions or careless mistakes.
Pssst..
He’s awake
06.58
V.15.2.1 Social support
Should you present a computer as a person? In reference [BS1] the author
states that you should not give the computer human characteristics.
Rather than the computer saying, I am waiting for your input the
phrasing should be Waiting for input . Other studies show that as
human beings we tend to treat anything that we communicate with as
human beings. Stupid machine , refers to the interface that we do not
like, rather than to the designer of the interface.
This behaviour is an example of how humans unconsciously map
patterns from their own everyday experience onto the environment. One
reason for this is that it lowers complexity since patterns can be reused.
When it comes to computer applications this behaviour could be exploited
to build more efficient services, or it could be ignored with costly
consequences.
At a perceptual level we easily recognise, or think that we recognise, a
human face, such as the man in the moon. Our attention is triggered by
things moving as autonomous entities, such as autumn leaves whirling in
the wind, behaving as if they were alive. At a cognitive, social level, we
reason about the mental state of an entity, its drives, emotions, and also
attribute pain to it. We even assign personality clusters to a system, i.e.
groups of traits (lazy, optimistic, perfectionist), or social roles (father,
teacher, lover) [PP].
Happiness is not to be rich (even if it helps), but having wife or husband,
friends, and challenging tasks to perform. How can we use this for more
intelligent user interfaces? Can we make the interface a friend of the user?
Do we then risk being exposed as cheaters, and that the user will end up
hating the interface instead? We are now getting more and more used to,
apparently intelligent, humanlike interactors and interactions, will we
accept less than humanlike behaviour from the next generation of tools?
Hakan Gulliksson
218
“ good strategy to use designing someone s friend is to make the
interface you have designed (and yourself) irreplaceable. This is possible
if you have expertise, or abilities, that the user is in desperate need of. A
virus scanner should provide a great opportunity for this type of
friendship. Another strategy would be to associate the user interface with
something, or someone, that has the same objectives as the user. A third
strategy is to include functionality or characteristics matching the users
abilities or preferences. A user with a preference for symbolic
manipulation can be given a symbolic, text based interface, whereas a
more spatially oriented user rather should be presented with a graphical
user interface. Cultural differences can be used in a similar way. A yellow
and blue computer for a Swedish user? One last variation of how to
acquire friendship is a symbiosis where both the user and the agent
benefit from the friendship. Napster was one example, users added music,
Napster prospered, and new users registered because of the accumulated
amount of music. From H-H interaction we can list some criteria that
indicates politeness in an interface from a users point of view [ACR]. The
interface should be:
Possible to identify as an individual.
Interested in you.
Respectful to you, e.g. moderates its pace to yours.
Responsive to you. A car that talks but does not listen?
Anticipate your needs.
Taciturn about its personal problems!
Well informed and perceptive.
Self-confident.
Stay focused.
Give instant gratification.
Trustworthy, e.g. not share confidences without your
permission.
Have common sense. A computer that plays a fanfare
every time you re-boot it?
“The enemy of my enemy
is my friend”
Mao Tse-tung
Let us say that you are the owner
of an enormous record company.
Your objective is to make sure that
noone can live without music.
MP3 and the Internet is the
perfect guarantee for this to
happen. The challenge is to avoid
loosing too much money until the
new behavioural patterns have
evolved..
You've got a friend in me.
You've got a friend in me.
When the road looks rough ahead,
And you're miles and miles
from your nice warm bed.
You just remember what your old pal said.
Boy, you've got a friend in me.
Yeah, you've got a friend in me.
Randy Newman
To have an interface that checks all of the items in the list is impossible
(what a friend!). Some of the criteria are not difficult for a computer to
achieve, such as keeping focused, but many other are currently out of
scope, such as having common sense. The list is good to keep in mind
when discussing the social bandwidth possible for computerised tools
within the next 10 years. Implementing items in the list also depend on
very important lower level characteristics such as timing.
A basic drive for what we do comes from our emotions. They are,
according to some researchers, in turn the means for our genes to make
sure that we enjoy life, and reproduce. We want to feel at ease, content,
pleased with ourselves, and happy. Achieving this involves pursuing subgoals, problem solving, and planning. All actions are selected and
executed under emotional control, even if we ourselves claim only logical
reasoning and rational behaviours. From this it seems that a user interface
ultimately should take emotions into account.
Signs of emotions: facial
expression, intonation, gestures,
gait, posture, pupilary dilation,
respiration, heart rate, body
temperature, electrodermal
response, perspiration, muscle
action potential, blood pressure.
Picard
Some of the feelings to consider in the user interface are, anger, gratitude,
sympathy, liking, shame and guilt. They can all be discussed in a
framework of favours. If you grant someone a favour then you are more
likely to be liked. If you on the other hand exploit favours given, but never
even intend to return them, the person giving the favours will be angry
with you if the swindle is discovered. Guilt is if you accept favours, never
Hakan Gulliksson
219
intend to return them, and suspect that you will be exposed. It is not hard
to imagine how the user interface can be used to exchange favours and
that the resulting feelings can be exploited by the user interface with the
intention to make the user happy (and the developer rich). But, can we
make the user trust the interface?
We can take the discussion one step further; imagine falling in love with a
computer! If we accept an evolutionary view of human social behaviour
an average woman will look for stability, sincerity, wealth, status, or
ambition that in turn will result in wealth. These are qualities that for ages
have kept the family protected and well fed. The man on the other hand
will look for faithfulness, to ensure fathership, and signs of fertility,
indicated by age and beauty.
Never trust a pretty interface.
75% of the users swears at
their computers.
Picard
V.15.2.2 Persuasive support
To start with we evolved out of the African savannah with semi-open
views (seeing without being seen), green surroundings, flowers, visible
horizon, landmarks such as big stones, trees for frame of reference, and
multiple escape paths. Moods depend on the surroundings and a familiar,
pleasant, environment will consequently help the user to relax and to do a
better work.
Things or applications that take the user s attitude into account and
attempts to change the behaviour, the worldview, or the attitudes of the
user, are called persuasive, or seductive [BF]. These features will be ever
more important in an aggressive market where individualised services are
increasingly important. Persuasion implies intent from the persuader or
the seducer and, if the persuader is a thing, a designer, must implant this
behaviour.
Computerised technology can function either as a tool, a medium, or as a
social actor. As a tool the thing, or information, could persuade by
providing the user with new capabilities that could enhance selfconfidence, or change behaviour. One example given in [BF] is a device
that gives information about the heart rate when exercising. This
information can be used to set the right training pace. The same reference
also suggests the following taxonomy for persuasive tools:
“Orthodontists have found that
a good-looking face has teeth
and jaws in the optimal
placement for chewing”
[SP]
Reduction, persuasion by simplification. ”uy by simply
pressing this button once
Tunnelling, give the user a clear, easy to use, path. To
continue press that button
Tailoring, adapt to the current user. Hey good looking,
press this button, especially made for you
Suggestion, provide information and request information at
the right time and place. You will win if you press that
button within one minute
Self-monitoring, feedback persuades to adapt behaviour.
My wearable tells me that my heart rate increases when I
reach for the button
Surveillance, change behaviour to match perceived state.
Ha, he is pushing the button. Run!
Conditioning, positive reinforcement. You won, and will
continue to win if you keep pressing that button
Hakan Gulliksson
220
Applications using the thing or information as a medium for persuasion
are easy to find. By providing experiences any commercial on a television
show tries to change our behaviour. Persuasive computerised social
actors that create relationships are not yet commonplace, but toy pets are
getting more and more intelligent. Some of them even expect the child to
take care of it.
A seductive experience starts by attracting the attention of the user. To
hold the attention the experience next makes a promise. This promise is
what keeps the interaction alive which means that it has to be matched
against user aspirations, emotional or other. The experience ends by
fulfilling the promise, but could be kept alive for a long time by partially
fulfilling promises. A flirt between a boy and a girl usually involves such
partial fulfilments and a soap opera on television uses it to perfection.
Some clues to a seductive experience are [JK]:
It diverts your attention.
It surprises you.
It creates an instinctive emotional response.
It gives promises that matter.
It fulfils some of these promises.
It unexpectedly gives deeper understanding.
It unexpectedly provides more than expected, i.e. it goes
beyond expectations, indirectly exposing a devoted
designer.
There is a considerable risk to add social elements to your design. If this is
done the wrong way, or with bad timing, the interface will be perceived as
annoying and irritating. We are very experienced social beings and one
example is that we do not like to chat when we need to be efficient, i.e.
getting the job done is not a social event [BF].
V.15.2.3 Narrative support
As humans we expect our fellow interactors to have a narrative
intelligence. This means that they should remember the interaction, and
have a local model of what has happened in terms of human interaction. If
computers could tell stories, have a personal history, and recognize the
narrative structure of other interactors we would be much more at ease
with using them (Chrystopher Nehaniv as cited in [CH]). A computer is
designed and programmed to change behaviour under pre-defined
circumstances. This means that it, from the point of view of a human user,
suddenly might change personality, which is quite annoying.
Narration is a good example of interaction because the narrator is trying to
create a state of mind in the listener. Originally, story telling was H-H. It
was extended to H-T-I-T-H when printing was invented and now we enter
the time where I-T-H is possible, i.e. the story can be told (and generated)
by a program. We do not yet know enough about story telling, and do not
have the right tools, but there is nothing in principle to keep us, or rather
information, from doing it. Do you agree? Further down the road are
listeners from the species of thing.
Hakan Gulliksson
In thunder, lighting or rain.
First line of Othello by
Shakespeare (my emphasis).
221
If feedback is available the response from the listener can be used to
change the course of events in the story, or the way the story is presented.
This will make it possible to create totally new types of experiences. Any
object in a story could be activated and reveal new trajectories in the story
space [DABB].
®
V.15.2.4 Calm technology
Always having to think about how to use the computer, or any other
service, is tiresome, recall driving a car for the first time (or reading your
first page of text). Every interaction initially has to be consciously thought
through and learned. We will reduce this cognitive workload if the car
finds out, and performs, as many actions as it can, by itself. If the road is
blocked, the car should stop. The term calm technology tries to
accomplish this by using ubiquitous computing where computation most
of the time is performed in the periphery of the user s attention and
sometimes directly attended to by the user [MW2]. This approach is also
called foreground/ background computing and the basic idea is that
technology should inform the user without demanding full attention all of
the time. When the user decides to pay attention, maybe triggered by
some unexpected event, supportive technology is brought into focus. An
illustrative example, where we still have not found a calm solution is email. Each arriving e-mail interrupts, but is this the right way to do email?
It seems contradictory to say, in the
face of frequent complaints about
information overload, that more
information could be encalming. It
seems almost nonsensical to say that
the way to become attuned to more
information is to attend to it less.
[MW]
The following table adopted from [WB] divides the possible services into
foreground and background applications. A foreground application is
one where your attention is directed toward the application. Currently
most applications are of this type, but as improved infrastructures
gradually emerge the number of background applications will increase.
Human-Human
Human-Thing
Human-Information
Foreground/Focused
Telephone call.
The timer on
your oven.
Selecting a link on a
web page.
Background/Peripheral
Context enabled mobile
phone, e.g. one which
could say “User is asleep”
Smart house that turns
down the heat at night.
Advertisements on a
web page.
Table V.15.1 Foreground or
background applications.
Applications will not restrict themselves to one cell only in the table
above, and they will adapt both to the situation and to the user [WB]. An
advertisement that is selected is suddenly in the foreground. Your mobile
phone displays the timer of your oven set by your wife, and you reset the
timer via a web page after discussing the dinner menu with your wife
over the phone.
Hakan Gulliksson
222
With increased information density and demands on productivity it is
natural for a user to have access to multiple sources of information, and to
perform parallel tasks. This raises the question of how to direct or
schedule attention. News are broadcasted at the same time every night,
and screen estate is allocated to tickering news updates on the CNN news
channel. The problem is to find a balance between distracting the user and
providing the wanted service. This is a design challenge and the (useful)
clock on your computer screen shows that solutions can be found.
Hurry up dad,
we seem to be
late!
Another indirect, peripheral, background, application was explored in the
following experiment [JM1]. A large screen was placed in a workplace
hallway. The hallway sensed the identity of people passing by and
adjusted the content shown to them on the display. This gave interesting
emergent phenomena, as people knew that they affected the display. One
of the problems for the researchers was how to collect information about
people without raising security and integrity issues.
V.15.2.5 Slow technology
Slow technology [LH] is another way to use technology in context. Here
the focus is how to provide technology also for reflection and mental rest.
One example is an electronic doorbell, which not only signals that
someone is at the door, but at the same time sends other messages. Each
signal from the doorbell could give an additional clue to a full, secret,
doorbell message. This is slow technology that makes us stop and reflect.
Technology that deliberately consumes time rather than saving it.
So, instead of placing the activity at the periphery of the user s perception
as in calm technology, slow technology steals, and highlights, a moment
or two. This can be done in many ways [LH]:
It takes time to identify what is happening.
It takes time to learn and get accustomed to it.
It takes time to understand why it works the way it works.
It takes time to apply it.
It takes time to find out the consequences using it.
Usually we aim for fast technology, meant to increase productivity,
inverting all the statements above, but in slow technology we try to make
good out of bad by using the extra time spent by the application for
reflection, creating new thoughts in the moments gained. The time spent
in interactions could be days, weeks, or even years. Another example, also
from [LH] is soniture. This is furniture and physical environments
designed with add-on sounds, creating an additional environmental
dimension, e.g. a floor with its own audible interpretation of the steps it
feels.
Just a button
Golden button
Time perspective changes
from just encompassing the
moment of explicit use
to the longer periods of time
associated with dwelling.
[LH]
XII
IX
III
VI
Alarm clock with an
acoustic E-tone chime
that sounds only once.
Hakan Gulliksson
223
V.15.3 Identification
How do we recognize a fellow man? You, or a thing, have numerous ways
to do it; voice, of course, face and ear features, thermograms that show
body heat radiation, visual texture of iris, gait, keystroke dynamics, DNA,
body odour, signature, retinal vasculature, fingerprints, and hand
geometry are perhaps the first few examples that come to mind. Things
also have other options. They can sense physical properties of a thing
(weight, shape, colour, size), or affix a property to a thing, i.e. a tag (bar
code, radio frequency identifier tag, doctors coat, registration plate for a
car . “ car equipped with a tag could serve as a link to the driver s web
page.
Detecting presence is easier than identifying because of the lower level of
resolution needed. We can use many of the techniques discussed above,
but also temperature, and shadows. A burglar alarm is the typical
example. Technology for identification and detection of presence should
fulfil some constraints such as robustness against noise, and power failure.
Also, the solution should not disturb the wearer. The technology
discussed above is already in use. Car keys are for instance used to find a
car on a big parking lot.
Researchers are investigating
smart floors. How do you think
they will manage to separate two
individuals walking over a floor?
What characteristics will be
used?
Every person has a
unique tongue print.
Figure V.15.4 Human input
and output devices.
Is it a bird, a plane ? NO, it is
Superman !
Some of a human being s input channels also work as output channels.
Eyes tell stories, and hands that are used for gestures are also the tools for
sensing reality. The mouth is mostly output, but in intimate situations,
and for survival it also works as input. Successful interaction demands
identification not only of humans, but also of the messages sent by
humans. One type of a message is the sentence, which is composed of
words where each word is itself a sequence of phonemes, most of which
has to be recognized to identify the message.
Personality can be classified into five
different scales (the big five):
Openness (to the unfamiliar),
Conscientiouness (goal directed),
Extraversion (capacity and need for
stimulation), Agreeableness (ranging
from compassion to antagoninsm),
and Neuroticism (mental stability)
McCray and Costa
We can put ourselves in another person s place. This is quite a feat when
you think about it, and very useful when we co-operate. In our minds we
build a model of her and execute this model to find out what the other
person is thinking, feeling, focusing on, or believing. Using this model we
can try to figure out what she is up to. As we analyse our fellow humans
we find ourselves in a nice recursive information web, or even in a mess of
Hakan Gulliksson
224
deception. People know that we scan for information about them,
information that they might not want to give away, or would like to
control. A poker game is an extreme example. If you look worried your
opponent will use it. You can use this fact by simulating worries, but your
opponent might guess that you are simulating, and if you know that he
knows that you are simulating ….
Not only do we study other participants. We also study ourselves, and
our own behaviour. As we do this we sometimes even manage to fool
ourselves. Consciously, as for example when you convince yourself that
you are not tired at all, in the morning, when the alarm clock rings, or
subconsciously when a happy tune on the clock radio, starts you up
whistling which cheers you up.
For many interactions, pinning down the physical positioning of an
interactor is a problem. This is not so for direct interaction where two
people are at the same place, face-to-face. But, when technology steps in
the physical position is not as easily shared. To socially position someone
is more difficult, regardless of if interactors are at the same location. By
social positioning we mean establishing all that characterises a human
being, in a social context. Is he, or she, happy, sleepy, at home, at ease,
interested? One important social process is that we place ourselves into
social hierarchies that affect our behaviour. Most of us do not yell at our
boss, but sometimes to a child that does not want to do the dishes. Almost
as important as establishing identity or social position, of yourself or
someone else, is to identify relationships between people. Relationships
can change over time, and sometimes quite fast, for instance when you tell
your wife that you forgot to pick up your daughter after school. For
efficient interaction we also have to identify social conventions. One
example is that in some countries a man should not address a woman in
public.
The identifications discussed above are, as almost any other human
interaction described, built on pattern matching. In general, to identify
something a characterization is needed which is not always easy to
specify. The figure below shows three trajectories indicating intentionality.
Can you identify which of them that best illustrates, fighting? Playing?
Courting?
Figure V.15.5 Three trajectories
showing movements of two
participants in an interaction
[PT].
Patterns are found, designed, used, established, and generalised.
Generalised too much patterns found loose their meaning. Maybe this
paragraph did just that?
Interaction implies communication. Direct communication peer-to-peer
means that we have to identify the receiver, and the receiver is usually
also interested in who sent the message. Identification means either direct
physical contact, or naming, and for I-I interaction physical contact is not
even an option.
Hakan Gulliksson
560322-8593
225
A name can be provided in many different ways. Some are born with it,
as a network card is. It has a unique label imprinted in hardware. Some
are given a name by someone else, for humans parents usually accept this
responsibility. Some have to ask someone else, which is done on the
Internet when a networked station asks a server node for a temporary
address. Some systems select an address. You do this when you buy a
new mobile telephone and choose a number that is easy to remember.
And, as a last resort, a name can be randomly generated.
When everything is connected to everything else in the next generation of
networked systems, unique identifiers will be very important, and luckily
it is surprisingly simple to create a unique identifier. All you have to do is
to combine a place and a time. Microsoft ® for instance, does this when a
unique id for a new software component is created that can be used
worldwide. The place is provided by the number of the network card on
the local machine. Combining two independent name-spaces into one
unique space is also quite easy. We create a hierarchical naming scheme
by adding a unique number to each of the original numbers. This is for
instance done with the telephone numbers, but could also be used to
create a combined number space for Internet addresses and ISBN
numbers. We simply concatenate
with the Internet address and
with the ISBN number and the two types of numbers can never be
confused …… , ….. .
The naming system should be consistent with the topology of the
represented world to support easy delivery of messages to the receiver.
This, for instance, helps the mailman to find the right street address, and
the hierarchical phone numbering scheme to route your telephone call to
the right country, and to the right part of that country.
One example of an H-I interaction is a coffee machine that recognises you,
has already filled up your cup, and greets you grinning with a cheerful
Good morning – maybe even smart enough to skip the Good morning
some mornings?
To implement this, the coffee machine has to be enhanced with machine
vision for face identification, and also for finding out your mood. Two
quite difficult, some would say impossible, feats. In H-H interaction we do
face and mood detection effortlessly, even though there is evidence that
face identification is treated separately from other object recognition
indicating that the problem is both difficult and important [SP]. The
problem for a thing is not how to access digital information, which can be
done through networking. The problem is rather how to make sense of the
physical environment using sensors. For this, identification is a
fundamental problem. Biomedia is any information that a computer
extracts from a human being and this information can be used for
biometrics, i.e. measuring human characteristics for identification.
Performance is one obvious problem, but not the only one to consider.
What if someone does not want to be recognised. Should we do it
anyway? Also, what if we make mistakes? Will users accept a system that
fails to deliver now and then? When we combine identification with
location awareness and networking, theft will become a high tech
business, and security a major issue.
Hakan Gulliksson
[email protected]
GMP 452
2004-05-19
IV.3.07
Me Tarzan you Jane
Any sufficiently advanced
technology is indistinguishable
from magic.
Arthur C Clarke
The chief mechanisms used in
identification are:
To point at it ... primary, physical ID
To label it with a "word" ... secondary,
representational method
Draw a picture of it (another
representational method - graphical)
We may also separate out from Set A
(all things) a Subset B (a category of
things).
By itself identification is a
rather useless action
226
If we follow the relation H-T the other way, what can a human read from
the exterior of a thing? It is designed, so in principle anything can be
expressed! People are used to interpret shapes; an average adult can name
about 10.000 things. So, by exploring this ability in H-T interaction we can
simplify interaction, and enhance experience.
To identify is to separate a
thing or action from the set of
all other things and actions.
Without a human in the loop we still have many new, and potentially
important application. Some T-I interaction some applications where
identification is necessary are:
Customize device behaviour based on identification of context.
An intelligent camera should send the picture taken to an
available hard disc.
Customize physical environment based on recognized context. If
a user starts reading a book the physical context adapts lighting
and the telephone leaves a do-not-disturb-if-it-is-not-veryimportant message.
Recognition by bottom up visual search and data-driven matching has
been shown to be NP-complete, i.e. computationally very expensive. If, on
the other hand, knowledge about the context task, situation … can be
used, the complexity decreases. Consequently for efficiency reasons we
should avoid building complete 3D-models and other complex
representations, and instead let the interactors make better use of the
contextual information. We should in other words prune the search space.
This line of thinking contrasts somewhat with the ambition to build an
(extensive) representation of the context for ubiquitous computing.
What to look for, where to look when, and how to look, are questions we
need a priory knowledge of the world to answer. One example of where
such a contextual model is useful is when a thing goes shopping for food
[RA]. Foods in a grocery store are grouped, candy in one area, big items
low on the shelves, milk and butter far from the exit, and away from the
entrance. This information can be used to guide an interactor in the store
to find the right items.
”8”
Top down
Bottom up
t
There are numerous techniques that a thing needs to learn more about its
environment. One is image processing that has many applications, such as
improving or manipulating images, to extract 3D information from 2D
images, or to find patterns in an image. The problem to generate object
information from 2D or 3D images, or doing motion estimations from a set
of images is referred to as computer vision, a special case being face
recognition as discussed above. Critical issues in vision research revolve
around the nature of the representations used, and the nature of the
processes that recover characteristics [DM]. The representations range
from local properties of the image, such as pixel intensities and edges, via
depth and information on orientations of surfaces, to objects defined as
hierarchies of volumes, which can be matched against real world objects.
Hakan Gulliksson
227
For real world objects we have many practical problems; different lighting
conditions, noise from hard rain, children that grow up, repainted houses,
and snow covering a landmark. One specific problem is background
clutter, especially if the recognition uses edge detection. Other problems
are; background colour conflicts if the algorithm uses colour lookup, and
partial occlusion. Some algorithms also need elaborate training sequences
to recognize an object and might require the user to select feature points.
The figure below illustrates how the human eye and brain represents an
image [SP]. Each cell can store information about one area in the image.
Cells in the middle of the eye are smaller, resulting in higher resolution for
this part of the image.
Surface::
Depth:
Slant:
Tilt:
Colour:
Figure V.15.6 Human image
representation.
Recovering the characteristics of a real world object from an image is
typically an ill posed problem where the solution (the characteristics) is
not uniquely determined by the image analysed. The processes
manipulating the representations either have to use contextual knowledge
from the real world, or guess, to extract properties. Examples of rules used
from the real world are; surfaces are consistently coloured and textured,
motion from tension and gravity makes straight lines, objects are rigid,
contours are continuous, and light sources are constant over short periods
of time. We can then apply statistics to constrain the image processing by
our knowledge of the world. Given the figure to the right, what is the
possibility that the uppermost image is a coin in the scene and you are
viewing it edge–on?
Coin?
Alternative
interpretations:
Yes!
No!
Another representation of an image, quite different from the one of human
vision, or the pixel based representation, is shown in the following figure.
Position
Cylinder
Coin
Is a
<x, y>
7 mm
Thickness
Diameter
3 cm
Which representation of the above is the best one for identification? This
has to be decided individually for each application and it is important to
consider both representation and processing at the same time because this
can simplify either, or both of them. Compare the human way of storing
an image, where two objects close to each other in the image end up close
together in the brain, with an array in Java, where the correlation between
the position in the array and the position in image is arbitrary, chosen by
the programmer.
Hakan Gulliksson
Figure V.15.7 Knowledge
schema for image
Who are you going to believe, me,
or your own eyes?
Groucho Marx
228
V.15.3.1 Pattern matching
Many human wonders are based on pattern matching, i.e. recognition, a
basic but intricate feature. You quickly realise this if you try to implement
it using a computer. The following figure shows some variations that have
to be taken care of in the simple task of recognising the letter F.
f
f
As discussed previously, pattern matching uses two complementary
processes, bottom up and top down. The bottom up approach combines
lower level primitives and tries to form meaningful constructs. The top
down process uses contextual information and internal models and
matches them against the bottom up constructs. In a crowd you recognise
your son, at least his jacket. Odd, since the person wearing it is too tall and
your son should be at home doing his homework. As you get closer you
realise that it really is your son. Surrounded by his short mates he looks
taller. Homework? He says. Another example is that late at night your
kitchen is lit only by the moon. A cylindrical object on the table will rather
be perceived as a cup than as a spare to your car s engine.
Some aspects of matching are already built in by evolution. Why, for
instance, do we see a diamond as a diamond, and not as a rotated square?
The only difference is the viewer s frame of reference, see figure V.15.9.
Figure V.15.8 Different patterns
that matching has to cope with to
identify F.
abcdeīghij
We humans assume that we are
directly perceiving the “objective“
properties of the bodies which
surround us ... Snow really is white,
roses really are red, tables are flat,
and so on. This purely passive idea
of perception is so entrenched in us
that it is difficult for us to grasp
that we only really see what, in a
way, we have already seen.…We
have already learned to distinguish
through our culture, that is through
prior interactions with other
members of our species, certain
ranges of colours, differentiating
red from pink and violet for
example.
Jacques Ferber
Figure V.15.9 A square
and a diamond shape?
Recognition by Component is a theory of matching developed by Irving
Biederman. It is also an example of a bottom up model of behaviour. He
postulates that matching is done by comparing a visible object to
combinations of geometric icons, geons. We will identify a cup by
matching it to geons representing a cup in our brain.
Can you think of any reason why
evolution has implemented the
brain such that it first breaks the
visual pattern down into edges
and primitive patterns before
putting the visual pattern back
together again?
Figure V.15.10 Geons,
small building blocks that
com-bine into wellknown
objects.
Hakan Gulliksson
229
Do not be fooled by the trivial examples chosen, they are only meant as
illustrations of the general principle. Matching can be very complex, and
even involve inventing, or imposing a match. Also, matching is not
restricted to identifying physical objects. Social relations and cultural
patterns are also found this way.
Runner?
Crocodile?
V.15.4 Navigation
Navigation is an activity where an interactor uses context to find its
position and to follow a planned course. Our introductory example of a
context is the social environment, i.e. other humans, and our navigation
task is to ask them for their opinion when we are curious about someone s
personality. Navigation in this context is a cyclic process where people, or
traces of people, are browsed, the findings modelled, and the result
interpreted and used. Understanding moods and emotions are important,
and there are many gestures and behaviours that we subconsciously
continuously monitor. Correct interpretation makes it possible to
manoeuvre in a social context, to avoid conflicts, and to make a good
impression. The result of the interpretation is used to decide on the
strategy to use for the next iteration of the cycle, starting once again with
browsing the environment [MR].
Social navigation is the
process of making
navigational decisions
in real or virtual
environments
based on social and
communicative
interactions with others.
Mark O. Riedl
Figure V.15.32 Social
navigation.
Some aspects of navigation are shown in figure V.15.33 below where Mr A
is shown, lost in Tokyo [DW], [MR]. He has knowledge about what Miss B
knows, which he has learnt from studying her behaviour, or by knowing
something else about her (maybe she looks Japanese). In addition, Mr. A
probably knows something about his current position, I am east of my
hotel . He also knows how to extract knowledge from the physical
environment since he has a map (in English) over Tokyo.
“Man is by nature a social animal,
and an individual who is unsocial
naturally and not accidentally is
either beneath our notice or more
than human. Society is something
in nature that precedes the
individual. Anyone who either
cannot lead the common life or is
so self-sufficient as not to need to,
and therefore does not partake of
society, is either a beast or a god”
Aristotle, 384-322 B.C
Mr A
Context
Representation
of self
Representation
of person B
Representation
of context
Representation
of …
Miss B
There are two strategies for navigation possible in this situation. One is to
ask the girl, and the other one is to consult the map. What strategy he will
use depends on the disposition of Mr. A, his mood, his knowledge in
Japanese, his trust in Japanese girls, or maybe his civil status. No social
Hakan Gulliksson
Figure V.15.33 A memory
system, adapted from [DW].
There is a dynamic relationship
between people, the activities in
space, and the space itself.
All three are subject to change.
[AM]
230
navigation is necessary if the internal representation of the context is good
enough, e.g. he suddenly recognises his hotel across the street.
Navigation is further complicated by the fact that social environments are
unstable. They consist of individuals that suddenly can change their
minds about something, and modify their actions accordingly. We are all
both navigators and parts of the map. The possibility for I/T to navigate in
a human social environment is still well out of reach for technology.
Now let us, i.e. H, navigate in information. Spatial thinking and spatial
navigation come natural to us, and are therefore interesting as tools for
exploration, orientation, and navigation, in all sorts of environments.
Navigation tries to answer the following basic questions. Where am I?
Where have I been? How can I get to where I want? Finding the answers
once again depends on positioning, and setting a course. They are two
tasks performed within a context, either in real or in virtual reality, using a
model of the places to visit (or miss) and the possible paths to travel. A
good cognitive map over the world is necessary and can for instance be a
mental map over roads, crossings and cities. Familiarity with the space
can be represented and generated in different ways. We might have
travelled the same path before or seen a similar web page. Maybe we
know the path from a map we have seen, or we have some high level
topological knowledge. Passing street numbers for instance indicates
speed and direction. Other times we for instance know that the destination
is beyond, or in between, familiar landmarks. If we have travelled the path
before we will recognise clues in many modalities, a smell here, a sight
there, i.e. the context is very important for navigation. A practical aspect is
that fallback solutions, i.e. backtracking, or short cuts to safe places,
should be provided in case the wrong course was taken. Errors are
inevitable.
Chair
Work
Home
Refridgerator
Bed
What does it mean for information to navigate? To start with, and as noted
above, a prerequisite for navigation is a position since you need it to set a
course. Information is virtual which means that the physical world is not
the default space. In fact, for information anything that can be identified
could serve as a position! Furthermore, there is an inevitable time delay
associated with physical distances, but delays also could emanate from
many other distances. One example is the distance between a software
agent that understands English, and one that understands Swedish.
Passing a message between these agents will involve a translation step
that takes time. The conclusion is that interaction, and specifically
navigation, of the type I-I is not necessarily situated in either time or
physical space!
One simple virtual space is a colour space. A specific yellow colour has a
defined value, for instance (255,255,0) in the RGB colour space. The
distance from yellow to red (255,0,0) can be measured and if an
application wants to transform a yellow dot into a red one it could follow
the path
,
, ,
,
, …
, , ,
, , . If we on the other
hand want to navigate from black (0,0,0) to white (255,255,255) we have
many different possible paths of equal length available. In the HSB (Hue,
Saturation, Brightness) colour space navigation from black to white starts
from (0,0,0) and ends at (0,0,100). In this case there is an obvious path to
use.
Hakan Gulliksson
231
The way-finding problem is a part of the navigation, and it is also an
optimisation problem. We want to arrive as soon, or as cheep, as possible.
The problem gets extremely complicated if multiple cost estimates (time,
money, condition of the path) are associated with different sub paths.
Navigation in virtual environments comes with additional costs [SS]. One
example is that input devices for 3D require training. Metaphors used,
such as virtual flying, are easy to understand, but using them in practise is
a problem, especially for a novice user. She is also often placed in an
unknown environment forcing her to do local explorations. On top of this,
the world itself, landmarks, paths, and even laws of nature may behave
strange. One way to alleviate the problem is to externalise goals and plans.
Examples of how to do this is to draw a red line in the virtual world that
the user can follow to the target, or to introduce force fields that hinders
her from going the wrong way (uphill is prohibited).
In virtual reality there are many more possible ways to do things, but still
we want to reuse familiar human interaction techniques. The control
metaphor chosen when designing navigating in a virtual 3D space is
important, and should be made clear to the user. There are several options
available depending on why the user wants to navigate. One alternative is
that control is exercised using a virtual camera held in the hand. The view
shown is the one seen by the camera. We can start from anywhere within
the world, fly around and look at objects, expand ourselves at will, and
maybe extend an arm to reach and manipulate an interesting object, all in
third, or first person, see figure below.
Navigation indicators
Regulatory Signs
Warning Signs
Guide Signs
Three most important signs as
defined by MUTCD (Federal
Highway Administartion standard)
Figure V.15.34 Different
ways to look at a virtual
world.
Both positioning and course setting make use of lower level tasks, notably
search and scan (browse). Scanning is a combination of the operations
overview, zoom, and filter, and as it should be, this is also a good way to
describe how you yourself navigate in a jungle [BS2]. First you gain an
overview of the scenario, and then you focus on one of the more
promising openings, filtering out uninteresting bush wood and single
trees, that are simple to pass.
Scan
Search
Recognize
Describe
The new Google scanmachine
Figure V.15.35 Scan and search
are complementary operations.
Search and scan complement each other. Scanning is the superior strategy
when it is easier to recognise than to describe information and search
excels if it is advantageous to specify the wanted information. To scan is
also favourable if the user is not familiar with the content, and if there is
not too much data.
Hakan Gulliksson
232
Automatic navigation is one of the problem areas where progress of
technology has met hard resistance. The thing faces exactly the same
problem as us humans, but we already have a well-developed toolbox,
provided by evolution. There are many problems facing the thing [NS]. To
start with the real world is dynamic and non-deterministic, which means
that planning is difficult and must be done in real time. One example is
the poor robot bumping its head in the bedroom door that is normally
open. Repeating the same action does not always have the same effect.
Some other problems are that the world is continuous rather than
discreet, and that the thing can never fully perceive the environment
because of sensor limitations (neither can we). The illustration to the right
shows another problem where a typical thing with only a local model of
reality is caught by a curved wall.
The railway train is an example of a successfully navigating interactor,
and the autopilot in an airplane is another one. Both must be considered as
passive and reactive interactors as they only follow a tasty path impossible
to leave.
Examples of active, reflective, navigating things are harder to find and
usually, as in the example of an automatic factory, demands a highly
constrained environment. One reason for this is that things cost money,
and will break if handled carelessly. For information, automatic
navigation is used in many important applications, most notably for
packet delivery on the Internet.
A thing can use a camera as a sensor. In this case the first part of the
navigation problem is the problem of assigning each pixel in an image to a
position and a velocity in 3D space. By using this information, pixels can
be grouped to objects, which can be identified as real world objects and
avoided, or aimed at, depending on application. Using such technology
you can listen to music, and the music follows you as you move from
room to room. When you leave the living room the music fades away and
meets you as you enter the kitchen.
T
?
T
T
Physical forces give us another way to model path finding and navigation.
We start by representing goals and obstacles as potential fields. Goals are
modelled as attractors and obstacles repel, see the figure below.
Figure V.15.39 Using potential
fields as an abstract model.
Obstacle
Goal
In the real world the strength of the field is typically inversely
proportional to the square of the distance to the source, i.e.
F α / distance2, a fundamentally continuous function. With the concepts
introduced above a world can be modelled as the superposition of the
Hakan Gulliksson
233
potential fields of a number of obstacles and goals, and a smart interactor
moves along a path towards a goal using as little energy as possible.
Figure V.15.40 Interaction as
minimisation of energy spent on the
path to the goal.
V.15.5 Choice
Choice, or selection, is a very basic concept, not easy to reduce into other
lower level concepts. Navigation, for instance, depends on choices (and a
selection involves navigation among choices). Intention and attention are
prerequisites, at least for conscious selection, and attention is caught by
and trained in a cultural and physical environment. Attention is in fact a
choice, conscious or not that for instance a boy tries to manipulate by
making an impression on a group of girls. This section is about how to
detect and make choices.
We are continuously choosing from the menu of the world, and the trend
is that increasingly choices are made and effectuated using technology.
This means that technology must be adapted to how we choose, and to the
context where we choose. The problem gets worse because computers
display information many times faster than human language can express
it, at least without learning and adaptation. This means that relative to HH interaction choices among alternatives will be much more prevalent in
H-I or H-T interaction.
Of utmost importance when we design for choice is to provide for
feedback. If you run your favourite text editor, and look for feedback you
will be surprised by the number of examples you will find. Confirmations
(highlight of selected text), in progress feedback (hourglass icon) and
feed forward feedback (the cursor) are some examples [VB]. We are so
used to working with windows, menus, icons, and pointers that we take
the feedback it provides for granted. When we consider a multi-user,
multi-input, multi-screen environment things get even more complicated
and we will have to reconsider how we accomplish feedback.
B
A
How many ways are there to control the
lighting in a room [SS]? Here are some
examples to start with:
Lighting switch.
Voice message.
Voice message and entering the room.
List another 10 ways assuming the
environment is arbitrarily intelligent
Choice is to thinking as battle to war.
You can philosophise and deliberate all
day, but the end result of all your
mental gymnastics has to be a choice of
some sort
[CC]
B
A
Remote
controls
Hakan Gulliksson
234
How can we tell that a choice has been done? Figure V.15.57 shows some
alternative selections.
Select 1
No selection at all
Figure V.15.57 Different
ways to select something.
Select by elimination
Select by interval or position
H
I
T
T
I
Select by symbol or property
All of the selections in the figure could as well have been slips, i.e.
mistakes. If we do not know the intention of the user then all we can do is
guess whether the choice was valid. One way to increase the probability is
to ask for a confirmation. In some dangerous machines a choice is
confirmed by using both hands in the selection.
Why not use both hands for spatial
input? Better performance is possible!
What tasks should be suitable for twohanded input? Could human factors be
improved? Usability?
Figure V.15.58 Safe operation.
V.15.6 Manipulation
The more active participants of the human race will certainly try to change
moods and stir emotions. Advertisements are mild and acceptable
versions of manipulation, but others, such as hypnosis, and group
pressure, are potentially dangerous. Many forms of blackmail are
certainly not allowed and will send you to jail. Still, our society would not
work, not even for one single day without social manipulation. We call it
education, and want a child to be well raised. We will now complement
section V.15.2 on intelligent interfaces with a discussion on manipulation.
We start with H-H, and by noting that most peculiarities of H-H can be
reused for H-I or H-T interaction.
How could you avoid being manipulated? This is knowledge that works
both ways. If you know how to escape it, you know how to exert it.
Without knowledge about manipulation you will not even know that you
are being manipulated. Are you?
There are three factors that are important in developing resistance. First,
knowledge about social psychology and attitude change are important,
second and maybe even more important is general knowledge about
philosophy and science. If someone presents scientific facts showing that
energy is created in their refrigerator without any external energy source
then you, as a knowledgeable person, will mutter something about UFO
and alchemy. A third factor important for detecting manipulation is selfknowledge. It will help you to inspect yourself and observe your own
reactions. Of course, a general attitude of scepticism, which is a born gift,
is always healthy.
Hakan Gulliksson
Have you seen a dictator with
irony? A dictator that laughs at
himself?
235
Some examples of findings from social psychology are that manipulators
often start with making minor requests. They often seem concerned,
sincere, and friendly. They use group pressure and do not make things too
easy. They present you with an appearingly meaningful task that is said to
be tough, but you are of course are capable of performing it. Immediate
intimacy and friendship, and feelings of disorientation, confusion, and
embarrassment are some possible indications that you are being
manipulated [SD]. In general people are easy to fool, as can be seen in any
magic show!
Not all manipulation is bad of course. Some flattery could even help
convince someone to do the dishes and praising the result will lower the
resistance to repeat the feat. Flattery is an important social cue, and
similarity is another. People similar to us more easily persuade us. If you
play golf and meet another golfer the probability for liking increases. In
fact, the greater the similarity, in background, trait, or attitudes the greater
the potential for persuasion [BF]. If this is scientifically proved to apply
also to fifty year old men making the dishes we do not know.
In a social setting we exploit similarity by adopting culturally predefined
roles. In the role of a teacher we are trusted to know the subject and be
able to teach it. Another example is that any referee is automatically
accepted as an authority on the football field. A social role can easily be
used for manipulation, we for instance readily trust a doctor and accept
the decision of the head of the family.
Listening, not imitation, may be the
sincerest form of flattery.
Dr. Joyce Brothers quotes
Yeah!
Attractiveness is another important aspect since attractive people or
products socially influence us easier. Physically attractive people are by
default assumed to be intelligent and honest. Luckily people cannot yet be
designed, only styled. A problem for product design is that different
audiences have different culturally established preferences, which vary
over time. This means that the designer has a lot of footwork to do,
making surveys and looking for clues, for instance in typical magazines
and TV-shows.
The last possibility for social manipulation that we will mention here is the
rule of reciprocity, which seems to be followed in every human society
[BF]. The principle is that if you are given a favour you will feel obliged to
return it, and this can be used for social manipulation in many ways. One
example are companies that gives you a watch, almost for free, if you sign
up for buying one book each month.
Manipulation for I-I means changing a data structure, for instance
deleting, copying and moving data in a database. Here we will add a short
discussion on security.
Security is how to protect a computer, network, or another resource, from
being manipulated. It is necessary to guarantee privacy, the condition of
keeping something personal. Computers and networks have created new
challenges, but the basic problems are as old as the social network. Some
members of a group are authorized to access resources, and
authentication is needed to verify their identities. For face-to-face
communication authentication is simple, in other situations we need
passwords, biometrics, or access cards with pin codes.
Hakan Gulliksson
236
Unauthorized access could be gained by misusing prior authorization,
masquerading as someone else, or by exploiting some vulnerability in the
security system. All well-known tricks from films featuring J. Bond. Once
inside the system, or with access to the protected resource, the intruder
could steal, destroy, or browse secret information. Denial of service is
another type of attack that could be especially damaging for a computer
system. It diminishes server capacity by keeping the server as busy as
possible and thereby temporarily hinders access to the system.
It is interesting to note that biology inspires thinking about network
security. Viruses and worms are different existing attacks. The defending
side also use colourful names, firewalls, sandboxes, and honey pots (used
to trick hackers) are some examples. Security is always a balance between
the cost of loosing control and the cost of equipment and administrative
expenses to keep it. Another important lesson is that insiders are
responsible for most of the security attacks.
Security please!
The combination of a user name and a password works well to
authenticate fixed access, but as users start moving around and take their
computer computers along, new solutions are needed. Logging in every
100 meters, i.e. to every local area network passed, is both a nuisance and
a threat to integrity. Context authentication is an alternative approach,
where the network can use any implicit information about the user such as
usage history, the current user environment, or typical user actions to
inform authentication. One example is that the information displayed on a
public information kiosk at a train station should be erased if the user
walks away. Other examples are mobile computers that work only at a
specific sports arena, and a network service that is enabled only if the user
accepts the service, e.g. by filling in a form on a web page.
A user scenario is that you attend a meeting at some location where you
have never been before. You have a document that you would like to
print, but the problem is to persuade the new network environment to
grant you access to a printer. Currently, the solution is to find a system
administrator (usually very difficult), or to disconnect the printer from the
network. One possible alternative could be a trust based security system
where users on a network are given rights to delegate access rights to
third parties. This way, one of the attendees to the meeting, with the
proper access rights, could give you a time limited access to the printer,
the projector, and the coffee machine. Rights could also be associated with
devices and software agents. Anything with a unique identity could be
given, and could delegate, access rights. What we need is a language,
preferably XML-based, to describe who has what access rights to what.
Each manipulation changes attributes, or states, of the manipulated object.
Move , for instance, means that the position of an object is changed.
Alternatively we could, if it suits us better, assign the attributes to the
manipulation; a move could be done for a thing at a specific angle, to a
predefined position.
Hakan Gulliksson
T
(move, x,y)
Move (
T , x,y)
237
The English language has thousands of verbs, and many of them can be
classified as manipulations. The commands selected and discussed in the
following rather manipulate screen based virtual objects than persuade
interactor-objects into doing something. Let us try to use physical reality
to list some of the possible manipulations, starting with two objects A and
B.
A
We can merge, join, group, compose, close, shut A and B,
A
and then split, divide, fork, break, open, extract the composition.
B
B
C
The new (or old) objects can be pulled, moved, pushed, lifted, drawn
apart.
C
We might cut, delete, remove D and save it in memory,
C
D
D
C
and then paste, retrieve, D back from memory.
D
At last we can stretch, mould, shape, form, C.
D
H
Some useful manipulations do not have a direct physical equivalence, and
create is one example. In the virtual world we can easily create something
out of nothing whereas in the real world this is impossible, even if there
are good approximations.
Using information we can model and manipulate a surface in 3D, but
information can also be used to directly manipulate things. According to
Encyclopaedia Britannica, sculpturing is a form of aesthetic expression in
which hard materials are worked into three-dimensional objects of art. A
lot of different media may be used, including clay, wax and stone.
Materials can be carved, modelled, moulded, or otherwise shaped and
combined.
H
+
H
H
H
ca-Cola
New technology gives new possibilities. Scanning a rotating object using a
laser is one way to input a 3D structure and given a 3D description there
are now machines that can automatically mould blocks of suitable
materials to the exact shape designed by the user.
The lathed surface is an example of a less complicated graphics based
sculpturing. The surface is generated by rotating a curve around a
coordinate axis.
Just as raster operations can perform logic operations on bitmapped
computer graphics, Boolean operations between volumes can be defined
in 3D. Using these operations new forms can be constructed starting with
simple 3D objects.
Hakan Gulliksson
AND
238
V.15.6.1 Speech acts
Language is studied from many points of view, including the validity of a
statement, syntax, and culture dependency. One view particularly
interesting for interaction is that of speech acts. Here, focus is not on
whether a statement is true or not, but on the act intended by the
utterance, to say is to do. You are not only describing situations, you are
creating, and manipulating them.
Making a normal utterance, i.e. saying something, involves a hierarchy of
acts on different levels. At the lowest level is the act of utterance. An
utterance is perceived, even if it is in another language, e.g. a gesture, and
even if we do not understand its meaning.
If we take an utterance in some language, add meaning at a particular
time and place, and also add an intention on the part of the speaker, we
arrive at another level of the act, the illocutionary speech act. In this act
the utterance, or another form of message, affects the addressee. The
speaker wants the listener to recognise the meaning of the utterance, i.e. to
do or think something. One example is the statement You ll do that . “n
illocutionary act can be expressed as a performative, i.e. a verb, operating
on some content, i.e. do and that in the previous example. The result
of the act is the perlucotion and it relates to the effects that an act has on
the state of the addressee. If a teenager is asked to wash up the dishes he
can attend to the task singing, or he might immediately leave the house.
Two different perlocutions for the same illocutionary act.
Searle [JS2] proposes the following taxonomy for illocutionary acts:
Assertive acts, used to commit the speaker to the truth of the
expression There s too little salt in the soup
Directive acts, try to persuade the listener to perform something,
Pass the salt, please
Promissive acts, are attempts to commit the speaker to do
something, I will pass the salt, some day
Expressive acts, expresses the speaker s feelings about a state of
affairs, I am sorry, the salt has been stolen
Declarative acts, perform an act by the utterance, I curse you
(the effect is uncertain, but maybe some salt might help?)
Duck!
What will you do? On the golf
course versus out hunting ducks
[WB]?
When we look into a mirror we think
the image that confronts us is
accurate. But move a millimetre and
the image changes. We are actually
looking at a never-ending range of
reflections. But sometimes a writer
has to smash the mirror – for it is on
the other side of that mirror that the
truth stares at us.
Harold Pinter, Nobel lecture
Buy, Sell
Top 2 words of the century
I say to the House as I said to ministers
who have joined this government, I have
nothing to offer but blood, toil, tears, and
sweat….
Winston Churchill
We choose to go to the moon. We choose to
go to the moon in this decade and do the
other things, not because they are easy, but
because they are hard, because that goal will
serve to organize and measure the best of
our energies and skills,…
John F. Kennedy
An interesting feature of all speech acts is that they are independent of the
cultural setting and the linguistic form chosen for it. Pass the salt, please
and Would you mind passing the salt are equivalent from the speech act
point of view. The theory of speech acts is quite general and currently
difficult to use directly in applications, but it is useful in the analysis of
conversations, and hence it follows that it can be used to analyse
interactions, for instance a graphical user interface. It also provides
insights into the use of the human language.
Act 3, in the kitchen, father and son sitting at the table, son doing nothing, father humming.
F (hinting at the issue): “Oh, the dishes are not done!”
->No visible or audible response whatsoever
F (testing a more direct approach) “It is your turn to do the dishes!!!”
->S: ”I will do it, I will, soon, just have a little thing to attend to”
Act 4, later in the kitchen, father enters kithchen,son doing nothing, father steaming.
F (adding a touch of affect): What the hell, the dishes still not done!?
-> the world keeps spinning, the Universe is not disturbed …
F (bringing the point home, stabbing at the heart) “Your monthly allowance will be reduced”
->S: ( looks at father, surprise in his eyes) “Dishes, me? Why didn’t you say so?”
Hakan Gulliksson
239
The following figure shows a state based model of speech acts (speech
dance?) between two interactors A and B [TWD2].
A: declare
A: request
B: promise
B: break promise
B: counter
A: counter
A: declare
B: assert
Figure V.15.72 Speech dance,
clearly extremely simplified.
A: accept
A: withdraw
A: withdraw
Interactor A starts, by issuing a simple request for some salt. B could
promise to deliver the goods, reject the request, or choose to counter by
requesting sugar. Figure V.15.66 models quite a simple interaction,
omitting many details, yet it results in a rather complicated graph, which
indicates the depth and variability of human speech interaction.
If I'd known I was going to win I'd
have written a speech, so here it is
Many of the idealists can only conceive of an idle humanity as an ideal humanity. They talk as if no
man could ever rest until he reached Utopia; or as if a really long holiday were something like heaven,
utterly distant and divine. Their social philosophy is that of the hearty and humorous epitaph of the
charwoman, who had gone on to do nothing for ever and ever. But even now it is by no means certain
that those who are not charwomen really become any more hearty and humorous by doing nothing for
ever and ever. A vast amount of stuffy and sentimental humbug has been uttered in favour of the Gospel
of Work. As it was said that Carlyle talked a great deal in praise of silence, it may also be respectfully
affirmed that he idled away a great deal of his time meditating on the virtues of labour. Work is not
necessarily good for people; overwork is very bad for people; and both often begin with a bad motive
and come to a bad end.
But there is another strong objection which I, one of the laziest of all the children of Adam, have
against the Leisure State. Those who think it can be done argue that a vast machinery using electricity,
water-power, petrol, and so on, might reduce the work imposed on each of us to a minimum. It might,
but it would also reduce our control to a minimum. We should ourselves become parts of a machine,
even if the machine only used those parts once a week. The machine would be our master, for the
machine would produce our food, and most of us can have no notion of how it was really being
produced.
G K Chesterton 1925
I think the name of leisure has come to cover three totally different things. The first is being allowed to
do something. The second is being allowed to do anything. And the third (and perhaps most rare and
precious) is being allowed to do nothing. Of the first we have undoubtedly a vast and a very probably a
most profitable increase in recent social arrangements. Undoubtedly there is much more elaborate
equipment and opportunity for golfers to play golf, for bridge-players to play bridge, for jazzers to jazz,
or for motorists to motor. But those who find themselves in the world where these recreations are
provided will find that the modern world is not really a universal provider. He will find it made more
and more easy to get some things and impossible to get others. [] The second sort of leisure is certainly
not increased, and is on the whole lessened. The sense of having a certain material in hand which a
man may mould into _any_ form he chooses, this a sort of pleasure now almost confined to artists. As
for the third form of leisure, the most precious, the most consoling, the most pure and holy, the noble
habit of doing nothing at all--that is being neglected in a degree which seems to me to threaten the
degeneration of the whole race. It is because artists do not practice, patrons do not patronise, crowds
do not assemble to worship reverently the great work of Doing Nothing, that the world has lost its
philosophy and even failed to create a new religion.
G K Chesterton 1927
Hakan Gulliksson
240
Part VI: Design, humans change our future
The previous chapters introduced interaction and interactors. Concepts
that can be used to model many important aspects of the world, for
instance the workings of society, companies and families, or other systems
built by technology, such as a car or a computer.
Still missing in our modelling toolbox is how to describe purposeful
creation. Interaction is all right for describing how something, for instance
a computerised information service, works, but it is not sufficient if we
want to describe how this service was thought out. It does not provide a
framework where we can describe why the service was needed, why it
works the way it does, or why you like using it. To answer these questions
we need a broader framework, where interaction is still important as the
glue and the engine, we need the concept of design.
Definition: Design is intentional
change in an unpredictable world.
Nelson, Stolterman [NS]
Why do we intentionally try to change and arrange our future? Some
answers are the same as to the question why we interact. Deliberate
change helps us survive and prosper, and if we do it right to we can
dominate others, or serve them well. There is always some little detail
affecting our lives that we would like to fix, and even if the world by some
strange coincident was perfect we would still want to change it just to
make an impression. Change is inevitable. Using a design perspective we
can avoid being trapped in last minute adjustments to fix problems and
instead formulate, and strive for, well thought out visions of longer-term
solutions. Design formulated this way can be used as a framework for
human development in general [ES].
If the world was completely deterministic, changing it would be an easy
task. But, many things that happen to us, in effect happen by chance. We
are also ourselves part of the system, entangled in uncountable feedback
loops, which makes things a tiny bit complicated, especially at the social
level. So, how do we accomplish change and even more important, how
do we make sure that the changes chosen give the intended result?
First, we need an accurate model of the world, which we have seen in
previous chapters is quite difficult to achieve. Next, we need to
understand the cause-effect relationship of actions at the appropriate level,
e.g. at the physical, or social level. Not only do we need to do this for one
cause and its effect, but also for the chain of causes and effects that we
trigger by a change. The complexity of the constraints, such as technology,
economy and social relations, and also of the problem, if it in any way
concerns humans, makes it impossible to optimise the solution in a formal
way. Rather than solving the problem, we need to resolve it, and working
with such problems calls for judgement and balanced solutions, not only
for yes or no answers.
Hakan Gulliksson
no
yes
241
Inquiry by the scientific method is one way to generate knowledge for
judgement. Another way is to use intuition, it is formed by experience
and is difficult to describe. It is sometimes referred to as tacit, or silent
knowledge, and is exemplified by art where the artist seemingly just does
it. Since we in design have to make decisions in a hopelessly complex
world, intuition is a necessary complement to the scientific reasoning
[ES].
Can we trust intuition? Actually, we have no choice! But, we should as
often and as much as possible use scientific reasoning to guide us.
Acquiring knowledge efficiently is to some extent a matter of method and
can be learnt. Some of the methods will be described later in this chapter.
Systematic and continuous introspection of our own behaviour, and of the
behaviours of others are also important. One interesting observation is
that intuitively (!) we trust some people to have better intuition than
others.
When the work is done we still have more work to do. We have to
evaluate the work, both the result and the process leading to it. Did we
find the right balance between using intuition and explicit knowledge in a
particular case? Was time spent on the right parts of the result? These and
many others are important questions to answer if we want to improve as
designers and if we want to get paid for more than a first lucky shot.
A design process is typically realised in different phases. Starting from a
vision of what we want to achieve, we formalise a requirement
specification, think about the concept of the design, its appearance and
many other things. The design is also for a specific context, maybe we can
identify typical users and tasks. All of the time while we work with these
questions we should evaluate how we work, and the results. Some of the
evaluation techniques we can use are to ask an expert, ask users, study a
prototype of the design in action, and test key aspects of it in a laboratory.
How to apply why and
how is design.
Why apply how and
why is human.
HG
Coca
Cola
Typical questions for evaluation are; whether the result of a design
process is useful, and it gives the intended result? One indication if a
product is any good is if people use it. This is a good top down measure,
both for restaurants and web sites. Evaluation bottom up is more difficult.
How do we measure if someone feels at home with our design, to what
extent the sense of time and space is lost, or if a tool is trusted? Security,
privacy, and customisation, all affect usage and should be evaluated, as
well as capabilities such as the possibility for the interactors to group
themselves, and to what extent they are peripherally aware of others. Note
that evaluation, as well as design is something that we all do all of the
time.
VI.1 What is the problem?
The first thing to remember is that most design tasks are wicked problems
[JLES], meaning that there is no final best solution and that the problem is
hard to define before a solution is found (at which time the solution is
obvious). The problems are ill defined, ill structured and have resolutions
rather than solutions. During the design process focus constantly shifts
between details and the whole. This makes the delegation of design work
difficult.
If we consider designed products carefully it is difficult to find anything at
all that works perfectly from every point of view [DP]! Aircrafts crash,
Hakan Gulliksson
242
axes do not keep their edges, and cars smell. But, these deficiencies are the
results of necessary compromises. All design involves compromises, as
does engineering.
As stated several times earlier in this book, the main idea with interaction
technology is to add QOL (Quality of Life) by introducing new
technology. This could be accomplished in the products of today by
incorporating complicated systems or devices, such as the Internet or the
computer, and use them to invent new services and products that solve
problems. Technology by itself allows for many degrees of freedom, which
means that the designer is faced with the combination of a wicked
problem and an enormous toolbox.
So far, many of these products are almost impossible to use by most
ordinary people. The problem is termed cognitive friction in [ACR],
meaning that the products are not well adapted to human behaviour and
thinking.
Conceptual
level
Physical level
Designers intention with the product.
Conceptual level
Figure VI.1.1 One problem is the
difference between the designer’s and
the user’s views.
Physical level
Users view of the product
If you study figure VII_1.1 you can see that the designer and the user
might have different views of the product at the physical level. A menu
with a strange name, a colour badly chosen, or a grip most suitable for a
left handed, just to give some examples. This physical level mismatch is
usually the easy part to fix. Cognitive friction is also shown in the figure as
a discrepancy at the conceptual level between how the user and the
designer view the product, i.e. their mental models of the product does
not agree. The reason for this discrepancy can be that they have different
background knowledge, or lack thereof. The result is an interrupt of the
habitual, standard, comfortable being in the world and the user has to
adapt to the product [TWD2]. Imagine a user who has always used a fully
automatic digital camera and suddenly is confronted with an old Leica
camera from 1975 with a separate exposure meter.
How many students can today use a slide rule where multiplication is
performed by addition (of logarithms)?
Hakan Gulliksson
243
One way to overcome the problem of cognitive friction is to improve the
usability of the product. This can for instance be done by adding a menu
that is easier to read, or support the user by a smart pop up form, at the
right place and time. Evaluating usability means estimating the Gulf of
execution and the Gulf of evaluation [DAN2 ]. The Gulf of execution is
the difference between what the user wants to do, and what actually can
be done using the controls in the product. The Gulf of evaluation is the
difference between the state of the system perceived by the user, and the
actual state of the system.
Another (better) way is to take a step back, consider the original goal of
the user, and from this redesign the whole interaction, more or less
ignoring implementation issues, i.e. do a goal based design. In the figure
below this means that design should rather start from what is desirable
than from what is technically possible, or economically viable.
“Like putting an Armani suit
on Attila the Hun, interface
design only tells how to
dress up existing behaviour”
Alan Cooper [ACR]
It should be simple enough
to use but functional
enough to be useful
Figure VI.1.2 Goal based design
starts from what is desirable.
Product
The discrepancy at the conceptual level is a problem, but what makes
design really difficult is found at the intentional level. How to find out
what another human being needs when she probably does not even know
it herself yet, and most likely cannot articulate her wish.
One way to focus on a goal-based design is to constantly ask ourselves
and other involvd the question Why? as in Why is this book worth
spending time with? . This is quite a different question from asking how,
e.g. How do I read this book? . “nother example is that we first should
ask ourselves Why does the user need this command? and only if we
are satisfied with the answer do we ask How does the user access the
command? , and How do we implement it? . If we cannot answer the
why question we should not even start asking how! The question of why
is useful in many steps of the development process.
On the physical level making a cup of
coffee is not difficult. If you are addicted to coffee, and in great distress, you
will probably manage any technology
to put together a cup of java regardless
of any cognitive friction. But, without
having tried coffee, or perhaps just
testing it once, would you envision
starting an industry by designing
coffee makers or a world wide chain
of coffee shops?
Coffee
XII
Coffee IX
III Coffee
V1
Coffee
Hakan Gulliksson
244
VI.2 Design for H-H
The problem with human co-operation using technology is that all aspects
of human life, i.e. all of its complexity, must be channelled over
technology.
Of course this is difficult!
Typical questions to be answered by the designer when a team of human
interactors are studied are:
Who speaks?
Who is spoken to? A designer faces quite different challenges for
communications of type one-to-one, one-to-many, or one-to-all.
What is said and why?
When and for how long? Floor control and user roles should be
considered.
What medium? Face to face or e-mail are two alternatives.
What method is used for decision-making? Negotiation or central
control, for instance.
For multi-user applications, such as collaborative work environments, a
designer also faces the problem of evaluation. It is many times difficult to
observe the system in use since users are not collocated, and it is also
difficult to create realistic test conditions in a laboratory.
The user in her different roles will be a part of the design. This is both
because she can tinker with system properties, and because she and her
group will adapt their behaviour to the system, thereby changing the
context it was designed for. The fact that more than one user interacts
raises additional issues compared to a single user application. A designer
has to reflect on privacy, access control, and conflicts between individuals
or groups, e.g. between managers and others [JG2]. Should a manager be
allowed to add some functionality, e.g. a mandatory time schedule that
suits his purposes and his ways of working, but only adds to the work of
others?
In reference [JG2] the author formulates some additional challenges for
developers when designing groupware and other equipment for H-H
interaction:
Exception handling is difficult. Interaction and groupwork
follow the rules most of the time. The problem is that sometimes
rules are broken for good reasons. Usually because breaking the
rules makes work much easier. If workarounds have to be used
the groupware becomes an obstacle.
Lack of experience in designing groupware systems. It is
difficult to intuitively foresee all the intricate dynamics involved,
even a small change, such as making the creation date of
documents visible could change behaviours.
Problems with introducing and managing the system. One
example is that high tech gear is typically designed to show off
the technology. This can make users appear socially unattractive.
Work processes can usually be
described in two ways:
the way things are supposed to
work and the way they work.
[JG2]
E-mail has been a success because the sender takes the initiative and has to
do most of the work. You quickly read an e-mail, even though too many emails, and computer viruses, could turn even only reading them into a
Hakan Gulliksson
245
nightmare. E-mail is compatible with common office practise, it is
informal, and easy to adapt to new situations, even if emotions are
difficult to express and easy to misinterpret. Evaluation of e-mail as a
CSCW tool is ongoing everywhere by us all, and still, after 30 years the
verdict is not final. E-mail is constantly compared to, and emerges with,
other new technologies. It is for instance difficult to quickly browse an
MMS message that is audio only, and this could be a crucial difference in
favour of the text message.
The question of whether a new technology is labour saving is not an easy
one to answer. Some evidence suggests that the washing machine in fact
increased the time spent washing! The reason was that the acceptable level
of hygiene changed. Whether the same is true about the dishwasher as
well is not known to us.
Culture – A set of beliefs, desires,
intentions, trust, morality.
J. Odell
Why is videoconference not
used to a greater extent at
Ericsson and Nokia (2004)?
VI.2.1 Ethics, privacy and security
As the context learns more about the interactors it can perform better, but
the backside of the coin is that this threatens the interactors integrity. Will
we feel comfortable in an everyday environment with computerised eyes
and ears that constantly observes us and registers our behaviour?
Hardware sensors shrink and are soon too small and numerous for
humans to relate to them. How will you know which nodes to shut down
for privacy if you cannot even see them? To top this, digitised information
is inherently lightweight (instant copy), and unfaithful. It is already
possible to track down both the addresses of the visitors at a web site and
where they have been before. This information is currently not used to its
full potential, but when it is, it will have a big effect on feedback from
usage patterns. The Amazon bookshop already collects information about
what books you are interested in.
When designing persuasive technologies you will carefully have to
consider the ethics of the services they produce. What if the user looses
time, money, or does something regrettable. Who is responsible if the
socially intelligent toy makes a serious mistake and perhaps hurts the
child? Should we blame the designer, the company who paid the designer,
the shop who sold the toy, Mother Nature, or perhaps the mother of the
child who bought the toy? The toy itself usually gets off easy. These
problems will become worse as persuasive technologies evolve. One
example is a computerised slot machine with a high level of psychological
insights.
“What might it be like to live in
a world where personal
information becomes available
as one moves from one space to
another? It is hard enough to
keep track of files in a desktop
PC, but, with new context
aware systems, how will we
know when information is
captured, accessed, and used,
and by whom, for what
purposes in context-aware
settings?
And how will this kind of
capability make us feel?
Victoria Bellotti, Keith
Edwards, Xerox Palo Alto
Research Center
The creators of a persuasive
technology should never
seek to persuade a person or
persons to something
they themselves would not consent
to be persuaded to do.
The Golden Rule of Persuasion.
A set of rules that should allow applications to be developed, without
risking a debate, has been proposed in the privacy guidelines issued by
the U.S Federal Trade Commission, also compare 9.8.4:
Notice: The individual should have clear notice of the type of
information collected.
Access: All information in the system should be accessible and
changeable by the users themselves and it is up to them to
change it, whatever way they like. This is for instance currently
not the case on the web or in newspapers.
Commercial organisations are
mostly interested in the integrity of
data, the military worries more
about secrecy, and individuals are
concerned about privacy.
[UL]
Use: How the information is used, and by whom, should be clear
to the information provider. Once again this is not the fact in
today s real life.
You already have zero privacy.
Get over it
Scott McNealy
Hakan Gulliksson
246
Security: Users are only allowed to access information that they
anyway could have obtained, for instance by personally
participating in an event. Reasonable measures should be taken
to secure the data from unauthorized access.
Choice: An individual should have the choice to deny data
collection.
If more information is shared by the interactors, the potential complexity
of the interaction will increase. Information about previous interactions is
obviously interesting to store. The problem is that there is way too much
information that could be potentially useful (somehow Humans found a
way around this problem millions of years ago). For reliability we also
need ways to distribute and safeguard the information while maintaining
fast access.
VI.3 Design for I-I
No human user in sight means that many of the aspects of design are not
applicable. Focus is instead on function and efficient use of resources and
the most important resource is the networkk. Compliance to standards
reduces redundant work and enable reuse. Performance is important, but
also reliability, and security. Networked equipment are usually quite
complex devices so advanced software is needed. Since performance is
important so is hardware, to speed things up.
Wireless networks make many new applications possible. They are in fact
at times so revolutionary that traditional affordances are no longer valid.
With no cables attached to the stereo it is difficult to identify a
loudspeaker. What is my device currently up to and interacting with?
What consequences does this interaction have, e.g. for safety, security,
stability, and configuration of other applications?
Smaller local area access networks are not too difficult to design and
manage, even though administrating a network with 10 users can be quite
labour intense. A designer of a larger network faces real challenges.
Policies, i.e. who is allowed to do what with what, and security are
difficult issues.
Resources
needed
time
Performance
An iteraction is in general much more dynamic than a medium. The
interaction involves at least two interactors and a medium (message), and
extends over time. To evaluate an interaction evaluating the medium
is not enough, the characteristics of the interactors and the context must be
accounted for. This is not too difficult for I-I interaction where efficiency is
easy to measure, e.g. a short message is better. Whenever a human
is involved in the interaction the evaluation of the medium is much more
complicated.
VI.4 Design for H-I/T
At this stage we would like to relate technology for interaction to design.
The figure below shows three different aspects of design, each with its
own knowledge space. How a designer goes about designing is one
dimension, another is how a product fulfils its purpose, e.g. through
usability and credibility, and thirdly, perhaps most important is what the
product is used for.
Hakan Gulliksson
time
Tax
Phone
bill
Morgage
Rent
247
Ergonomic design
Visual design
Interaction design
Concept design
Requirement spec.
Reuse (Standard)
Sustain a vision
How a designer
designs a product.
Credibility
Usability
Novelty
Technology
Economy
Aesthetics
Status
…
How a product
fulfils its purpose.
ψasic operation Χmerge, join, …Ψ
Interaction (identification,
navigation, choice …Ψ
Serves a purpose (desiderata)
Figure VI.4.1 Design and
Change (to/from)
technology interact.
What a product does.
The more abstract product deliveries, i.e. desiderata and change (to/from)
in the figure VI.4.1 deserve additional comments. Panta Rei, i.e. everything
flows, is one of many ways to express that change is inevitable. We
constantly either manoeuvre towards a wanted state, or move away from
something we do not like. The alternative is not very interesting; no
change at all equals death. A desideratum (that-which-is-desired) is what
we intend the world to be, it is the purpose of our product (and our lives),
and not only limited to needs. A product that satisfies a need, for instance
a pair of shoes, is very limited in scope compared to a social system
designed to keep a group of people working in good spirit towards a
common goal. Desiderata are related to the intentional states and emotions
discussed in part IV of the book where desire was one of the intentional
states listed. For a more thorough treatment of desiderata we refer to
reference [ES].
To conclude, usability is of course important for a product to fulfil its
purpose, but so are also many other aspects, such as adaptability that
makes personalised services possible, and aesthetics that makes them
enjoyable. Infrastructures for computing and communication are key
technologies, and the limited access to valid contextual information,
especially social context, will constrain the level of innovation.
Remember that behaviour changes as a result of the introduction of new
technology and this uncovers new opportunities for technology and
services.
A desideratum is something
that is evoked out of a want, a
desire, a hope, a wish, a
passion, an aspiration, an
ambition, a quest, a call to, a
hunger for, or will towards.
[ES]
FREE
ALL
FREEDOM
TO CHANGE
FREE
I
FREE
ME
T
VI.4.1 Information overload
Humans always try to process input into meaningful representations with
their limited memory and processing capability. This threatens to drown
us because of the current explosion of machine generated, and machine
transported information.
This said the statement in the margin refines the problem of information
overload somewhat. The problem is not so much the amount of
information, but rather what we want to do with it. We humans can
handle lots of information, as proved by our vision system. The
information stress is rather an effect of how we consciously choose to
process information, the number of goals we strive for, and their
complexity. This kind of stress probably also existed several hundred
years ago, without computers and databases. Information overload is
something that must be kept in mind when designing user interfaces for
H-T, and H-I interaction.
Hakan Gulliksson
“Information overload is
the existence of a gap
between what can be done
and what one wants to do
or think one should do with
existing information”
P Wilson [ABG]
“ … there are limits to what
computers can expect their human
companions to put up with”
Paul Saffo [TWD].
248
VI.4.2 Incidit in Scyllam, qui vult vitare Cha-ry'bdim
(Out of the cauldron into the fire)
It seems that humanity has strived and longed for material things,
comfort, and security for a long time. One effect of this is that we are
managed by time rather than managing our own time. Our tools are
running the show. Now, assuming that we will soon have, or maybe
already have, an acceptable material standard, what will our next
accomplishment be? Will we not try to embed and develop ourselves in a
social, probably virtual environment, e.g. by posting as someone or
something else doing something interesting somewhere else? The interest
in networked games and virtual communities indicate this. Then, maybe
we will get rid of the ringing of the alarm clock, but instead develop
dependencies on many other (yet unknown) aspects of the new virtual
environments. Is the sound of an arriving e-mail an indication of this? Or,
perhaps the reports on stressed young people who cannot turn of their
mobile phones, SMS:ing all night, afraid of missing something important?
Society will disintegrate families,
but the Internet and the mobile
networks will re-integrate them.
Keep in touch with your children at
school or when they grow up, all
over the world.
Selling virtual homes where a
family can live will be big business.
What colour are your exclusive
virtual curtains?
/HG
Information society ->
interaction society ->
relation society->
/HG
VI.5 Design for T-T
Design of T-T interaction is a well-developed area. Civil engineers have
built bridges for thousands of years and in the last hundred years there
has been an enormous development in different kinds of machinery. New
materials and new methods of manufacturing still hold many surprises.
The new information society shifts emphasis from machines and material
towards information, and knowledge work, but somehow we have to
package all of the information and processing capacity that will be
available to us, and give it physical shapes. We do this by giving the
manufacturing tools more intelligence and processing capacity, and by
introducing new materials. This makes exteriors possible, which are more
adapted to the use of the thing, and at the same time provides for aesthetic
solutions. Our machines are more and more craftsmen that can use
materials optimally, economically, and without the time penalty paid by
human craftsmen.
Every material has its own characteristics, hardness, stiffness, and
strength, and for some materials these can be changed. Using the
appropriate tools, a piece of such a material can be given a specified
shape, size and weight. The techniques used are wasting, forming, and
casting [DP]. Wasting means removing material from some piece until it
gets the right shape. Forming is changing the shape, not by removing
material, but by transforming the material in the piece by bending or
pressing. The last technique is casting where a mould is covered by or
filled with some liquid that hardens. With today s technology we are very
good at performing the techniques above. We have machines that can do it
with enormous speed and precision.
Hakan Gulliksson
The technological advances of the
next age will be in processing.
[DP]
For every way there is to build
something up there are many
ways to break it down.
Pessimistic view on the world
Still the world is surviving and
even developing.
Positiv response.
Stiff, strong
Light, thick
249
What we still can improve is perhaps how we process the material itself,
i.e. how we alter its properties to match the design problem at hand.
Figure VI.5.1 Steel
Service Corporation.
As technology develops, the number of possible interactions T-T, without
including H in the loop, will increase. One example is a smart tag that
identifies other tags. It could soon be integrated into mobile phones, or
maybe the mobile phone will be integrated into the smart tag? A smart tag
can be used to simplify payment, as a security device unlocking doors, or
for exchanging business cards at meetings.
Robotics is another research area with numerous possibilities. Primitive
automatic lawn movers and vacuum cleaners are already found on the
market. But, most tasks are much more complex than they seem. Building
two robots that together lifts something up, and carries it between two
positions is quite difficult in general. The two robots need good internal
models of each other, the physical environment, and of the task to
perform. They also need means of communication, must be fail safe, and
capable of real-time planning and adaptation to changes in the
environment. The seemingly simple task contains many of the problems
we have discussed in this book.
T
T
T
Context is of course important also for T-T design. A designer is faced
with the following choices when designing a vehicle for handicapped
[SC4]:
1.
2.
3.
4.
Make the vehicle smart enough to manage all kinds of
environments
Adapt the vehicle to a specific environment
Adapt the environment
Make the environment smart enough to adapt to all kinds of
vehicles.
If we add humans to the environment, and change the design target to a
system including humans, the designer will not have access to all
information in the system. The design problem will be more difficult, but
note that with humans in the loop the ability for adaptation increases
immensely.
Hakan Gulliksson
250
PART VII: Resources
VII.1 References
[“”G] “nders ”roberg, Cognitive tools for learning , UMINF .
[“C] “ndy Clark, ”eing there , IS”N -262-53156-9
[ACR] “lan Cooper, The inmates are running the “sylum , IS”N -672-31649-8
[“D] “. Dey, Providing “rchitectural Support for ”uilding Context-“ware “pplications , PhD Thesis at
Georgia Institute of technology, November 2000.
[“D ] “lan Dix et al, Exploiting space and location as a design framework for interactive mobile systems ,
ACM Transactions on Computer-Human Interaction (TOCHI), Volume 7 , Issue 3, September 2000.
[“D ] “. Dey et al , “ conceptual framework and a toolkit for supporting the rapid prototyping of contextaware applications , Human-Computer Interaction, vol 16, pp 97-166, 2001
[“DFH] “. “kmajian et al, Linguistics “n Introduction to Language and Comunication , IS”N -262-51086-3.
[“D ] “lan R. Dennis, Rethinking Media Richness Towards a Theory of Media Synchronicity , Proceedings of
the 32nd Hawaii International Conference on System Sciences ,1999.
[“D ] “. Dey, Evaluation of Ubiquitous Computing Systems , available at
www.cs.berkeley.edu/~dey/pubs/emuc2001.pdf 2004-06-16
[AD6] A. Dix et al, Exploiting Space and Location as a Design Framework for Interactive Mobile Systems , “CM
transactions on Computer-Human Interaction, Vol 7, No 3, September 2000, p 285-321.
[“H] “. Huang, et al, Running the Web backwards appliance data services ,
Computer Networks 00, 2000, p 1-13.
[AK] Klovdahl, A. S. (1989). Urban Social Networks: Some methodological problems and possibilities. In M.
Kochen (Ed.) The Small World. Norwood, NJ: Ablex.
[AM] A. Munro., K. Höök, and D. Benyon. eds. 1999. Social Navigation of Information Space. London: Springer
verlag
[“M ] “aron Marcus, Eugene Chen, Designing the PD“ of the future , Interaction of the “CM, Jan-feb 2002.
[“S] “lbert Schmidt et al, “dvanced Interaction in Context , Lecture notes in computer science Vol 1707, ISBN
3-540-66550-1; Springer, 1999, pp 89-101
[“S ] “lbrecht Schmidt, MediaCups Experience with Design and Use of Computer-Augmented Everyday
“rtefacts , Computer Networks, Special Issue on Pervasive Computing, Vol. 35, No. 4, March 2001, p. 401-409
[“S ] “lbrecht Schmidt, Implicit Human-computer-interaction Through Context , Personal Technologies
Volume 4(2&3), June 2000. pp 191-199
[“S ] “. Sloman, “rchitectural requirements for human like agents , in Human Cognition and Social “gent
Technology , edited by Kerstin Dautenhahn, IS”N
.
[”“] ”. “nderson, Information Society Technologies and Quality of Life ,
Chimera working paper number 2004-09
[”F] ” Fogg, Persuasive Computers Perspectives and Research Directions , Proceedings of the Conference on
Human Factors in Computing Systems (CHI-98) : Making the Impossible Possible (CHI98). Los Angeles, CA,
USA, 1998, pp 225-232.
[”F ] ”. Fogg, Persuasive Technology Using Computers to Change What We Think and Do ,
ISBN 1-55860-643-2
[”MD] ”arnard, Duke, May, Duce, Systems, Interactions, and Macrotheory ,
ACM Transactions on Computer-Human Interaction, Vol 7, No 2, June 2000, pp 222-262
[BS] Bruce R. Schatz, The Interspace Concept Navigation “cross Distributed Communities ,
IEEE Computer, January 2002, pp 54-62
[”S ] ”en Shneiderman, Designing the user interface , IS”N -321-19786-0
Hakan Gulliksson
251
[”S ] ” Schneiderman, The eyes have it “ Task by Data Type Taxonomy for Information Visualizations , Proc.
IEEE Symp. Visual Languages, 1996.
[”S ] ”. Salem, “esthetics as a Key Dimension for Designing Ubiquitous Entertainment Systems ,
retrieved 17 oct 2005 at
http://www.idemployee.id.tue.nl/g.w.m.rauterberg/publications/UBIHOME2005paper.pdf
[CC] Chris Crawford, On interactive storytelling , IS”N -321-27890-9
[CC ] Chris Crawford, Understanding interactivity , available at http //www.erasmatazz.com/
[CG] C. Greenhalgh, “ugmenting Reality Through the Coordinated Use of Diverse Interfaces , ???.
[CH] Carrie Heeter, Interactive in the Context of Designed Experiences , Journal of Interactive “dvertising,
Volume 1, Number 1 Fall 2000.
[CH ] C. Hummels et al, Knowing, doing, and feeling Communicating with your digital products , “vailable at
http://www.io.delft.nl/id-studiolab/djajaningrat/publications.html 2002-11-12
[CMN] Card, Moran Newell, The Psychology of Human-Computer Interaction ,
[CS] C. Snyder, Paper prototyping , IS”N -55860-870-2
[CW] C. Wickens, Engineering Psychology and Human performance , IS”N -321-04711-7
[D“] Diane “ckerman, “ natural history of the senses , IS”N -679-73566-6
[D“””] Davenport et al, Synergistic storyscapes and constructionist cinematic sharing ,
IBM system journal, vol 39, no 3&4, 2000
[D“N] Donald “ Norman, Things that make us think , IS”N -201-58129-9
[D“N ] Donald “ Norman, The psychology of everyday things
[D” ] David ”enyon, Representations in human-computer systems development ,
http://www.dcs.napier.ac.uk/~dbenyon/publ.html.
[D” ] David ”enyon, Emplying intelligence at the interface , http //www.dcs.napier.ac.uk/~dbenyon/publ.html.
[DC] D. Cohen et al, Livemaps for Collection “wareness , Journal of Human-Computer Studies, vol 56, no 1, pp
7-23, Jan. 2002.
[DD] D Dennett, Cognitive Wheels The Frame Problem of “I in M. ”oden ed The Philosophy of “rtificial
Intelligence, 1990, pp 147-170, ISBN 0-19-824854
[DD ] D. Davis, Why do anything?, Emotion, affect and the fitness function underlying behaviour and thought.
Affective Computing, AISB 2004, University of Leeds, UK
[DHA] David Harel, Computers ltd. , ISBN 0-19-850555-8
[DH] Douglas Hofstadter, Gödel, Escher, ”ach an eternal golden braid , IS”N -140-05579-7
[DK] David Kirsh, Interactivity and Multimedia Interfaces , Instructional sciences
[DK ] David Kirsh, Today the earwig, tomorrow man , “rtificial Intelligence vol , pp
-184, 1991
[DK ] David Kieras, “ Guide to GOMS Model Usability Evaluation using GOMS and GLE“N , available at
ftp.eecs.umich.edu/people/kieras, 1999
[DM] David Marr, Vision, ISBN 0-7167-1567-8
[DN] Donald “ Norman, The design of everyday things , IS”N -385-26774-6
[DN ] Donald “ Norman, Emotional design , IS”N -465-05135-9
[DP] D. Pye, The nature and aesthetics of design , IS”N -71-3652861.
[DP ] D. Pinelle et al, Task “nalysis for Groupware Usability Evaluation Modeling Shared-Workspace Tasks
with the Mechanics of Collaboration , “CM transactions of Computer-Human Interaction, Vol 10, No 4,
December 2003, pp 281-311.
[DS] D. Svanaes, Context-“ware Technology “ Phenomenological Perspective , Human-Computer Interaction,
Vol 16, pp 379-400, 2001.
[DW] D. M. Wegner, “ computer network model of human transactive memory , Social Cognition, 13, 1-21.
[EB] Eric Bergman, Information appliances and beyond , IS”N -55860-600-9
[E” ] E. ”eck et al, Experimental evaluation of Techniques for Usability Testing of Mobile Systems in a
Laboratory Setting , In Proceedings of OzCHI
, ”risbane, “ustralia.
[ED] E. Deci, The What and Why of goal pursutits Human Needs and he Self-Determination of ”ehaviour ,
Psychological inquiry, vol 11, no 4, 2000, p 227-268.
[EF] E. Freeman Lifestreams Organizing your Electronic Life , In “““I Fall Symposium “I “pplications in
Knowledge Navigation and Retrieval, November 1995. Cambridge, MA. http://citeseer.nj.nec.com/4353.html.
[EN] Elmasri, Navathe, Fundamentals of database systems , IS”N -201-54263-3
[FDFH] Foley et al, Computer graphics, principles and practice , IS”N -201-84840-6.
[ES] E. Stolterman, H. Nelson, The design way , IS”N 0-877-783055.
[ET] E. Tufte, Visual Explanations , IS”N -96139212-6.
Hakan Gulliksson
252
[FG] F. Gemperle at al, Design for Wearability ,
http://www.ices.cmu.edu/design/wearability/files/Wearability.pdf, available June 2003.
[FI] FIP“ Interaction Control Library Specification , www.fipa.org
[GA] G. “bowd et al, Charting past, present, and Future Research in Ubiquitous Computing, “CM Transactions
on Computer-Human Interaction, Vol 7, No 1, March 2000, p29-58 .
[GB] Giorgio Buttazzo, IEEE Computer, vol 34, no 7, July 2001
[GC] Guanling Chen et al, A Survey of Context-Aware Mobile Computing Research, Dartmouth Computer
Science Technical Report TR2000-381.
[GD] G. Doherty, Continuous Interaction and Human Control , Control Proceedings of 18th European
Conference on Human Decision Making and Manual Control ISBN 1-874152-08-X p.80-96 J. Alty (Eds), Group
D. Publications, Loughborough, (October 1999)
[GF] George W. Fitzmaurice, Graspable User Interfaces , PhD thesis, University of Toronto, 1996.
[GG] G. Graham, Philosophy of the arts , IS”N -415-23564-2
[GOF] Gamma, Helm, Johnson, Vlissides, Design patterns Elements of reusable object oriented design ,
Addison-Wesley, 1994, ISBN 0-201-63361-2.
[GP] G. Polya, How to solve it , IS”N -691-02356-5
[GR] G. Riva et al, Presence 2010: The Emergence of Ambient Intelligence,
http://www.vepsy.com/communication/volume4/4Riva.pdf, available May 2003.
[GR ] G. Riva, The Layers of Presence “ ”io-Cultural Approach to Understanding Presence in Natural and
mediated Environments , Cyberpsychology & ”ehaviour, vol , no ,
,p
-419.
[HG] Hans Gellersen et al, Multi-Sensor Context-“wareness in Mobile Devices and Smart “rtefacts , Mobile
Networks and Applications (MONET), Oct 2002.
[HK] Hideki Koike, Integrating Paper and Digital Information on EnhancedDesk “ Method for Realtime Finger
Tracking on an “ugmented Desk System , “CM Transactions on Computer-Human Interaction, Vol 8, No 4,
2001, pp 307-322
[HM] Hans Moravec, When will computer hardware match the human brain ,
Journal of Transhumanism, vol 1, 1998.
[HP] H. Parunak et al, Co-X defining what agents do together ,
www.jamesodell.com/publications.html, visited 23/8 2002.
[HS] Heiko Sacher, Gareth Loudon, Uncovering the New Wireless Interaction Paradigm ,
Interactions of the ACM, Jan-feb 2002.
[HS ] H. Stelmaszewska, Conceptualising user hedonic experience , In D. J. Reed, G. Baxter & M. Blythe (Eds.),
Proceedings of ECCE-12, Living and Working with Technology. York: European Association of Cognitive
Ergonomics. pp 83-89.
[J“] John “rmitage, From User Interface to Uber-Interface “ Design Discipline Model for Digital Products ,
Interactions of the ACM, May-June 2003.
[JC] John Carroll, Human-Computer Interaction in the New Milennium , IS”N -201-70447-1.
[JC ] J. Carroll, “ User-centred Process for Determining Requirements for Mobile Technologies: the TramMate
project , Proceedings of the th Pacific Asia Conference on information systems 2003.
[JD] John Deller et al, Discrete-Time Processing of Speech Signals , IS”N -7803-5386-2.
[JF] John Fiske, Introduction to communication studies , IS”N -415-04672-6.
[JHM] Murray, Janet H, Hamlet on the Holodeck, ISBN 0-262-63187-3.
[JG] Jacek Gwizdka, What s in the Context , The CHI
workshoop .
[JG ] J. Grudin, From here and now to Everywhere and forever ,
research.microsoft.com/research/coet/grudin/ubicomp.pdf available 2002-11-15.
[JG ] J. Grudin, Groupware and Social Dynamics Eight Challenges for Developers ,
Comm. Of the ACM 37(1), 92-105, 1994.
[JG ] J. Grudin, The Computer Reaches Out The Historical Continuity of Interface Design , Proceedings of the
SIGCHI conference on Human factors in computing systems. 1990, p261-268
[JLES] J Löwgren, Erik Stolterman, Design av informationsteknik , IS”N -44-00681-0.
[JF] Jacques Ferber, Multi-“gent Systems , IS”N -201-36048-9.
[JG] J. Gratch, “ Domain-independent Framework for Modeling Emotion Journal of Cognitive Systems
Research, vol 5, no 4, 2004, p 269-306
[JH] John Hughes et al, Patterns of Home Life Informing Design for Domestic Environments , Personal
Technologies' Special Issue on Domestic Personal Computing vol. 4, p.25-38, London: Spinger-Verlag.
[JH ] Jeffrey Hightower, Location Systems for Ubiquitous Computing , IEEE Computer, vol. 34, no. 8, pp. 57-66,
Aug 2001.
Hakan Gulliksson
253
[JH3] J. Habermas, ISBN 0807015075
[JK] Julie Khaskavsky, et al, Understanding the seductive experience , Communications of the “CM, May
,
Vol 42, no 5, pp 45-49.
[JL] Joelle Coutaz et al, Context is key, Communications of the ACM March 2005, vol 48, no 3]
[JJ] Jeff Johnson, “ustin Henderson, Conceptual Models ”egin by Designing What to Design , Interactions of
the ACM Jan-Feb 2002.
[JM] Joseph McCarthy, The Virtual World Gets Physical, IEEE Internet Computing, Nov/Dec 2001, vol 5, number
6.
[JN] Jacob Nielsen, D is better than D , http //www.useit.com/alertbox/
.html., available “ug
.
[JN ] Jakob Nielsen, Designing Web Usability , IS”N 1-56205-810-X
[JP] J. Preece et al, Human-Computer Interaction , IS”N -201-62769-8.
[JP ] J. Preece et al, Interaction design , IS”N -471-49278-7.
[JR] Jun Rekimoto, TimeScape “ Time Machine for the Desktop Environment , CHI' late-breaking results,
1999.
[JR2] Jim Rowson, The social media project at HP labs , http //netseminar.stanford.edu/sessions/
-11-01.html,
available 2004-05-27
[JR ] J. Russell, Core “ffect, Prototypical Emotional Episodes, and Other Things Called Emotion Dissecting the
Elephant , Journal of Personality and Social Psychology, vol 76, no 5, p 805-819.
[JS] John Searle, Speech acts , IS”N -52-109626-X
[JS ] John Searle, “ taxonomy for illocutionary acts , in The philosophy of language pp
-155, Oxford
University Press 1971.
[JS3] John Searle, The problem of consciousness , found on the net but also in The rediscovery of mind , MIT
Press 1992.
[JU] J. Urda, Appraisal theory and social appraisals, retrieved 2005-11-17 at
http://ged.insead.edu/fichiersti/inseadwp2005/2005-04.pdf
[K“] Keith “llen, Meaning and Speech acts , http //www.arts.monash.edu.au/ling/speech_acts_allan.shtml.
[K”] K. ”erridge, Pleasures of the brain , ”rain and cognition vol ,
, pp
-128.
[KD] D. Canamero et al, Emotionally grounded social interaction , in Human Cognition and Social “gent
Technology , edited by Kerstin Dautenhahn, IS”N
.
[KH] K. Hinckley, “ Survey of Design Issues in Spatial Input , Proc. ACM UIST'94 Symposium on User
Interface Software & Technology, April 1994, pp. 213-222].
[KL] Kristof Van Laerhoven, On-line Adaptive Context Awareness starting from low-level sensors , Licentiate
thesis, Free University of Brussels, 1999.
[KS] K. Schmidt et al, „Coordination mechanisms Towards a Conceptual Foundation of CSCW Systems Design ,
Computer Supported Cooperative Work: The Journal mof Collaborative Computing, vol 5, 1996, p 155-200.
[KS1] K. Sheldon, Achieving sustainable new happiness:Prospects, practices, and prescriptions. In A. Linley & S.
Joseph (Eds.), Positive psychology in practice (pp. 127-145), 2004 Hoboken
[KS ] K. Sheldon, What is Satisfying “bout Satisfying Events? Testing
Candidate Psychological Needs ,
Journal of Personality and Social Psychology, vol 80, no 2, p 325-339.
[LC] L. Cheng Personal Contextual “wareness Through Visual Focus , IEEE Intelligent systems, May
, pp
16-20.
[LH] Lars Hallnäs, et al, Slow technology , Personal and Ubiquitous Computing, Vol. , No. ,
, pp.
-212.
[LEJ] Lars-Erik Janlert “ wider view of interaction , draft version June 2003.
[LEJ2] Lars-Erik Janlert “ generic medium model for new media , draft version June
.
[LS] L. Suchman, Plans and situated action, The problem of human machine communication ,
, ISBN 0-52133739-9
[M”] Miroslaw ”ober, MPEG-7 Visual Shape Descriptors , IEEE Transactions on Circuits and Systems for Video
technology, vol 11, no 6, June 2001, pp 716-719.
[M” ] Meridith ”elbin, ”elbin website , “vailable at www.belbin.com, September
.
[MB2] M. A. Baker (ed.), Sex differences in human performance. Contemp. Psychology, 1987, 33, 964-965
[M” ] M. ”ickhard, Emergence , http //www.lehigh.edu/~mhb /emergence.html, last visited /
[MC ] Mihaly Csikszentmihalyi, Creativity , IS”N -06-092820-4
[MC ] Mihaly Csikszentmihalyi, The meaning of things , IS”N -521-28774-x
[MC ] M. Csikszentmihalyi, Flow the psychology of optimal experience
[MG] “ Monk, N Gilbert, Perspectives on HCI , -12-504575-1
[MNWN] James H. McMillan, Jon F. Wergin, Understanding and Evaluating Educational Research ,
January 1998 Merrill Pub Co; ISBN: 0131935410.
Hakan Gulliksson
254
[MR] M. Riedl, “ Computational Model and Classification Framework for Social Navigation , Masters thesis,
North Carolina State University, 2001,Available at http://www4.ncsu.edu:8030/~moriedl/publications/thesis.pdf ,
2002-11-30.
[MR ] M. Raghunath et al, Fostering a symbiotic handheld environment ,
IEEE Computer, p56-65, September 2003
[MR] M. Rosson, Usability Engineering Scenario-Based Development of Human-Computer
Interaction , IS”N 1-55860-712-9
[MS ] Munhindar Singh, “gent Communication Languages Rethinking the Principles ,
IEEE Computer, vol 31, no 12, pp 40-47, December 1998
[MS ] Munhinda Singh, “ social Semantics for “gent Communication Languages , Issues in “gent
Communication. Lecture Notes in Computer Science 1916 Springer 2000, ISBN 3-540-41144-5
[MW] Martijn van Welie, Task-”ased User Interface Design , PhD dissertation.
[MT] M. Toda, The Urge Theory of Emotion and Social Interaction , Chapter and , retrieved
-11-18 at
http://cogprints.org.
[MW ] M. Weiser, et al Designing Calm technology , Powergrid Journal
,
http://www.ubiq.com/hypertext/weiser/calmtech/calmtech.htm, 10/1 2002.
[MW ] M. Wooldridge, Mulit-agent systems , IS”N -971-49691-X, Wiley 2002.
[MW4] Wiberg, M. (1999). Extending the modality of travelling: Designing travelling support for mobile IT users,
Jyväskylä, : Proceedings of IRIS 22, "Enterprise Architectures for Virtual Organisations ", s 49 -58.
[MZ] Michelle Zhou, Visual task Characterization for “utomated Visual Discourse “nalysis , Proceedings,
ACM CHI 1998, pp 392-399
[MZ ] Martin Zimmerman, Human psychology , nd Ed Berlin: Springer Verlag 1989
[NF] N. Frijda, The emotions , Cambridge press,
[NG] Neil Gershenfeld, The nature of mathematical modelling , IS”N -521-57095-6
[NG2] Nick Gibbins, lecture notes found at www.ecs.soton.ac.uk/~nmg97r/hci/task-analysis/
[NG ] N. Goodman, Ways of worlds making , IS”N -915144-51-4
[NS] Stillings et al, Cognitive science, an introduction , Second edition, ISBN 0-262-19353-1
ISBN 0-201-633618NS2]
[NS ] N. Shedroff, “rticles and presentations retrieved
at http://www.nathan.com/thoughts/index.html
[OJ] O hare, Jennings, Foundations of distributed artificial intelligence , IS”N -471-00675-0
[OR] Odd-Wiking, Rahlff et al, Using personal traces in Context Space, Position paper for the CHI 2000 workshop
WS11.
[P”] P. ”arnard et al, Representing cognitive activity in complex tasks , Human-computer-interaction, 14, 93158. (1999)
[PB2] P. J. Brown, G.J.F. Jones., Context-aware retrieval: exploring a new environment for information retrieval
and information filtering . Personal and Ubiquitous Computing, 5, 4, pp.253-263, 2001.
[PD] P. Dourish, Where the action is , IS”N
-04196-0
[PE] P. Ekman. Emotion in the Human Face Cambridge University Press, Cambridge, UK,
.
[PG] Peter Gärdenfors, Hur homo blev sapiens , IS”N -578-0352-8.
[PG ] Peter Gärdenfors Conceptual Spaces -262-57219-2
[PJ] P. Jordan, Designing pleasurable products “n introduction to the new human factors ,
, London, UK
Taylor & Francis.
[PL] Peter Lucas, Human-Computer Interaction , Vol , pp
-336, 2001.
[PL ] P.Langley, Elements of machine learning, IS”N -55860-301-8.
[POSA1] Meunier, Sommerlad, Stal, Rohnert, ”uschmann, Pattern-Oriented Software Architecture, Volume 1: A
system of patterns , IS”N -471-606952.
[POS“ ] Schmidt, Stal, Rohnert, ”uschmann, Pattern-Oriented Software Architecture, Volume 2: Patterns for
Concurrent and Networked Objects , ISBN 0-471-606952.
[PP] P. Persson et al, Stereotyping Characters “ Way of Triggering “nthropomorphism , “““I Fall
Symposium, 3-5 November 2000, available at http://www.sics.se/~jarmo/SocIntAgent.htm 2002-11-09.
[PR] Pentti Routio, “rteology Semiotics of “rtifacts , http www .uiah.fi/projects/metodi/
.htm, available June
2003.
[PT] Todd et al Judgement of domain-specific intentionality based solely on motion cues, available at
http://www-abc.mpib-berlin.mpg.de/users/barrett/ 2003-11-21.
Hakan Gulliksson
255
[RA] R. Arkin, ”ehaviour-”ased Robotics , IS”N -262-01165-4.
[R”] R. ”rooks, Flesh and Machines , IS”N -375-42079-7.
[RC] Craig, R. T, Communication theory as a field , Communication Theory,
,
, pp
-161.
[RD] R. Darken, et al, Wayfinding Strategies and behaviours in Large Virtual Worlds , Proceedings of the “CM
CHI 96, pp 142-149.
[RL ] R. Larsen Personality psychology , IS”N -07-111149-2.
[RL ] R. Layard, Happiness, has social science a clue? ,
http://www.sustainablepss.org/intro/Layard_2003b.pdf, retrieved Oct 29 2005.
[RMO] Rune Monö, Design for product understanding , IS”N -01105-x.
[RN] S. Russel, P. Norvig, “rtificial intelligence , IS”N -13-360124-2.
[RN ] “ new framework for Entertainment Computing From Passive to “ctive Experience , In proceedings of
the ICEC 2005 LNCS 3711, p1-12, 2005
[RSW] R. Wurman, Information anxiety , IS”N 0-78-972410-3.
[RR ] Rolf Rolfsen, Contextual “warteness Survey and Proposed Research “genda , SINTEF Telecom and
Informatics, 1999.
[RR ] Rahlff, O. W., Rolfsen, R. K., Herstad, J., Thanh, D. v., Context and Expectations in Teleconversations ,
Proceedings of HCI International '99 (the 8th International Conference on Human-Computer Interaction).
[RR ] Rettie, R.
. Using Goffman sframeworks to explain presence and reality. , Presence
Seventh
Annual International Workshop. 117-124. Valencia, ISPR
[RR ] R. Rettie, Presence and Embodiment in Mobile Phone Communication ,
Psychology journal, vol3, no 1, 2005, p 16-34
[RS] R. L. Schalock, The concept of qualiity of life what we know and what we do not know , Journal of
disability research, vol 48, March 2004, pp 203-216
[RV] R. Veenhoven, “dvances in understanding happiness , Happiness. Revue Québécoise de Psychologie,
1997, Vol. 19, pp. 29-74.
[RV ] Ruut Veenhoven, The four qualities of life , Journal of happiness studies
, Vol 1, pp 1-39.
[RV ] R. Vallacher, What Pople Think They are Doing , Psychological review
, vol , no , p -15.
[RW] Robert Williams, Mapping Genes that Modulate Mouse ”rain Development “ Quantitative Genetic
“pproach , in Mouse brain development Goffinet AF, Rakic P, eds). Springer Verlag, New York, pp 21–49.
[S”] Steve ”enford, Understanding and Constructing Shared Spaces with Mixed Reality ”oundaries , ACM
Transaction on Computer-Human Interaction (ToCHI), 5 (3), pp.185-223, September 1998, ACM Press.
[SC] Schidt, User interface for wearable computers –Don t Stop to Point and Click, Intelligent Interactive
Assistance & Mobile Multimedia Computing (IMC'2000) Rostock-Warnemünde, Germany - November 9-10, 2000,
http://www.teco.edu/~albrecht/publication/imc00/uis-for-wearables-abstract.html.
[SC ] S. Chan et al, Usability for mobile commerce across multiple form factors , Journal of Electronic
Commerce Research, No 3, 2002.
[SC ] Sanjay Chandrasekharan, Semantic web a distributed cognition view , Carleton University Cognitive
Science Technical Report 2002-13, www.carleton.ca/iis/TechReports/files/2002-13.pdf, visited 2005-05-12.
[SD] Steve Dubrow-Eichel, ”uilding Resistance Tactics for Counteracting Manipulation and Unethical
Hypothesis in Totalistic Groups , Suggestion the journal of professional and Ethical Hypnosis, 1985, pp 34-44.
Available at http://users/snip/net/~drsteve/articles/building_resistance.html 2002-12-01.
[SK] Kristoffersen, S. and F. Ljungberg, Mobile Use of IT, In Proceedings of IRIS22 1998, Jyvaskyla, Finland
[SM] Steve Mann, Wearable Computing Toward Humanistic Intelligence ,
IEEE Intelligent Systems, May 2001, pp 10-15.
[SP] Steven Pinker, How the mind works , IS”N -393-31848-6.
[SR] Rajeev Sharma et al, Toward multimodal human-computer interface ,
Proceedings of the IEEE, vol86, no 5 May 1998.
[SS] S. Smith et al, Using the Resources Model in Virtual Environment Design , Workshop on User Centered
Design and Implementation of Virtual Environments, S. Smith and M. Harrison (eds), pg 57-72, 30th September,
1999, University of York, York.
[SS ] Steven Shafer, Interaction issues in Context-“ware Intelligent Environments , to appear, available at
http://research.microsoft.com/easyliving/publications.htm.
[ST] Shawn Tseng et al, Credibility and Computing Technology , Communication of the “CM, May
, Vol
42, no 5.
[SW] S. Wehrend and C. Lewis, A problem-oriented classification of visualization techniques. In Proceedings
IEEE Visualization '90, pp. 139-143.
Hakan Gulliksson
256
[TH] T. Höllerer, et al, Exploring MARS: Developing Indoor and Outdoor User Interfaces to a Mobile Augmented
Reality System , Computers and Graphics, 23(6), Elsevier Publishers, Dec. 1999, pp. 779-785
[TJ] Timo Jokela, When good things happen to bad products , Interactions, Nov-Dec 2004.
[TM] Toshiyuki Masui, Real-World Graphical User Interfaces , Proceedings of the First International
Symposium on Handheld and Ubiquitous Computing, No. 1927, pp. 72-84, September 2000.
[TM2] T. Mitchell, Machine learning, ISBN 0-07-115467].
[TM3] T. Moran, T. P. (1980). A framework for studying human-computer interaction. In Methodology of
Interaction, R. A. Guedj et al., eds., North-Holland, 293-301.
[TM4] T. Moran [private conversation]
[TN] Tor Nörretranders, The user illusion , IS”N -140-23012-2
[TS] Thad Starner, Human powered wearable computing , I”M systems journal, vol , no & ,
.
[TS ] T. Selker, Context-aware design and interaction in computer systems ,
IBM systems journal, vol 39, no 3&4, 2000.
[TSi] Thomas Sikora, MPEG-7 Visual Standard for Content Description – “n Overview , IEEE Transactions on
Circuits and Systems for Video technology, vol 11, no 6, June 2001, pp 696-702.
[TWD] Winograd et el, ”ringing design to software , IS”N -201-85491-0
[TWD ] Winograd and Flores, Understanding Computers and Cognition , IS”N -89391-050-3.
[U”] U. ”orghoff, J. Schlichter, Computer-Supported Co-operative Work , IS”N -540-66984-1.
[UCIT] http://WWW.umu.se/ucit
[UI] Underkoffler, URP a luminous tangible workbench for urban planning and design ,
Proceedings of CHI-99, pp 386-393, 1999
[UL] U. Leonrhardt, Supporting Location-“wareness in Open Distributed Systems , PhD thesis, Faculty of
engineering University of London, 1998.
[VB] V. Bellotti, et al, Intelligibility and “ccountability Human Considerations in Context-“ware Systems ,
Human-Computer Interaction, vol 16, 2001, pp 193-212.
[W”] W. ”uxton, Integrating the periphery and Context “ new taxonomy of telematics , Proceedings of
Fgraphics Interface
,p
-246.
[WC] Wayne Christensen, Self-directedness a process approach to cognition , “xiomathes, vol , p
171-189, 2004
[WE] William Eamon, Technology and Magic, Technologia
, -64
[WK] W. Kintsch, The representation of knowledge in minds and machines , International journal of Psychology
1998, vol 6, no 33 pp 411-420.
[WM] Wendy Mackay, Media Spaces Environments for Informal Multimedia Interaction , Computer Supported
Co-operative Work, John Wiley 1999.
[XFR] Xristine Faulkner, Usability engineering , IS”N -333-77321-7
[YB] Yaneer Bar-Yam, Dynamics of Complex Systems , IS”N -201-55748-7
Hakan Gulliksson
257
The Philosopher's Song
(Monty Python)
Immanuel Kant was a real pissant
Who was very rarely stable.
Heidegger, Heidegger was a boozy beggar
Who could think you under the table.
David Hume could out-consume
Wilhelm Friedrich Hegel,
And Wittgenstein was a beery swine
Who was just as schloshed as Schlegel.
There's nothing Nietzsche couldn't teach ya
'Bout the raising of the wrist.
Socrates himself was permanently pissed
John Stuart Mill, of his own free will,
On half a pint of shandy was particularly ill.
Plato, they say, could stick it away
Half a crate of whiskey every day.
Aristotle, Aristotle was a bugger for the bottle,
Hobbes was fond of his dram,
And Rene Descartes was a drunken fart:
"I drink, therefore I am"
Yes, Socrates, himself, is particularly missed;
A lovely little thinker but a bugger when he's pissed!
Hakan Gulliksson
258
VII.2 Index
Abstraction, 13, 41, 49, 117
Accountability, 60
Action, 12, 83
Representation, 99
Action cycle, 84, 85, 125
Action loop, 95
Action tendency, 130
Activation, 127
Activity, 85, 161
Actuator, 76
Adaptation
Human, 113
Interactor, 63
System, 26
Aesthetics, 133
Communication, 178
Affect, 127
Affection level of interaction, 181
Affective learning, 115
Affordance, 60
Aggregation, 13
Aggregation, UML, 41
Analogue representation, 76
Analogue signal representation, 66
Analysis
Image, 68
Interaction, 82, 176
AND, 54
Anger, 127
Approximation, grouping by, 46
Argumentation, 204
Arousal, 125
Art, 141
Artefact, 73
Artificial intelligence, 97
Ascii code, 67
Associative memory, 24
Assymetry in interaction, 180
Attention, 101
Auction, 204
Augmented reality, 165
Autonomy, 28, 129
Bandwidth of interaction, 180
Battery, 143
Behaviour, 20, 50
Behavioural model, 50
Best first search, 110
Binary number, 66
Binding design, 177
Biomedia, 226
Hakan Gulliksson
Bit, 66
Blind search, 110
Bloom, learning taxonomy, 115
Bottom up, 229
Breadth first search, 110
Byte, 66
Cause and effect, 20, 109
Cave, 165
Centralised control, 196, 215
Challenge, 139
Channel for communication, 187
Choice, 216
Chunk, 99
Chunk of data, 104
Classification, 119, 215
Client-server, 199
Close coupling, 211
Clustering, 47
Cocktail effect, 104
Coding, 53
Coding of text, 67
Cognition, 60, 61
Cognitive friction, 243
Cognitive learning, 115
Cohesion, 23, 41
Cohesion in interaction, 182
Collision detection, 88
Colour, 70
Command, 196
Command based interaction, 215
Commitment, 104, 111, 124, 199
Common ground, 209
Common sense, 109
Communication, 12, 30, 190, 198
Competition, 147, 199
Complexity
System, 37
Complexity of interaction, 180
Computation, 21
Computer vision, 227
Computer-supported cooperative
work, 205
Computing, 94
Concept space, 119
Conceptual level, 177
Conceptual view of interaction, 200
Conceptual view of processing, 94
Conceptual view of system, 49, 176
259
Conceptual view, design
discrepance, 243
Concerns, 125
Concurrent processing, 21
Conditioning, 116
Conflict resolution, 199, 203
Congruence in interaction, 180
Context, 13, 55, 160
Activity, 161
Application, 161
Cultural, 55
Definition, 160
Environment, 160
For reasoning, 62
H-H, 167
I-I, 169
Interaction property, 183
Language interpretation, 188
Navigation, 231
Physical, 55
Self, 161
Situation, 160
Social, 55
Technological, 56
T-I, 174
Contingency problem, 107, 109
Continous model, 52
Continous value, 52
Control, 195
Conventions, 199
Conversation, 189
Cooperation, 147, 199
Coordination, 156, 198
Coordination for interaction, 197
Correlation, 198
Correlation in interaction, 182
Coupling, 23, 41
Coupling in interaction, 182
CPU cache, 101
Creativity, 18
Creativity level of interaction, 181
Creativity, definition, 120
Creativity, human, 120
CSCW, 205
Cultural context, 55
Cycle, 43
Damping, 187
Data flow model, 51
Data, definition, 65
Database, 45
Decision cycle, 84, 85
Decision making, 112, 204
Declarative knowledge, 90
Deductive reasoning, 105
Hakan Gulliksson
Depth first search, 110
Design, 18
Deterministic system, 52
Digital number, 67
Digital representation, 66, 76
Discourse, 188, 190
Discrimitation, 215
Distributed behavior, 184
Distributed control, 196
Divide and conquer, 44, 109
Drive, 129
Duration of interaction, 183
Dynamic system, 35
Education, 17
Effectiveness in interaction, 180
Effectuator, 76
Effectuators, 62
Efficiency of interaction, 181
Emergence, 32
Emotion, 125, 126, 219
Empirism, 90
Energy source, 141
Energy, and adaption, 43
Entropy level of interaction, 181
Environment, 13, 55, 160
Cultural, 55
Physical, 55
Social, 55
Episodic knowledge, 90
Epistemology, 90
Equilibrium, 34
Error
In model, 48
Ethics, 246
Evaluation, 245
Event, 125
Event, system view, 20
Evolution, 8, 26, 28
Human, 100, 140, 191
Learning, 114
Perception, 61
Thing, 57, 142, 160
Exhaustive search, 110
Experience, 131
Expert system, 98
Fear, 127
Feature space, 119
Feedback, 23
Feed-forward control, 25
Filter
Message, 188
Flexibility, 26
Floating point numbers, 67
260
Flow, 138
Focusing, 42
Formal model, 51
Frame, 93, 121
Frame problem, 110
Frequency of interaction, 183
Functional level, 177
Functional model, 51
Generalisation, 13
Generalisation, UML, 41
Geon, 229
Gestalt theory, 45
GIF, 69
Goal, 62, 84, 130
Representation, 99
Goal based design, 244
Golden ratio, 141
Greedy search, 110
Grouping, 41
Grouping for cooperation, 202
Groupware, 205
Gulf of evaluation, 244
Gulf of execution, 244
Habituation, 116
Happiness, 136
HCI, 152
HCI, definition, 152
HCI, Human Computer Interaction,
57
Hearing, 78
Hedonic experience, 132
Hermeneutic, 192
Heterogenicity, 28
Heuristic function, 110
Heuristics, 106, 110
Hierarchy
Contexts, 183
HITI model, 13
History of interaction, 181
HIT, 8
HITI, 8
HITI model, 11
Holism, 32
Human computer interaction, 57
Human computer interaction,
definition, 152
Human Information/Idea Thing
Interaction model, 11
Human language, 30
Idea, 65
Identification, 215
Ill posed problems, 81
Hakan Gulliksson
Illocutionary speech act, 239
Image analysis, 68
Immersed VR, 165
Immersion, 164, 186
Immersion level of interaction, 181
Impulse, 125
Inductive learning, 119
Inductive reasoning, 106
Inference, 117
Information, 65, 66
Information representation, 68
Information theory, 66, 186
Inherit, 41
Inheritance, 41
Input, 20, 61
Intelligence, 93, 97
Intelligence, definition, 29
Intelligent agent, 58
Intelligent interface, 217
Intelligent thing, 58, 73, 97
Intention, 84, 125
Intention, Intentional state, 103
Intentional state, 103
Intentional view of interaction, 200
Intentional view of processing, 94
Intentional view of system, 49, 176
Interaction, 12
Interactor, 57
Human, 57
Information, 57
Thing, 57
Interface, 20, 41
Interpolation, 87
Inverse kinematics, 81
Is a relationship, 41
Joint intention, 199
JPEG, 69
Kisceral, 204
Knowledge, 63, 65, 90
Knowledge and learning, 116
Knowledge representation, 91, 92
Language, 30, 191
Learning, 27, 63, 118
Learning, definition, 113
Lexical design, 177
Lighting, 87
Long term memory, 100
Machine learning, 118
Manipulation, 216
Manuscript, 51
Meaning, 190
Means-end analysis, 106
261
Measure interaction, 179
Medium, 186, 188
Meme, 150
Memory, 99
Mental state, 103
Message, Human communication,
30
Metaphor, 51
Mind-body problem, 91
MIPS, 94
Mixed-initiative, 217
Mixed-initiative control, 196
MMI, 152
Mobile system, 35
Mobility, 35, 125
Model
Continous, 52
Introduction, 47
Model, definition, 47
Modular, 23, 41
Mood, 125
Moore s law, 94
Motion, 35
Motive, 130
Motoric abilities, 80
Multiplicity, 21
Music, 89
Narration, 89, 221
Navigation, 216, 230
Need, 125, 129
Negotiation, 204
Neural pathway, 78
Noise, 179, 187
Norm, 199
Object, 73
Ontology, 90
Open environment, 55
Open loop control, 25
Open system, 25
Optimisation, 62
Order, 42
Output, 20, 62
Pareto efficiency, 205
Pattern, 34
Pattern matching, 229
Pattern recognition, 34, 40
Pedagogy, 115
Perceive, 60
Perception, 61, 125
Persuasive application, 220
Pervasive computing, 152
Phenomenological model, 50
Hakan Gulliksson
Phoneme, 72
Physical environment, 55
Physical representation, 53
Physical view of interaction, 200
Physical view of system, 49, 176
Physical view, design discrepance,
243
Plan, 109
Representation, 99
Planning, 108
Play, 116
Pleasant, 127
Positivism, 90
Power, Human, 141
Power, Thing, 143
Pragmatics, 188
Precense, 121, 224
Presence, 164, 181, 198
Social, 121, 123, 210, 211, 213
Prioritisation for cooperation, 203
Privacy, 236, 246
Problem solving, 105
Procedural knowledge, 90
Process, 94
Processing, 21
Human, 60
Information, 94
Processing concurrently, 21
Processing sequentially, 21
Producer-consumer, 199
Proprioception, 79
Prosody, 72
Protocol, 198
Psychomotor learning, 115
QoL, 134
QOL, 157, 243
Qualification problem, 109
Quality of Life, 134
Quality of Life, definition, 157
RAM memory, 100
Ramification problem, 110
Rational behaviour, 59
Ray-tracing, 87
Real value, 52
Reasoning, 62
Reasoning, definition, 105
Recursion, 44
Reductionism, 32
Reflection, 87
Reflectivity, 87
Reinforced learning, 27
Relational database, 45
262
Rendering, 87
Repetition, 43
Representation, 12, 60
A/D conversion, 76
Analogue information, 66
of image, 68, 227
of knowledge, 91
of sound, 71
of speech, 71
Physical, 53
Virtual, 53
Reuse, 26
RGB, 70
Scaling, 42
Scan, 232
Schema, 93, 99, 121
Science, 16
Script, 99
SDT, 130
Search, 110, 232
Security, 236, 246
Segmentation, 215
Self-Determination theory, 130
Self-esteem, 130
Semantic knowledge, 90
Semantic level, 177
Semantics
Communication, 178
Sensation, 61
Sense, 60
Sensing, 61, 76
Sensory registers, 99
Sequencing design, 177
Sequential processing, 21
Shadowing, 87
Shannon, 186
Sharing for cooperation, 202
Short term memory, 99
Signal, 65
Simulation, internal human, 81
Situated action, 94, 95
Situation, 160, 161
Situation, 173
Smell, 78
Social abilities, 122
Social awareness, 189, 210
Social context, 55, 167
Social presence, 121, 123, 210, 211,
213
Social quality, 212
Social rules, 124
Social support in user interface, 218
Socially competent, 123
Hakan Gulliksson
Society, 55
Sound, 71
Specialisation, 116
Specialisation and cooperation, 202
Speech acts, 239
Speech synthesis, 89
Spoken language, 31
Stable system, 34
Static system, 35
Stigmergy, 197
Stimulus-Response diagram, 59
Stochastic systems, 52
Story telling, 89
Structural model, 50
Structure, 20, 50
Supervised learning, 27
Surprise, 127
Symmetry, 43
Syntactic design, 177
Synthesis
Interaction, 82, 176
Representation, 54
Speech, 89
System
Autonomy, 28
Complexity, 37, 38
Complexity reduction, 40
Control, 23
Deterministic, 52
Dynamic, 35
Environment, 55
Equilibrium, 34
Feedback, 23
Heterogenicity, 28
Layered, 43
Memory, 35
Modular, 41
Stable, 34
Static, 35
Stochastic, 52
Structure, 20
Time invariance, 35
System control, 216
Tacit knowledge, 117
Talk exchange, 190
Task, 161
Task, definition, 85
Taste, 78
Text, 70
Text output, 195
Texture mapping, 87
Thing, 73
Thinking, 93, 94
Time invariance, 35
Time line, 67
263
Time perception, 141
Time sharing, 204
Timing, 219
Token passing, 190
Top down, 229
Touch, 78
Transform
Message, 188
Transmitter, 188
Trial and error, 116
Turing test, 98
Turn taking, 190
Ubiquitous computing, 152
UML
Aggregation, 41
Generalisation, 41
Urge, 125
Valence, 127
Vector space, 46
Well-being, 134
Wicked problems, 242
Wireframe model, 87
Virtual
Definition, 53
Environment, 56
Virtual reality, 164
Virtual representation, 53
Visceral, 204
Vision, 78
Word, 72
VR glasses, 165
VR, Virtual reality, 164
"My pen is at the bottom of a page,
Which, being finished, here the story ends;
'Tis to be wished it had been sooner done,
But stories somehow lengthen when begun.”
Byron
Hakan Gulliksson
264
VII.3 Think along
Intentional / Conceptual /Physical
Cognition / Perception / Sensation
Cpu/ Filter/ Sensor
Interaction – Representation of action - Interactor
Feedback – Signal – Transmitter/Receiver
Conversation – Utterance – Speaker/Listener
Emergence by interaction
examples: [ Society and family by language, Things by physical force]
Co-operation / Competition
Compromise / Control
examples:
[Aesthetics / Functionality, Complexity / Speed, Democracy / Dictatorship, Create / Use]
Wisdom/Knowledge/Message / Information / Language / Data / Packet / Quanta
Space / Time
Static / Dynamic
Attention / At ease
Structure / Chaos
Fixed/Mobile
Noun / Verb (Adjective / Adverb)
Actor / Action (Characteristics)
Object / Method (Attribute)
Program / Execution
Data / Operation
Representation / Transformation
Signal / Filter
Knowledge / Learning
Message / Modulation
Stephen Hogbin, “ppearance and
reality , IS”N 1-892836-05-x
Problem / Solution
Analysis / Synthesis
Read / Write
Recognise / Describe
Edge / Space
Content /Interface
Centralised / Distributed (modular, partitioned))
Computer / Network
CPU / Bus / Memory
Source / Channel
Object / Relation
Complexity / Flexibility / Adaptation / Feedback / Intelligence
Overview / Detail
Holism / Reductionism
Abstraction (disregard details)/ Instance
General / Particular
Population / Individual
Context / Self
Top down / Bottom up
Noise
Protocol
Classification / Correlation
Deduction / Induction
Field experiment / Survey / Formal theory
Design – Evaluate
Create – Evaluate
Program – Test
Construct – Test?
Aggregation / Coupling / Cohesion /Association / Relationship
Hierarchy (change resolution)/ Sequence / Concurrency / Layered
Hakan Gulliksson
265