M&M Task 3: Brains and Computers: Chaudhary (2016) : Brain-Computer Interfaces For Communicat Ion and Rehabilitation
M&M Task 3: Brains and Computers: Chaudhary (2016) : Brain-Computer Interfaces For Communicat Ion and Rehabilitation
M&M Task 3: Brains and Computers: Chaudhary (2016) : Brain-Computer Interfaces For Communicat Ion and Rehabilitation
BCI
CONNECTIONSIM
THAGARD
CHAPTER 1: APPROACHES TO COGNITIVE SCIENCE
Beginnings
• Plato: most important knowledge comes from virtue that people know innately, independently of
sense experience
• Descartes + Leibniz: knowledge can be gained just by thinking and rezoning = rationalism
• Aristotle: knowledge = rules that are learned from experience = empiricism
• Kant: rationalism + empiricism = knowledge depends on both sense experience and the innate
capacities of the mind
• Then experimental psychology came (Wundt), but then was overtaken by behaviourism (Watson)
• I skipped the rest cuz it seems irrelevant
• Thinking can be best understood in terms of representational structures in the mind and
computational procedures that operate on those structures
• CRUM might be wrong lol → might be inadequate to explain fundamental facts about the mind
• But it’s the most successful approach thus far at explaining the mind
• CRUM assumes that the mind has mental representations like data structures, and computational
procedures like algorithms
• CRUM: Mind <-> brain <->
computation
• Thinking arises from applying
computational procedures to mental representations
• Cognitive theory postulates a set of representational structures and a set of processes that operate on
these structures
• A computational model makes these structures and processes more precise by interpreting them by
analogy with computer programs that consist of data structures and algorithms
• Vague ideas about representations can be supplemented by precise computational ideas about data
structures, and mental processes can be defined algorithmically
• To test this model, it must be implemented in a software program in a programming language
• This program may run on a variety of hardware platforms (PCs, Macs…) or it may be designed for a
specific kind of hardware that has many processes working in parallel
• 3 stages of the development of cog. Theories: discovery, modification, and evaluation
• The running program can contribute to evaluation of the model and theory in 3 ways:
1. Shows that the postulated representations and processes are computationally realizable
2. In order to show not only the computational realizability of a theory but also its psychological
plausibility, the program can be applied qualitatively to various examples of thinking
3. To show a much more detailed fit between the theory and human thinking, the program can
be used quantitively to generate detailed predictions about human thinking that can be
compared with the results of psychological experiments
Evaluating approaches to mental representations
CHAPTER 7: CONNCECTIONS
1. Representational power
• In local connectionist networks, the units have specifiable interpretation such as particular concepts or
propositions
• The activation unit can be interpreted as a judgement about the applicability of a concept or the truth
of a proposition
• Links can be one way or symmetric
• Links are either excitatory or inhibitory
• Distributed representations
o Feedforward network where
info flows upward through the
network
o Bottom row: input, top row:
output
o Information is distributed all over
o .ie concepts in a network
• Recurrent networks: activation from the output units feed back into the input units
• Links between units suffice for representing simple associations, but lack representational power to
capture more complex kinds of rules ex: ‘for any x, if there is a y such tat u is a geek and x likes y, then x
is a geek” → difficult to represent in connectionist networks
o But can try by using synchrony to link units that represent associated elements: a unit of
package of units that represents the x that does the liking can be made to fire with the same
temporal pattern as the x that likes computers
o Or another way to represent relational info is to use vectors = list of numbers that can be
understood as the firing rates of groups of neurons
▪ Vectors can be distinguished between agents (what does the liking) and objects
(ones that are linked)
▪ They can be combined to represent complex relational info needed for analogical
reasoning
2. Computational power
A. Problem solving
• Parallel network
1. Concepts such as outgoing are represented by units
2. Positive internal constraints are represented by excitatory connections: if 2 concepts are
related by a positive constrain, then the units representing the elements should be linked by
an excitatory link
3. Negative internal constraints are represented by inhibitory connections: if 2 concepts are
related by a negative constraint, then the units representing the elements should be linked by
an inhibitory link
4. An external constraint can be captured by linking units representing elements that satisfy the
external constraint to a special unit that affects the units to which it is linked either positively
(excitatory links) or negatively (inhibitory links)
• The neural network computes by spreading activation between units that are linked to each other
o Unit with excitatory link to an active unit → gain activation
o Unit with inhibitory link to an active unit → decreased activation
• Relaxation = constraints can be satisfied in parallel by repeatedly passing activation among all units,
until after some number of cycles of activity all units have reached stable activation levels
o Adjusting the activation of all units based on the units to which they are connected until all
units have stable high or low activations
• Settling = achieving stability
i. Planning
ii. Decision
iii. Explanation
B. Learning
o They also can identify statistical associations between input and output features that are more
subtle than rules
o Limitations:
▪ Requires a supervisor to say whether an error has been made
▪ Tends to be slow, requiring 100s and 1000s of examples to train a simple network
C. Language
• Interconnected units can represent hypotheses about what letters are present and about what words
are present, and relaxing the network can pick the best
overall interpretation
• Construction integration model of discourse
comprehension: meaning is determined by activation
flow in the network
o Which interpretation gets activated depends
on how input info will affect the various units
and links
• Children have tendency to use wrongly use past tenses (ex: goed, hitted)
o McClelland showed how a connectionist network can be trained to reproduce the children’s
error using distributed representations rather than rules
o But there is a debate (which I can’t be fucked to write down but I don’t think it matters so its
chill)
3. Psychological plausibility
4. Neurological plausibility
• Real neurons are much more complex than the units in an artificial network, which merely pass
activation to each other
• They also have neurotransmitters, and also undergo changes in synaptic and non-synaptic properties
• Each artificial unit = representing a neuronal group, a complex of neurons that work together to play a
processing role
• Local networks use symmetric links between units
• Synapses connecting neurons are one-way
o Also have neural pathways that allow them to influence each other
o Has excitatory links to other neurons or inhibitory links to other neurons, but not a mixture
• Actual neural networks have the feedforward character of backpropagation networks
o But there is no known neurological mechanism by which the same pathways that feed
activation forward can also be used to propagate error correction backward
• → despite all this shit, it’s still a good analogy
5. Practical applicability
• Education: reading → to read, you need to process letters into which words and simultaneously take
into account meaning and context
o Reading is a parallel constraint satisfaction → constraints simultaneously involve spelling and
leaning and context
• Design = parallel constraint satisfaction
o Ex: architect’s design for a building must take into account numerous constraints: cost,
purpose of building, surroundings, aesthetic
• Engineering → ex: training networks to recognise bombs, underwater objects, and handwriting
ACTIVATION FUNCTIONS
ACTIVATION FUNCTIONS IN NEURAL NETWORKS
Activation function? It’s just a thing function that you use to get the output of node
Types
Why derivative/differentiation is used? When updating the curve, to know in which direction and how much to
change or update the curve depending upon the slope
• If we do not apply an activation function then the output signal would simply be a simple linear function
• A Neural Network without Activation function would simply be a Linear regression Model, which has
limited power and does not perform well most of the time
• We want our Neural Network to not just learn and compute a linear functions but something more
complicated than that.
• Also without activation function our Neural network would not be able to learn and model other
complicated kinds of data such as images, videos, audio, speech etc
→ We need to apply an activation function f(x) so as to make the network more powerful and add ability to it to
learn something complex and complicated form data and represent non-linear complex arbitrary functional
mappings between inputs and outputs
→ Hence using a nonlinear activation, we are able to generate non-linear mappings from inputs to outputs
→ An activation function should be differentiable = we can find the slope of the sigmoid curve at any two points
1. Sigmoid or Logistic
2. Tanh — Hyperbolic tangent
3. ReLu -Rectified linear units
GRACEFUL DEGREDATION
CONNECTIONISM: AN IN TRODUCTION
• Graceful degradation is the property of a network whose performance progressively becomes worse as
the number of its randomly destroyed units or connections increases
• The alternative property might be called catastrophic degradation = the property of a system whose
performance plummets to zero when even a single component of the system is destroyed
• That the performance of biological neural networks degrades gradually -- "gracefully" -- not
catastrophically, is another reason why artificial neural networks are more accurate information
processing models than classical ones
o After all, altering even a single rule in a classical computer model tends to bring the computer
implementing the damaged program to a "crashing" halt
TEACHTARGET
• Graceful degradation = ability of a computer, machine, electronic system or network to maintain
limited functionality even when a large portion of it has been destroyed or rendered inoperative
• The purpose: to prevent catastrophic failure. Ideally, even the simultaneous loss of multiple
components does not cause downtime in a system with this feature
• The operating efficiency or speed declines gradually as an increasing number of components fail
• Graceful degradation is an outgrowth of effective fault management, which is the component of
network management concerned with detecting, isolating and resolving problems
• In Web site design, the term refers to the judicious implementation of new or sophisticated features to
ensure that most Internet users can effectively interact with pages on the site
DELTA RULE
TECHNOPEDIA
Definition?
The Delta rule in machine learning and neural network environments is a specific type of backpropagation that
helps to refine connectionist ML/AI networks, making connections between inputs and outputs with layers of
artificial neurons
Explanation
• In general, backpropagation has to do with recalculating input weights for artificial neurons using a
gradient method
o Delta learning does this using the difference between a target activation and an actual
obtained activation
o Using a linear activation function, network connections are adjusted
• Another way to explain the Delta rule is that it uses an error function to perform gradient descent
learning:
o Essentially in comparing an actual output with a targeted output, the technology tries to find a
match
o If there is not a match = the program makes changes
o The actual implementation of the Delta rule is going to vary according to the network and its
composition → but by employing a linear activation function, the Delta rule can be useful in
refining some types of neural network systems with particular flavours of backpropagation
WIKIPEDIA
• Delta rule is a gradient descent learning rule for updating the weights of the inputs to artificial neurons
in a single-layer neural network
• It is a special case of the more general backpropagation algorithm. For a neuron j with activation
function g(x), the delta rule for j's ith weight wji is given by: