Decision(Making,Models( Definition( action
Decision
Making,
Models
Definition
Models
of
decision
making
attempt
to
describe,
using
stochastic
differential
equations
which
represent
either
neural
activity
or
more
abstract
psychological
variables,
the
dynamical
process
that
produces
a
commitment
to
a
single
action/outcome
as
a
result
of
incoming
evidence
that
can
be
ambiguous
as
to
the
action
it
supports.
Background
Decision
making
can
be
separated
into
four
processes
(Doya,
2008):
1) Acquisition
of
sensory
information
to
determine
the
state
of
the
environment
and
the
organism
within
it.
2) Evaluation
of
potential
actions
(options)
in
terms
of
the
cost
and
benefit
to
the
organism
given
its
belief
about
the
current
state.
3) Selection
of
an
action
based
on,
ideally,
an
optimal
tradeoff
between
the
costs
and
benefits.
4) Use
of
the
outcome
of
the
action
to
update
the
costs
and
benefits
associated
with
it.
Models
of
the
dynamics
of
decision
making
have
focused
on
perceptual
decisions
with
only
two
possible
responses
available.
The
term
two--alternative
forced
choice
(TAFC)
applies
to
such
tasks
when
two
stimuli
are
provided,
but
the
term
is
now
generally
used
for
any
binary
choice
discrimination
task.
In
a
perceptual
decision,
the
response,
or
action,
is
directly
determined
by
the
current
percept.
Thus
the
decision
in
these
tasks
is
essentially
one
of
perceptual
categorization,
namely
process
(1)
above,
though
the
same
models
can
be
used
for
action
selection
given
ambiguous
information
of
the
current
state
(process
3).
Evaluation
of
the
possible
responses
in
terms
of
their
value
or
the
resulting
state's
utility
(process
2)
(Sugrue
et
al.,
2005)
given
both
uncertainty
in
the
current
state,
and
uncertainty
in
the
outcomes
of
an
action
given
the
state,
is
the
subject
of
expected
utility
theory
and
prospect
theory.
The
necessary
learning
and
updating
of
the
values
of
different
actions
given
the
actual
outcomes
they
produce
(process
4)
is
the
subject
of
instrumental
conditioning
and
reinforcement
learning,
for
example
via
temporal--difference
learning
(Seymour
et
al.,
2004)
and
actor--critic
models
(Joel
et
al.,
2002).
This
article
is
primarily
concerned
with
the
dynamics
of
the
production
of
either
a
single
percept
given
unreliable
sensory
evidence
(1),
or
a
single
action
given
uncertainty
in
the
outcomes
(3).
General
features
of
discrimination
tasks
or
TAFC
tasks.
In
a
TAFC
task,
a
single
decision
variable
can
be
defined,
representing
the
likelihood
ratio-- the
probability
that
evidence
to
date
favors
one
alternative
over
the
other.
While
TAFC
tasks
(Figure
1)
have
provided
the
dominant
paradigm
for
analysis
of
choice
behavior,
the
restriction
to
only
two
choices
is
lifted
in
many
of
the
more
recent
models
of
decision
making
based
on
multiple
variables,
allowing
for
the
fitting
of
a
wider
range
of
data
sets.
The
tasks
can
either
be
based
on
a
free
response
paradigm,
in
which
a
subject
responds
after
as
much
or
little
time
as
she
wants,
or
an
interrogation
(forced
response)
paradigm,
in
which
the
stimulus
duration
is
limited
and
the
subject
must
make
a
response
within
a
given
time
interval.
The
free
response
paradigm
is
perhaps
more
powerful,
since
each
trial
produces
two
types
of
information:
accuracy
(correct
or
incorrect)
and
response
time.
However,
by
variation
of
the
time
allowed
when
responses
are
forced,
both
paradigms
are
valuable
for
constraining
models,
since
they
can
provide
a
distribution
of
response
times
for
both
correct
and
incorrect
trials,
as
well
as
the
proportion
of
trials
that
are
correct
or
incorrect
with
a
given
stimulus.
These
behavioral
data
can
be
modified
by
task
difficulty,
task
instructions,
such
as
("respond
rapidly"
versus
"respond
accurately")
or
reward
schedules
and
inter--trial
intervals.
Most
models
of
the
dynamics
of
decision
making
focus
on
tasks
where
the
time
from
stimulus
onset
to
response
is
no
more
than
one
to
two
seconds,
a
timescale
over
which
neural
spiking
can
be
maintained.
Choices
requiring
much
more
time
than
this
are
likely
to
depend
upon
multiple
memory
stores,
neural
circuits
and
strategies,
which
become
difficult
to
identify,
extract
and
model
in
a
dynamical
systems
framework
(a
state--based
framework
is
more
appropriate).
Figure
1.
Scheme
of
the
two--alternative
forced
choice
(TAFC)
task.
Two
streams
of
sensory
input,
each
containing
stimulus
information,
or
a
signal
(S1
and
S2)
combined
with
noise
(!!
and
!! ),
are
compared
in
a
decision--making
circuit.
The
circuit
must
produce
one
of
two
responses
(A
or
B)
indicating
which
of
the
two
signals
is
the
stronger.
The
optimal
method
for
achieving
this
discrimination
is
via
the
sequential
probability
ratio
test
(SPRT)
which
requires
the
decision
making
circuit
to
integrate
inputs
over
time.
In
the
standard
setup
of
the
models,
two
parallel
streams
of
noisy
sensory
input
are
available,
with
each
stream
supplying
evidence
in
support
of
one
of
the
two
allowed
actions
(see
Figure
1).
The
sensory
inputs
can
be
of
either
discrete
or
continuous
quantities
and
can
arrive
discretely
or
continuously
in
time.
The
majority
of
models
focus
on
continuous
update
in
continuous
time
so
can
be
formulated
as
stochastic
differential
equations
(Gillespie,
1992,
Lawler,
2006).
The
sensory
evidence,
which
is
momentary,
produces
a
decision
variable,
which
indicates
the
likelihood
of
choosing
one
of
the
two
alternatives
given
current
evidence
and
all
prior
evidence.
The
primary
difference
between
models
is
in
how
sensory
evidence
determines
the
decision
variable.
While
most
models
incorporate
a
form
of
temporal
integration
of
evidence
(Cain
and
Shea--Brown,
2012)
and
include
a
negative
interaction
between
the
two
sources
of
evidence,
differences
arise
in
the
stability
of
initial
states
which
determines
whether
integration
is
perfect
and
in
the
nature
of
the
interaction:
feedforward
between
the
inputs,
feedforward
between
outputs
or
feedback
from
outputs
to
decision
variables
(Bogacz
et
al.,
2006).
Models
can
also
differ
in
their
choice
of
decision
threshold--the
value
of
the
decision
variable
at
which
a
response
is
produced--in
the
free
response
paradigm
(Simen
et
al.,
2009,
Deneve,
2012,
Drugowitsch
et
al.,
2012),
and
in
particular
whether
this
parameter
or
other
model
parameters,
such
as
input
gain,
which
also
affect
the
response
time
distribution,
are
static
or
dynamic
across
a
trial
(Shea--Brown
et
al.,
2008,
Thura
et
al.,
2012).
As
the
time
available
for
acquisition
of
sensory
information
increases,
so
accuracy
of
responses
increases
in
a
perceptual
discrimination
task.
Accuracy
is
measured
as
probability
of
choosing
the
response
leading
to
more
reward,
which
is
equivalent
to
obtaining
a
veridical
percept
in
these
tasks.
All
of
the
models
to
be
discussed
below
can
produce
such
a
speed--accuracy
tradeoff
by
parameter
adjustment.
If
parameters
are
adjusted
so
as
to
increase
the
mean
response
time,
then
accuracy
increases.
Such
a
tradeoff
is
observed
in
behavioral
tasks,
when
either
instructions
or
the
schedule
of
reward
and
punishment
encourages
participants
to
respond
as
quickly
as
possible
while
being
less
concerned
about
making
errors,
or
to
respond
as
accurately
as
possible,
while
being
less
concerned
about
the
time
it
takes
to
decide.
The
simplest
way
to
effect
such
a
tradeoff
is
to
adjust
the
inter
trial
interval,
which
if
long
compared
to
the
decision
time,
means
that
accuracy
of
responses
impacts
reward
rate
much
more
so
than
the
time
for
the
decision
itself.
Models
can
replicate
such
behavior
when
optimal
performance
is
based
on
the
maximal
reward
rate.
Typical
parameter
adjustments
to
increase
accuracy
while
slowing
responses
would
be
a
multiplicative
scaling
down
of
inputs
(and
the
concurrent
input
noise)
or
a
scaling
up
of
the
range
across
which
the
decision
variable
can
vary
by
raising
a
decision
threshold
(Figs
2--3)
(Ratcliff,
2002,
Simen
et
al.,
2009,
Balci
et
al.,
2011).
A
similar
effect
can
be
achieved
in
alternative,
attractor--based
models
through
the
level
of
a
global
applied
current,
which
affects
the
stability
of
the
initial
"undecided"
state
(Figs
6--9)
(Miller
and
Katz,
2013).
From
a
neuroscience
perspective,
the
decision
variable
is
typically
interpreted
as
either
the
mean
firing
rate
of
a
group
of
neurons
or
a
linear
combination
of
rates
of
many
neurons
(Beck
et
al.,
2008)
(the
difference
between
two
groups
being
the
simplest
such
combination).
There
has
been
remarkable
progress
in
matching
the
observed
firing
patterns
of
neurons
(Newsome
et
al.,
1989,
Shadlen
and
Newsome,
2001,
Huk
and
Shadlen,
2005)
with
the
dynamics
of
a
decision
variable
in
more
mathematical
models
of
decision
making
(Glimcher,
2001,
Gold
and
Shadlen,
2001,
Glimcher,
2003,
Smith
and
Ratcliff,
2004,
Gold
and
Shadlen,
2007,
Ratcliff
et
al.,
2007).
This
has
led
to
the
introduction
of
biophysically
based
models
of
neural
circuits
(Wang,
2008),
which
have
accounted
for
much
of
the
concordance
between
simple
mathematical
models,
neural
activity
and
behavior.
Optimal
Decision
Making
An
optimal
decision--making
strategy
either
maximizes
expected
reward
over
a
given
time
or
minimizes
risk.
In
TAFC
perceptual
tasks,
a
response
is
either
correct
or
an
error.
In
the
interrogation
paradigm,
with
fixed
time
per
decision,
the
optimal
strategy
is
the
one
leading
to
greatest
accuracy,
that
is
the
lowest
expected
error
rate.
In
the
free
response
paradigm
the
optimal
strategy
either
delivers
the
greatest
accuracy
for
a
given
mean
response
time,
or
produces
the
fastest
mean
response
time
for
a
given
accuracy.
In
these
tasks,
the
sequential
probability
ratio
test
(SPRT),
introduced
by
Wald
and
Wolfowitz
(Wald,
1947,
Wald
and
Wolfowitz,
1948),
and
in
its
continuous
form,
the
drift
diffusion
model
(DDM)
(Ratcliff
and
Smith,
2004,
Ratcliff
and
McKoon,
2008)
leads
to
optimal
choice
behavior
by
any
of
these
measures
of
optimality
(see
(Bogacz
et
al.,
2006)
for
a
thorough
review).
Using
SPRT
in
the
interrogation
paradigm,
one
simply
accumulates
over
time
the
log-- likelihood
ratio
of
the
probabilities
of
each
alternative
given
the
stream
of
evidence,
where
the
observed
sensory
input
per
unit
time
has
a
certain
probability
given
alternative
A
and
another
probability
given
alternative
B.
Integrating
the
log--likelihood
over
time,
after
setting
the
initial
condition
as
the
log--likelihood
ratio
of
the
prior
probabilities,
log[P(A)/P(B)],
leads
to
a
quantity
log[P(A|S)/P(B|S)]
which
is
greater
than
zero
if
A
is
more
likely
than
B
given
the
stimulus
and
less
than
zero
otherwise.
Thus
the
optimal
procedure
is
to
choose
A
or
B
depending
on
the
sign
of
the
summed,
or
in
the
continuous
limit,
integrated,
log--likelihood
ratio.
In
the
free
response
paradigm
a
stopping
criterion
must
be
included.
This
is
achieved
by
setting
two
thresholds
for
the
integrated
log--likelihood
ratio,
a
positive
one
(+)
for
choice
A
and
a
negative
one
(? )
for
choice
B.
The
further
the
thresholds
are
from
the
origin,
the
lower
the
chance
of
error,
but
the
longer
the
integration
time
before
reaching
a
decision.
Thus
the
thresholds
reflect
the
fraction
of
errors
that
can
be
tolerated,
with
= log !
and
!!!
= log !
where
is
the
probability
of
choosing
A
when
B
is
correct
and
is
the
!!!
probability
of
choosing
B
when
A
is
correct.
The
Models
Accumulator
Models
The
first
models
of
decision
making
in
humans
or
animals
were
accumulator
models,
sometimes
called
counter
models
or
race
models.
In
these
models,
evidence
accumulates
separately
for
each
possible
outcome.
This
has
the
advantage
that
if
many
outcomes
are
possible,
the
models
are
simply
extended
by
addition
of
one
more
variable
for
each
additional
alternative,
with
evidence
for
each
alternative
accumulating
within
its
allotted
variable.
In
the
interrogation
paradigm,
one
simply
reads
out
the
highest
variable,
so
the
choice
depends
on
the
sign
of
the
difference
of
the
two
variables
in
the
TAFC
paradigm.
Thus,
if
the
difference
in
accumulated
quantities
matched
the
difference
in
integrated
log
probabilities
of
the
two
types
of
evidence,
such
readout
from
an
accumulator
model
would
be
equivalent
to
an
SPRT,
so
would
be
optimal.
In
the
free
response
paradigm,
accumulator
models
produce
a
choice
when
any
one
of
the
accumulated
variables
reaches
a
threshold,
so
these
models
can
be
called
"race
to
threshold
models"
or
simply
"race
models".
The
original
accumulator
models
included
neither
interaction
between
accumulators,
nor
ability
for
variables
to
decrease.
However,
for
decisions
in
nature
or
in
laboratory
protocols,
evidence
in
favor
of
one
alternative
is
typically
evidence
against
the
other
alternative.
This
is
particularly
problematic
in
the
free
response
paradigm,
because
the
time
at
which
one
variable
reaches
threshold
and
produces
the
corresponding
choice
is
independent
of
evidence
accumulated
for
other
choices.
Thus
the
behavior
of
simple
accumulator
models
is
not
optimal.
Comparisons
of
response
time
distributions
of
these
models
with
behavioral
responses
showed
the
models
to
be
inaccurate
in
this
regard--observed
response--time
distributions
are
skewed
with
a
long--tail,
whereas
the
response
times
of
accumulator
models
were
much
more
symmetric
about
the
mean.
These
discrepancies
led
to
the
ascendance
of
Ratcliff's
drift
diffusion
model
(Ratcliff,
1978).
The
Drift
Diffusion
Model
The
drift
diffusion
model
(DDM)
is
an
integrator
with
thresholds
(Figure
2),
or
more
precisely,
the
decision
variable,
,
follows
a
Wiener
process
with
two
absorbing
boundaries
(Figure
3).
It
includes
a
deterministic
(drift)
term,
,
proportional
to
the
rate
of
incoming
evidence
and
a
diffusive
noise
term
of
variance
! ,
which
produces
variability
in
response
times
and
can
lead
to
errors:
= + ,
where
is
a
white
noise
term
defined
by
= - .
1 1(t)
+ S1
- S+n
Choice A X > 0
X or X = +T
Choice B
S2
+
X < 0
or X = -T
2 2(t)
Figure
2.
The
drift
diffusion
model
(DDM).
The
DDM
is
a
one--dimensional
model,
so
the
two
competing
inputs
and
their
noise
terms
are
first
combined:
in
this
case
= ! - !
and
= !! + !!.
If
the
model
is
scaled
to
a
given
level
of
noise
then
its
three
independent
parameters
are
drift
rate
(S)
and
positions
of
each
of
the
two
thresholds
(a,
--b)
with
respect
to
the
starting
point.
When
the
model
was
introduced,
these
parameters
were
assumed
fixed
for
a
given
subject
in
a
specific
task.
The
threshold
spacing
determines
where
one
operates
in
the
speed--accuracy
tradeoff,
so
can
be
optimized
as
a
function
of
the
relative
cost
for
making
an
incorrect
response
and
the
time
between
trials.
Any
starting
point
away
from
the
midpoint
represents
bias
or
prior
information.
The
drift
rate
is
proportional
to
stimulus
strength.
X
X
+a
X(0) = 0 -b
St
mt
0
P(X,t) (for St + m t ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- introduction meaning making in social movements
- the activity of meaning making a holistic perspective on
- kenneth burke s definition of man mesa community college
- decision making models definition action
- meaning making digital chalkboard
- making sense of the meaning literature an integrative
- effective decision making
Related searches
- three categories of consumer decision making behavior
- financial decision making examples
- decision making in financial management
- financial management decision making process
- framing decision making example
- financial decision making process
- investment decision making process
- investment decision making pdf
- strategies for decision making and problem solving
- consumer decision making process article
- consumer decision making process examples
- consumer purchase decision making process