kuliah 5 estimasi bagian 2
DESCRIPTION
copyright by Ahmad Hidayatno ST, MTTeknik Elektro Undip SemarangTRANSCRIPT
-
Estimasi
Problems of Dimensionality Computational Complexity Component Analysis and Discriminants Hidden Markov Models Hidden Markov Model: Extension of Markov Chains
Bagian II - Component Analysis and DiscriminantsCombine features in order
to reduce the dimension of the feature spaceLinear combinations are
simple to compute and tractableProject high dimensional data onto a
lower dimensional spaceTwo classical approaches for finding optimal
linear transformationPCA (Principal Component Analysis) Projection
that best represents the data in a least- square senseMDA (Multiple
Discriminant Analysis) Projection that best separates the data in a
least-squares sense
8
- Hidden Markov Models: Markov ChainsGoal: make a sequence of
decisionsProcesses that unfold in time, states at time t are
influenced by a state at time t-1Applications: speech recognition,
gesture recognition, parts of speech tagging and DNA sequencing,
Any temporal process without memory
T = {(1), (2), (3), , (T)} sequence of states
We might have 6 = {1, 4, 2, 2, 1, 4}
The system can revisit a state at different steps and not every state need to be visited10
- First-order Markov modelsOur productions of any sequence is
described by the transition probabilities
P(j(t + 1) | i (t)) = aij10
-
10
-
= (aij, T)
P(T | ) = a14 . a42 . a22 . a21 . a14 . P((1) = i)
Example: speech recognition
production of spoken words
Production of the word: pattern represented by phonemes
/p/ /a/ /tt/ /er/ /n/ // ( // = silent state)
Transitions from /p/ to /a/, /a/ to /tt/, /tt/ to er/, /er/ to /n/ and /n/ to a silent state
10
- Hidden Markov Model (HMM)
Interaction of the visible states with the hidden statesbjk= 1 for all j where bjk=P(Vk(t) | j(t)).
3 problems are associated with this model
The evaluation problem
The decoding problem
The learning problem - The evaluation problem
It is the probability that the model produces a sequence VT of visible states. It is:
where each r indexes a particular sequenceof T hidden states.
-
Using equations (1) and (2), we can write:
Interpretation: The probability that we observe the particular sequence of T visible states VT is equal to the sum over all rmax possible sequences of hidden states of the conditional probability that the system has made a particular transition multiplied by the probability that it then emitted the visible symbol in our target sequence.
Example: Let 1, 2, 3 be the hidden states; v1, v2, v3 be the visible statesand V3 = {v1, v2, v3} is the sequence of visible states
P({v1, v2, v3}) = P(1).P(v1 | 1).P(2 | 1).P(v2 | 2).P(3 | 2).P(v3 | 3)
++ (possible terms in the sum= all possible (33= 27) cases !)
-
First possibility:
Second Possibility:
P({v1, v2, v3}) = P(2).P(v1 | 2).P(3 | 2).P(v2 | 3).P(1 | 3).P(v3 | 1) + +
Therefore:1
(t = 1)
2
(t = 2)
3
(t = 3)
v1
v2
v3
2
(t = 1)
1
(t = 3)
3
(t = 2)
v3
v2
v1
- The decoding problem (optimal state sequence)
Given a sequence of visible states VT, the decoding problem is to find the most probable sequence of hidden states.
This problem can be expressed mathematically as:
find the single best state sequence (hidden states)
Note that the summation disappeared, since we want to find
Only one unique best case !
-
Where: = [,A,B]
= P((1) = ) (initial state probability)
A = aij = P((t+1) = j | (t) = i)
B = bjk = P(v(t) = k | (t) = j)
In the preceding example, this computation corresponds to the selection of the best path amongst:
{1(t = 1),2(t = 2),3(t = 3)}, {2(t = 1),3(t = 2),1(t = 3)}
{3(t = 1),1(t = 2),2(t = 3)}, {3(t = 1),2(t = 2),1(t = 3)}
{2(t = 1),1(t = 2),3(t = 3)}
- The learning problem (parameter estimation)
This third problem consists of determining a method to adjust the model parameters = [,A,B] to satisfy a certain optimization criterion. We need to find the best model
Such that to maximize the probability of the observation sequence:
We use an iterative procedure such as Baum-Welch or Gradient to find this local optimum
=
=
=
-
=
max
r
1
r
T
t
1
t
T
)
1
t
(
|
)
t
(
(
P
))
t
(
|
)
t
(
v
(
P
)
V
(
P
w
w
w
=
=
=
=
-
=
=
T
t
1
t
T
r
T
t
1
t
T
r
T
))
1
t
(
|
)
t
(
(
P
)
P(
(2)
))
t
(
|
)
t
(
v
(
P
)
|
V
(
P
)
1
(
w
w
w
w
w
{
}
)
T
(
),...,
2
(
),
1
(
T
r
w
w
w
w
=
=
=
-
=
states
hidden
of
sequence
possible
3
t
1
t
3
2
1
))
1
t
(
|
)
t
(
(
P
)).
t
(
|
)
t
(
v
(
P
})
v
,
v
,
v
({
P
w
w
w
)
(
P
)
|
V
(
P
)
V
(
P
T
r
r
1
r
T
r
T
T
max
w
w
=
=
]
B
,
A
,
[
p
l
=
[
]
l
w
w
w
w
w
w
w
w
w
w
w
w
|
)
T
(
V
),...,
2
(
v
),
1
(
v
),
T
(
),...,
2
(
),
1
(
P
max
arg
)
T
(
),...,
2
(
),
1
(
:
that
such
)
T
(
),...,
2
(
),
1
(
)
T
(
),...,
2
(
),
1
(
=
)
|
V
(
P
Max
T
l
l