pert 05 aplikasi clustering

66
Overview of Clustering Rong Jin

Upload: aiinir

Post on 16-Apr-2017

355 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Pert 05 aplikasi clustering

Overview of Clustering

Rong Jin

Page 2: Pert 05 aplikasi clustering

Outline K means for clustering Expectation Maximization algorithm for clustering Spectrum clustering (if time is permitted) w w

w

Page 3: Pert 05 aplikasi clustering

Clustering

$$$

age

Find out the underlying structure for given data points

Page 4: Pert 05 aplikasi clustering

Application (I): Search Result Clustering

Page 5: Pert 05 aplikasi clustering

Application (II): Navigation

Page 6: Pert 05 aplikasi clustering

Application (III): Google News

Page 7: Pert 05 aplikasi clustering

Application (III): Visualization

Islands of music (Pampalk et al., KDD’ 03)

Page 8: Pert 05 aplikasi clustering

Application (IV): Image Compression

http://www.ece.neu.edu/groups/rpl/kmeans/

Page 9: Pert 05 aplikasi clustering

How to Find good Clustering? Minimize the sum of

distance within clustersC1

C2

C3

C4C5

,

6 2,

1 1,arg min

j i j

n

i j i jj iC m

m x C

,

6

,1

1 the j-th cluster0 the j-th cluster

1

any a single cluster

ii j

i

i jj

i

xm

x

m

x

Page 10: Pert 05 aplikasi clustering

How to Efficiently Clustering Data?

,

6 2,

1 1,arg min

j i j

n

i j i jj iC m

m x C

,Memberships and centers are correlated.i j jm C

,

1,

,1

Given memberships ,

n

i j ii

i j j n

i ji

m xm C

m

2

,

1 arg min( ) Given centers { },

0 otherwise

i jkj i j

j x CC m

Page 11: Pert 05 aplikasi clustering

K-means for Clustering K-means

Start with a random guess of cluster centers

Determine the membership of each data points

Adjust the cluster centers

Page 12: Pert 05 aplikasi clustering

K-means for Clustering K-means

Start with a random guess of cluster centers

Determine the membership of each data points

Adjust the cluster centers

Page 13: Pert 05 aplikasi clustering

K-means for Clustering K-means

Start with a random guess of cluster centers

Determine the membership of each data points

Adjust the cluster centers

Page 14: Pert 05 aplikasi clustering

K-means1. Ask user how many

clusters they’d like. (e.g. k=5)

Page 15: Pert 05 aplikasi clustering

K-means1. Ask user how many

clusters they’d like. (e.g. k=5)

2. Randomly guess k cluster Center locations

Page 16: Pert 05 aplikasi clustering

K-means1. Ask user how many

clusters they’d like. (e.g. k=5)

2. Randomly guess k cluster Center locations

3. Each datapoint finds out which Center it’s closest to. (Thus each Center “owns” a set of datapoints)

Page 17: Pert 05 aplikasi clustering

K-means1. Ask user how many

clusters they’d like. (e.g. k=5)

2. Randomly guess k cluster Center locations

3. Each datapoint finds out which Center it’s closest to.

4. Each Center finds the centroid of the points it owns

Page 18: Pert 05 aplikasi clustering

K-means1. Ask user how many

clusters they’d like. (e.g. k=5)

2. Randomly guess k cluster Center locations

3. Each datapoint finds out which Center it’s closest to.

4. Each Center finds the centroid of the points it ownsAny Computational Problem?

Computational Complexity: O(N) where N is the number of points?

Page 19: Pert 05 aplikasi clustering

Improve K-means Group points by region

KD tree SR tree

Key difference Find the closest center for

each rectangle Assign all the points within a

rectangle to one cluster

Page 20: Pert 05 aplikasi clustering

Improved K-means Find the closest center for

each rectangle Assign all the points within

a rectangle to one cluster

Page 21: Pert 05 aplikasi clustering

Improved K-means

Page 22: Pert 05 aplikasi clustering

Improved K-means

Page 23: Pert 05 aplikasi clustering

Improved K-means

Page 24: Pert 05 aplikasi clustering

Improved K-means

Page 25: Pert 05 aplikasi clustering

Improved K-means

Page 26: Pert 05 aplikasi clustering

Improved K-means

Page 27: Pert 05 aplikasi clustering

Improved K-means

Page 28: Pert 05 aplikasi clustering

Improved K-means

Page 29: Pert 05 aplikasi clustering

Improved K-means

Page 30: Pert 05 aplikasi clustering

A Gaussian Mixture Model for Clustering Assume that data are

generated from a mixture of Gaussian distributions

For each Gaussian distribution Center: i

Variance: i (ignore) For each data point

Determine membership

: if belongs to j-th clusterij iz x

Page 31: Pert 05 aplikasi clustering

Learning a Gaussian Mixture(with known covariance)

Probability ( )ip x x

2

/ 2 22

( ) ( , ) ( ) ( | )

1( ) exp22

j j

j

i i j j i j

i jj d

p x x p x x p p x x

xp

Page 32: Pert 05 aplikasi clustering

Learning a Gaussian Mixture(with known covariance)

Probability ( )ip x x

2

/ 2 22

( ) ( , ) ( ) ( | )

1( ) exp22

j j

j

i i j j i j

i jj d

p x x p x x p p x x

xp

Log-likelihood of data

Apply MLE to find optimal parameters

2/ 2 22

1log ( ) log ( ) exp22j

i ji j d

i i

xp x x p

( ),j j jp

Page 33: Pert 05 aplikasi clustering

Learning a Gaussian Mixture(with known covariance)

22

22

1 ( )2

1 ( )2

1

( )

( )

i j

i n

x

j

k x

nn

e p

e p

[ ] ( | )ij j iE z p x x E-Step

1

( | ) ( )

( | ) ( )

i j jk

i n jn

p x x p

p x x p

Page 34: Pert 05 aplikasi clustering

Learning a Gaussian Mixture(with known covariance)

1

1

1 [ ][ ]

m

j ij imi

iji

E z xE z

M-Step

1

1( ) [ ]m

j iji

p E zm

Page 35: Pert 05 aplikasi clustering

Gaussian Mixture Example: Start

Page 36: Pert 05 aplikasi clustering

After First Iteration

Page 37: Pert 05 aplikasi clustering

After 2nd Iteration

Page 38: Pert 05 aplikasi clustering

After 3rd Iteration

Page 39: Pert 05 aplikasi clustering

After 4th Iteration

Page 40: Pert 05 aplikasi clustering

After 5th Iteration

Page 41: Pert 05 aplikasi clustering

After 6th Iteration

Page 42: Pert 05 aplikasi clustering

After 20th Iteration

Page 43: Pert 05 aplikasi clustering

Mixture Model for Doc Clustering A set of language models

1 2, ,..., K

1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w

Page 44: Pert 05 aplikasi clustering

Mixture Model for Doc Clustering A set of language models

1 2, ,..., K

1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w

( )ip d d

( , )

1

( ) ( , )

( ) ( | )

( ) ( | )

j

j

k i

j

i i j

j i j

V tf w dj k jk

p d d p d d

p p d d

p p w

Probability

Page 45: Pert 05 aplikasi clustering

Mixture Model for Doc Clustering A set of language models

1 2, ,..., K

1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w

( )ip d d

( , )

1

( ) ( , )

( ) ( | )

( ) ( | )

j

j

k i

j

i i j

j i j

V tf w dj k jk

p d d p d d

p p d d

p p w

Probability

Page 46: Pert 05 aplikasi clustering

Mixture Model for Doc Clustering A set of language models

1 2, ,..., K

1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w

( )ip d d

( , )

1

( ) ( , )

( ) ( | )

( ) ( | )

j

j

k i

j

i i j

j i j

V tf w dj k jk

p d d p d d

p p d d

p p w

Probability

Introduce hidden variable zijzij: document di is generated by the

j-th language model j.

Page 47: Pert 05 aplikasi clustering

Learning a Mixture Model

( , )

1

( , )

1 1

( | ) ( )

( | ) ( )

k i

k i

V tf w dm j j

mVK

tf w dm n n

n m

p w p

p w p

1

[ ] ( | )

( | ) ( )

( | ) ( )

ij j i

i j jK

i n nn

E z p d d

p d d p

p d d p

E-Step

K: number of language models

Page 48: Pert 05 aplikasi clustering

Learning a Mixture Model

M-Step

1

1( ) [ ]N

j iji

p E zN

1

1

[ ] ( , )( | )

[ ]

N

ij i kk

i j N

ij kk

E z tf w dp w

E z d

N: number of documents

Page 49: Pert 05 aplikasi clustering

Examples of Mixture Models

Page 50: Pert 05 aplikasi clustering

Other Mixture Models Probabilistic latent semantic index (PLSI) Latent Dirichlet Allocation (LDA)

Page 51: Pert 05 aplikasi clustering

Problems (I) Both k-means and mixture models need to compute

centers of clusters and explicit distance measurement Given strange distance measurement, the center of

clusters can be hard to computeE.g., ' ' '

1 1 2 2' max , ,..., n nx x x x x x x x

x y

z

x y x z

Page 52: Pert 05 aplikasi clustering

Problems (II) Both k-means and mixture models look for compact

clustering structures In some cases, connected clustering structures are more desirable

Page 53: Pert 05 aplikasi clustering

Graph Partition MinCut: bipartite graphs with minimal number of

cut edges

CutSize = 2

Page 54: Pert 05 aplikasi clustering

2-way Spectral Graph Partitioning Weight matrix W

wi,j: the weight between two vertices i and j

Membership vector q1 Cluster -1 Cluster i

i Aq

i B

[ 1,1]

2,

,

arg min

14

n

i j i ji j

CutSize

CutSize J q q w

qq

Page 55: Pert 05 aplikasi clustering

Solving the Optimization Problem

Directly solving the above problem requires combinatorial search exponential complexity

How to reduce the computation complexity?

2,

[ 1,1] ,

1arg min4n

i j i ji jq q w

qq

Page 56: Pert 05 aplikasi clustering

Relaxation Approach Key difficulty: qi has to be either –1, 1

Relax qi to be any real number Impose constraint

21

nii q n

2 2 2, ,

, ,

2, ,

,

1 1 24 4

1 12 24 4

i j i j i j i j i ji j i j

i i j i j i ji j i j

J q q w q q q q w

q w q q w

,i i jj

d w

2, , ,

,

1 1 12 2 2i i i j i j i i i j i j ji i j iq d q q w q d w q

,i i jD d

( )TJ q D W q

Page 57: Pert 05 aplikasi clustering

Relaxation Approach

2

* arg min arg min ( )

subject to

T

kk

J

q n

q q

q q D W q

Page 58: Pert 05 aplikasi clustering

Relaxation Approach

Solution: the second minimum eigenvector for D-W

2

* arg min arg min ( )

subject to

T

kk

J

q n

q q

q q D W q

2( )D W q q

Page 59: Pert 05 aplikasi clustering

Graph Laplacian

L is semi-positive definitive matrix For Any x, we have xTLx 0, why? Minimum eigenvalue 1 = 0 (what is the eigenvector?) The second minimum eigenvalue 2 gives the best bipartite

graph

, , ,: , i j i j i jjw w L D W W D

1 2 30 ... k

Page 60: Pert 05 aplikasi clustering

Recovering Partitions Due to the relaxation, q can be any number (not just

–1 and 1) How to construct partition based on the eigenvector?

Simple strategy: { | 0}, { | 0}i iA i q B i q

Page 61: Pert 05 aplikasi clustering

Spectral Clustering Minimum cut does not balance the size of bipartite

graphs

Page 62: Pert 05 aplikasi clustering

Normalized Cut (Shi & Malik, 1997)

Minimize the similarity between clusters and meanwhile maximize the similarity within clusters

,( , ) , ,

( , ) ( , )

i j A i B ii A j B i A i B

A B

s A B w d d d d

s A B s A BJd d

,( , ) ( , ) B A

i ji A j BA B A B

d ds A B s A BJ wd d d d

jj

d d

2

,B A

i ji A j B A B

d dw

d d d

BidddAiddd

iqBA

AB

if if

//

)(

2,i j i j

i jw q q

Page 63: Pert 05 aplikasi clustering

Normalized Cut 2

, ( - )

/ if

/ if

Ti j i j

i j

B Ai

A B

J w q q

d d d i Aq

d d d i B

q D W q

Page 64: Pert 05 aplikasi clustering

Normalized Cut

Relax q to real value under the constraint

2, ( - )

/ if

/ if

Ti j i j

i j

B Ai

A B

J w q q

d d d i Aq

d d d i B

q D W q

0,1 DeqDqq TT

Solution: DqqWD )(

Page 65: Pert 05 aplikasi clustering

Image Segmentation

Page 66: Pert 05 aplikasi clustering

Non-negative Matrix Factorization