Kernel Method
Kernel Method
Kernel Method
Methods (Part 2)
Steve Vincent
November 17, 2004
Overview
Algorithm
Example
Comparison
Bag of Words
Semantic Kernels
String Kernels
Tree Kernels
PCA algorithm
1.
2.
3.
4.
1
equals the desired variance to be captured
Project the data points on the eigenvectors
5.
1
a n E xk
N
x
i 1
2.
3.
4.
5.
K I
K I
N
N
7.
8.
a n / n
9.
j 1
1NxN
1NxN
T
ya I
... K
N
N
k ( x N , x)
Data Source
Problem Definition:
Data Selection
Attribute 86
[From MatLab]
#
Attributes
%
Variance
#
Attributes
%
Variance
25
73.29
55
98.20
30
80.66
60
98.86
35
86.53
65
99.35
40
91.25
70
99.65
45
94.94
75
99.83
50
97.26
80
99.96
Relative Performance
5.668
11
Overall Results
Properties:
k (d1 , d 2 )
k (d1 , d1 ).k (d 2 , d 2 )
Basic idea :
terms
terms
terms
terms
terms
documents
Size d
K(d1,d2)=?
Size t
2
Size k <<t
Concepts space
15
k(d1,d2)=(d1)SS(d2)
w(t ) ln
df (t )
df(t)=# of
documents
containing term t
16
~
k (d1 , d 2 ) (d1 ) RR ' (d 2 )'
~
k (d1 , d 2 ) (d1 ) PP' (d 2 )' (d1 ) i Qij (d 2 ) j
i, j
Semantic Distance
V
'
where is a diagonal matrix of the same
d (d )U k
19
k (d1 , d 2 ) (d1 )U kU k (d 2 )
( d )U k
1 / 2
i
(v )
i
k ( d j , d )
j 1
where k is the base kernel,
and i vi are i 1
eigenvalue, eigenvector pairs of the kernel matrix
Can represent the LSKs with the proximity matrix
P U kU k
20
Features of s = p-spectrum of s =
histogram of all (contiguous)
substrings of length p
Feature space indexed by all
elements of p
u(s)=number of occurrences of u in s
The associated kernel is defined as
k p ( s, t )
up
p
u
( s ) (t )
p
22
p-spectrum kernel
example
p-spectrum Kernels
Recursion
k ks ( s, t )
k p ( s, t )
| s| p 1 |t | p 1
s
k
p (s(s(i : i p), t ( j : j p)
i 1
j 1
in O(p |s| |t|) operations
The evaluation of one row of the table for the p-suffix kernel
corresponds to performing a search in the string t for the psuffix of a prefix in s.
24
All-subsequences kernels
Recursive implementation
k (s, ) 1
k (sa, t ) k (s, t )
k (s, t (1 : j 1))
j : t j a
Fixed-length subsequence
kernels
tables)
k0 (s, ) 1
k p (s, ) 0
for p 0
k p (sa, t ) k p (s, t )
j : t j a
p 1
(s, t (1 : j 1))
27
Gap-weighted subsequence
kernels (1)
28
Gap-weighted subsequence
kernels (2)
Example(2)
D1 : ATCGTAGACTGTC
D2 : GACTATGC
(D1)CAT = 28+and(D2)CAT = 4
k(D1,D2)CAT=212+214
Gap-weighted subsequence
kernels (3)
30
Motivation : the noisy stemming hypothesis (important Ngrams approximate stems), confirmed experimentally in a
categorization task
31
Extension of WSK:
32
Tree Kernels
33
Example :
NP
S
VP
John
VP
Mary
VP
NP
V
VP
N
N
loves
loves
A Parse Tree
Mary
a few among
the many subtrees
of this tree!
loves
Mary
VP
V
N
34
n1T1 n2 T2
35
Kco-rooted(n1,n2)=0 if n1 or n2 is a leaf
children i
References