Chapter 6 - Exercises
Chapter 6 - Exercises
Chapter 6 - Exercises
Chapter 6
Exercise 6.10
Consider the table of term frequencies for 3 documents denoted Doc1, Doc2, Doc3 in Figure 6.9.
Compute the tf-idf weights for the terms car, auto, insurance, best, for each document, using the idf
values from Figure 6.8.
Solution
==================================================================================
Exercise 6.15
Recall the tf-idf weights computed in Exercise 6.10. Compute the Euclidean normalized document
vectors for each of the documents, where each vector has four components, one for each of the four
terms.
Solution
Solution
1. q = [1, 0, 1, 0] //[car, auto, insurance, best]
score(q, doc1)= 0.8974 //[0.8974*1 + 0.1257*0 + 0*1 + 0.4230*0]
score(q, doc2) = 0.6883 //[0.0756*1 + 0.7867*0 + 0.6127*1 + 0*0]
score(q, doc3) = 1.3015 //[0.5953*1 + 0*0 + 0.7062*1 + 0.3833*0]
Ranking: doc3, doc1, doc2
Solution
(i) nnn.atc
..................................................................................................................
(ii) ntc.atc
ntc weight for doc1
ntc.atc
..................................................................................................................
tf-idf weights Doc1 Doc2 Doc3
car 44.55 6.6 39.6
auto 6.24 68.64 0
insurance 0 53.46 46.98
best 21 0 25.5
ntc.ltn weight for doc1
query doc1 Product
Term w(tf) idf tf-idf tf idf tf-idf norm' w
car 1 1.65 1.65 27 1.65 44.55 0.8974 1.4807
auto 0 2.08 0 3 2.08 6.24 0.1257 0
insurance 1 1.62 1.62 0 1.62 0 0 0
best 1 1.5 1.5 14 1.5 21 0.4230 0.6345
49.65
ntc.ltn weight for doc2
query doc2 Product
Term w(tf) idf tf-idf tf idf tf-idf norm' w
car 1 1.65 1.65 4 1.65 6.6 0.0756 0.1247
auto 0 2.08 0 33 2.08 68.64 0.7867 0
insurance 1 1.62 1.62 33 1.62 53.46 0.6127 0.9926
best 1 1.5 1.5 0 1.5 0 0 0
85.81
ntc.ltn weight for doc3
query doc3 Product
Term w(tf) idf tf-idf tf idf tf-idf norm' w
car 1 1.65 1.65 24 1.65 39.6 0.5953 0.9822
auto 0 2.08 0 0 2.08 0 0 0
insurance 1 1.62 1.62 29 1.62 46.98 0.7062 1.144
best 1 1.5 1.5 17 1.5 25.5 0.3833 0.575
66.52
ntc.ltn
product
Term doc1 doc2 doc3
car 1.4807 0.1247 0.9822
auto 0 0 0
insurance 0 0.9926 1.144
best 0.6345 0 0.575
Score 2.1152 1.1173 2.7012