Chapter 6 - Exercises
Chapter 6 - Exercises
Chapter 6 - Exercises
Chapter 6
Exercise 6.10
Consider the table of term frequencies for 3 documents denoted Doc1, Doc2, Doc3 in Figure 6.9.
Compute the tf-idf weights for the terms car, auto, insurance, best, for each document, using the idf
values from Figure 6.8.
Exercise 6.15
Recall the tf-idf weights computed in Exercise 6.10. Compute the Euclidean normalized document
vectors for each of the documents, where each vector has four components, one for each of the four
1. q = [1, 0, 1, 0] //[car, auto, insurance, best]
score(q, doc1)= 0.8974 //[0.8974*1 + 0.1257*0 + 0*1 + 0.4230*0]
score(q, doc2) = 0.6883 //[0.0756*1 + 0.7867*0 + 0.6127*1 + 0*0]
score(q, doc3) = 1.3015 //[0.5953*1 + 0*0 + 0.7062*1 + 0.3833*0]
Ranking: doc3, doc1, doc2
(i) nnn.atc
(ii) ntc.atc
ntc weight for doc1
tf-idf weights Doc1 Doc2 Doc3
car 44.55 6.6 39.6
auto 6.24 68.64 0
insurance 0 53.46 46.98
best 21 0 25.5
ntc.ltn weight for doc1
query doc1 Product
Term w(tf) idf tf-idf tf idf tf-idf norm' w
car 1 1.65 1.65 27 1.65 44.55 0.8974 1.4807
auto 0 2.08 0 3 2.08 6.24 0.1257 0
insurance 1 1.62 1.62 0 1.62 0 0 0
best 1 1.5 1.5 14 1.5 21 0.4230 0.6345
ntc.ltn weight for doc2
query doc2 Product
Term w(tf) idf tf-idf tf idf tf-idf norm' w
car 1 1.65 1.65 4 1.65 6.6 0.0756 0.1247
auto 0 2.08 0 33 2.08 68.64 0.7867 0
insurance 1 1.62 1.62 33 1.62 53.46 0.6127 0.9926
best 1 1.5 1.5 0 1.5 0 0 0
ntc.ltn weight for doc3
query doc3 Product
Term w(tf) idf tf-idf tf idf tf-idf norm' w
car 1 1.65 1.65 24 1.65 39.6 0.5953 0.9822
auto 0 2.08 0 0 2.08 0 0 0
insurance 1 1.62 1.62 29 1.62 46.98 0.7062 1.144
best 1 1.5 1.5 17 1.5 25.5 0.3833 0.575
Term doc1 doc2 doc3
car 1.4807 0.1247 0.9822
auto 0 0 0
insurance 0 0.9926 1.144
best 0.6345 0 0.575
Score 2.1152 1.1173 2.7012