Course Plan: Department of Computer Science Enginnering

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

DEPARTMENT OF COMPUTER SCIENCE ENGINNERING

Semester VIII
Academic Year: 2020-2022
Regulations 2017

COURSE PLAN

Prepared by

K.Rajalakshmi
Assistant Professor
Department of Computer Science Engineering
COURSE FILE INDEX
Course Code / Course Title: CS8080      INFORMATION RETRIEVAL TECHNIQUES

SUBMISSION SIGNATURE
S. No CONTENTS
DATE FACULTY HOD
1 Syllabus
2 Lesson Schedule
3 Time Table of the Faculty
4 Student Name List
5 Previous End Semester Question Papers (Last 5 Years)
Question Paper
CIA Answer Key
6 I Answer Scripts ( Above Average / Average/ Below Average)
Result Analysis
Question Paper
Answer Key
7 CIA II Answer Scripts ( Above Average / Average/ Below Average)
Result Analysis
Question Paper
Answer Key
8 CIA III Answer Scripts ( Above Average / Average/ Below Average)
Result Analysis
Internal Test Marks
9
Submission of Course feedback
10
Current Anna University Question Paper
11
Submission of Anna University feedback
12

Work load Reports (Once in week)

1. Work load – 1
2. Work load – 2

3. Work load – 3
4. Work load – 4
5. Work load – 5

6. Work load – 6
7. Work load – 7

8. Work load – 8
9. Work load – 9
SYLLABUS

CS8080      INFORMATION RETRIEVAL TECHNIQUES            L T P C


3 0 0 3

UNIT I INTRODUCTION                                                    9
Information Retrieval – Early Developments – The IR Problem – The Users Task –
Information versus Data Retrieval – The IR System – The Software Architecture of the IR
System – The Retrieval and Ranking Processes – The Web – The e-Publishing Era – How the
web changed Search – Practical Issues on the Web – How People Search – Search Interfaces
Today – Visualization in Search Interfaces.
UNIT II MODELING AND RETRIEVAL EVALUATION      9
Basic IR Models – Boolean Model – TF-IDF (Term Frequency/Inverse Document
Frequency) Weighting – Vector Model – Probabilistic Model – Latent Semantic Indexing
Model – Neural Network Model – Retrieval Evaluation – Retrieval Metrics – Precision and
Recall – Reference Collection – User-based Evaluation – Relevance Feedback and Query
Expansion – Explicit Relevance Feedback.
UNIT III TEXT CLASSIFICATION AND CLUSTERING    9
A Characterization of Text Classification – Unsupervised Algorithms: Clustering – Naïve
Text Classification – Supervised Algorithms – Decision Tree – k-NN Classifier – SVM
Classifier – Feature Selection or Dimensionality Reduction – Evaluation metrics – Accuracy
and Error – Organizing the classes – Indexing and Searching – Inverted Indexes – Sequential
Searching – Multi-dimensional Indexing.
UNIT IV WEB RETRIEVAL AND WEB CRAWLING        9
The Web – Search Engine Architectures – Cluster based Architecture – Distributed
Architectures – Search Engine Ranking – Link based Ranking – Simple Ranking Functions –
Learning to Rank – Evaluations — Search Engine Ranking – Search Engine User Interaction
– Browsing – Applications of a Web Crawler – Taxonomy – Architecture and
Implementation – Scheduling Algorithms – Evaluation.
UNIT V RECOMMENDER SYSTEM                             9
Recommender Systems Functions – Data and Knowledge Sources – Recommendation
Techniques – Basics of Content-based Recommender Systems – High Level Architecture –
Advantages and Drawbacks of Content-based Filtering – Collaborative Filtering – Matrix
factorization models – Neighborhood models.

TOTAL: 45 PERIODS
TEXT BOOKS:

1. Ricardo Baeza-Yates and Berthier Ribeiro-Neto, ―Modern Information Retrieval:


The Concepts and Technology behind Search, Second Edition, ACM Press Books,
2011.
2. Ricci, F, Rokach, L. Shapira, B.Kantor, ―Recommender Systems Handbook, First
Edition, 2011.

REFERENCES:

1. C. Manning, P. Raghavan, and H. Schütze, ―Introduction to Information


Retrieval, Cambridge University Press, 2008.
2. Stefan Buettcher, Charles L. A. Clarke and Gordon V. Cormack, ―Information
Retrieval: Implementing and Evaluating Search Engines, The MIT Press, 2010.

STAFF-IN-CHARGE HOD-CSE
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

COURSE PLAN OF INFORMATION RETRIEVAL TECHNIQUES


(ELECTIVE)

1 Class : VIII- Semester ME (CSE )

2 Course Code & Name : CS8080 – INFORMATION RETRIEVAL TECHNIQUES

3 Course Type : Theory

4 Course Status & Credits: UG Credits: 3


(Learning about the Machine Learning process)

5 Aim/Course Descriptions: To understand the various components, operation, applications of


Machine Learning.

6 Prerequisites : A knowledge of object oriented programming, simple data


structures such as hash maps, and text processing.
LECTURE SCHEDULE
(Academic year: 2020 – 2021)

Name of the staff : Mrs.K.RAJALAKSHMI


Subject Code / Name : CS8080- INFORMATION RETRIEVAL TECHNIQUES
Course : BE
Semester / Branch / Section : VIII / CSE

Teaching
No.o Cumula
Reference Aids
Date f. tive
S.no Topic Name Book (LCD/
Planned Peri No.of.
Page No OHB/
ods Periods
BB)
UNIT I INTRODUCTION

1 22.12.20 Information Retrieval R2 (3-8) 1 1 PPT


23.12.20 Early Developments, The IR
2 R1(17-40) 1 2 PPT
Problem
23.12.20 The Users Task, Information
3 versus Data Retrieval R1(61-65) 1 3 PPT

28.12.20 The Software Architecture of PPT &


4 the IR System R1(41-44) 1 4 video
lect.
28.12.20 The Retrieval and Ranking
5 Processes R1(47-50) 1 5 PPT

29.12.20 The Web, The e-Publishing


6 Era, How the web changed R1(61-65) 2 7 PPT
Search
30.12.20 Practical Issues on the Web – PPT&
7 How People Search R1(248-250) 1 8 video
lect.
04.01.21 Search Interfaces Today –
8 Visualization in Search R1(271-291) 1 9 PPT
Interfaces.
UNIT II MODELING AND RETRIEVAL EVALUATION     

1 04.01.21 Basic IR Models , Weighting R1(61-65) 1 10 PPT

2 05.01.21 Boolean Model R1(499) 1 11 PPT


06.01.21 TF-IDF (Term PPT
3 Frequency/Inverse R1(501-526) 1 12
Document Frequency)
06.01.21 Vector Model , Probabilistic 13
4 R1(139-147) 1 PPT
Model
11.01.21 Latent Semantic Indexing R1(120-125) 14
5 1 PPT
Model Neural Network Model R1(123-126)
11.01.21 Retrieval Evaluation , 15
6 Retrieval R1(149-166) 1 PPT
Metrics
12.01.21 Precision and Recall, 16
7 Reference R1(264) 1 PPT
Collection , User-based
13.01.21 Evaluation , Relevance PPT&
Feedback and Query 18 video
8 R1(264) 2
Expansion lect.
,Explicit Relevance Feedback
UNIT III TEXT CLASSIFICATION AND CLUSTERING 

1 18.01.21 A Characterization of Text R1(173-177) 1 19 PPT


19.01.21 Unsupervised Algorithms:
2 Clustering, R1(221) 1 20 PPT
Naïve Text Classification
3 20.01.21 Supervised Algorithms R1(221-240) 1 21 PPT
4 20.01.21 Decision Tree, k-NN Classifier R1(242-246) 1 22 PPT
5 21.01.21 Classification SVM Classifier R1(484-492) 1 23 PPT
25.01.21 Feature Selection or PPT
Dimensionality Reduction PPT&
6 R1(477-484) 1 24
video
lect.
25.01.21 Evaluation metrics , Accuracy
R1(472-475)
7 and Error , Organizing the 1 25 PPT
classes
25.01.21 Indexing and Searching R1(493-450) 1 26 PPT
8
, Inverted Indexes.
27.01.21 Sequential Searching , Multi R1(493-450 1 27 PPT
9
Dimensional Indexing
UNIT IV WEB RETRIEVAL AND WEB CRAWLING 
1 01.02.21 The Web , Search Engine, R1(271-273) 1 28 PPT
2 01.02.21 Architectures – Cluster based R1(273-291) 1 29 PPT
Architecture , Distributed
Architectures
3 02.02.21 Search Engine Ranking , Link R1(291-294) 1 30 PPT
Based Ranking
4 03.02.21 Simple Ranking Functions , R1(435-440) 1 31 PPT
Learning to Rank
5 03.02.21 Evaluations , Search Engine R1(294-314) 1 32 PPT
Ranking
6 04.02.21 Search Engine User R1(435-440) 1 33 PPT
Interaction,
Browsing , Applications of a
Web
7 08.02.21 Crawler , Taxonomy , R1 (450-456) 2 35 PPT
Architecture and
Implementation
8 09.02.21 Scheduling Algorithms , R1(463-471) 1 36 PPT
Evaluation.
UNIT V RECOMMENDER SYSTEM     
1 10.02.21 Recommender Systems R2 (61-84) 1 37 PPT
Functions
2 10.02.21 Data and Knowledge Sources R2(476-479) 1 38 PPT
PPT&
video
lect.
3 15.02.21 Recommendation Techniques R2(466) 1 39 PPT
4 15.02.21 Basics of Content-based R2(483-484) 1 40 PPT
Recommender Systems
5 16.02.21 High Level Architecture, R2(511-514) 1 41 PPT
Advantages and Drawbacks of
Content-based Filtering
6 17.02.21 Collaborative Filtering, R2(415-439) 1 42 PPT
PPT&
video
lect.
7 17.02.21 Recommender Systems R2(483-484) 1 43 PPT
Functions
8 22.02.21 Matrix factorization models, R2(511-514) 1 44 PPT
9 22.02.21 Neighborhood models. R2(415-439) 1 45 PPT

ASSIGNMENT TOPICS
 E-Publishing Era
 Scheduling Algorithms and Evaluation
 High Level Architecture, Advantages and Drawbacks of Content-based Filtering

TOPICS BEYOND THE SYLLABUS


 Machine problems - implementation details of different components in a retrieval
system.
 Computational artifacts,
 Implementation efficiency and readability of your machine problem

ADDITIONAL RESOURCES FOR COURSE:

 Question Bank
 Animated power point presentation for each unit
 Course Material for all the units

PROFESSIONAL COMPONENTS
Engineering Topic : 60%
Algorithm : 30%
General : 10%
PORTIONS FOR INTERNAL ASSESSMENT TEST I, II & MODEL:
Test No Unit No Topics
I Unit I and Unit II (Half) Introduction about machine a
Learning and techniques, Modeling and Retrieval
Evaluation
II Unit II(Half) , Unit III Text Classification And Clustering Web Retrieval And
Web Crawling 

Model Unit IV(Half) and Unit V Recommender System   , Web Crawling   

E-RESOURCE LINKS:
1. https://csenotescorner.blogspot.com/2018/02/information-retrieval-techniques.html
2. https://www.smartzworld.com/notes/information-retrieval-system-pdf-notes-irs-pdf-
notes/
3. https://www.cl.cam.ac.uk/teaching/1314/InfoRtrv/lecture1.pdf
4. https://studentsfocus.com/cs6007-ir-notes-information-retrieval-lecture-handwritten-
notes-cse-7th-sem-anna university/
5. https://www.slideshare.net/AnandhArumugakan/cs6007-information-retrieval-5-units-
notes

Prepared By Verified by

Mrs.K.Rajalakshmi Dr.C.Seelammal
AP/ CSE HOD / CSE

You might also like