0% found this document useful (0 votes)

35 views3 pages

Assignment 1

Uploaded by

jatinchowhan8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views3 pages

Assignment 1

Uploaded by

jatinchowhan8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Assignment -1

1. Consider these documents:

Doc 1 breakthrough drug for schizophrenia
Doc 2 new schizophrenia drug
Doc 3 new approach for treatment of schizophrenia
Doc 4 new hopes for schizophrenia patients
a. Draw the term‐document incidence matrix for this document collection.
b. Draw the inverted index representation for this collection

2. Recommend a query processing order for

(tangerine OR trees) AND (marmalade OR skies) AND (kaleidoscope OR eyes)

given the following postings list sizes:

Term Postings Size

eyes 213312

kaleidoscope 87009

marmalade 107913

skies 271658

tangerine 46653

trees 316812
3. For a conjunctive query, is processing postings lists in order of size guaranteed to be optimal?
Explain why / why not.

4. Extend the postings merge algorithm to arbitrary Boolean query formulas. What is its time
complexity? For instance, consider:

(Brutus OR Caesar) AND NOT (Antony OR Cleopatra)

Can we always merge in linear time? Linear in what? Can we do better than this?

5. For the Porter stemmer rule group:

a. What is the purpose of including an identity rule such as SS →SS?

b. Applying just this rule group, what will the following words be stemmed to?

circus canaries boss

c. What rule should be added to correctly stem pony?

d. The stemming for ponies and pony might seem strange. Does it have a deleterious effect on
retrieval? Why or why not?

6. Why are skip pointers not useful for queries of the form x OR y?

7. We have a two word query. For one term the postings list consist of the following 16 entries.

[ 2, 4, 9, 12, 14, 16, 18, 20, 24, 32, 47, 81, 120, 125, 158, 180 ]

and for the other list it is the one entry postings list

[ 81 ]

Work out how many comparisons would be done to intersect the two postings list with the
following two strategies.

i. Using standard postings list.

ii. Using postings list stored with skip pointers, with the suggested skip length of √P.

8. Consider a postings intersection between this postings list, with skip pointers:

And the following intermediate result postings list (which has has no skip pointers):

Trace through the posting’s intersection algorithm.

A. How often is a skip pointer followed (i.e., p1 is advanced to skip(p1))?

B. How many postings comparisons will be made by this algorithm while intersecting the two
lists?

C. How many postings comparisons would be made if the postings lists are intersected without
the use of skip pointers?

9. How is the inverted index used for the document retrieval and how this inverted index
updated with new documents?

10. How are positional indexes different from traditional inverted indexes and what are the
benefits of using positional indexes?

CS726 - Solution Manual - Introduction To Information Retrieval
100% (2)
CS726 - Solution Manual - Introduction To Information Retrieval
49 pages
Stylus Pro 7880 9880 Field Repair Guide PDF
62% (13)
Stylus Pro 7880 9880 Field Repair Guide PDF
350 pages
Ricoh MP C4504 C5504 C6004 C4504ex C5504ex C6004ex Parts Catalog 66e08bf9c3a41
No ratings yet
Ricoh MP C4504 C5504 C6004 C4504ex C5504ex C6004ex Parts Catalog 66e08bf9c3a41
202 pages
Information Retrieval Solutions Manual
84% (57)
Information Retrieval Solutions Manual
17 pages
Inverted Index & Boolean Queries
100% (4)
Inverted Index & Boolean Queries
6 pages
Solution.: Increase - 3
No ratings yet
Solution.: Increase - 3
5 pages
T 01
100% (1)
T 01
1 page
Sheet 2 ch2
No ratings yet
Sheet 2 ch2
4 pages
Sheet 2
No ratings yet
Sheet 2
4 pages
Information Retrieval Analysis
No ratings yet
Information Retrieval Analysis
6 pages
Sheet 1
No ratings yet
Sheet 1
2 pages
600 Computer Mcqs
No ratings yet
600 Computer Mcqs
23 pages
IR - Midsem Question Paper - 2024 - Solutionfull
No ratings yet
IR - Midsem Question Paper - 2024 - Solutionfull
7 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
L05
No ratings yet
L05
33 pages
Module 1-1
No ratings yet
Module 1-1
12 pages
2.boolean Retrieval Model
No ratings yet
2.boolean Retrieval Model
40 pages
B Tech WSM CSE 442 Endterm Online NOV 20-11-2021
No ratings yet
B Tech WSM CSE 442 Endterm Online NOV 20-11-2021
3 pages
CS276 PA1 Report Rukmani Ravi Sundaram Tayyab Tariq 1.description of The Structure of The Program: Index - Py
No ratings yet
CS276 PA1 Report Rukmani Ravi Sundaram Tayyab Tariq 1.description of The Structure of The Program: Index - Py
2 pages
Document Indexing in Information Retrieval
No ratings yet
Document Indexing in Information Retrieval
19 pages
Unit I
No ratings yet
Unit I
83 pages
Lecture 1
No ratings yet
Lecture 1
53 pages
Practice Sheet
No ratings yet
Practice Sheet
2 pages
Ir End Pyq Sols
No ratings yet
Ir End Pyq Sols
8 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
50 pages
Sample Exam
No ratings yet
Sample Exam
2 pages
Lecture1-Intro - Realted To Ch1
No ratings yet
Lecture1-Intro - Realted To Ch1
60 pages
Assignment 6
No ratings yet
Assignment 6
3 pages
Ir 1
No ratings yet
Ir 1
14 pages
Sistem Temu Kembali
No ratings yet
Sistem Temu Kembali
6 pages
Lecture 2 Inverted Index PDF
No ratings yet
Lecture 2 Inverted Index PDF
24 pages
Lec 1 IR
No ratings yet
Lec 1 IR
42 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
69 pages
Ir MCQ-1
No ratings yet
Ir MCQ-1
22 pages
T1 PDF
No ratings yet
T1 PDF
2 pages
Unit2 ISR
No ratings yet
Unit2 ISR
12 pages
Lecture 3-Skip Pointers and Phrase Queries
No ratings yet
Lecture 3-Skip Pointers and Phrase Queries
12 pages
Assignments 1 Solution
100% (1)
Assignments 1 Solution
6 pages
Introduction To Information Retrieval Instructor S Solution Manual Solutions 1st Edition Christopher D. Manning 2025 Full Version
100% (4)
Introduction To Information Retrieval Instructor S Solution Manual Solutions 1st Edition Christopher D. Manning 2025 Full Version
195 pages
Lecture1 Intro Handout 1 Per
No ratings yet
Lecture1 Intro Handout 1 Per
57 pages
Solved-Midterm Sistem Temu Kembali
No ratings yet
Solved-Midterm Sistem Temu Kembali
5 pages
Stanford CS347 Spring 2001 Midterm Solutions
No ratings yet
Stanford CS347 Spring 2001 Midterm Solutions
5 pages
asila-IR
No ratings yet
asila-IR
16 pages
Information Retrieval: Prof: Ehab Ezzat Hassanein
No ratings yet
Information Retrieval: Prof: Ehab Ezzat Hassanein
15 pages
Sheet Bounes
No ratings yet
Sheet Bounes
1 page
Supervisionguide16 17 Students
No ratings yet
Supervisionguide16 17 Students
17 pages
QP Midsem Regular - Solutions For IR
100% (2)
QP Midsem Regular - Solutions For IR
4 pages
Lect 3 Inverted Index
No ratings yet
Lect 3 Inverted Index
24 pages
IR Lec04 Skip Ptrs Phrase Queries Indexing
No ratings yet
IR Lec04 Skip Ptrs Phrase Queries Indexing
18 pages
Information Retrieval
No ratings yet
Information Retrieval
3 pages
6 The Term Vocabulary & Posting List
No ratings yet
6 The Term Vocabulary & Posting List
19 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
38 pages
Chap5 Index Construction
No ratings yet
Chap5 Index Construction
38 pages
Lecture01 Intro
No ratings yet
Lecture01 Intro
45 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
57 pages
Lecture 2 - Boolean Retrieval
No ratings yet
Lecture 2 - Boolean Retrieval
49 pages
Lecture1 Intro
No ratings yet
Lecture1 Intro
60 pages
Roav DASHCAM R2241 A1-MANUAL
No ratings yet
Roav DASHCAM R2241 A1-MANUAL
12 pages
Evaluation PHP
No ratings yet
Evaluation PHP
1 page
Presentation Template Guide
No ratings yet
Presentation Template Guide
19 pages
CSS Basics and Best Practices
No ratings yet
CSS Basics and Best Practices
113 pages
Brave MMA Event Expenses 2016
No ratings yet
Brave MMA Event Expenses 2016
18 pages
Kernel Exploitation for Hackers
No ratings yet
Kernel Exploitation for Hackers
31 pages
Data Validation vs. Verification Guide
No ratings yet
Data Validation vs. Verification Guide
16 pages
Hydro Flask 2025 03 01 2025 03 31
No ratings yet
Hydro Flask 2025 03 01 2025 03 31
7 pages
Cogwise AI & $COGW Token Overview
No ratings yet
Cogwise AI & $COGW Token Overview
36 pages
03 Css
No ratings yet
03 Css
20 pages
Screenshot 2024-06-27 at 2.57.46 PM
No ratings yet
Screenshot 2024-06-27 at 2.57.46 PM
9 pages
Chapter 4 Vector Space
No ratings yet
Chapter 4 Vector Space
66 pages
Mindray VS900c Accutorr-7-Service-Manual-10.0
No ratings yet
Mindray VS900c Accutorr-7-Service-Manual-10.0
90 pages
Functional Testing Techniques
No ratings yet
Functional Testing Techniques
42 pages
Xii CS Practical Programs 2024 - 2025
No ratings yet
Xii CS Practical Programs 2024 - 2025
28 pages
Failures, Errors and Risks in Computer System Presentation (0024)
No ratings yet
Failures, Errors and Risks in Computer System Presentation (0024)
21 pages
Turnitin Guide For Students
No ratings yet
Turnitin Guide For Students
13 pages
Mutasi Bca Saka Februari 2021
No ratings yet
Mutasi Bca Saka Februari 2021
9 pages
Web Design & Marketing Basics
No ratings yet
Web Design & Marketing Basics
65 pages
UNV【Datasheet】 IPC2122LB-SF28 (40) -A-BY 2MP Mini Fixed Bullet Network Camera Datasheet V1.1-EN
No ratings yet
UNV【Datasheet】 IPC2122LB-SF28 (40) -A-BY 2MP Mini Fixed Bullet Network Camera Datasheet V1.1-EN
4 pages
Gigabyte Ga-Q77m-D2h Rev 1.01
No ratings yet
Gigabyte Ga-Q77m-D2h Rev 1.01
32 pages
Pub 57441
No ratings yet
Pub 57441
40 pages
CN Unit 1
No ratings yet
CN Unit 1
31 pages
What Is TikTok An Introduction
No ratings yet
What Is TikTok An Introduction
8 pages
Windows User Account Management Lab
No ratings yet
Windows User Account Management Lab
3 pages
Current Midterm Solved Papers: Muhammad Faisal Dar
No ratings yet
Current Midterm Solved Papers: Muhammad Faisal Dar
14 pages
User Manual
No ratings yet
User Manual
175 pages
30 Day's Batch Complete Schedule PDF
No ratings yet
30 Day's Batch Complete Schedule PDF
4 pages

Assignment 1

Uploaded by

Assignment 1

Uploaded by

Assignment -1

1. Consider these documents:

2. Recommend a query processing order for

(tangerine OR trees) AND (marmalade OR skies) AND (kaleidoscope OR eyes)

given the following postings list sizes:

Term Postings Size

(Brutus OR Caesar) AND NOT (Antony OR Cleopatra)

5. For the Porter stemmer rule group:

a. What is the purpose of including an identity rule such as SS →SS?

circus canaries boss

c. What rule should be added to correctly stem pony?

i. Using standard postings list.

Trace through the posting’s intersection algorithm.

A. How often is a skip pointer followed (i.e., p1 is advanced to skip(p1))?

You might also like