Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
62 views8 pages

Irsw Project

This document describes a project to build a text summarization system using natural language processing and machine learning. It discusses extractive and abstractive summarization approaches and describes implementing the TextRank algorithm for extractive summarization. The dataset contains product descriptions and the task is to summarize them into shorter versions while maintaining context. Finally, the generated summaries are added to a dataframe and converted to a CSV file.

Uploaded by

kartike tiwari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views8 pages

Irsw Project

This document describes a project to build a text summarization system using natural language processing and machine learning. It discusses extractive and abstractive summarization approaches and describes implementing the TextRank algorithm for extractive summarization. The dataset contains product descriptions and the task is to summarize them into shorter versions while maintaining context. Finally, the generated summaries are added to a dataframe and converted to a CSV file.

Uploaded by

kartike tiwari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Text Summarizer

Synopsis

Submitted by:

Shwetank Verma (19103209)

Ishaan Raj Mishra


(19103210)

Varun Mittal (19103266)

Amritansh Gupta
(19103305)

Department of CSE/IT

JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY


Table of Contents

Page No.

Abstract i

Introduction ii

Background Study iii

Flowchart and Dataset Description iv

References v
ABSTRACT

The amount of text data available has increased dramatically in recent years from a variety of sources.
This large volume of literature has a wealth of information and knowledge that must be adequately
summarized to be useful.

One of the most difficult NLP tasks is summarization, which is the process of generating a shorter
version of a piece of text while keeping critical context information.

The goal is to provide a condensed representation of an input text that captures the original text's basic
meaning.

To produce a condensed version, most successful summarizing systems use extractive algorithms that
crop out and stitch together Chunks of the text.

i
INTRODUCTION

Before producing the required summary texts, machine learning algorithms can be trained to interpret
documents and identify the areas that carry key facts and information.

Summarization improves the readability of publications, cuts down on time spent searching for
information, and allows for more information to be crammed into a given space.

We will be working on extraction-based summarization in this project.

The process of extractive text summarising entails extracting essential terms from the original document
and combining them to create a summary.

Extractive summarization is a type of machine learning that includes weighting the most important parts
of sentences and using the findings to construct summaries.

To determine the weights of the phrases, several algorithms and approaches can be employed to rank them
according to their relevance and resemblance to one another, and then link them to create a summary.

Even though the outcomes of extraction-based summarization aren't always grammatically correct, we
nevertheless get a concise and valuable piece of data.

ii
BACKGROUND STUDY

RESEARCH PAPER 1

TITLE: Analytical study of Text Summarization Techniques

AUTHOR: Dr. Pooja Raundale, Himanshu Shekhar

PUBLISHER: IEEE PUBLISHED IN: October 2021

SUMMARY: They implemented and compared the performance of various automatic summarization
methods to gain insight into how long the methods take to implement and how accurate and human-like
the generated summaries are.

Extractive techniques (TF-IDF and TextRank) achieve very high scores for ROUGE evaluation.

Abstractive techniques like Seq2Seq with Attention and Pointer-Generator score a lot lower as compared
to the above two since they generate human-like summaries that appear to be handwritten.

RESEARCH PAPER 2

TITLE: Extractive Text Summarization Using Sentence Ranking

AUTHOR: J.N. Madhuri, Ganesh Kumar R.

PUBLISHER: IEEE PUBLISHED IN: August 2019

SUMMARY: In this work, they proposed extractive-based text summarization using a statistical novel
approach based on the sentences ranking the sentences selected by the summarizer. The sentences which
are extracted are produced as a summarized text.

The sentences are sorted based on their weighted frequency ranks from highest rank to lowest. The
sentences are arranged in descending order. The summarizer will extract the high-weighted frequency
sentences to find a summary of a document.

iii
FLOWCHART REPRESENTATION

DATASET DESCRIPTION

It contains numerous paragraphs describing various types of medications available and how to consumethem
including the benefits and aftereffects of the medication. It also consists of the doctor’s directionson when to
consume them based on various situations and what to avoid while consuming them

iv
DESCRIPTION OF THE PROJECT

In this project, Automatic text summarization is summarizing the given paragraph using natural language
processing and machine learning. There has been an explosion in the amount of text data from a variety
of sources. This volume of text is an invaluable source of information and knowledge which needs to be
effectively summarized to be useful. In this review, the main approaches to automatic text summarization
are described.

The dataset used in this project contains long descriptions of products. The task is to make a text
summarizer that takes these descriptions as input and summarizes them into shorter versions without
losing the context. The length of the summary will also be adjustable by the user.

There are two general approaches to automatic summarization: Extraction and Abstraction.

Extractive Summarization: These methods rely on extracting several parts, such as phrases and sentences,
from a piece of text and stacking them together to create a summary. Therefore, identifying the right
sentences for summarization is of utmost importance in an extractive method.

Abstractive Summarization: These methods use advanced NLP techniques to generate an entirely new
summary. Some parts of this summary may not even appear in the original text. Such a summary might
include verbal innovations. Research has focused primarily on extractive methods, which are appropriate
for image collection and video summarization.

In this Jupyter notebook, the TextRank algorithm for extractive text summarization is implemented using
Google's PageRank search algorithm to generate correlations among sentences.

Finally, all the generated summary for each paragraph is added to the Dataframe and then the Dataframe
is converted to a CSV file.

v
REFERENCES

1. Luís Gonçalves , Automatic Text Summarization with Machine Learning, Apr


12, 2020

https://medium.com/luisfredgs/automatic-text-summarization- with-machine-
learning-an-overview-68ded5717a25

2. Shrivarsheni, Text Summarization Approaches for NLP, Oct 26 2020


https://www.machinelearningplus.com/nlp/text-summarization- approaches-nlp-
example/

3. Aravindpai, Comprehensive Guide to Text Summarization using Deep Learning


in Python, June 10 2019

https://www.analyticsvidhya.com/blog/2019/06/comprehensive

-guide-text-summarization-using-deep-learning-python/

4. Alfrick Opidi, Gentle Introduction to Text Summarization in Machine


Learning, Apr 15 2019

https://blog.floydhub.com/gentle-introduction-to-text- summarization-
in-machine-learning/

vi

You might also like