Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
28 views3 pages

Brief History of Web Scraping

Web scraping originated in 1989 with the creation of the World Wide Web by Tim Berners-Lee, initially aimed at facilitating information sharing among scientists. Over the years, it evolved from basic web crawling tools like the Wanderer and JumpStation to more sophisticated software like BeautifulSoup and visual web scrapers, enabling users to extract data easily. Today, web scraping is a vital method for businesses to gain competitive advantages and is expected to continue growing alongside advancements in technology and data accessibility.

Uploaded by

Jimmy Teng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views3 pages

Brief History of Web Scraping

Web scraping originated in 1989 with the creation of the World Wide Web by Tim Berners-Lee, initially aimed at facilitating information sharing among scientists. Over the years, it evolved from basic web crawling tools like the Wanderer and JumpStation to more sophisticated software like BeautifulSoup and visual web scrapers, enabling users to extract data easily. Today, web scraping is a vital method for businesses to gain competitive advantages and is expected to continue growing alongside advancements in technology and data accessibility.

Uploaded by

Jimmy Teng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Brief History of Web Scraping

May 14, 2021

Data, web scraping






Web scraping is becoming a more widely known term. Most associate it with web data
extraction, the most efficient and the simplest way of copying large chunks of information
online; however, did you know that web scraping was born for a completely different purpose
and it took almost two decades for it to transform into web scraping we are familiar with
now?

Here is the timeline:

The birth of the World Wide Web


The origins of very basic web scraping can be dated back to 1989 when a British scientist
Tim Berners-Lee created the World Wide Web. Originally the idea was to have a platform
where information could be automatically shared between scientists in universities and
institutes all around the world. However, with the World Wide Web came three very
important features that are the key elements for every web scraping tool nowadays:

 the URLs which we now use to designate a scraper to a specific website,


 embedded hyperlinks that allow us to navigate through the designated website,
 and web pages that contained various types of data - text, images, audios, videos, etc.

First web browser


Continuing his work, two years later, Tim Berners-Lee created the very first web browser, an
http:// web page, all run on a server from his NeXT computer, giving a way for people to
access and interact with the World Wide Web.

The Wanderer
Time-wise not much apart, in 1993, the first concept of crawling was born. The Wanderer,
more precisely - the World Wide Web Wanderer developed by Matthew Gray at the
Massachusetts Institute of Technology was a first of its kind, Perl-based web crawler whose
sole purpose was to measure out the size of the web. The same year, the Wanderer was used
to generate an index called the Wandex. Even though the author does not claim it, the
Wanderer with Wandex had the potential to become the first general-purpose World Wide
Web search engine.

JumpStation
However, the same year, 1993, the technology that laid grounds for big names such as
Google, Bing, Yahoo, and other search tools on the web today - JumpStation was born and
became the actual very first crawler-based web search engine. With it, millions of web pages
indexed - the internet turned into an open-source platform of data in various forms.

BeautifulSoup
A bit more than a decade later, in 2004, came BeautifulSoup - HTML parser, a library of
commonly used algorithms written in Python programming language. BeautifulSoup helped
to grasp the sense of site structure and parse the contents within the HTML containers;
therefore, saving hours of work for programmers. And since the internet had become this
immense source of information that anyone with a computer and internet connection had
access to, as well as it being easily searchable, people had started to take advantage of this by
extracting the information available to them. For some time websites did not prohibit the
ability to download the content of their sites; however, slowly that changed, and for the
amount of data that was getting downloaded - simply manually copy-pasting was not an
option; therefore, other ways of obtaining the information was bound to be developed.

Rise of visual web scrapers


Soon after, web scraping as we know it was born. The visual web scraping software Web
Integration Platform version 6.0 which was launched by Stefan Andresen, allowed users to
highlight the necessary information of a web page and structure that data into a usable excel
file, or database which provided an opportunity for non-programmers to join and easily
extract data from the web.

Nowadays, as technologies and industries progress, companies are looking to gain an


advantage over their competition. And, due to the fact, that the amount of information
available on the internet is growing exponentially, Web scraping is becoming one of the most
prominent and widely-used methods of acquiring data at scale across various industries and
business spheres

Future of web scraping


Web scraping has grown immensely in recent years, and almost guaranteed to continue
upward growth. Currently, the commercial web scraping scene is mostly for gaining a
competitive advantage by collecting leads, scraping competitors, price monitoring, etc.
However, as technology develops, such as Artificial Intelligence, and data becomes even
more accessible and crucial to different aspects of life, web scraping will advance with it and
produce various new and remarkable applications that we are only looking forward to
experimenting with.

You might also like