How the Web Works
Chapter 1
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
© 2015 Pearson
Randy Connolly and Ricardo Hoar Fundamentals ofhttp://www.funwebdev.com
Web Development
Objectives
1 Definitions and
History 2 Internet
Protocols
3
Client-Server
Model 4 Where is the
Internet?
5 Domain Name
System 6 Uniform Resource
Locators (URL)
Hypertext Transfer
7 Protocol
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Section 1of 8
DEFINITIONS AND HISTORY
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Internet = Web?
The answer is no
The World-Wide Web (WWW or simply the Web) is certainly what
most people think of when they see the word “internet.”
But the WWW is only a subset of the Internet.
The Internet is the infrastructure
that connects computers.
The World Wide Web is the
method of accessing that
infrastructure.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Communication Definitions
We will begin with the telephone
Telephone networks provide a good starting place to learn about
modern digital communications.
In the telephone networks of old, calls were routed through
operators who physically connected caller and receiver by
connecting a wire to a switchboard to complete the circuit.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Circuit Switching
A circuit switching establishes an actual physical
connection between two people through a series of
physical switches.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Circuit Switching
Its Limitations
Circuit Switching Weaknesses
▪You must establish a link and maintain a dedicated
circuit for the duration of the call
▪Difficult to have multiple conversations
simultaneously
▪ Wastes bandwidth since even the silences are
transmitted
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
ARPANET
The beginnings of the Internet
The research network ARPANET was created. In the
1960s
▪ ARPANET did not use circuit switching
▪ it used packet switching
A packet-switched network does not require a continuous
connection. Instead it splits the messages into smaller
chunks called packets and routes them to the appropriate
place based on the destination address.
The packets can take different routes to the destination.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Packet Switching
......
Thou ma p of w oe, t h at
thu s d ost ta lk in signs! Thou map of woe, that
thus dost talk in sign s !
Sender
address
Original message
broken into
Original message
reassembled from
packets
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Packet Switching
Isn’t this more complicated?
While packet switching may seem a more complicated
and inefficient approach than circuit switching, it is:
▪more robust (it is not reliant on a single pathway
that may fail) and
▪a more efficient use of network resources (since a
circuit can communicate multiple connections).
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Short History of the Internet
Perhaps not short enough
The early ARPANET network was funded and
controlled by the United States government, and was
used exclusively for academic and scientific purposes.
The early network started small with just a handful of
connected campuses in 1969 and grew to a few
hundred by the early 1980s.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
TCP/IP
Rides to the rescue
To promote the growth and unification of the disparate
networks a suite of protocols was invented to unify the
networks together.
By 1981, new networks built in the US began to adopt the
TCP/IP (Transmission Control Protocol / Internet Protocol)
communication model (discussed in the next section),
while older networks were transitioned over to it.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Tim Berners-Lee
I meant Sir Tim Berners-Lee
The invention of the WWW is usually attributed to
the British Tim Berners-Lee, who, along with the
Belgian Robert Cailliau, published a proposal in
1990 for a hypertext system while both were
working at CERN in Switzerland.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Core Features of the Web
Shortly after that initial proposal Berners-Lee developed
the main features of the web:
1. A URL to uniquely identify a resource on the WWW.
2. The HTTP protocol to describe how requests and
responses operate.
3. A software program (later called web server software)
that can respond to HTTP requests.
4. HTML to publish documents.
5. A program (later called a browser) to make HTTP
requests from URLs and that can display the HTML it
receives.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
W3C
The World Wide Web Consortium
Also in late 1994, Berners-Lee helped found the World Wide Web
Consortium (W3C), which would soon become the international
standards organization that would oversee the growth of the
web.
This growth was very much facilitated by the decision of CERN to
not patent the work and ideas done by its employee and instead
left the web protocols and code-base royalty free.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Web Apps Compared to Desktop Apps
First the advantages of web apps
Some of the advantages of web applications include:
• Accessible from any internet-enabled computer.
• Usable with different operating systems and browser platforms.
• Easier to roll out program updates since only need to update software on
server and not on every desktop in organization.
• Centralized storage on the server means fewer concerns about local
storage (which is important for sensitive information such as health care
data).
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Web Apps Compared to Desktop Apps
Now the disadvantages of web apps
Some of the disadvantages of web applications include:
• Requirement to have an active internet connection (the internet is not always
available everywhere at all times).
• Security concerns about sensitive private data being transmitted over the
internet.
• Concerns over the storage, licensing and use of uploaded data.
• Problems with certain websites on certain browsers not looking quite right.
• Limited access to the operating system can prevent software and hardware
from being installed or accessed (like Adobe Flash on iOS).
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
What is an “Intranet”?
A short digression
One of the more common terms you might encounter
in web development is the term “intranet” (with an
“a”), which refers to an internet network that is local
to an organization or business.
Intranet resources are often private, meaning that only
employees (or authorized external parties such as
customers or suppliers) have access to those
resources.
Thus Internet (with an “e”) is a broader term that
encompasses both private (intranet) and public
networked resources.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
What is an “Intranet”?
Intranets are typically protected from unauthorized external access
via security features such as firewalls or private IP ranges.
Because intranets are private, search engines such as Google have
limited or no access to content within a private intranet.
Due to this private nature, it is difficult to accurately gauge, for
instance, how many web pages exist within intranets, and what
technologies are more common in them.
Some especially expansive estimates guess that almost half of all web
resources are hidden in private intranets.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Intranet versus Internet
Financial and other Off-site workers might be
enterprise systems able to access internal
system.
Firewall
Public can
access public
Customers and corporate web system.
partners might be able to
access internal system.
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Static Web Sites
Partying Like It’s 1995
In the earliest days of the web, a webmaster (the term popular
in the 1990s for the person who was responsible for creating
and supporting a web site) would publish web pages, and
periodically update them.
In those early days, the skills needed to create a web site were
pretty basic: one needed knowledge of the HTML markup
language and perhaps familiarity with editing and creating
images.
This type of web site is commonly referred to as a static web site,
in that it consists only of HTML pages that look identical for all
users at all times.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Static Web Sites
I want to see
0 vacation.html
0 Browser
displays files
8 Server retrieves files
from its hard drive
0
• ..,
Server "sends" HTML
and then later the image
_
to browser
vaclati tml
pi ctu re . j pg
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Dynamic Web Sites
Within a few years of the invention of the web, sites
began to get more complicated as more and more
sites began to use programs running on web servers
to generate content dynamically.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Dynamic We b Sites
0 I want t o see
vacation.php
8• Browser
displays files
8 Server recognizes
that it must run a
dynamic script that
is on its hard drive.
0 Server "sends"
generated HTML
and the image
file t o user.
) Server executes
or interprets
0 Scripts
EJ the script.
"outputs" HTML
vacation.php
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Dynamic Web Sites
What are they?
These server-based programs would read content from
databases, interface with existing enterprise computer
systems, communicate with financial institutions, and
then output HTML that would be sent back to the
users’ browsers.
This type of web site is called here in this book a
dynamic web site because the page content is being
created at run-time by a program created by a
programmer; this page content can vary for user to
user.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Web 2.0 and Beyond
In the mid 2000s, a new buzz-word entered the
computer lexicon: web 2.0.
This term had two meanings, one for users and one
for developers.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Web 2.0
Its meaning for users
For the users, Web 2.0 referred to an interactive
experience where users could contribute and consume web
content, thus creating a more user- driven web experience.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Web 2.0
Its meaning for developers
For software developers, Web 2.0 also referred to a change in the
paradigm of how dynamic web sites are created.
Programming logic, which previously existed only on
the server, began to migrate to the browser.
This required learning Javascript, a rather tricky programming
language that runs in the browser, as well as mastering the rather
difficult programming techniques involved in asynchronous
communication.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Section 2 of 8
INTERNET PROTOCOLS
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
What’s a Protocol?
The internet exists today because of a suite of
interrelated communications protocols.
A protocol is a set of rules that partners in
communication use when they communicate.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
A Layered Architecture
The TCP/IP Internet protocols were originally
abstracted as a four-layer stack.
Later abstractions subdivide it further into five or
seven layers.
Since we are focused on the top layer anyhow, we will
use the earliest and simplest four-layer network
model.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Four Layer Network Model
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Link Layer
Save this for your networking course
The link layer is the lowest layer, responsible for both the
physical transmission across media (wires, wireless) and
establishing logical links.
It handles issues like packet creation, transmission,
reception and error detection, collisions, line sharing and
more.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Internet Layer
The internet layer (sometimes also called the IP Layer)
routes packets between communication partners across
networks.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Internet Protocol (IP)
The Internet uses the Internet Protocol (IP) addresses to
identify destinations on the Internet.
Every device connected to the Internet has an IP
address, which is a numeric code that is meant to
uniquely identify it.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
IP addresses and the Internet
c . - ... 00.
,._.,_
a. -
1..---
CI0'-
1Pv4
M
address
. .. . -
IP: 22.15.216.13
10'-
. . . Doldc;.._
- 0k l S .. .
All r i g h u
reserved.
W1ndo..s I P Conhguration
Ethernet adapter Local Area Connection:
IP : 142.108.149.36 connecnon-spec1f1c DNS suff1X
IPv4 Address . .
subnet Mask • • •
. . . . . 192.168.123.254
. .
oefau l Gauway . . . . . .
. .
:\>
IP: 192.168.123.254
IPAddre s s
OHCP BootP Static
- - - - - - -
IP Addres s 10.239.28.131 IP: 142.181.80.3
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
IP Addresses
Two types
IPv4 addresses are the IP addresses from the original
TCP/IP protocol.
In IPv4, 12 numbers are used (implemented as four 8-
bit integers), written with a dot between each integer.
Since an unsigned 8-bit integer's maximum value is 255,
four integers together can encode approximately
4.2 billion unique IP addresses.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
IP Addresses
Two types
To future proof the Internet against the 4.2 billion limit,
a new version of the IP protocol was created, IPv6.
This newer version uses eight 16-bit integers for 2128
unique addresses, over a billion billion times the number
in IPv4.
These 16-bit integers are normally written in
hexadecimal, due to their longer length.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
4 - 8 bit components
1Pv4
(32 bits)
232 addresses
192.168.123.254
8 - 16 bit components
1Pv6
(128 bits)
2 128 addresses
3fae:7a10:4545:9:291:e8ff:fe21:37ca
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Transport Layer
The transport layer ensures transmissions arrive, in order,
and without error.
This is accomplished through a few mechanisms.
First, the data is broken into packets formatting according to the
Transmission Control Protocol (TCP).
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Transport Layer
Secondly, each packet is acknowledged back to the sender so
in the event of a lost packet, the transmitter will realize a
packet has been lost since no ACK arrived for that packet.
That packet is retransmitted, and although out of order, is
reordered at the destination.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
TCP Packets
• Message broken
into packets with a
sequence number.
11IThou map o f woe, I
Thou map of woe, that
thus dost talk in signs! I 2 1that thus dost I
I3 1talk in signs! I
For each TCP packet
sent, an ACK
(acknowledgement)
must be received back.
Thou map of woe, that
thus dost talk in signs!
e Eventually, sender will
0 Message reassembled from
resend any packets that packets and ordered according
didn 't get an ACK back. to their sequence numbers.
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Application Layer
With the application layer, we are the level of protocols
familiar to most web developers.
Application layer protocols implement process-to- process
communication and are at a higher level of abstraction in
comparison to the low-level packet and IP addresses protocols
in the layers below it.
Examples: HTPP, SSH, FTP, DNS, POP, SMTP.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Section 3 of 8
CLIENT-SERVER MODEL
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Client-Server Model
What is it?
The web is sometimes referred to as a client-server
model of communications.
In the client-server model, there are two types of
actors: clients and servers.
The server is a computer agent that is normally active
24 hours a day, 7 days a week (or simply 24/7),
listening for queries from any client who make a
request.
A client is a computer agent that makes requests and
receives responses from the server, in the form of
response codes, images, text files, and other data.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Request-Response Loop
Within the client-server model, the request-response
loop is the most basic mechanism on the server for
receiving requests and transmitting data in response.
The client initiates a request to a server and gets a
response that could include some resource like an
HTML file, an image or some other data.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
The Peer-to-Peer Alternative
Not actually illegal
In the peer-to-peer model where each computer is
functionally identical, each node is able to send and
receive directly with one another.
In such a model each peer acts as both a client and
server able to upload and download information.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Peer-to-Peer Model
Request and Respond
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Server Types
A server is rarely just a single computer
Earlier, the server was shown as a single machine,
which is fine from a conceptual standpoint.
Clients make requests for resources from a URL; to the
client, the server is a single machine.
However, most real-world web sites are typically not
served from a single server machine, but by many
servers.
It is common to split the functionality of a web site
between several different types of server.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Server Types
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Server Farms
Have no cows
A single web server that is also acting as an application
or database server will be hard-pressed to handle more
than a few hundred requests a second, so the usual
strategy for busier sites is to use a server farm.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Server Farm
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Server Farms
The goal behind server farms is to distribute incoming
requests between clusters of machines so that any
given web or data server is not excessively
overloaded.
Special routers called load balancers distribute
incoming requests to available machines.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Server Farms
Even if a site can handle its load via a single server, it is
not uncommon to still use a server farm because it
provides failover redundancy.
That is, if the hardware fails in a single server, one of
the replicated servers in the farm will maintain the
site’s availability.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Server Racks
In a server farm, the computers do not look like the
ones in your house.
Instead, these computers are more like the plates
stacked in your kitchen cabinets.
That is, a farm will have its servers and hard drives
stacked on top of each other in server racks.
A typical server farm will consist of many server racks,
each containing many servers.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Server Rack
Fiber channel switches
Rack management server
Test server
Keyboard t ray and flip-up monitor
Patch panel
Production w eb server
Production data server
RAID HD arrays
Patch pane l
Production w eb server
Production data server
Batteries and UPS
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Data Centers
Server farms are typically housed in special facilities
called data centers.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Hypothetical Data Center
Server racks
Air conditioning
'--.
Backup
generators
UPS (batteries)
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Section 4 of 8
WHERE IS THE INTERNET?
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Is the Internet a Cloud?
The Internet is often visually represented as a cloud,
which is perhaps an apt way to think about the Internet
given the importance of light and magnetic pulses to its
operation.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
From the Computer to the Local Provider
Our main experience of the hardware component of
the Internet is that which we experience in our homes.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
In the House
The broadband modem (also called a cable modem or DSL modem)
is a bridge between the network hardware outside the house
(typically controlled by a phone or cable company) and the network
hardware inside the house.
These devices are often supplied by the ISP.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Routers
The wireless router is a device we typically need to
purchase and install.
Routers are in fact one of the most important and
ubiquitous hardware devices that makes the Internet
work.
At its simplest, a router is a hardware device that
forwards data packets from one network to another
network.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Routers and Routing Tables
Sender addr ess
142.109.149.46
1142.109.149.46 1209.202.161.240 ! 1 !Thou map of woe, I
Sender address Destination
address 127.0.0.1
65.47.242.9
Router address 65.47.242.9
140.239.191.1 90.124.1.2
Address Next Hop
208.68.17.3 140.239.191.1
Address Next Hop
142.109.149.146 66.37.223.130
etc.
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Fiber Optic Cable
Fiber optic cable (or simply optical fiber) is a glass-
based wire that transmits light and has significantly
greater bandwidth and speed in comparison to metal
wires.
In some cities (or large buildings), you may have fiber
optic cable going directly into individual buildings; in
such a case the fiber junction box will reside in the
building.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Section 5 of 8
DOMAIN NAME SYSTEM (DNS)
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Domain Name System
Why do we need it?
As elegant as IP addresses may be, human beings do
not enjoy having to recall long strings of numbers.
Instead of IP addresses, we use the Domain Name
System (DNS)
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
DNS Overview
Ineed t o go t o
www.funwebdev.com
0 What'sthe
IP address of
Here it is,
it's: 66.147.244.79
0 Here it is ...
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Domain Levels
Third-Level Domain l Top level Domain (TLD)
serverl.www.funwebdev.com
t
Fourth-Level Domain
t
Second-Level Domain (SLD)
Most general Top-level Domain (TLD) com
Second-Level Domain (SLD) funwebdev
Third-Level Domain www
Most specific Fourth-Level Domain serverl
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Types of TLDs
Generic top-level domains (gTLD)
Country code top-level domain (ccTLD)
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Name Registration
How are domain names assigned?
Special organizations or companies called domain
name registrars manage the registration of domain
names.
These domain name registrars are given permission to
do so by the appropriate generic top-level domain
(gTLD) registry and/or a country code top-level domain
(ccTLD) registry.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Section 6 of 8
UNIFORM RESOURCE LOCATORS (URL)
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
URL Components
In order to allow clients to request particular
resources from the server, a naming mechanism is
required so that the client knows how to ask the
server for the file.
For the web that naming mechanism is the Uniform
Resource Locator (URL).
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
Section 7 of 8
HYPERTEXT TRANSFER PROTOCOL
(HTTP)
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
HTTP
The HTTP protocol establishes a TCP connection on
port 80 (by default).
The server waits for the request, and then responds
with a response code, headers and an optional
message (which can include files).
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
HTTP
GET /index.html HTTP/1.1
Host: ex....,le.com
User-Agent: Mozilla/5.0 CWindows NT 6.1; WOW64;
rv:15.0) Gecko/20100101 Firefox/15.0.1
Accept: text/html,application/xhtml+xml
Accept-Language: en-us,en;q.0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cache-Contro1: max-age-o
HTTP/1.1 200 OK
Date: Mon, 22 Oct 2012 02:43:49 GMT
Server: Apache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 4538
Connection: close
Content-Type: text/html; charset•UTF-8
<html>
<head> ...
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
Browser Tools for HTTP
Modern browsers provide the developer with tools
that can help us understand the HTTP traffic for a
given page.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
HTTP Request Methods
The HTTP protocol defines several different types of
requests, each with a different intent and
characteristics.
The most common requests are the GET and POST
request, along with the HEAD request.
Other requests, such as PUT, DELETE, CONNECT,
TRACE and OPTIONS are seldom used, and are not
covered here.
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
GET versus POST requests
<form method="POST" action="FormProcess.php" >
Artist: Picasso
Year : 1906
1 . . . . - - - - - - - - '
Nationality: I Spain fl'l
POST /FormProcess.php http/1.1
I S u b m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .
Web server
<a href="SomePage.php ">Hyperlink</a>
GET /SomePage.php http/1.1
Randy Connolly a n d Ricardo Hoar Fundamentals of W e b Developmen t
WAMP Software Stack
Throughout this textbook we will rely on the WAMP
software stack, which refers to the Windows
operating system, Apache web server, MySQL
database, and PHP scripting language
Randy Connolly and Ricardo Hoar Fundamentals of Web Development
What You’ve Learned
1 Definitions and
History 2 Internet
Protocols
3 Client-Server
Model 4 Where is the
Internet?
5 Domain Name
System 6 Uniform
Resource
Locators (URL)
7 Hypertext
Transfer Protocol
(HTTP)
Randy Connolly and Ricardo Hoar Fundamentals of Web Development