Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views14 pages

Lecture 8 HTTP&Apache

The document provides an overview of HTTP (Hypertext Transfer Protocol), detailing its request/response structure, message formats, and the role of URIs in identifying web resources. It also discusses TCP connections, emphasizing the advantages of persistent connections introduced in HTTP/1.1, and the importance of caching for efficient web page retrieval. Additionally, it highlights Apache as a leading web server, noting its security features and market share.

Uploaded by

wahab baloch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views14 pages

Lecture 8 HTTP&Apache

The document provides an overview of HTTP (Hypertext Transfer Protocol), detailing its request/response structure, message formats, and the role of URIs in identifying web resources. It also discusses TCP connections, emphasizing the advantages of persistent connections introduced in HTTP/1.1, and the importance of caching for efficient web page retrieval. Additionally, it highlights Apache as a leading web server, noting its security features and market share.

Uploaded by

wahab baloch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Nasser Abouzakhar

23rd, Nov 2017

Content
Introduction
HTTP
– Request Messages
– Response Messages
– Uniform Resource Identifiers
– TCP Connections
– Caching
Apache

2
Introduction
WWW made the Internet accessible
– Originally designed to organise and retrieve information
using hypertext interlinked docs

Hypertext is about having one doc that can link to another doc
– HTTP and HTML were designed to meet that requirement

URLs provide information that allows objects on the Web to be


located (basis of hypertext system)
– Points to files that may be located on other machines which
is the core of the hypertext part of HTTP & HTML

Source: Peterson & Davie, 2012 p 708


3

HTTP (Hyper Text Transfer


Protocol)

4
HTTP
It is a request/response protocol
– Web browsers use HTTP protocol to fetch web pages from
web servers
– It is a text-oriented protocol running over TCP

Example: If you opened the UH’s URL


http://www.herts.ac.uk/index.html, your web browser would
open a TCP connection to the web server www.herts.ac.uk
– Your browser would immediately retrieve and display the file
called index.html
– Often webpages contain images, text and objects such as
audio, video clips, pieces of code, or URLs
Source: Peterson & Davie, 2012 p 709
5

HTTP Message
where <CRLF> stands for
carriage-return+line-feed.

START-LINE indicates whether this is a request message or a


response message. In case of
a request it identifies the “remote procedure” to be executed
a response it identifies the status of the request

(MESSAGE_HEADER) is where a server’s host name is specified

(MESSAGE_BODY) is where a server would place the requested


page when responding to a request

Source: Peterson & Davie, 2012 p 710


6
Request Messages

Request Messages
The first line of HTTP request message specifies 3 parts:
– The operation to be performed (e.g. GET, HEAD),
– The Webpage the operation should be performed on, and
– The HTTP version

Source: Peterson & Davie, 2012 p 711


8
Example (1)
The START_LINE

Option 1: indicates that the client wants the server on host


www.cs.princeton.edu to return the page named index.html

Option 2: to use a relative identifier and specify the host


name in one of the MESSAGE_HEADER lines

Host is one of MESSAGE_HEADER fields


Source: Peterson & Davie, 2012 p 712
9

Response Messages

10
Response Messages
Start with a single START_LINE and include:
– the HTTP version,
– A three-digit code indicating whether or not the request
was successful, and
– A text string giving the reason for the response

Source: Peterson & Davie, 2012 p 712


11

Example (2)
The START_LINE

indicates that the server managed to satisfy the request

shows that it was not able to satisfy the request

The Princeton Computer Science Department Webpage had moved


from http://www.cs.princeton.edu/index.html to
http://www.princeton.edu/cs/index.html

Source: Peterson & Davie, 2012 p 712


12
Response Messages, cont.
If successful, the response message will carry the requested
page which is an HTML document

The requested page may contain nontextual data e.g. GIF


image and encoded using MIME (base64)

The MESSAGE_HEADER lines give attributes of the page


contents, including:
– Content-Length that is the number of bytes in the contents
– Expires (time at which the contents are considered stale), and
– Last-Modified that is time which the contents were last
modified at the server
Source: Peterson & Davie, 2012 p 713
13

URIs (Uniform Resource


Identifiers)

14
Uniform Resource Identifiers
(URIs)
A URI is a character string that identifies a resource
– URLs are one type of URI
– A resource can be anything that has identity such as a doc, video

The format of URIs allows different sorts of resource identifiers to


be incorporated into the URI space
– The first part of a URI is a scheme that names a particular way of
identifying a certain kind of resource
– The second part, separated from the first part by a colon is the
scheme-specific part, as follows:

Source: Peterson & Davie, 2012 p 714


15

TCP Connection

16
TCP Connection
HTTP version 1.0
established a separate
TCP connection for each
data item retrieved from
the server

Note: some of the TCP


ACKs are not shown

Source: Peterson & Davie, 2012 p 715


17

TCP Connection, cont.


HTTP version 1.1 introduced
persistent connections
– The client & server can exchange
multiple request/response
messages over the same TCP
connections

Advantages:
– Eliminate the setup overhead
– TCP’s congestion window mechanism operates efficiently
it is not necessary to go through the slow start phase
for each page
Source: Peterson & Davie, 2012 p 715
18
TCP Connection, cont.
Disadvantages:
– neither the client nor server
knows how long to keep a
particular connection
– Server might be asked to keep
connections opened on behalf of
1000s of clients

Solution:
server must timeout and close a connection if it has
received no requests on the connection for a period of time

Source: Peterson & Davie, 2012 p 716


19

Caching

20
Caching
Benefits:
– Faster retrieve and display of pages from a nearby cache
– Load reduction on the server

Can be implemented in various places as follows:


– Internet Browser: can cache recently accessed pages,

– Website or Proxy: can support a single site-wide cache to allow users


to take advantage of previously downloaded pages, and

– ISP’s Router: can peek inside the request message and look at the
URL for the requested page. If it has the page in its cache, it returns it.

Source: Peterson & Davie, 2012 p 717


21

Caching, cont.
Caching Requirement:
– The cache needs to make sure that it is not responding with
an out-of-date version of the requested page

Example: the server assigns an expiration date to each page it


sends back to the client
– The cache remembers this date and only verifies the page
when that expiration date has passed
– After that time, the cache can use the HEAD or conditional
GET operation to verify that it has the most recent copy

Source: Peterson & Davie, 2012 p 717


22
Apache

23

Introduction
Apache is the most popular HTTP Web server, free software
and open source
It has a web server market share of > 50%
Advantages:
– Stable, efficient and flexible software
– Operates on a large number of popular OS platforms

Used by major websites such as amazon.com, IBM


Combined with Python, Perl, and PHP, Apache allows you
to develop customised and dynamic Web applications

24
Process Ownership and Security
Each process has an owner with limited rights on the system
Whenever a process is started, it inherits the permissions of its
parent process
– e.g. as a root user, the shell in which you’re doing your work
and any of its processes will have the same rights as you

Apache starts with root permissions to carryout initial network


functions
– binds itself to port 80 so that it can listen for clients requests
– Once it does this, it can give up its rights and run as a non-root
user, as indicated in its configuration files
25

Process Ownership and Security,


cont.
By limiting the permissions of the server, you reduce the
likelihood of sending malicious requests to the server
– Any input coming from a client over the network shouldn’t be
allowed to make CGI script perform unacceptable operation
– Due to improperly configured web servers some successful
attacks could be achieved easily

You can use APT to install Apache for a Debian-based Linux


distro as follows:
…:~$ sudo apt-get –y install apache2

26
Reference
Computer Networks: A systems approach
by Larry Peterson and Bruce Davie (Fifth
edition)

27

Resources
www.apache.org
http://httpd.apache.org/

28

You might also like