Module 5
Networked Programs
ARUN K H,ISE, AIT
1
IP Address
An Internet Protocol address (IP address) is a unique numerical
label separated by full stops assigned to each device connected to a
computer network that uses the Internet Protocol for
communication
2
Public and Private IP Address
• ISP(ACT, Vodafone,AIRTEL) while other is router
wifi IP
• Finding out both IP Addresses
✓ Public IP: Google:- what’s My IP is?
✓ Private IP: Command Prompt:- ipconfig 3
Static and Dynamic IP Address
4
5
• Python provides two levels of access to network services. At a
low level, you can access the basic socket support in the
underlying operating system, which allows you to implement
clients and servers for both connection-oriented and
connectionless protocols.
• Python also has libraries that provide higher-level access to
specific application-level network protocols, such as FTP,
HTTP, and so on.
• This chapter gives you understanding on most famous
concept in Networking - Socket Programming.
6
7
Why use Sockets?
Sockets are the backbone of networking. They make the
transfer of information possible between two different
programs or devices.
For example, when you open up your browser, you as a client
are creating a connection to the server for the transfer of
information.
8
What are Sockets?
• Sockets are the endpoints of a bidirectional communications channel.
Sockets may communicate within a process, between processes on the
same machine, or between processes on different continents.
• A single network will have two sockets, one for each communicating
device or program
• These sockets are a combination of an IP address and a Port.
• A single device can have ‘n’ number of sockets based on the port
number that is being used.
• Different ports are available for different types of protocols
• Sockets may be implemented over a number of different channel
types: Unix domain sockets, TCP, UDP, and so on
9
List of some important modules in Python Network/Internet
programming.
Protocol Common Port No Python module
function
HTTP Web pages 80 httplib, urllib, xmlrpclib
NNTP Usenet news 119 nntplib
FTP File transfers 20 ftplib, urllib
SMTP Sending email 25 smtplib
POP3 Fetching email 110 poplib
IMAP4 Fetching email 143 imaplib
Telnet Command 23 telnetlib
lines
Gopher Document 70 gopherlib, urllib 10
transfers
How to achieve Socket Programming in Python:
• To achieve Socket Programming in Python, you will need to import
the socket module or framework. This module consists of built-in methods
that are required for creating sockets and help them associate with each other.
Some of the important methods are as follows:
Methods Description
used to create sockets (required on both
socket.socket() server as well as client ends to create
sockets)
used to accept a connection. It returns a
pair of values (conn, address) where conn is
a new socket object for sending or receiving
socket.accept()
data and address is the address of the
socket present at the other end of the
connection
used to bind to the address that is specified
socket.bind()
as a parameter
socket.close() used to mark the socket as closed
used to connect to a remote address
socket.connect() 11
specified as the parameter
socket.listen() enables the server to accept connections
Sequence of socket API calls and data flow for TCP
12
Server
• A Server is either a program, a computer, or a device
that is devoted to managing network resources. Servers
can either be on the same device or computer or locally
connected to other devices and computers or even
remote.
• There are various types of servers such as database
servers, network servers, print servers, etc.
• Servers commonly make use of methods like
socket.socket(),socket.bind(), socket.listen(), etc to
establish a connection and bind to the clients.
13
Server program
import socket
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((socket.gethostname(),1234))
#port number can be anything between 0-65535(we usually specify non-
previleged ports which are > 1023)
s.listen(5)
while True:
clt,adr=s.accept()
print(f"Connection to {adr}established")
#f string is literal string prefixed with f which
#contains python expressions inside braces
#to send info to clientsocket
clt.send(bytes("Socket Programming in Python","utf-8 ")) 14
Client
• A Client is either a computer or software that receives
information or services from the server. In a client-
server module, clients requests for services from
servers.
• The best example is a web browser such as Google
Chrome, Firefox, etc.
• These web browsers request web servers for the
required web pages and services as directed by the
user.
• Other examples include online games, online chats, etc. 15
Client program
import socket
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.Connect ( ( socket.gethostname( ), 1234) )
msg=s.recv(1024)
print(msg.decode("utf-8"))
NOTE: gethostname is used when client and server are on the same
computer. (LAN – local Ip / WAN – public Ip)
16
The World’s Simplest Web Browser
Perhaps the easiest way to show how the HTTP protocol works is to write a very
simple Python program that makes a connection to a web server and follows the rules
of the HTTP protocol to request a document and display what the server sends back.
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)
while True:
data = mysock.recv(20)
if (len(data) < 1):
break
print(data.decode(),end='')
mysock.close()
17
import socket
Retrieving an image over HTTP
import time
HOST = 'data.pr4e.org'
PORT = 80
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect((HOST, PORT))
mysock.sendall(b'GET http://data.pr4e.org/cover3.jpg HTTP/1.0\r\n\r\n')
count = 0
picture = b""
while True:
data = mysock.recv(5120)
if (len(data) < 1): break
time.sleep(0.25)
count = count + len(data)
print(len(data), count)
picture = picture + data
mysock.close()
# Look for the end of the header (2 CRLF)
pos = picture.find(b"\r\n\r\n")
print('Header length', pos)
print(picture[:pos].decode())
# Skip past the header and save the picture data
18
picture = picture[pos+4:]
fhand = open("stuff.jpg", "wb")
fhand.write(picture)
fhand.close()
Retrieving web pages with urllib
• While we can manually send and receive data over HTTP using the
socket library, there is a much simpler way to perform this common
task in Python by using the urllib library.
• Using urllib, you can treat a web page much like a file. You simply
indicate which web page you would like to retrieve and urllib handles
all of the HTTP protocol and header details.
import urllib.request
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
print(line.decode().strip())
19
Compute the frequency of each word in
the file romeo.txt
import urllib.request
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
counts = dict()
for line in fhand:
words = line.decode().split()
for word in words:
counts[word] = counts.get(word, 0) + 1
print(counts)
20
Parsing HTML using regular expressions
One simple way to parse HTML is to use regular expressions to
repeatedly search for and extract substrings that match a
particular pattern.
Here is a simple web page:
<h1>The First Page</h1>
<p>
If you like, you can switch to the
<a href="http://www.dr-chuck.com/page2.htm">
Second Page</a>.
</p>
We can construct a well-formed regular expression to match
and extract the link values from the above text as follows:
href="http://.+?"
21
import urllib.request
import re
url = input('Enter - ')
html = urllib.request.urlopen(url).read()
links = re.findall(b'href="(http://.*?)"',
html)
for link in links:
print(link.decode())
22
What is Web Scraping?
• Web Scraping is the technique of automatically extracting data from
websites using software/script.
• Because the data displayed by most website is for public
consumption. It is totally legal to copy this information to a file in
your computer
• Python is the most popular language for web scraping. It's more like
an all-rounder and can handle most of the web crawling related
processes smoothly.
• Scrapy and Beautiful Soup are among the widely used frameworks
based on Python that makes scraping using this language such an
easy route to take
23
Some python libraries for
web scraping:
• Beautiful Soup
• Scrapy
• Requests
• LXML
• Selenium
24
Parsing HTML and scraping the web
• One of the common uses of the urllib capability in Python is to scrape
the web.
• Web scraping is when we write a program that pretends to be a web
browser and retrieves pages, then examines the data in those pages
looking for patterns.
• As an example, a search engine such as Google will look at the source of
one web page and extract the links to other pages and retrieve those
pages, extracting links and so on.
• Using this technique, Google spiders its way through nearly all of the
pages on the web.
• Google also uses the frequency of links from pages it finds to a
particular page as one measure of how “important” a page is and how
high the page should appear in its search results. 25
Beautiful Soup
• bs4 — BeautifulSoup 4.
• Beautiful Soup is a Python library for pulling data out of
HTML and XML files.
• It works with your favorite parser to provide idiomatic
ways of navigating, searching, and modifying the parse
tree.
• It commonly saves programmers hours or days of work.
26
Parsing HTML using BeautifulSoup
• BeautifulSoup is a Python library that can be used to parse HTML input
to extract links.
• You can download and install the BeautifulSoup code from
http://www.crummy.com/software/
• Most HTML is generally broken in ways that cause an XML parser to
reject the entire page of HTML as improperly formed.
• Beautiful Soup tolerates highly flawed HTML and still lets you easily
extract the data you need.
• We will use urllib to read the page and then use Beautiful Soup to
extract the href attributes from the anchor (a) tags. 27
import urllib.request
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:
print(tag.get('href', None))
28
You can use BeautifulSoupto pull out various parts of each tag
Reading binary files using urllib
To retrieve a non-text (or binary) file such as an image or video file. The data in
these files is generally not useful to print out, but you can
easily make a copy of a URL to a local file on your hard disk using urllib.
img =
urllib.request.urlopen('http://data.pr4e.org/cover3.jpg').read()
fhand = open('cover3.jpg', 'wb')
fhand.write(img)
fhand.close()
29
30
HTML
• HTML stands for Hyper Text Markup Language
• It is the standard markup language for creating Web
pages.
• It describes the structure of a Web page
• It consists of a series of elements
31
HTML (cont..)
• HTML elements tell the browser how to display the
content
• The elements are represented by tags
• The tags label pieces of content such as "heading",
"paragraph", "table", and so on
• Browsers do not display the HTML tags, but use them
to render the content of the page
32
A Simple HTML Document
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
33
• The <!DOCTYPE html> declaration defines this document
to be HTML5
• The <html> element is the root element of an HTML page
• The <head> element contains meta information about the
document
• The <title> element specifies a title for the document
• The <body> element contains the visible page content
• The <h1> element defines a large heading
34
• The <p> element defines a paragraph
HTML Tags
• HTML tags are element names surrounded by angle
brackets:
• <tagname>content goes here...</tagname>
• HTML tags normally come in pairs like <p> and </p>
• The first tag in a pair is the start tag, the second tag is
the end tag
• The end tag is written like the start tag, but with
35
a forward slash inserted before the tag name
XML
• XML stands for eXtensible Markup Language.
• It is a Markup language much like HTML.
• It was designed to store and transport data.
• It was designed to be self descriptive.
• It plays an important role in many different IT systems.
• It is often used for distributing data over the Internet.
36
The Difference Between XML and HTML
• XML and HTML were designed with different goals:
• XML was designed to carry data - with focus on what data is
• HTML was designed to display data - with focus on how
data looks
• XML tags are not predefined like HTML tags are
37
Example
<?xml version="1.0" encoding="UTF-8"?>
<Wishes>
<to>Vinu</to>
<from>Kavitha</from>
<heading>Happy Birthday</heading>
<body>Many Many happy Birthday!</body>
38
</Wishes>
Yet another example
<person>
<name>Chuck</name>
<phone type="intl">
+1 734 303 4456
</phone>
<email hide="yes"/>
</person>
39
Tree representation of XML
40
XML Does Not Use Predefined Tags
• The XML language has no predefined tags.
• The tags in the example above (like <to> and <from>) are not
defined in any XML standard.
• These tags are "invented" by the author of the XML
document.
• HTML works with predefined tags like <p>, <h1>, <table>, etc.
• With XML, the author must define both the tags and the
document structure. 41
XML is Extensible
• Most XML applications will work as expected even if new
data is added (or removed).
• Then imagine a newer version of Bday.xml with added
<date> and <hour> elements, and a removed <heading>.
• The way XML is constructed, older version of the
application can still work:
42
XML Simplifies Things
• It simplifies data sharing
• It simplifies data transport
• It simplifies platform changes
• It simplifies data availability
43
Parsing XML
Here is a simple application that parses some XML and extracts some data elements from the
XML:
.
import xml.etree ElementTree as ET
data = '''
<person>
<name>Chuck</name>
<phone type="intl"> XML data
+1 734 303 4456
</phone>
<email hide="yes"/>
</person>'''
tree = ET.fromstring(data) # converts string representation of “data” into tree of XML nodes
print('Name:', tree.find('name').text) # search tree and retrieve a node that matches tag –’name’
print('Attr:', tree.find('email').get('hide')) 44
Looping through nodes
import xml.etree.ElementTree as ET
input = '''
<stuff>
<users>
<user x="2">
<id>001</id>
<name>Chuck</name>
</user>
<user x="7">
<id>009</id>
<name>Brent</name>
</user>
</users>
</stuff>''' 45
Looping through nodes
stuff = ET.fromstring(input)
lst = stuff.findall('users/user') # list of subtrees that represent the user
structures in the XML tree
print('User count:', len(lst))
O/P:
User count: 2
for item in lst: Name Chuck
print('Name', item.find('name').text) Id 001
print('Id', item.find('id').text) Attribute 2
print('Attribute', item.get("x")) Name Brent
Id 009
Attribute 7
46
Still Any simpler format ?
47
JSON
• JSON stands for JavaScript Object Notation.
• JSON is a lightweight format for storing and transporting
data.
• JSON is often used when data is sent from a server to a
web page.
• JSON is "self-describing" and easy to understand.
48
Is JSON a programming language?
• JSON is a language-independent data format.
• It was derived from JavaScript, but many
modern programming languages include code to
generate and parse JSON-format data.
• Since Python was invented before JavaScript, Python’s
syntax for dictionaries and lists influenced the syntax of
JSON. So the format ofJSON is nearly identical to a
combination of Python lists and dictionaries.
• The official Internet media type for JSON is
49
application/json
What is the purpose of JSON
• The JSON format is often used for serializing and
transmitting structured data over a network connection.
• It is used primarily to transmit data between a server
and a web application
• It serves as an alternative to XML
50
JSON Syntax Rules
• Data is in name/value pairs
• Data is separated by commas
• Curly braces hold objects
• Square brackets hold arrays
51
JSON Data - A Name and a Value
JSON data is written as name/value pairs, just like JavaScript
object properties.
A name/value pair consists of a field name (in double
quotes), followed by a colon, followed by a value:
"firstName“ : "John"
Note: JSON names require double quotes. JavaScript names
do not.
52
JSON Objects
JSON objects are written inside curly braces.
Just like in JavaScript, objects can contain multiple
name/value pairs:
{"firstName":"John", "lastName":"Doe"}
53
JSON encoding that is equivalent to the XML
ex seen earlier :
{
"name" : "Chuck",
"phone" : {
"type" : "intl",
"number" : "+1 734 303 4456"
},
"email" : {
"hide" : "yes"
}
}
54
55
Parse JSON - Convert from JSON to Python
If you have a JSON string, you can parse it by using
the json.loads() method.
The result will be a Python dictionary.
Example
Convert from JSON to Python:
import json
# some JSON:
x = '{ "name":"John", "age":30, "city":"New York"}'
# parse x: O/P : 30
y = json.loads(x)
# the result is a Python dictionary:
print(y["age"]) 56
JSON Arrays
JSON arrays are written inside square brackets.
Just like in JavaScript, an array can contain objects:
"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]
In the example above, the object "employees" is an array. It
contains three objects.
Each object is a record of a person (with a first name and a last
name).
57
Parse JSON (with arrays)
import json info = json.loads(data)
print('User count:', len(info))
data = '''
[ for item in info:
{ "id" : "001", print('Name', item['name'])
"x" : "2", print('Id', item['id'])
"name" : "Chuck" print('Attribute', item['x'])
} ,
{ "id" : "009", O/P
"x" : "7", User count: 2
Name Chuck
"name" : "Brent" Id 001
} Attribute 2
Name Brent
]''' Id 009
Attribute 7 58
59
60
Application Programming Interfaces(API)
• Exchange of data between applications is done using HTTP
and
• A way to represent complex data is achieved through:
▪ (XML) or JavaScript Object Notation (JSON).
• The next step is to begin to define and document “contracts”
between applications using these techniques.
• The general name for these application-to-application
contracts is Application Programming Interfaces(API)
• Whenever we use an API, generally, one program makes a set
of services available for use by other applications and
publishes the APIs (i.e., the “rules”) that must be followed to
access the services provided by the program. 61
Service Oriented Architecture
• A SOA approach is one where our overall application makes use
of the services of other applications.
• More formally - Service-oriented architecture is a style of
software design where services are provided to the other
components by application components, through a
communication protocol over a network.
• A non-SOA approach is where the application is a single
standalone application which contains all of the code necessary
to implement the application.
62
Example of an SOA
Service Oriented Architecture
63
Advantages of SOA
• Always maintains only one copy of data (ex. hotel
reservation)
• The owners of the data can set the rules about the use of
their data
When an application makes a set of services in its API
available over the web, we call these web services.
64
Google Geocoding Web Services
• Google has an excellent web service that allows us to make
use of their large database of geographic information
• We can submit a geographical search string like “Mumbai”
to their geocoding API and have Google return its best
guess as to where on a map we might find our search string
and tell us about the landmarks nearby.
• The geocoding service is free but rate limited so you cannot
make unlimited use of the API in a commercial application.
65
The following is a simple application to prompt the user for a search string, call
the Google geocoding API, and extract information from the returned JSON.
66
67
68
69
Security and API usage
• It is common that we need some kind of “API key” to make
use of a vendor’s API.
• The general idea is that they want to know who is using
their services and how much each user is using.
• Both free and pay tiers of their services are available to
limit the no of requests an individual can make during a
particular time period.
• Include API key as part of POST data or as a parameter on
70
URL when calling API
Security and API usage
• Vendor expects a genuine source of user request
• They expect users to send cryptographically signed
messages using shared keys and secrets.
• A standard protocol OAuth is used for the same
• OAuth - An open protocol to allow secure
authorization in a simple and standard method from
web, mobile and desktop applications.
71
• Few years earlier before OAuth, the authorization of a
user on a website was done typically assigning a unique
id and user-selective password. But now a days, you also
see the dialogue box identical to below one along with
the so called typical sign up method :
• Now a days, you can sign up and login on third party sites
using your previous accounts on facebook, twitter, github,
etc.
72
73
OAuth Protocol
• Officially it is stated as : “OAuth is an authorization
framework that enables a third-party application to obtain a
limited access to an HTTP service.”
• OAuth is an open standard protocol for authentication that
allows a user to use Internet service functions, such as those
provided by Facebook or Twitter, within other applications
(desktop, web, mobile, etc.)
74
Twitter API
• Twitter moved from an “open and public API” to an API that
required the use of “OAuth signatures” on each API request.
Sample program demonstration:
• Download the files twurl.py, hidden.py, oauth.py, and twitter1.py
from www.py4e.com/code and put them all in a folder on your
computer.
• To make use of these programs you will need to have a Twitter
account, and
• Authorize your Python code as an application, set up a key, secret,
token and token secret.
• Edit the file hidden.py and put these four strings into the appropriate
variables in the file:
75
# Create new App and get the four strings
def oauth():
return {"consumer_key": "h7Lu...Ng",
"consumer_secret" : "dNKenAC3New...mmn7Q",
"token_key" : "10185562-eibxCp9n2...P4GEQQOSGI",
"token_secret" : "H0ycCFemmC4wyf1...qoIpBo"}
76
The Twitter web service are accessed using a URL like this:
https://api.twitter.com/1.1/statuses/user_timeline.json
But once all of the security information has been added, the
URL will look more like:
https://api.twitter.com/1.1/statuses/user_timeline.json?co
unt=2&oauth_version=1.0&oauth_token=101...SGI&screen_
name=drchuck&oauth_nonce=09239679&oauth_timestam
p=1380395644&oauth_signature=rLK...BoD&oauth_consu
mer_key=h7Lu...GNg&oauth_signature_method=HMAC-
SHA1
77
The below program retrieves the timeline for a particular Twitter user and
returns it to us in JSON format in a string. We simply print the first 250
characters of the string:
import urllib.request, urllib.parse, urllib.error
import twurl
import ssl
# Create App and get the four strings, put them in hidden.py
TWITTER_URL = 'https://api.twitter.com/1.1/statuses/user_timeline.json'
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE 78
while True:
print('')
acct = input('Enter Twitter Account:')
if (len(acct) < 1): break
url = twurl.augment(TWITTER_URL, {'screen_name': acct, 'count': '2'})
print('Retrieving', url)
connection = urllib.request.urlopen(url, context=ctx)
data = connection.read().decode()
print(data[:250])
headers = dict(connection.getheaders())
# print headers
print('Remaining', headers['x-rate-limit-remaining'])
79
80
In the following example, we retrieve a user’s Twitter friends, parse the returned
JSON, and extract some of the information about the friends.
81
82
83
84