Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
35 views51 pages

Data Processing

The document provides an overview of data processing in Python, including reading and writing text and binary files, working with JSON, and using APIs. It introduces PyScript, a framework for running Python in the browser, and explains how to set up and use it for frontend applications. Additionally, it covers database interactions using SQLite, including creating tables and inserting data.

Uploaded by

Akkal Bista
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views51 pages

Data Processing

The document provides an overview of data processing in Python, including reading and writing text and binary files, working with JSON, and using APIs. It introduces PyScript, a framework for running Python in the browser, and explains how to set up and use it for frontend applications. Additionally, it covers database interactions using SQLite, including creating tables and inserting data.

Uploaded by

Akkal Bista
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Data

Processing
ER.SUDAN PRAJAPATI
Reading and Writing Data in Text
Format
f = open("m.txt", "r") # open file for reading

words = []

for line in f: # iterate over all lines in file

words += line.split() # append the list of words in line

f.close()

print(f.read())
# Example: the with statement

with open("m.txt") as f: # "rt" is the default mode

words = [word for line in f for word in line.split()]

# Example: writing to a file

with open("output.txt", "w") as f:

for line in words:

f.write(" ".join(line) + "\n")

with open("output.txt") as f:
◦ print(f.read())
Append Data
Append data to a File: When file is opened in write mode then cursor at
starting index. That’s why if we start writing file in write mode then it starts to
write file from beginning and over write the existing data in file.

But in append() ie. “a” mode, the cursor placed at the last index always. Due to
this, if we start writing data to file in append mode then it start to writing file from
last instead of over writing existing file.
f = open("Abc.txt",'a')

f.write(" Programming")

f.close()
Tell() and Seek()
Writing multiple data
Python JSON

JSON (JavaScript Object Notation) is a popular data format used for representing
structured data. It's common to transmit and receive data between a server and
web application in JSON format.

In Python, JSON exists as a string. For example:

p = '{"name": "Bob", "languages": ["Python", "Java"]}'


Parse JSON in Python
import json

person = '{"name": "Bob", "languages": ["English", "French"]}'

person_dict = json.loads(person)

# Output: {'name': 'Bob', 'languages': ['English', 'French']}

print( person_dict)

# Output: ['English', 'French']

print(person_dict['languages'])
Python read JSON file
Person.json

{"name": "Bob",

"languages": ["English", "French"]

}
import json

with open('path_to_file/person.json', 'r') as f:

data = json.load(f)

# Output: {'name': 'Bob', 'languages': ['English', 'French']}

print(data)
Python Convert to JSON
string
import json

person_dict = {'name': 'Bob',

'age': 12,

'children': None

person_json = json.dumps(person_dict)

# Output: {"name": "Bob", "age": 12, "children": null}

print(person_json)
Writing JSON to a file
import json

person_dict = {"name": "Bob",

"languages": ["English", "French"],

"married": True,

"age": 32

with open('person.txt', 'w') as json_file:

json.dump(person_dict, json_file)
Binary data in Python
TO read or write a binary file, at first you need to understand the different file modes for Binary Files in Python −

Mode Description

rb Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the
default mode.

rb+ Opens a file for both reading and writing in binary format. The file pointer placed at the beginning of the file.

wb Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does not exist, creates
a new file for writing.

wb+ Opens a file for both writing and reading in binary format. Overwrites the existing file if the file exists. If the file
does not exist, reates a new file for reading and writing.

ab Opens a file for appending in binary format. The file pointer is at the end of the file if the file exists. That is, the
file is in the append mode. If the file does not exist, it creates a new file for writing.

ab+ Opens a file for both appending and reading in binary format. The file pointer is at the end of the file if the file
exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.
# Open a binary file

f = open('D:\PythonLogo.png', 'rb')

# Read lines

data = f.read()

# Display the data

print(data)
Write to a Binary File

The wb mode of the open() method is used to open a file in format form writing.

Note − The Binary Files are not huma-readable and the content is unrecognizable
# Open a file in binary format for writing

f = open("abc.txt","wb")

# Elements to be added to the binary file

a = [100, 200, 240]

# Convet the integer elements to a bytearray

myArr = bytearray(a)

# The byte representation ius now written to the file

f.write(myArr)

f.close()
What is PyScript?
PyScript is an open source web framework that allows you to create frontend web
applications using Python.

With PyScript, you can either embed Python code in HTML, or link to a Python file
and the code will execute in the browser — without running Python in the
backend.

PyScript was created by Anaconda and was publicly announced on April 30 at


PyCon US 2022.

At the time of writing, PyScript is in an alpha state and is actively being


developed, so breaking changes and newer features are to be expected since it
hasn’t been stably released yet.
How does PyScript work?
PyScript builds upon Pyodide, which ports CPython to WebAssembly.

WebAssembly is a low-level binary format that allows you to write programs in


other languages, which are then executed in the browser.

With CPython in WebAssembly, we can install and run Python packages in the
browser, while PyScript abstracts most of the Pyodide operations, allowing you to
focus on building frontend apps with Python in the browser.
When would you want to use
PyScript?
Move a Python backend to the frontend: If you have a Python application running
in the backend, you can use PyScript to move it to the frontend, saving you web
hosting bills

Make use of Python’s ecosystem of libraries: Scientific packages such as scikit-


learn, numpy, pandas are only available in Python and not in the frontend. But
with PyScript, you can use these packages in the frontend or even use your own
Python modules

Interact with the local file system: JavaScript in the browser does not have APIs
for reading or writing files in the file system. With PyScript, you can read a file in
the file system, manipulate the data, and inject it into the DOM
Getting started
Now that our directory is set up for PyScript, we will first add links to the PyScript assets
comprising of a CSS file and JavaScript file in the <head> section of an HTML page.

Once the assets have been added, you can use PyScript in an HTML file in either of two
ways:

Internal PyScript: You can write and place your Python code within the <py-script> tag
in an HTML file; the <py-script> tag can be added in the <head> or <body> tag
depending on your task at hand

External PyScript: This is where you write your Python code in a file ending with .py
extension, which you can then reference in the <py-script> tag using the src attribute
Internal PyScript
<!DOCTYPE html>

<html lang="en">

<head>

<meta charset="utf-8" />

<meta name="viewport" content="width=device-width, initial-scale=1" />

<title>Hello World!</title>

<!-- linking to PyScript assets -->

<link rel="stylesheet" href="https://pyscript.net/releases/2022.12.1/pyscript.css" />

<script defer src="https://pyscript.net/releases/2022.12.1/pyscript.js"></script>

</head>

<body>

<!-- Put Python code inside the the <py-script> tag -->

<py-script>display("Hello World!")</py-script>

</body>

</html>
In the <head> section, we link to the pyscript.css file, which contains styles for
PyScript visual components, REPL, the PyScript loader, etc. Next, we link to the
pyscript.js file, which sets up the necessary features for using PyScript, such as
creating tags like <py-script> where you can write your Python code.
External PyScript
While putting Python code in the <py-script> tag works, a much better and more scalable approach is to add
the code in an external file and reference it in the HTML file as you create more HTML pages or your scripts
get larger.

The following are some of the reasons why you should consider using PyScript code in an external file:

The file can be cached by the browser, leading to faster performance

You can reference the file in multiple pages, reducing duplication

Your Python code can be formatted with tools like black or Python linters. These tools don’t currently work on
Python code embedded in an HTML file

To use PyScript externally, we will create an index.html file, a Python file ending with .py extension containing
our Python code, and finally reference the Python file in the index.html file
Create an index.html file and link to the PyScript assets:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Greetings!</title>
<!-- linking to PyScript assets -->
<link rel="stylesheet" href="https://pyscript.net/releases/2022.12.1/pyscript.css" />
<script defer src="https://pyscript.net/releases/2022.12.1/pyscript.js"></script>
</head>
<body>
</body>
</html>
create the main.py file and add the code below:

def greetings(name):

print(f'Hi, {name}')

greetings('John Doe')
Linking the main.py file in the HTML file

Open the index.html and add the line inside the <body> tag:
Demo.py
num1 =5

num2 = 10

sum= num1 + num2

print('the sumof {0} {1} is {2}'.format(num1, num2, sum) )


<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Greetings!</title>
<link rel="stylesheet" href="https://pyscript.net/releases/2022.12.1/pyscript.css" />
<script defer src="https://pyscript.net/releases/2022.12.1/pyscript.js"></script>
</head>
<body>
add the following line
<py-script src=“demo.py"></py-script>
</body>
</html>
Retrieve data from API
An API (Application Programming Interface) is the definition of the way computer programs communicate with
each other.

We use Requests to connect to the API of a web server, tell it what we want, and it returns it to us. This is called
the request-response cycle.

We can find a list of some free APIs (available without authentication) at https://apipheny.io/free-api/#apis-
without-key .

These APIs can be used for developing and testing our code.

Let’s make a request to the Cat Fact API. If we go to https://catfact.ninja/, it gives us the definitions:
◦ GET /fact is the API endpoint.
◦ GET is the type of request we make and
◦ /fact is the path.

You can even test this in your web browser: https://catfact.ninja/fact

Using the Requests library, we do this with get().


# Import

import requests

# URL

url = 'https://catfact.ninja/fact'

# Make a request

response = requests.get(url)

response_content = response.content

# Display

display(response_content)

(notes: The requests.Response object tells us what the server said. We can access the response content using
content.)
The response content is in the JSON format and Requests gives us the json()
method that decodes it and returns the corresponding data as Python objects.
This is equivalent to json.load().

response_json = response.json()

# Display

display(response_json)
API which requires parameters
Let’s then examine another API which accepts parameters to specify the
information request.

In particular, we will request a list of Finnish universities from


http://universities.hipolabs.com using the /search end point and a parameter
country with value Finland, like this:

http://universities.hipolabs.com/search?country=Finland
# URL

url = 'http://universities.hipolabs.com/search?country=Finland'

# Make a request

response = requests.get(url)

# Decode JSON

response_json = response.json()

# Display

display(response_json[:2])

Note:URLs containing parameters can always be constructed manually using


the & character and then listing the parameter (key, value) pairs as above.
Requests allows us to provide the parameters as a dictionary of strings, using the params keyword argument to get(). This is
easier to read and less error-prone.

# URL

url = 'http://universities.hipolabs.com/search'

# Make the parameter dictionary

parameters = {'country' : 'Finland'}

# Get response

response = requests.get(url, params=parameters)

# Decode JSON

response_json = response.json()

# Display

display(response_json[:2])
http://www.boredapi.com/api/
activity/
# Import module

import requests

# URL of the activity API end point

url = "http://www.boredapi.com/api/activity/"

# Send the request using the get() function

response = requests.get(url)

display(response.json())
adding some parameters
◦ type
◦ Participants

params = {
'type' : 'education',
'participants' : 1,
}

# Send the request using get() with parameters


response = requests.get(url, params)
display("Response")
display(response.json())
Interacting with Databases
In many applications data rarely comes from text files, that being a fairly inefficient
way to store large amounts of data. SQL-based relational databases (such as SQL
Server,

PostgreSQL, and MySQL) are in wide use, and many alternative non-SQL (so-
calledNoSQL) databases have become quite popular.

The choice of database is usually dependent on the performance, data integrity, and
scalability needs of an application
import sqlite3

query = """

CREATE TABLE test

(a VARCHAR(20), b VARCHAR(20),

c REAL, d INTEGER

);"""

con = sqlite3.connect(':memory:')

con.execute(query)

con.commit()
data = [('Atlanta', 'Georgia', 1.25, 6),

('Tallahassee', 'Florida', 2.6, 3),

('Sacramento', 'California', 1.7, 5)]

stmt = "INSERT INTO test VALUES(?, ?, ?, ?)"

con.executemany(stmt, data)

con.commit()

cursor = con.execute('select * from test')

rows = cursor.fetchall()

rows
list of tuples to the DataFrame constructor, but you also need thecolumn names,
contained in the cursor’s description attribute:

cursor. description

import pandas.io.sql as sql

sql.read_frame('select * from test', con)


Connect to Database
import sqlite3

conn = sqlite3.connect(‘ascol.db')

print "Opened database successfully“

Here, you can also supply database name as the special name :memory: to create a
database in RAM. Now,

let's run the above program to create our database test. db in the current directory.

You can change your path as per your requirement.

Keep the above code in sqlite.py file and execute it as shown below. If the database is
successfully created, then it will display the following message.
Create a Table
import sqlite3

conn = sqlite3.connect(‘ascoldb')

print "Opened database successfully";

conn.execute('''CREATE TABLE COMPANY

(ID INT PRIMARY KEY NOT NULL,

NAME TEXT NOT NULL,

AGE INT NOT NULL,

ADDRESS CHAR(50),

SALARY REAL);''')

print "Table created successfully";

conn.close()
Insert Operation
import sqlite3

conn = sqlite3.connect('test.db')

print "Opened database successfully";

conn.execute("INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \

VALUES (1, 'Paul', 32, 'California', 20000.00 )");

conn.execute("INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \

VALUES (2, 'Allen', 25, 'Texas', 15000.00 )");


conn.execute("INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \

VALUES (3, 'Teddy', 23, 'Norway', 20000.00 )");

conn.execute("INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \

VALUES (4, 'Mark', 25, 'Rich-Mond ', 65000.00 )");

conn.commit()

print "Records created successfully";

conn.close()
SELECT Operation
import sqlite3

conn = sqlite3.connect('test.db')

print "Opened database successfully";

cursor = conn.execute("SELECT id, name, address, salary from COMPANY")

for row in cursor:

print "ID = ", row[0]

print "NAME = ", row[1]

print "ADDRESS = ", row[2]

print "SALARY = ", row[3], "\n"

print "Operation done successfully";

conn.close()
UPDATE Operation
import sqlite3

conn = sqlite3.connect('test.db')

print "Opened database successfully";

conn.execute("UPDATE COMPANY set SALARY = 25000.00 where ID = 1")

conn.commit()

print "Total number of rows updated :", conn.total_changes


cursor = conn.execute("SELECT id, name, address, salary from COMPANY")

for row in cursor:

print "ID = ", row[0]

print "NAME = ", row[1]

print "ADDRESS = ", row[2]

print "SALARY = ", row[3], "\n"

print "Operation done successfully";

conn.close()
DELETE Operation
import sqlite3

conn = sqlite3.connect('test.db')

print "Opened database successfully";

conn.execute("DELETE from COMPANY where ID = 2;")

conn.commit()

print "Total number of rows deleted :", conn.total_changes

cursor = conn.execute("SELECT id, name, address, salary from COMPANY")


for row in cursor:

print "ID = ", row[0]

print "NAME = ", row[1]

print "ADDRESS = ", row[2]

print "SALARY = ", row[3], "\n"

print "Operation done successfully";

conn.close()

You might also like