FILE HANDLING IN PYTHON
INTRODUCTION
File handling is an essential aspect of programming, enabling a program to interact
with data stored in external files. Python provides built-in functions and libraries to
handle files efficiently, allowing you to read, write, modify, and store data in
various file formats such as text, binary, and compressed files.
In Python, file handling is performed through the open() function, which provides
different modes for accessing files, including reading, writing, and appending.
These operations can be applied to both text and binary files, and python's standard
library offers additional tools like zip file, pickle, and json for compressing data
and serializing objects, making file handling even more versatile.
Effective file handling is crucial in various applications, including data processing,
file storage, configuration management, and data transfer between systems. By
mastering file handling techniques, you can efficiently manage large datasets, work
with external resources, and ensure that your Python programs can interact with
data stored outside the program itself.
1. Read and Write Text Files
In Python, text files are read and written using built-in functions and the open()
function. Text files are files that store readable characters (such as .txt or .csv
files), as opposed to binary files.
Reading Text Files
To read a text file, python uses the open() function to open the file in read mode
('r').
The read() method is used to read the contents of the file and the readlines()
method can be used to read the file line by line.
For example :-
with open('textfile.txt', 'r') as file:
content = file.read()
print(content)
Writing to Text Files
To write to a text file, python opens the file in write mode ('w') or append mode
('a').
Write mode overwrites the file, while append mode adds data to the end of the file.
For example :-
with open('output.txt', 'w') as file:
file.write("Hello, world!")
file.write("This is a text file.")
Using 'with open(...)' ensures that the file is properly closed after reading or
writing, even if an error occurs.
2. Read and Write Binary Files
Binary files contain data in a format that isn't readable as text. For example, image
files (like .jpg or .png) and executable files are binary files. In python, binary files
are handled by opening them with the mode 'rb' for reading and 'wb' for writing.
Reading Binary Files
To read a binary file, we open it in read-binary mode ('rb'). The read() method
works similarly to text files but returns data in binary format.
For example :-
with open('image.jpg', 'rb') as file:
content = file.read()
print(type(content))
Writing Binary Files
To write binary data, we open the file in write-binary mode ('wb'). When writing
binary files, data must be in the form of bytes.
For example :-
with open('output.bin', 'wb') as file:
file.write(binary_data)
print("This is the binary data")
3. Zip/Compress Binary Files
Python provides the zipfile module for working with (.zip) archives, which can be
used to compress binary or text files. The zip file module allows us to create, read,
and extract zip archives.
Compressing Binary Files
We can compress binary files into a .zip archive by using zip file and the write()
method to add files to the archive.
For example :-
import zipfile
with zipfile.ZipFile('archive.zip', 'w', zipfile.ZIP_DEFLATED) as zipf:
zipf.write('example.jpg', arcname='example.jpg')
The ZIP_DEFLATED option is used to apply compression.
Extracting Files from a Zip Archive
To extract files from a zip archive, we use the extractall() method.
For example :-
with zipfile.ZipFile('archive.zip', 'r') as zipf:
zipf.extractall('extracted_files')
4. Serialization Using Pickle and JSON
Serialization refers to the process of converting an object into a format that can be
easily stored in a file or transmitted over a network. In python, this can be done
using modules like pickle and json.
Using Pickle for Serialization
The pickle module allows python objects to be serialized into a binary format, and
later deserialized back to their original state. This is useful for saving python
objects to disk (e.g., models, data structures).
For example :-
import pickle
data = {'name': 'Alice', 'age': 30, 'city': 'New York'}
with open('data.pkl', 'wb') as file:
pickle.dump(data, file)
Deserializing Objects
To retrieve the original object from the file,pickle.load() method is used.
For example :-
with open('data.pkl', 'rb') as file:
loaded_data = pickle.load(file)
print(loaded_data)
Using JSON for Serialization
JSON (Java Script Object Notation) is a lightweight format for storing and
transporting data. It is text-based, which makes it human-readable. The JSON
module can be used for serializing python objects into JSON format.
For example :-
import json
data = {'name': 'Bob', 'age': 25, 'city': 'Los Angeles'}
with open('data.json', 'w') as file:
json.dump(data, file)
Conclusion
File handling is a crucial skill in Python, enabling us to interact with different types
of files — text, binary, and compressed files. Through Python’s built-in modules
like open(), zipfile, pickle, and json, we can read, write, and compress files
efficiently, as well as serialize and deserialize objects for storage or transmission.
These capabilities are fundamental for many real-world applications like data
processing, web scraping, and working with large datasets.