Thanks to visit codestin.com
Credit goes to www.tutorialspoint.com

Python Pandas read_json() Method



The read_json() method in Python's Pandas library allows you to read or load data from a JSON file or JSON string into a Pandas object. This method supports multiple configurations, including reading nested JSON structures, parsing dates, managing missing values, and selecting specific data. This method is widely used in data analysis as it easily handles the structured data.

JSON (JavaScript Object Notation) is a popular lightweight data format for exchanging and storing structured information. It uses plain-text formatting, where each element is represented in a hierarchical structure, making it easy to read and process. JSON files have the .json extension and are commonly used in web applications, APIs, and data pipelines.

Syntax

The syntax of the Pandas read_json() method is as follows −

pandas.read_json(path_or_buf, *, orient=None, typ='frame', dtype=None, convert_axes=None, convert_dates=True, keep_default_dates=True, precise_float=False, date_unit=None, encoding=None, encoding_errors='strict', lines=False, chunksize=None, compression='infer', nrows=None, storage_options=None, dtype_backend=<no_default>, engine='ujson')

Parameters

The Python Pandas read_json() method accepts the below parameters −

  • path_or_buf: A string representing the file path, URL, or file-like object to read JSON data from. Supports various schemes like http, ftp, s3, etc.

  • orient: Specifies the expected format of JSON string. Common values include "split", "records", "index", "columns", and "values".

  • typ: Defines the type of object to return (frame for DataFrame, series for Series).

  • dtype: Defines the data type of columns. If set to True, pandas automatically detects and assign data types for each column. If set to False, Pandas do not infer data types, keep the original data as it is. You can also use a dictionary for mapping column names to their data types. By default, it is set to None.

  • convert_axes: Converts axes to the specified data type. Default is None.

  • convert_dates: If True, attempts to parse date-like strings. If set to False, no dates will be converted. If you give a list of column names, then those columns will be converted and default datelike columns may also be converted (depending on keep_default_dates parameter).

  • keep_default_dates: If date parsing is enabled (convert_dates is not set to False), the method will attempt to automatically identify and parse columns with date-like names.

  • precise_float: Enables the use of a higher precision (strtod) function for decoding string to double values.

  • date_unit: Specifies the timestamp unit to detect when converting dates. Automatically detects the appropriate precision unless explicitly set.

  • encoding: Specifies the encoding to decode Python 3 byte objects.

  • encoding_errors: Defines how encoding errors are handled. Possible values are "strict", "ignore", "replace"

  • lines: Indicates whether the file should be read as one JSON object per line.

  • chunksize: Reads the file in chunks of the specified size (useful for large files).

  • compression: Specifies the compression type (e.g., gzip, bz2, zip, etc.). Default is infer.

  • storage_options: Additional options for connecting to certain storage back-ends (e.g., AWS S3, Google Cloud Storage).

  • dtype_backend: Determines the backend data type for the resultant DataFrame. This feature is experimental.

  • engine: Specifies the parser engine to use for reading JSON. Default engine is "ujson". The "pyarrow" engine is only available when lines=True.

Return Value

The Pandas read_json() method returns a Series, DataFrame, or pandas.api.typing.JsonReader object, depending on whether chunksize is set. A JsonReader is returned when the chunksize is not 0 or None. Otherwise, the return type depends on the value specified in the typ parameter.

Example: Basic Example of Reading JSON String

Basic usage of the read_json() method to read a JSON string into a DataFrame. Here we will use the StringIO to represent the JSON string as an in-memory buffer

import pandas as pd
from io import StringIO

json_data = '{"col1": {"row1": "a", "row2": "b"}, "col2": {"row1": "c", "row2": "d"}}'

# Read the JSON string
df = pd.read_json(StringIO(json_data))

print("DataFrame from JSON String:")
print(df)

The output of the above code is as follows −

DataFrame from JSON String:
col1 col2
row1 a c
row2 b d

Example: Reading a Simple JSON File

Here is a basic example demonstrating reading a simple JSON file using the pandas read_json() method.

import pandas as pd

# Reading a JSON file
df = pd.read_json('example_json_file.json')

print("DataFrame from JSON File:")
print(df)

When we run above program, it produces following result −

DataFrame from JSON File:
Car Date_of_purchase
0 BMW 10-10-2024
1 Lexus 12-10-2024
2 Audi 17-10-2024
3 Mercedes 16-10-2024
4 Jaguar 19-10-2024
5 Bentley 22-10-2024

Example: Reading JSON Data from an URL

This example reads the JSON data from an URL using the pandas read_json() method.

import pandas as pd

url ="https://raw.githubusercontent.com/domoritz/maps/refs/heads/master/data/iris.json"

# Read JSON into a Pandas DataFrame 
df = pd.read_json(url)

print("DataFrame from JSON file using an URL:")
print(df.head(5))

Following is an The output of the above code is as follows −

DataFrame from JSON file using an URL:
sepalLength sepalWidth petalLength petalWidth species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

Example: Reading JSON with 'records' Orientation

This example demonstrates reading JSON data with 'records' orientation using the orient parameter of the pandas read_json() method.

import pandas as pd
from io import StringIO

# Sample JSON
data = """[
    {"Name": "Kiran", "Gender": "Male", "Age": 30},
    {"Name": "Hema", "Gender": "Female", "Age": 25},
    {"Name": "priya", "Gender": "Female", "Age": 35}
]"""

# Read JSON
df = pd.read_json(StringIO(data), orient='records')
print(df)

While executing the above code we get the following output −


Name Gender Age
0 Kiran Male 30
1 Hema Female 25
2 priya Female 35

Example: Reading Compressed JSON Data into a DataFrame

The read_json() method allows you to read the data from a compressed JSON file using the compression parameter of the read_json() method.

import pandas as pd

# Create a DataFrame
df = pd.DataFrame([
    {"Name": "Kiran", "Gender": "Male", "Age": 30},
    {"Name": "Hema", "Gender": "Female", "Age": 25},
    {"Name": "priya", "Gender": "Female", "Age": 35}])

# Convert to JSON file
result = df.to_json('dataframe_to_json_data.json.gz')

# Read compressed JSON
df = pd.read_json('dataframe_to_json_data.json.gz', compression='gzip')
print("DataFrame from compressed JSON file:")
print(df)

Following is an The output of the above code is as follows −

DataFrame from compressed JSON file:
Name Gender Age
Name Kiran Male 30
Gender Hema Female 25
Age priya Female 35
python_pandas_io_tool.htm
Advertisements