0% found this document useful (0 votes)

13 views13 pages

Statement Code

The document outlines a Python script that processes financial data from an Excel file, performing tasks such as data cleaning, summarization, and fraud detection. It generates monthly summaries, top transactions, and identifies potential fraud based on specific criteria, then saves the results into a new Excel workbook with multiple sheets. The script utilizes libraries like pandas and openpyxl for data manipulation and Excel file handling.

Uploaded by

tasneemnadkar1804

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views13 pages

Statement Code

Uploaded by

tasneemnadkar1804

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 13

import pandas as pd

from openpyxl import Workbook

from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.styles import Font
from collections import Counter
import os

# --- Load the file ---

df = pd.read_excel("sample.xlsx")

# Keep a clean copy for Full_Statement

df_cleaned_original = df.copy()

# --- Step 1: Preprocess for Monthly Summary ---

df.columns = df.columns.str.strip().str.upper()
df = df.rename(columns={"TRANS DATE": "DATE"})

df['DATE'] = pd.to_datetime(df['DATE'], errors='coerce')

df = df.dropna(subset=['DATE'])

# Normalize
df['DESCRIPTION'] = df['DESCRIPTION'].astype(str).str.upper()
df['CREDITS'] = pd.to_numeric(df.get('CREDITS', 0), errors='coerce').fillna(0)
df['DEBITS'] = pd.to_numeric(df.get('DEBITS', 0), errors='coerce').fillna(0)
df['BALANCE'] = pd.to_numeric(df.get('BALANCE', 0), errors='coerce')

df['DAY'] = df['DATE'].dt.day
df['MONTH'] = df['DATE'].dt.month
df['YEAR'] = df['DATE'].dt.year

df['IS_IWRETURN'] = df['DESCRIPTION'].str.contains('IWRETURN', na=False)

df['IS_OWRETURN'] = df['DESCRIPTION'].str.contains('OWRETURN', na=False)
df['IS_NON_BUSINESS'] = ~df['DESCRIPTION'].str.contains('SALARY|NEFT|RTGS|UPI|
IMPS', na=False)
df['IS_EMI_BOUNCE'] = df['DESCRIPTION'].str.contains('EMI BOUNCE|BOUNCE', na=False)
df['IS_ODCC_INTEREST'] = df['DESCRIPTION'].str.contains('OD|CC INTEREST', na=False)

# Daily last balance

daily_last_bal = df.sort_values('DATE').groupby(df['DATE'].dt.date)
['BALANCE'].last().reset_index()
daily_last_bal.columns = ['Date', 'Closing Balance']

# Monthly summary
summary = df.groupby(['YEAR', 'MONTH']).agg(
SumOfCredit=('CREDITS', 'sum'),
SumOfDebit=('DEBITS', 'sum'),
NoOfCredit=('CREDITS', lambda x: (x > 0).sum()),
NoOfDebit=('DEBITS', lambda x: (x > 0).sum()),
IWReturns=('IS_IWRETURN', 'sum'),
OWReturns=('IS_OWRETURN', 'sum'),
NonBusinessCreditsSUM=('CREDITS', lambda x: (df.loc[x.index, 'IS_NON_BUSINESS']
* x).sum()),
NonBusinessCreditsInNo=('IS_NON_BUSINESS', 'sum'),
NoOfEMIBounces=('IS_EMI_BOUNCE', 'sum'),
MonthlyODCCInterest=('CREDITS', lambda x: (df.loc[x.index, 'IS_ODCC_INTEREST']
* x).sum())
).reset_index()

# Add daily last balances (columns 01–31)

for day in range(1, 32):
summary[f"{day:02}"] = summary.apply(
lambda row: df.loc[
(df['YEAR'] == row['YEAR']) & (df['MONTH'] == row['MONTH']) &
(df['DAY'] == day),
'BALANCE'
].iloc[-1] if not df.loc[
(df['YEAR'] == row['YEAR']) & (df['MONTH'] == row['MONTH']) &
(df['DAY'] == day)
].empty else 0,
axis=1
)

# --- Step 2: Prepare Full_Statement ---

drop_cols = ['DAY', 'MONTH', 'YEAR', 'IS_IWRETURN', 'IS_OWRETURN',
'IS_NON_BUSINESS', 'IS_EMI_BOUNCE', 'IS_ODCC_INTEREST']
df_cleaned = df.drop(columns=drop_cols, errors='ignore')

# --- Step 3: Create Top Transactions Sheet ---

df_top = df_cleaned_original.copy()
df_top.columns = df_top.columns.str.strip().str.upper()

df_top['CREDITS'] = pd.to_numeric(df_top.get('CREDITS', 0),

errors='coerce').fillna(0)
df_top['DEBITS'] = pd.to_numeric(df_top.get('DEBITS', 0),
errors='coerce').fillna(0)
df_top['TRANS DATE'] = pd.to_datetime(df_top['TRANS DATE'], errors='coerce')
df_top = df_top.dropna(subset=['TRANS DATE'])

# Classification logic
def classify_transaction(desc):
desc = str(desc).upper()
if 'NEFT' in desc:
return 'NEFT'
elif 'RTGS' in desc:
return 'RTGS'
elif 'UPI' in desc:
return 'UPI'
elif 'IMPS' in desc:
return 'IMPS'
elif 'CLG' in desc or 'CHEQUE' in desc:
return 'Cheque Outward'
elif 'SELF' in desc or 'CASH' in desc:
return 'Cash Withdrawal'
else:
return 'Other'

# Overall Top 5
overall_credits = df_top[df_top['CREDITS'] > 0].nlargest(5, 'CREDITS').copy()
overall_debits = df_top[df_top['DEBITS'] > 0].nlargest(5, 'DEBITS').copy()

for df_txn, col in [(overall_credits, 'CREDITS'), (overall_debits, 'DEBITS')]:

df_txn['AMOUNT'] = df_txn[col]
df_txn['Classification Transaction'] =
df_txn['DESCRIPTION'].apply(classify_transaction)
df_txn['TRANS DATE'] = df_txn['TRANS DATE'].dt.strftime('%d-%m-%Y')

credit_table = overall_credits[['TRANS DATE', 'DESCRIPTION', 'AMOUNT',

'Classification Transaction']]
debit_table = overall_debits[['TRANS DATE', 'DESCRIPTION', 'AMOUNT',
'Classification Transaction']]
credit_table.columns = debit_table.columns = ['Extracted Date', 'Description',
'Amount', 'Classification Transaction']

# --- Monthly Top 5 ---

df_top['Month_Str'] = df_top['TRANS DATE'].dt.strftime('%B-%Y')
monthly_credits = df_top[df_top['CREDITS'] > 0].copy()
monthly_debits = df_top[df_top['DEBITS'] > 0].copy()

monthly_credit_dict = {}
monthly_debit_dict = {}

for month in sorted(df_top['Month_Str'].unique()):

top_5_c = monthly_credits[monthly_credits['Month_Str'] == month].nlargest(5,
'CREDITS')
top_5_d = monthly_debits[monthly_debits['Month_Str'] == month].nlargest(5,
'DEBITS')

credit_rows = []
for _, row in top_5_c.iterrows():
credit_rows.append([
row['TRANS DATE'].strftime('%d-%m-%Y'),
row['DESCRIPTION'],
row['CREDITS'],
classify_transaction(row['DESCRIPTION'])
])
monthly_credit_dict[month] = credit_rows

debit_rows = []
for _, row in top_5_d.iterrows():
debit_rows.append([
row['TRANS DATE'].strftime('%d-%m-%Y'),
row['DESCRIPTION'],
row['DEBITS'],
classify_transaction(row['DESCRIPTION'])
])
monthly_debit_dict[month] = debit_rows

# --- Step 4: Save to Excel ---

wb = Workbook()

# Monthly_Summary
ws1 = wb.active
ws1.title = "Monthly_Summary"
for r in dataframe_to_rows(summary, index=False, header=True):
ws1.append(r)

# Full_Statement
ws2 = wb.create_sheet("Full_Statement")
for r in dataframe_to_rows(df_cleaned_original, index=False, header=True):
ws2.append(r)

# Top_Transactions
ws3 = wb.create_sheet("Top_Transactions")

# Section 1: Overall Top 5

ws3.append(["Top 5 Transactions Credit", "", "", "", "", "Top 5 Transactions
Debit", "", "", ""])
ws3.append([])
ws3.append(["Extracted Date", "Description", "Amount", "Classification
Transaction", "",
"Extracted Date", "Description", "Amount", "Classification
Transaction"])

for row_c, row_d in zip(dataframe_to_rows(credit_table, index=False, header=False),

dataframe_to_rows(debit_table, index=False, header=False)):
ws3.append(row_c + [""] + row_d)

# Section 2: Monthly Top 5

ws3.append([])
ws3.append([])
ws3.append(["Top 5 Transactions Credits Monthly", "", "", "", "", "Top 5
Transactions Debits Monthly", "", "", ""])

for month in monthly_credit_dict.keys():

# Add month headers
ws3.append([month, "", "", "", "", month])
ws3.append(["Extracted Date", "Description", "Amount", "Classification
Transaction", "",
"Extracted Date", "Description", "Amount", "Classification
Transaction"])

credit_rows = monthly_credit_dict[month]
debit_rows = monthly_debit_dict.get(month, [])
max_len = max(len(credit_rows), len(debit_rows))

for i in range(max_len):
c_row = credit_rows[i] if i < len(credit_rows) else [""] * 4
d_row = debit_rows[i] if i < len(debit_rows) else [""] * 4
ws3.append(c_row + [""] + d_row)

ws3.append([]) # spacing

# Daily_Balance Sheet (Sheet 5)

# --- Sheet 4: Daily_Balance (Correct Format) ---
ws4 = wb.create_sheet("Daily_Balance")

# Prepare day-wise balance data

day_cols = [f"{day:02}" for day in range(1, 32)]
summary['Month_Label'] = pd.to_datetime(summary[['YEAR',
'MONTH']].assign(DAY=1)).dt.strftime('%B-%Y')

# Prepare daily balance summary table

daily_bal_summary = summary[['Month_Label'] + day_cols].copy()

# Add an average row at the end

avg_row = ['Average'] + [round(daily_bal_summary[day].mean(), 2) for day in
day_cols]
daily_bal_summary.loc[len(daily_bal_summary)] = avg_row

# Write headers and rows to Sheet 4

ws4.append(['Month'] + day_cols)
for _, row in daily_bal_summary.iterrows():
ws4.append([row['Month_Label']] + list(row[day_cols]))

# --- Step 5: Fraud_Transactions Detection ---

df_fraud = df_cleaned_original.copy()
df_fraud.columns = df_fraud.columns.str.strip().str.upper()

# Ensure numeric types

df_fraud['CREDITS'] = pd.to_numeric(df_fraud.get('CREDITS', 0),
errors='coerce').fillna(0)
df_fraud['DEBITS'] = pd.to_numeric(df_fraud.get('DEBITS', 0),
errors='coerce').fillna(0)
df_fraud['BALANCE'] = pd.to_numeric(df_fraud.get('BALANCE', 0), errors='coerce')

# Parse transaction date and sort

df_fraud['TRANS DATE'] = pd.to_datetime(df_fraud['TRANS DATE'], errors='coerce')
df_fraud = df_fraud.sort_values('TRANS DATE').reset_index(drop=True)

# Load holiday list

holiday_dates = pd.to_datetime([
"2023-12-04", "2023-12-12", "2023-12-13", "2023-12-14", "2023-12-18",
"2023-12-19", "2023-12-25", "2023-12-26", "2023-12-30", "2024-01-26",
"2024-02-19", "2024-03-29", "2024-04-01", "2024-04-17", "2024-05-01",
"2024-06-06", "2024-06-07", "2024-06-11", "2024-06-15"
])

# --- STEP 1: Calculate expected balance from FIRST row ---

expected_balances = [df_fraud.loc[0, 'BALANCE']]
mismatch_flags = [False] # First row has no mismatch

for i in range(1, len(df_fraud)):

prev_balance = expected_balances[-1]
credit = df_fraud.loc[i, 'CREDITS']
debit = df_fraud.loc[i, 'DEBITS']
actual_balance = df_fraud.loc[i, 'BALANCE']

expected = prev_balance + credit - debit

expected_balances.append(expected)

mismatch = abs(expected - actual_balance) > 1.0

mismatch_flags.append(mismatch)

df_fraud['Expected_Balance'] = expected_balances
df_fraud['Mismatch'] = mismatch_flags

# --- STEP 2: Flag suspicious holiday debits (excluding UPI) ---

df_fraud['DESCRIPTION_LOWER'] = df_fraud['DESCRIPTION'].str.lower().fillna('')
df_fraud['IsUPI'] = df_fraud['DESCRIPTION_LOWER'].str.contains('upi|unified
payment')

df_fraud['IsHolidayWithdrawal'] = (
df_fraud['TRANS DATE'].isin(holiday_dates) &
(df_fraud['DEBITS'] > 0) &
(~df_fraud['IsUPI']) &
df_fraud['Mismatch']
)

# --- STEP 3: Filter fraud transactions ---

fraud_rows = df_fraud[df_fraud['Mismatch'] | df_fraud['IsHolidayWithdrawal']]

# --- STEP 4: Prepare output ---

fraud_output = fraud_rows[['TRANS DATE', 'DESCRIPTION', 'CREDITS', 'DEBITS',
'BALANCE']].copy()
fraud_output.rename(columns={
'TRANS DATE': 'Extracted Date',
'CREDITS': 'Credit',
'DEBITS': 'Debit'
}, inplace=True)

# --- STEP 5: Classify Transaction Type ---

def classify_transaction(description):
if not isinstance(description, str):
return "Unknown"
desc = description.lower()
if "upi" in desc:
return "UPI"
elif "credit" in desc or "cr" in desc:
return "Credit"
elif "debit" in desc or "withdrawal" in desc or "dr" in desc:
return "Debit"
return "Unknown"

fraud_output['Party Name'] = ""

fraud_output['Transaction Type'] =
fraud_output['DESCRIPTION'].apply(classify_transaction)

# --- STEP 6: Write to Excel ---

ws5 = wb.create_sheet("Fraud_Transactions")
ws5.append(fraud_output.columns.tolist())

for row in fraud_output.itertuples(index=False):

try:
ws5.append(list(row))
except:
ws5.append(["Invalid data in row"])

# --- Step 6: Fraud_Indicators Sheet ---

fraud_indicator_rows = [
["Amount Balance Mismatch", "Transaction whose amount/balance do not match with
previous transactions."],
["Irregular Interest Charges", "Interest Charges which are not in all months
within narrow date range."],
["Irregular Transfers to Parties", "Transactions categorised as Fund Transfers
which are not in all months within a narrow date range."],
["Irregular Salary Credits", "Salary credits which are not in every month
within a narrow date range."],
["Suspicious ATM Withdrawals", "ATM Withdrawals having amount not being
multiple of 100 or outside of permissible range."],
["Transactions on Bank Holidays", "NEFT, RTGS and Cheque Deposit transactions
cannot happen on Bank Holidays."],
["Suspicious RTGS Transactions", "RTGS transactions have a minimum amount limit
as prescribed by RBI."],
["Suspicious Salary Credits", "Salary Credit on Bank Holidays."],
["Salary Credit Amount Remains Unchanged over extended period", "Salary amount
usually changes over time, particularly during tail end of financial year, due to
change in TDS amount."],
["Round Figure Tax Payment", "Tax paid amounts are usually not in round figures
(multiple of 100)."],
["Negative EOD Balance", "EOD bank balance on any day is unlikely to be
negative in Savings Account."],
["Interest Credit Transactions", "Interest Credit transactions should be
periodic (monthly/quarterly/half yearly)."],
["More and frequent Cash Deposit than Salary", "Higher number or more cash
deposit than salary is highly unlikely."],
["Immediate big transactions after Salary Credits", "Withdrawal of big amount
after salary credit is due to forged salary entries."],
["Equal Credit Debit", "Total amount of credit/debit or total number of
credit/debit cannot be same. Both scenarios are extremely unlikely."],
]

# Create the new Fraud_Indicators sheet

# --- Step 6: Fraud_Indicators Sheet with Detailed Format ---
ws6 = wb.create_sheet("Fraud_Indicators")

# Updated Header
ws6.append(["Sr No", "Fraud Indicator", "Description", "Identified ?", "Transaction
Count"])

# Add rows with Sr No and placeholders

for i, row in enumerate(fraud_indicator_rows, start=1):
ws6.append([i, row[0], row[1], "No", 0])

# --- Step 7: Irregular_Transactions Sheet ---

# Step 1: Determine threshold for large transactions (95th percentile)

debit_threshold = df['DEBITS'].quantile(0.95)
credit_threshold = df['CREDITS'].quantile(0.95)

# Step 2: Count occurrence of each description

desc_counts = df['DESCRIPTION'].value_counts()

# Step 3: Mark irregular if:

# - High amount or
# - Rare description
df['IS_IRREGULAR'] = (
(df['DEBITS'] > debit_threshold) |
(df['CREDITS'] > credit_threshold) |
(df['DESCRIPTION'].apply(lambda x: desc_counts.get(x, 0)) <= 1)
)

# Step 4: Extract irregular transactions

irregular_df = df[df['IS_IRREGULAR']].copy()
irregular_df = irregular_df[['DATE', 'DESCRIPTION', 'DEBITS', 'CREDITS',
'BALANCE']]
irregular_df.columns = ['Date', 'Description', 'Debit', 'Credit', 'Balance']
irregular_df = irregular_df.sort_values('Date')

# Step 5: Write to new sheet

ws7 = wb.create_sheet("Irregular_Transactions")
ws7.append(irregular_df.columns.tolist())
for row in irregular_df.itertuples(index=False):
ws7.append(list(row))

#Ensure column names are consistent and datetime is parsed

df.columns = df.columns.str.strip().str.upper()

if 'DATE' in df.columns:
df['DATE'] = pd.to_datetime(df['DATE'], errors='coerce')
else:
raise KeyError("Expected column 'DATE' not found in DataFrame.")

# Create YearMonth column for grouping

df['YEARMONTH'] = df['DATE'].dt.to_period("M")

# Ensure DEBITS and CREDITS are numeric

for col in ['DEBITS', 'CREDITS']:
if col in df.columns:
df[col] = pd.to_numeric(df[col], errors='coerce').fillna(0)
else:
df[col] = 0.0

# Optional classifier (customize based on your keywords)

def classify_transaction(desc):
desc = str(desc).lower()
if any(word in desc for word in ['atm', 'withdraw', 'debit']):
return 'Use'
elif any(word in desc for word in ['salary', 'refund', 'credit', 'transfer']):
return 'Source'
return 'Other'

# Function to extract top uses and sources

def get_top_usage_source(df_sub):
uses_data, sources_data = [], []
grouped = df_sub.groupby("DESCRIPTION")
for desc, group in grouped:
total_debit = group["DEBITS"].sum()
total_credit = group["CREDITS"].sum()
combined_amounts = pd.concat([group["DEBITS"], group["CREDITS"]]).round(2)
common_amounts = Counter(combined_amounts.dropna()).most_common(1)
most_common_amt, count_common = common_amounts[0] if common_amounts else
(0, 0)
total_count = len(group)
classification = classify_transaction(desc)
row = {
"Description": desc,
"Total Sum of Transaction": round(total_debit + total_credit, 2),
"Total Count": total_count,
"Similar Transaction Amount": round(most_common_amt, 2),
"Count of Transaction": count_common,
"Classification Transaction": classification
}
if total_debit > total_credit:
uses_data.append(row)
elif total_credit > 0:
sources_data.append(row)

uses_df = pd.DataFrame(uses_data)
sources_df = pd.DataFrame(sources_data)

if not uses_df.empty:
uses_df = uses_df.sort_values("Total Sum of Transaction",
ascending=False).head(10)
if not sources_df.empty:
sources_df = sources_df.sort_values("Total Sum of Transaction",
ascending=False).head(10)

return uses_df, sources_df

# If you already have a workbook, use that, else create new one
# Replace this if 'wb' already exists in your main code
try:
wb
except NameError:
wb = Workbook()

ws8 = wb.create_sheet("Top_Uses_and_Sources")
ws8.append(["Top 10 Uses (Overall)"] + [""] * 5 + ["Top 10 Sources (Overall)"])

# Overall data
overall_uses_df, overall_sources_df = get_top_usage_source(df)
ws8.append(list(overall_uses_df.columns) + [""] + list(overall_sources_df.columns))

for i in range(max(len(overall_uses_df), len(overall_sources_df))):

use_row = list(overall_uses_df.iloc[i]) if i < len(overall_uses_df) else [""] *
len(overall_uses_df.columns)
source_row = list(overall_sources_df.iloc[i]) if i < len(overall_sources_df)
else [""] * len(overall_sources_df.columns)
ws8.append(use_row + [""] + source_row)

# Monthly breakdown
ws8.append([]); ws8.append(["Top 10 Uses and Sources Per Month"])

for month, group in df.groupby("YEARMONTH"):

ws8.append([]); ws8.append([f"Month: {month}"])
monthly_uses_df, monthly_sources_df = get_top_usage_source(group)
ws8.append(["Top 10 Uses"] + [""] * 5 + ["Top 10 Sources"])
ws8.append(list(monthly_uses_df.columns) + [""] +
list(monthly_sources_df.columns))
for i in range(max(len(monthly_uses_df), len(monthly_sources_df))):
use_row = list(monthly_uses_df.iloc[i]) if i < len(monthly_uses_df) else
[""] * len(monthly_uses_df.columns)
source_row = list(monthly_sources_df.iloc[i]) if i <
len(monthly_sources_df) else [""] * len(monthly_sources_df.columns)
ws8.append(use_row + [""] + source_row)

# --- Step 9: Account_Summary Sheet with Key Financial Metrics ---

from dateutil.relativedelta import relativedelta

# Get last available date in the dataset

latest_date = df['DATE'].max()
six_months_ago = latest_date - relativedelta(months=6)
twelve_months_ago = latest_date - relativedelta(months=12)

# Average Balance (overall)

avg_balance = round(df['BALANCE'].mean(), 2)

# Average Balance on 5th, 15th, and 25th

avg_balance_5_15_25 = round(df[df['DAY'].isin([5, 15, 25])]['BALANCE'].mean(), 2)

# Average Balance (last 6 months)

avg_balance_last_6 = round(df[df['DATE'] >= six_months_ago]['BALANCE'].mean(), 2)

# Average Receipt (last 6 months)

avg_receipt_6 = round(df[(df['DATE'] >= six_months_ago) & (df['CREDITS'] > 0)]
['CREDITS'].mean(), 2)
# I/W Return
iw_return_count = int(df['IS_IWRETURN'].sum())

# O/W Return
ow_return_count = int(df['IS_OWRETURN'].sum())

# Average Balance (last 12 months)

avg_balance_last_12 = round(df[df['DATE'] >= twelve_months_ago]['BALANCE'].mean(),
2)

# Average Receipt (last 12 months)

avg_receipt_12 = round(df[(df['DATE'] >= twelve_months_ago) & (df['CREDITS'] > 0)]
['CREDITS'].mean(), 2)

# Total Gross Credits

total_gross_credits = round(df['CREDITS'].sum(), 2)

# Total Net Credits (excluding IW returns)

total_net_credits = round(df[~df['IS_IWRETURN']]['CREDITS'].sum(), 2)

# Total Gross Debits

total_gross_debits = round(df['DEBITS'].sum(), 2)

# --- Create 9th Sheet ---

ws9 = wb.create_sheet("New_Analysis")

# Format header
ws9.append(["Metric", "Value"])

# Append rows
rows = [
["Average Balance", avg_balance],
["Average Balance(5,15,25)", avg_balance_5_15_25],
["Average Balance(Last 6 Month)", avg_balance_last_6],
["Average Receipt(6 months)", avg_receipt_6],
["I/W Return", iw_return_count],
["O/W Return", ow_return_count],
["Average Balance(Last 12 Month)", avg_balance_last_12],
["Average Receipt(12 Month)", avg_receipt_12],
["Total Gross Credits", total_gross_credits],
["Total Net Credits", total_net_credits],
["Total Gross Debits", total_gross_debits]
]

for r in rows:
ws9.append(r)

# Leave space and label

ws9.append([])
ws9.append(["Monthly Detailed Metrics"])
ws9.append([])

# Ensure 'MONTH_YEAR' and 'DAY' columns exist

df['MONTH_YEAR'] = df['DATE'].dt.strftime('%b %Y')
df['DAY'] = df['DATE'].dt.day

# Group by Month-Year
monthly_group = df.groupby('MONTH_YEAR')
monthly_data = []
# Safe helper
def count_and_sum(group, col_flag, value_col):
if col_flag in group.columns:
return [
int(group[col_flag].sum()),
round(group.loc[group[col_flag], value_col].sum(), 2)
]
return [0, 0.0]

# Process each month

for label, group in monthly_group:
row = [label] # Month-Year

# Balances on 5th, 15th, 25th and Monthly Avg

for d in [5, 15, 25]:
row.append(round(group[group['DAY'] == d]['BALANCE'].mean(), 2))
row.append(round(group['BALANCE'].mean(), 2))

# Credit/Debit stats
iw_flag = group['IS_IWRETURN'] if 'IS_IWRETURN' in group else pd.Series([False]
* len(group), index=group.index)
ow_flag = group['IS_OWRETURN'] if 'IS_OWRETURN' in group else pd.Series([False]
* len(group), index=group.index)

row.extend([
group[group['CREDITS'] > 0].shape[0],
round(group['CREDITS'].sum(), 2),
group[group['DEBITS'] > 0].shape[0],
round(group['DEBITS'].sum(), 2),
int(iw_flag.sum()),
int(ow_flag.sum()),
group[(group['CREDITS'] > 0) & (~iw_flag)].shape[0],
round(group[(group['CREDITS'] > 0) & (~iw_flag)]['CREDITS'].sum(), 2),
group[(group['DEBITS'] > 0) & (~ow_flag)].shape[0],
round(group[(group['DEBITS'] > 0) & (~ow_flag)]['DEBITS'].sum(), 2),
])

# Other flags
row += count_and_sum(group, 'IS_CASH_WITHDRAWAL', 'DEBITS')
row += count_and_sum(group, 'IS_ATM', 'DEBITS')
row += count_and_sum(group, 'IS_CASH_DEPOSIT', 'CREDITS')
row += count_and_sum(group, 'IS_CHEQUE_RETURN_CHARGE', 'DEBITS')
row += count_and_sum(group, 'IS_CHEQUE_INWARD_BOUNCE', 'DEBITS')
row += count_and_sum(group, 'IS_CHEQUE_OUTWARD_BOUNCE', 'DEBITS')
row += count_and_sum(group, 'IS_PAYMENT_INWARD_BOUNCE', 'DEBITS')
row += count_and_sum(group, 'IS_PAYMENT_OUTWARD_BOUNCE', 'DEBITS')
row += count_and_sum(group, 'IS_PAYMENT_BOUNCE_CHARGE', 'DEBITS')
row += count_and_sum(group, 'IS_CHEQUE_DEPOSIT', 'CREDITS')
row += count_and_sum(group, 'IS_CHEQUE_ISSUE', 'DEBITS')
row += count_and_sum(group, 'IS_CREDIT_INTERNAL_TRANSFER', 'CREDITS')
row += count_and_sum(group, 'IS_DEBIT_INTERNAL_TRANSFER', 'DEBITS')
row += count_and_sum(group, 'IS_LOAN_DISBURSAL', 'CREDITS')
row += count_and_sum(group, 'IS_INTEREST_RECEIVED', 'CREDITS')
row += count_and_sum(group, 'IS_INTEREST_PAID', 'DEBITS')

# Salary
if 'IS_SALARY' in group.columns:
row += [
round(group[group['IS_SALARY']]['CREDITS'].sum(), 2),
round(group[group['IS_SALARY']]['DEBITS'].sum(), 2)
]
else:
row += [0.0, 0.0]

# Holiday and charges

row += count_and_sum(group, 'IS_HOLIDAY_TRANSACTION', 'DEBITS')
row += count_and_sum(group, 'IS_MIN_BAL_CHARGE', 'DEBITS')

# Cash deposit ranges

row.append(((group['CREDITS'] >= 900000) & (group['CREDITS'] <=
1000000)).sum())
row.append(((group['CREDITS'] >= 40000) & (group['CREDITS'] <= 50000)).sum())

# ATM withdrawals > 2000

if 'IS_ATM' in group.columns:
row.append(((group['IS_ATM']) & (group['DEBITS'] > 2000)).sum())
else:
row.append(0)

# Min, Max, Avg balance

row += [
round(group['BALANCE'].min(), 2),
round(group['BALANCE'].max(), 2),
round(group['BALANCE'].mean(), 2)
]

monthly_data.append(row)

# Final column headers - must match row length

monthly_headers = [
"Month-Year", "5th Balance", "15th Balance", "25th Balance", "Monthly Avg
Balance",
"No. of Credit Txns", "Total Credit Amt",
"No. of Debit Txns", "Total Debit Amt",
"No. of IW Returns", "No. of OW Returns",
"Net Credit Txns", "Net Credit Amt",
"Net Debit Txns", "Net Debit Amt",
"Cash Withdrawal Txns", "Cash Withdrawal Amt",
"ATM Withdrawal Txns", "ATM Withdrawal Amt",
"Cash Deposit Txns", "Cash Deposit Amt",
"Cheque Return Charges (No, Amt)",
"Cheque Inward Bounce (No, Amt)",
"Cheque Outward Bounce (No, Amt)",
"Payment Inward Bounce (No, Amt)",
"Payment Outward Bounce (No, Amt)",
"Payment Bounce Charges (No, Amt)",
"Cheque Deposits (No, Amt)",
"Cheque Issues (No, Amt)",
"Credit Internal Transfers (No, Amt)",
"Debit Internal Transfers (No, Amt)",
"Loan Disbursals (No, Amt)",
"Interest Received (No, Amt)",
"Interest Paid (No, Amt)",
"Salary Credit", "Salary Debit",
"Holiday Txns (No, Amt)",
"Minimum Balance Charges (No, Amt)",
"Cash Deposit ₹9–10L", "Cash Deposit ₹40–50K",
"ATM Withdrawal > ₹2000",
"Min Balance", "Max Balance", "Avg Balance"
]

# Safe Excel row handler

def safe_row(row):
return [str(x) if isinstance(x, (list, dict)) or pd.isna(x) else x for x in
row]

# Add to Excel
ws9.append(safe_row(monthly_headers))
for row in monthly_data:
ws9.append(safe_row(row))

# --- Save ---

wb.save("Analysis_Output28.xlsx")
print("✅ Final file 'Analysis_Output.xlsx' saved with Daily_Balance sheet added.")

Revolut Bank Statement
No ratings yet
Revolut Bank Statement
2 pages
GBP Transactions Summary Sep-Oct 2020
No ratings yet
GBP Transactions Summary Sep-Oct 2020
6 pages
2.3 - Jupyter Notebook
No ratings yet
2.3 - Jupyter Notebook
24 pages
Freda Song Drechsler - Fama-French
No ratings yet
Freda Song Drechsler - Fama-French
7 pages
Jun 2022
No ratings yet
Jun 2022
1 page
RK Gias Springhill 2019
No ratings yet
RK Gias Springhill 2019
2,541 pages
Electronic Payment Systems Guide
No ratings yet
Electronic Payment Systems Guide
15 pages
Eco Outlook Axis
No ratings yet
Eco Outlook Axis
26 pages
Corporate Credit Rating Analysis
No ratings yet
Corporate Credit Rating Analysis
31 pages
Harii Textfile
No ratings yet
Harii Textfile
9 pages
Ip Practical
No ratings yet
Ip Practical
23 pages
Code With Dates HARDCODED
No ratings yet
Code With Dates HARDCODED
2 pages
Claim Airdrop For STYLE Engage - Public Engagement Page Metamask-1
No ratings yet
Claim Airdrop For STYLE Engage - Public Engagement Page Metamask-1
23 pages
HKZOpl CBB TTD 2 Raw
No ratings yet
HKZOpl CBB TTD 2 Raw
14 pages
HyperFund Standard Presentation 1 Oct 2021 Final-1
No ratings yet
HyperFund Standard Presentation 1 Oct 2021 Final-1
35 pages
CA 1 Watson Studio
No ratings yet
CA 1 Watson Studio
11 pages
Answers Practical File
No ratings yet
Answers Practical File
19 pages
File Code
No ratings yet
File Code
13 pages
Account STMT XX8258 12032024
No ratings yet
Account STMT XX8258 12032024
5 pages
Final
No ratings yet
Final
45 pages
NDTL JLN STADIUM - Sept'23
No ratings yet
NDTL JLN STADIUM - Sept'23
1 page
Phase 2 New
No ratings yet
Phase 2 New
14 pages
Set B
No ratings yet
Set B
8 pages
Debit Card Security & Control Guide
No ratings yet
Debit Card Security & Control Guide
3 pages
Personal Finance Analysis Tool
No ratings yet
Personal Finance Analysis Tool
2 pages
Excel Accounting
No ratings yet
Excel Accounting
6 pages
AI-Powered Fraud Detection Guide
No ratings yet
AI-Powered Fraud Detection Guide
12 pages
Pandas Operations Guide
No ratings yet
Pandas Operations Guide
6 pages
Wa0016.
No ratings yet
Wa0016.
13 pages
Pandas Readexcel Merge Cleaning Class2-3
No ratings yet
Pandas Readexcel Merge Cleaning Class2-3
1 page
Excel Diary2025
No ratings yet
Excel Diary2025
241 pages
Ym Fo CSG 8 TF 4 WV4 en
No ratings yet
Ym Fo CSG 8 TF 4 WV4 en
14 pages
Trading Results Analysis
No ratings yet
Trading Results Analysis
27 pages
Half Yearly Answers
No ratings yet
Half Yearly Answers
10 pages
ATM Verification Work by
No ratings yet
ATM Verification Work by
2 pages
Python Pandas DataFrame Tasks
No ratings yet
Python Pandas DataFrame Tasks
13 pages
Pyhtonpractice Questions
No ratings yet
Pyhtonpractice Questions
5 pages
Practical Questions
No ratings yet
Practical Questions
7 pages
BSC Sem4 - srm2 & Python Project - Group No. 02
No ratings yet
BSC Sem4 - srm2 & Python Project - Group No. 02
43 pages
DP600 Code Used 240514
No ratings yet
DP600 Code Used 240514
27 pages
Problem Statement
No ratings yet
Problem Statement
3 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Task
No ratings yet
Task
15 pages
Challenges of Paypal
No ratings yet
Challenges of Paypal
17 pages
September 2023 Account Statement
No ratings yet
September 2023 Account Statement
1 page
Ac Anike Chukwudi Henry October, 2021 591683895 Fullstmt
No ratings yet
Ac Anike Chukwudi Henry October, 2021 591683895 Fullstmt
48 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
Bank OTP Registration Guide
No ratings yet
Bank OTP Registration Guide
4 pages
E Passbook
No ratings yet
E Passbook
10 pages
Fee Information Document
No ratings yet
Fee Information Document
2 pages
Project Ip
No ratings yet
Project Ip
20 pages
Interactive Data Analysis With Jupyter Cheatsheet 1731972443
No ratings yet
Interactive Data Analysis With Jupyter Cheatsheet 1731972443
10 pages
PhonePe Statement Jul2024 Aug2024
No ratings yet
PhonePe Statement Jul2024 Aug2024
63 pages
Rapport-Marco-Polo-Hunter-Famille-Biden (1) (1) - Pages-71
No ratings yet
Rapport-Marco-Polo-Hunter-Famille-Biden (1) (1) - Pages-71
5 pages
Python - Pandas - Numpy Interview Q&A
No ratings yet
Python - Pandas - Numpy Interview Q&A
12 pages
Daily Transactions Problem Statement
No ratings yet
Daily Transactions Problem Statement
27 pages
Tractor Fee Receipt for Kailash Chand
No ratings yet
Tractor Fee Receipt for Kailash Chand
1 page
Pandas Data Manipulation Extended CheatSheet 1731972219
No ratings yet
Pandas Data Manipulation Extended CheatSheet 1731972219
9 pages
Final Ai M225187154i
No ratings yet
Final Ai M225187154i
25 pages
Business Bank Statement Summary
No ratings yet
Business Bank Statement Summary
1 page
Jio AirFiber Bill Summary
No ratings yet
Jio AirFiber Bill Summary
2 pages
Digitally Stamped Statement of Account
No ratings yet
Digitally Stamped Statement of Account
4 pages
Statement of Axis Account No:923010044838954 For The Period (From: 01-01-2024 To: 17-12-2024)
No ratings yet
Statement of Axis Account No:923010044838954 For The Period (From: 01-01-2024 To: 17-12-2024)
7 pages
Py Spark Samples
No ratings yet
Py Spark Samples
3 pages
Code Feature
No ratings yet
Code Feature
7 pages
OpTransactionHistory26 12 2024
No ratings yet
OpTransactionHistory26 12 2024
3 pages
Bank Statement Simulator Script
No ratings yet
Bank Statement Simulator Script
2 pages
Jul To Aug
No ratings yet
Jul To Aug
19 pages
Daily Transactions Problem Statement Major Project
No ratings yet
Daily Transactions Problem Statement Major Project
8 pages
Screenshot 2024-09-20 at 3.25.27 PM
No ratings yet
Screenshot 2024-09-20 at 3.25.27 PM
1 page
Problem Statement Major Project
No ratings yet
Problem Statement Major Project
8 pages
Code
No ratings yet
Code
6 pages
Anagh-Desai BigDataAssignments NYSE Airlines Using DF
No ratings yet
Anagh-Desai BigDataAssignments NYSE Airlines Using DF
9 pages
Acc Cum Bal Script
No ratings yet
Acc Cum Bal Script
2 pages
CityBank Account 10.10.113.317332017974563845248
No ratings yet
CityBank Account 10.10.113.317332017974563845248
2 pages
CMA Niranjan Mahapatra 18.11.22
No ratings yet
CMA Niranjan Mahapatra 18.11.22
17 pages
Pyspark Interview Questions
No ratings yet
Pyspark Interview Questions
4 pages
Pandas For Python Pro Level Cheat Sheet
No ratings yet
Pandas For Python Pro Level Cheat Sheet
14 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
Finance Report
No ratings yet
Finance Report
5 pages
Pandas Fuction Notes
No ratings yet
Pandas Fuction Notes
3 pages
Code Python
No ratings yet
Code Python
3 pages
File 14
No ratings yet
File 14
5 pages
Field Value
No ratings yet
Field Value
5 pages
File 15
No ratings yet
File 15
3 pages
TXN Date Description Debit Credit Balance Date Value Datechq - No. Value Date Ref No./Cheque No
No ratings yet
TXN Date Description Debit Credit Balance Date Value Datechq - No. Value Date Ref No./Cheque No
4 pages
Field Value
No ratings yet
Field Value
4 pages
File 10
No ratings yet
File 10
2 pages
File 21
No ratings yet
File 21
2 pages
Inbound 2665441404356826662
No ratings yet
Inbound 2665441404356826662
13 pages
File Cleaning
No ratings yet
File Cleaning
2 pages
Pandas Ques
No ratings yet
Pandas Ques
3 pages
Module 3
No ratings yet
Module 3
5 pages
Pandas Trampas
No ratings yet
Pandas Trampas
9 pages
Python Interview Cheat Sheet Moodys
No ratings yet
Python Interview Cheat Sheet Moodys
2 pages
Financial Database
No ratings yet
Financial Database
5 pages