Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views2 pages

Assignment 01

The document outlines the assignment instructions for the Master of Science in Statistics Degree Program at the University of Kelaniya, specifically for the Advanced Regression Analysis course. Students are required to analyze three datasets using R statistical software to build regression models predicting GDP growth, taxi trip pricing, and house prices, with specific guidelines for submission. The assignment emphasizes the importance of including R codes, justifications for model suitability, and detailed interpretations of the results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views2 pages

Assignment 01

The document outlines the assignment instructions for the Master of Science in Statistics Degree Program at the University of Kelaniya, specifically for the Advanced Regression Analysis course. Students are required to analyze three datasets using R statistical software to build regression models predicting GDP growth, taxi trip pricing, and house prices, with specific guidelines for submission. The assignment emphasizes the importance of including R codes, justifications for model suitability, and detailed interpretations of the results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Department of Statistics & Computer Science

University of Kelaniya
Master of Science in Statistics Degree Program
Batch 3
STAT 54063 – Advanced Regression Analysis
Assignment 01
Instructions:
 Submit a report with answers for all the questions as a word/pdf document on or before 27th April
March 2025.

 Use R statistical software (R Studio) where necessary and include all R codes and screen shots in
your report under relevant question numbers.

 Make sure to submit all R codes along with the report when submitting the answers (use a zip
folder which includes all documents when submitting the answer to the PGLMS).

 Remember to mention your student number and name in the report and the zip folder.

 Follow all the instructions given in the lecture when attempting and submitting the assignment.
---------------------------------------------------------------------------------------------------------------------

1. The dataset "economic_data.xlsx" contains macroeconomic information collected from a country


over a certain period. An investigator is interested in predicting GDP growth based on selected
economic indicators. The dataset includes variables such as GDP growth, inflation rate,
unemployment rate, government spending, and exports. Variable descriptions of the dataset is
given in Table 1. Fit a suitable regression model to predict GDP growth.
(Hint: follow all the steps in fitting a regression model and needs to justify the suitability of the
fitted model as well)
Table 01: Variable description of "economic_data"
Inflation_Rate The annual percentage change in the general price level of goods
and services.
Unemployment_Rate The percentage of the labor force that is unemployed and actively
seeking work.
Government_Spending Total government expenditure (in million USD).
Exports The total value of goods and services exported (in million USD).
GDP_Growth The percentage increase in a country’s Gross Domestic Product
(GDP).

2. The “taxi_trip_pricing.xlsx” dataset represents data collected from a ride-hailing service, including
information about various trips taken by passengers. It captures multiple trip characteristics, such as
the time of day, traffic conditions, weather, and pricing information, as well as the distance, duration,
and other relevant details for each trip. Each record represents a single ride or trip. Variable
descriptions of the dataset is given in Table 2. Fit a regression model to predict total price charged
for the trip.
Page 1 of 2
(Hint: follow all the steps in fitting a regression model and needs to justify the suitability of the
fitted model as well).

Table 02: Variable description of "taxi_trip_pricing.xlsx"


Trip_Distance_km The total distance traveled during the trip, measured in kilometers.

Time_of_Day The time of day during which the trip occurred. Categories include
Morning, Afternoon, Evening, and Night.
Day_of_Week Indicates whether the trip was taken on a weekday or weekend.
Categories include Weekday, Weekend.
Passenger_Count The number of passengers in the vehicle during the trip.
Traffic_Conditions The traffic level encountered during the trip. Categories include Low,
Medium, High.
Weather The weather conditions during the trip. Categories include Clear,
Rain.
Base_Fare The initial fare of the ride, typically a fixed cost before any other
charges are applied.
Per_Km_Rate The rate charged per kilometer traveled.
Per_Minute_Rate The rate charged per minute of the trip.
Trip_Duration_Minutes The total duration of the trip, measured in minutes.
Trip_Price The total price charged for the trip.

3. The “house_price.csv” dataset aims particularly in the context of predicting house prices. It contains
1000 records, with each row representing a house and various attributes that influence its price.
Variable descriptions of the dataset is given in Table 3.

Table 03: Variable description of "house_price.csv"


Square_Footage The size of the house is in square feet.
Num_Bedrooms The number of bedrooms in the house.
Num_Bathrooms The number of bathrooms in the house.
Year_Built The year the house was built.
Lot_Size Size of the lot (in acres).
Garage_Size The number of cars that can fit in the garage.
Neighborhood_Quality A rating of the neighborhood’s quality on a scale of 1-10, where 10
indicates a high-quality neighborhood.
House_Price The price of the house.

(a) Build a multiple linear regression model to predict House_Price using the forward selection
method and identify the final model.
(b) Based on the final model selected through forward selection, interpret the model summary
output in R.
(c) Focus on discussing the significance of the selected predictors, their impact on the dependent
variable, and evaluate how well the model fits the data.

Page 2 of 2

You might also like