Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views13 pages

01 Univariate Linear Regression

This document outlines a module for implementing univariate linear regression to predict restaurant profits based on city populations. It includes sections on problem statement, dataset loading, data visualization, cost computation, and gradient descent for parameter optimization. The goal is to create a model that estimates potential profits for new restaurant locations based on population data.

Uploaded by

lakshmipoojaraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views13 pages

01 Univariate Linear Regression

This document outlines a module for implementing univariate linear regression to predict restaurant profits based on city populations. It includes sections on problem statement, dataset loading, data visualization, cost computation, and gradient descent for parameter optimization. The goal is to create a model that estimates potential profits for new restaurant locations based on population data.

Uploaded by

lakshmipoojaraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

9/7/25, 11:46 PM 01_univariate_linear_regression

01 - Univariate Linear Regression


In this module, you will implement linear regression with one variable to predict
profits for a restaurant franchise.

Outline
1 - Packages
2 - Linear regression with one variable
2.1 Problem Statement
2.2 Dataset
2.3 Refresher on linear regression
2.4 Compute Cost
Exercise 1
2.5 Gradient descent
Exercise 2
2.6 Learning parameters using batch gradient descent

1 - Packages
First, let's run the cell below to import all the packages that you will need during this
assignment.
numpy is the fundamental package for working with matrices in Python.
matplotlib is a famous library to plot graphs in Python.
utils.py contains helper functions for this assignment. You do not need to
modify code in this file.
In [1]: import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline

2 - Problem Statement
Suppose you are the CEO of a restaurant franchise and are considering different
cities for opening a new outlet.
You would like to expand your business to cities that may give your restaurant
higher profits.

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 1/13


9/7/25, 11:46 PM 01_univariate_linear_regression

The chain already has restaurants in various cities and you have data for profits
and populations from the cities.
You also have data on cities that are candidates for a new restaurant.
For these cities, you have the city population.
You will use the data to help you identify which cities may potentially give your
business higher profits.
3 - Dataset
You will start by loading the dataset for this task.
The load_data() function shown below loads the data into variables
x_train and y_train
x_train is the population of a city
y_train is the profit of a restaurant in that city. A negative value for profit
indicates a loss.
Both X_train and y_train are numpy arrays.
In [2]: # load the dataset
x_train, y_train = load_data()

View the variables


Before starting on any task, it is useful to get more familiar with your dataset.
A good place to start is to just print out each variable and see what it contains.
The code below prints the variable x_train and the type of the variable.
In [3]: # print x_train
print("Type of x_train:",type(x_train))
print("First five elements of x_train are:\n", x_train[:5])

Type of x_train: <class 'numpy.ndarray'>


First five elements of x_train are:
[6.1101 5.5277 8.5186 7.0032 5.8598]

x_train is a numpy array that contains decimal values that are all greater than
zero.
These values represent the city population times 10,000
For example, 6.1101 means that the population for that city is 61,101
Now, let's print y_train
In [4]: # print y_train
print("Type of y_train:",type(y_train))
print("First five elements of y_train are:\n", y_train[:5])

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 2/13


9/7/25, 11:46 PM 01_univariate_linear_regression

Type of y_train: <class 'numpy.ndarray'>


First five elements of y_train are:
[17.592 9.1302 13.662 11.854 6.8233]

Similarly, y_train is a numpy array that has decimal values, some negative, some
positive.
These represent your restaurant's average monthly profits in each city, in units of
$10,000.
For example, 17.592 represents $175,920 in average monthly profits for that
city.
-2.6807 represents -$26,807 in average monthly loss for that city.
Check the dimensions of your variables
Another useful way to get familiar with your data is to view its dimensions.
Print the shape of x_train and y_train and see how many training examples
you have in your dataset.
In [5]: print ('The shape of x_train is:', x_train.shape)
print ('The shape of y_train is: ', y_train.shape)
print ('Number of training examples (m):', len(x_train))

The shape of x_train is: (97,)


The shape of y_train is: (97,)
Number of training examples (m): 97

The city population array has 97 data points, and the monthly average profits also
has 97 data points. These are NumPy 1D arrays.
Visualize your data
It is often useful to understand the data by visualizing it.
For this dataset, you can use a scatter plot to visualize the data, since it has only
two properties to plot (profit and population).
Many other problems that you will encounter in real life have more than two
properties (for example, population, average household income, monthly profits,
monthly sales).When you have more than two properties, you can still use a
scatter plot to see the relationship between each pair of properties.
In [6]: # Create a scatter plot of the data. To change the markers to red "x",
# we used the 'marker' and 'c' parameters
plt.scatter(x_train, y_train, marker='x', c='r')

# Set the title


plt.title("Profits vs. Population per city")
# Set the y-axis label
plt.ylabel('Profit in $10,000')
# Set the x-axis label
plt.xlabel('Population of City in 10,000s')
plt.show()

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 3/13


9/7/25, 11:46 PM 01_univariate_linear_regression

Your goal is to build a linear regression model to fit this data.


With this model, you can then input a new city's population, and have the model
estimate your restaurant's potential monthly profits for that city.

4 - Refresher on linear regression


In this module, you will fit the linear regression parameters to your dataset. (w, b)

The model function for linear regression, which is a function that maps from x
(city population) to y (your restaurant's monthly profit for that city) is
represented as
fw,b (x) = wx + b

To train a linear regression model, you want to find the best parameters (w, b)

that fit your dataset.


To compare how one choice of is better or worse than another choice,
(w, b)

you can evaluate it with a cost function J (w, b)

is a function of
J . That is, the value of the cost
(w, b) depends J (w, b)

on the value of . (w, b)

The choice of that fits your data the best is the one that has the
(w, b)

smallest cost .
J (w, b)

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 4/13


9/7/25, 11:46 PM 01_univariate_linear_regression

To find the values that gets the smallest possible cost


(w, b) , you can J (w, b)

use a method called gradient descent.


With each step of gradient descent, your parameters come closer to (w, b)

the optimal values that will achieve the lowest cost . J (w, b)

The trained linear regression model can then take the input feature (city x

population) and output a prediction (predicted monthly profit for a


fw,b (x)

restaurant in that city).

5 - Compute Cost
Gradient descent involves repeated steps to adjust the value of your parameter
(w, b) to gradually get a smaller and smaller cost . J (w, b)

At each step of gradient descent, it will be helpful for you to monitor your
progress by computing the cost as gets updated.
J (w, b) (w, b)

In this section, you will implement a function to calculate so that you can J (w, b)

check the progress of your gradient descent implementation.


Cost function
As you may recall from the lecture, for one variable, the cost function for linear
regression is defined as
J (w, b)

m−1
1
(i) (i) 2
J (w, b) = ∑ (fw,b (x ) − y )
2m
i=0

You can think of as the model's prediction of your restaurant's profit,


fw,b (x
(i)
)

as opposed to , which is the actual profit that is recorded in the data.


y
(i)

m is the number of training examples in the dataset


Model prediction
For linear regression with one variable, the prediction of the model fw,b for an
example is representented as:
x
(i)

(i) (i)
fw,b (x ) = wx + b

This is the equation for a line, with an intercept and a slope b w

Implementation
Complete the compute_cost() function below to compute the cost J (w, b) .

Exercise 1
Complete the compute_cost below to:

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 5/13


9/7/25, 11:46 PM 01_univariate_linear_regression

Iterate over the training examples, and for each example, compute:
The prediction of the model for that example
(i) (i)
fwb (x ) = wx + b

The cost for that example


(i) (i) 2
cost = (fwb − y )

Return the total cost over all examples


m−1
1 (i)
J (w, b) = ∑ cost
2m
i=0

Here, is the number of training examples and is the summation


m ∑

operator.
In [7]: # GRADED FUNCTION: compute_cost

def compute_cost(x, y, w, b):


"""
Computes the cost function for linear regression.

Args:
x (ndarray): Shape (m,) Input to the model (Population of cities)
y (ndarray): Shape (m,) Label (Actual profits for the cities)
w, b (scalar): Parameters of the model

Returns
total_cost (float): The cost of using w,b as the parameters for l
to fit the data points in x and y
"""
# number of training examples
m = x.shape[0]

# You need to return this variable correctly


total_cost = 0

### START CODE HERE ###

for i in range(m):
f_wb = w * x[i] + b # prediction
cost = (f_wb - y[i])**2 # squared error
total_cost += cost
total_cost = total_cost / (2 * m) # average over dataset

### END CODE HERE ###

return total_cost

You can check if your implementation was correct by running the following test code:
In [8]: # Compute cost with some initial values for paramaters w, b
initial_w = 2
initial_b = 1

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 6/13


9/7/25, 11:46 PM 01_univariate_linear_regression

cost = compute_cost(x_train, y_train, initial_w, initial_b)


print(type(cost))
print(f'Cost at initial w: {cost:.3f}')

# Public tests
from public_tests import *
compute_cost_test(compute_cost)

<class 'numpy.float64'>
Cost at initial w: 75.203
All tests passed!

Expected Output:
Cost at initial w: 75.203

6 - Gradient descent
In this section, you will implement the gradient for parameters for linear w, b

regression.
As described in the lecture videos, the gradient descent algorithm is:
repeat until convergence: {

∂J (w, b)
b := b − α
∂b

∂J (w, b)
w := w − α (1)
∂w

where, parameters w, b are both updated simultaniously and where


m−1
∂J (w, b) 1
(i) (i)
= ∑ (fw,b (x ) − y ) (2)
∂b m
i=0

m−1
∂J (w, b) 1
(i) (i) (i)
= ∑ (fw,b (x ) − y )x (3)
∂w m
i=0

m is the number of training examples in the dataset


fw,b (xis the model's prediction, while , is the target value
(i)
) y
(i)

You will implement a function called compute_gradient which calculates ∂J (w)

∂w
,
∂J (w)

∂b

Exercise 2
Complete the compute_gradient function to:
file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 7/13
9/7/25, 11:46 PM 01_univariate_linear_regression

Iterate over the training examples, and for each example, compute:
The prediction of the model for that example
(i) (i)
fwb (x ) = wx + b

The gradient for the parameters w, b from that example


(i)
∂J (w, b)
(i) (i)
= (fw,b (x ) − y )
∂b

(i)
∂J (w, b)
(i) (i) (i)
= (fw,b (x ) − y )x
∂w

Return the total gradient update from all the examples


m−1 (i)
∂J (w, b) 1 ∂J (w, b)
= ∑
∂b m ∂b
i=0

m−1 (i)
∂J (w, b) 1 ∂J (w, b)
= ∑
∂w m ∂w
i=0

Here, is the number of training examples and is the summation


m ∑

operator
In [9]: # GRADED FUNCTION: compute_gradient
def compute_gradient(x, y, w, b):
"""
Computes the gradient for linear regression
Args:
x (ndarray): Shape (m,) Input to the model (Population of cities)
y (ndarray): Shape (m,) Label (Actual profits for the cities)
w, b (scalar): Parameters of the model
Returns
dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
dj_db (scalar): The gradient of the cost w.r.t. the parameter b
"""

# Number of training examples


m = x.shape[0]

# You need to return the following variables correctly


dj_dw = 0
dj_db = 0

### START CODE HERE ###


for i in range(m):
f_wb = w * x[i] + b # prediction
err = f_wb - y[i] # (f_wb - y)
dj_db += err # ∂J/∂b contribution
dj_dw += err * x[i] # ∂J/∂w contribution

dj_db /= m
dj_dw /= m

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 8/13


9/7/25, 11:46 PM 01_univariate_linear_regression

### END CODE HERE ###

return dj_dw, dj_db

Run the cells below to check your implementation of the compute_gradient


function with two different initializations of the parameters , . w b

In [10]: # Compute and display gradient with w initialized to zeroes


initial_w = 0
initial_b = 0

tmp_dj_dw, tmp_dj_db = compute_gradient(x_train, y_train, initial_w, init


print('Gradient at initial w, b (zeros):', tmp_dj_dw, tmp_dj_db)

compute_gradient_test(compute_gradient)

Gradient at initial w, b (zeros): -65.32884974555672 -5.83913505154639


Using X with shape (4, 1)
All tests passed!

Now let's run the gradient descent algorithm implemented above on our dataset.
Expected Output:
Gradient at initial , b (zeros) -65.32884975 -5.83913505154639
In [11]: # Compute and display cost and gradient with non-zero w
test_w = 0.2
test_b = 0.2
tmp_dj_dw, tmp_dj_db = compute_gradient(x_train, y_train, test_w, test_b)

print('Gradient at test w, b:', tmp_dj_dw, tmp_dj_db)

Gradient at test w, b: -47.41610118114435 -4.007175051546391

Expected Output:
Gradient at test w -47.41610118 -4.007175051546391

2.6 Learning parameters using batch gradient descent


You will now find the optimal parameters of a linear regression model by using batch
gradient descent.
A good way to verify that gradient descent is working correctly is to look at the
value of and check that it is decreasing with each step.
J (w, b)

Assuming you have implemented the gradient and computed the cost correctly
and you have an appropriate value for the learning rate alpha, should J (w, b)

never increase and should converge to a steady value by the end of the
algorithm.

Exercise 3
file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 9/13
9/7/25, 11:46 PM 01_univariate_linear_regression

Complete the 'gradient_descent' function below to implement the gradient descent


algorithm as given below:
repeat until convergence: {

∂J (w, b)
w = w − α (3)
∂w

∂J (w, b)
b = b − α
∂b

In your implementation, however, the number of iterations of your loop should be


determined by the num_iters parameter that is passed to the function, and not
determined by a convergence test

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 10/13


9/7/25, 11:46 PM 01_univariate_linear_regression

In [12]: # GRADED FUNCTION: gradient_descent


def gradient_descent(x, y, w_in, b_in, cost_function, gradient_function,
"""
Performs batch gradient descent to learn theta. Updates theta by taki
num_iters gradient steps with learning rate alpha

Args:
x : (ndarray): Shape (m,)
y : (ndarray): Shape (m,)
w_in, b_in : (scalar) Initial values of parameters of the model
cost_function: function to compute cost
gradient_function: function to compute the gradient
alpha : (float) Learning rate
num_iters : (int) number of iterations to run gradient descent
Returns:
w (scalar): Updated value of parameter after running gradient desce
b (scalar): Updated value of parameter after running gradient desce
J_history (List): History of cost values
p_history (list): History of parameters [w,b]
"""

### START CODE HERE ###

w, b = w_in, b_in
J_history = []
p_history = []

for i in range(num_iters):
# compute gradients at current parameters
dj_dw, dj_db = gradient_function(x, y, w, b)

# parameter update
w = w - alpha * dj_dw
b = b - alpha * dj_db

# track cost and parameters


J_history.append(cost_function(x, y, w, b))
p_history.append([w, b])

return w, b, J_history, p_history

### END CODE HERE ###

Now let's run the gradient descent algorithm above to learn the parameters for our
dataset.
In [13]: # initialize fitting parameters. Recall that the shape of w is (n,)
initial_w = 0.
initial_b = 0.

# some gradient descent settings


iterations = 1500
alpha = 0.01

w,b,_,_ = gradient_descent(x_train ,y_train, initial_w, initial_b,


compute_cost, compute_gradient, alpha, iterations)
print("w,b found by gradient descent:", w, b)

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 11/13


9/7/25, 11:46 PM 01_univariate_linear_regression

w,b found by gradient descent: 1.166362350335582 -3.63029143940436

Expected Output:
w, b found by gradient descent 1.16636235 -3.63029143940436
We will now use the final parameters from gradient descent to plot the linear fit.
Recall that we can get the prediction for a single example . f (x
(i)
) = wx
(i)
+ b

To calculate the predictions on the entire dataset, we can loop through all the training
examples and calculate the prediction for each example. This is shown in the code
block below.
In [14]: m = x_train.shape[0]
predicted = np.zeros(m)

for i in range(m):
predicted[i] = w * x_train[i] + b

We will now plot the predicted values to see the linear fit.
In [15]: # Plot the linear fit
plt.plot(x_train, predicted, c = "b")

# Create a scatter plot of the data.


plt.scatter(x_train, y_train, marker='x', c='r')

# Set the title


plt.title("Profits vs. Population per city")
# Set the y-axis label
plt.ylabel('Profit in $10,000')
# Set the x-axis label
plt.xlabel('Population of City in 10,000s')

Out[15]: Text(0.5, 0, 'Population of City in 10,000s')

Your final values of can also be used to make predictions on profits. Let's predict
w, b

what the profit would be in areas of 35,000 and 70,000 people.


The model takes in population of a city in 10,000s as input.
Therefore, 35,000 people can be translated into an input to the model as
np.array([3.5])

Similarly, 70,000 people can be translated into an input to the model as


np.array([7.])

In [16]: predict1 = 3.5 * w + b


print('For population = 35,000, we predict a profit of $%.2f' % (predict1

predict2 = 7.0 * w + b
print('For population = 70,000, we predict a profit of $%.2f' % (predict2

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 12/13


9/7/25, 11:46 PM 01_univariate_linear_regression

For population = 35,000, we predict a profit of $4519.77


For population = 70,000, we predict a profit of $45342.45

Expected Output:
For population = 35,000, we predict a profit of $4519.77
For population = 70,000, we predict a profit of $45342.45

file:///Users/pokeapokemon/Downloads/01_univariate_linear_regression (1).html 13/13

You might also like