Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
9 views3 pages

ML (Multi-Linear Regression)

The document outlines a Python program that implements a multiple linear regression model using the '50_Startups' dataset. It includes data preprocessing steps such as label encoding and one-hot encoding for categorical variables, followed by splitting the dataset into training and testing sets. The model is trained and evaluated, with the train and test scores printed to assess performance.

Uploaded by

kjasus520
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

ML (Multi-Linear Regression)

The document outlines a Python program that implements a multiple linear regression model using the '50_Startups' dataset. It includes data preprocessing steps such as label encoding and one-hot encoding for categorical variables, followed by splitting the dataset into training and testing sets. The model is trained and evaluated, with the train and test scores printed to assess performance.

Uploaded by

kjasus520
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Program:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split


from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
# Load dataset
dataset = pd.read_csv('50_Startups.csv')
dataset.head()

x = dataset.iloc[:, :-1].values # independent variable array


y = dataset.iloc[:, 1].values
#Catgorical data

labelencoder_x= LabelEncoder()
x[:, 3] = labelencoder_x.fit_transform(x[:,3])
# onehotencoder= OneHotEncoder(categorical_features= [3])
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(drop='first'), [3])],
remainder='passthrough')
# x= onehotencoder.fit_transform(x).toarray()

x = x[:, 1:]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state=0)

#Fitting the MLR model to the training set:


regressor= LinearRegression()
regressor.fit(x_train, y_train)

y_train_pred = regressor.predict(x_train)
y_test_pred = regressor.predict(x_test)

print("Train Score:",regressor.score(x_train,y_train))
print("Test Score:",regressor.score(x_test,y_test))

Dataset:
R&D Spend Administration Marketing Spend State Profit
165349.2 136897.8 471784.1 New York 192261.8
162597.7 151377.6 443898.5 California 191792.1
153441.5 101145.6 407934.5 Florida 191050.4
144372.4 118671.9 383199.6 New York 182902
142107.3 91391.77 366168.4 Florida 166187.9
131876.9 99814.71 362861.4 New York 156991.1
134615.5 147198.9 127716.8 California 156122.5
130298.1 145530.1 323876.7 Florida 155752.6
120542.5 148719 311613.3 New York 152211.8
123334.9 108679.2 304981.6 California 149760
101913.1 110594.1 229161 Florida 146122
100672 91790.61 249744.6 California 144259.4
93863.75 127320.4 249839.4 Florida 141585.5
91992.39 135495.1 252664.9 California 134307.4
119943.2 156547.4 256512.9 Florida 132602.7
114523.6 122616.8 261776.2 New York 129917
78013.11 121597.6 264346.1 California 126992.9
94657.16 145077.6 282574.3 New York 125370.4
91749.16 114175.8 294919.6 Florida 124266.9
86419.7 153514.1 0 New York 122776.9
76253.86 113867.3 298664.5 California 118474
78389.47 153773.4 299737.3 New York 111313
73994.56 122782.8 303319.3 Florida 110352.3
67532.53 105751 304768.7 Florida 108734
77044.01 99281.34 140574.8 New York 108552
64664.71 139553.2 137962.6 California 107404.3
75328.87 144136 134050.1 Florida 105733.5
72107.6 127864.6 353183.8 New York 105008.3
66051.52 182645.6 118148.2 Florida 103282.4
65605.48 153032.1 107138.4 New York 101004.6
61994.48 115641.3 91131.24 Florida 99937.59
61136.38 152701.9 88218.23 New York 97483.56
63408.86 129219.6 46085.25 California 97427.84
55493.95 103057.5 214634.8 Florida 96778.92
46426.07 157693.9 210797.7 California 96712.8
46014.02 85047.44 205517.6 New York 96479.51
28663.76 127056.2 201126.8 Florida 90708.19
44069.95 51283.14 197029.4 California 89949.14
20229.59 65947.93 185265.1 New York 81229.06
38558.51 82982.09 174999.3 California 81005.76
28754.33 118546.1 172795.7 California 78239.91
27892.92 84710.77 164470.7 Florida 77798.83
23640.93 96189.63 148001.1 California 71498.49
15505.73 127382.3 35534.17 New York 69758.98
22177.74 154806.1 28334.72 California 65200.33
1000.23 124153 1903.93 New York 64926.08
1315.46 115816.2 297114.5 Florida 49490.75
0 135426.9 0 California 42559.73
542.05 51743.15 0 New York 35673.41
0 116983.8 45173.06 California 14681.4

Output:

You might also like