DIRE DAWA UNVIRSETY
COLLOGE OF COMPUTITONAL NATURAL
SENICE
DEPARTMENT OF
STATISTICS
ASSIGNMENT OF COMPUTING II
Group Assignment
Name ID
1, Bedasa Elemo ............. 1401466
2, Negeraa Abdaata ...........1402733
3, Bikila meseret .......... . 1401546
1, Q
Overview of Data Structures in R
R provides several types of data structures to store and manipulate data effectively. Here are the main
types:
Vectors:
The most basic data structure in R, which holds elements of the same type (numeric, character, logical,
etc.). They can be created using the c() function.
Matrices:
Two-dimensional arrays that hold elements of the same type. Matrices can be created using the matrix()
function.
Arrays:
Similar to matrices but can have more than two dimensions. They can be created with the array()
function.
Data Frames:
A table-like structure where each column can contain different types of data (numeric, character, etc.).
Data frames are created using the data.frame() function and are widely used for statistical analysis.
Lists:
A collection of objects that can be of different types. Lists can be created using the list() function.
Factors:
Used to represent categorical data and can be ordered or unordered. Factors are created using the
factor() function.
2,A Question.
A<-c(3,3,3,4,4,4,5,5,5,6,6,6)
A<-matrix(A,3,4)
>A
[,1] [,2] [,3] [,4]
[1,] 3 4 5 6
[2,] 3 4 5 6
[3,] 3 4 5 6
> A<-c(3,4,5,6,3,4,5,6,3,4,5,6)
>A
[1] 3 4 5 6 3 4 5 6 3 4 5 6
> A<-c(3,4,5,6,3,4,5,6,3,4,5,6)
> A<-matrix(A,nrow=3,byrow=TRUE)
>A
[,1] [,2] [,3] [,4]
[1,] 3 4 5 6
[2,] 3 4 5 6
[3,] 3 4 5 6
B.
B<-c(7,5,6,9,8,7,4,6,3,3,4,6)
> B<-matrix(B,3,4)
>B
[,1] [,2] [,3] [,4]
[1,] 7 9 4 3
[2,] 5 8 6 4
[3,] 6 7 3 6
> B<-c(7,5,6,9,8,7,4,6,3,3,4,6)
> B<-matrix(B,3,4)
>B
[,1] [,2] [,3] [,4]
[1,] 7 9 4 3
[2,] 5 8 6 4
[3,] 6 7 3 6
> B1<-c(7,9,4,3,5,8,6,4,6,7,3,6)
> B1<-matrix(B1,3,4)
> B1
[,1] [,2] [,3] [,4]
[1,] 7 3 6 7
[2,] 9 5 4 3
[3,] 4 8 6 6
> B1<-matrix(B1,nrow=3,byrow=TRUE)
> B1
[,1] [,2] [,3] [,4]
[1,] 7 9 4 3
[2,] 5 8 6 4
[3,] 6 7 3 6
C)
B<-B[c(1,3),c(1,4)]
>B
[,1] [,2]
[1,] 7 3
[2,] 6 6
D) Question
> D<-c(3:6)
> D<-matrix(D,2,2)
>D
[,1] [,2]
[1,] 3 5
[2,] 4 6
> solve(D)
[,1] [,2]
[1,] -3 2.5
[2,] 2 -1.5
> t(D)
[,1] [,2]
[1,] 3 4
[2,] 5 6
>
Question 3
A.
data <- data.frame(
+ ID = 1:20,
+ Sex = c("Female", "Male", "Female", "Male", "Male",
+ "Male", "Female", "Male", "Female", "Male",
+ "Female", "Male", "Male", "Female", "Female",
+ "Female", "Male", "Female", "Female", "Male"),
+ Weight = c(61, 55, 49, 58, 60,
+ 56, 47, 55, 54, 52,
+ 59, 56, 62, 48, 51,
+ 50, 64, 55, 56, 65),
+ Height = c(1.55, 1.70, 1.65, 1.65, 1.74,
+ 1.80, 1.52, 1.56, 1.57, 1.55,
+ 1.60, 1.59, 1.68, 1.52, 1.54,
+ 1.55, 1.70, 1.62, 1.66, 1.70),
+ Age = c(25, 21, 22, 21, 20,
+ 21, 22, 22, 20, 21,
+ 22, 21, 20, 20, 21,
+ 21, 22, 21, 20, 22),
+ Department = c("Marketing management", "Statistics", "Mathematics", "IT", "Civil",
+ "Statistics", "History", "Geography", "Biology", "Management",
+ "Accounting", "Surveying", "Economics", "Accounting", "Banking",
+ "Chemistry", "Mechanical", "Amharic", "Civics", "English"),
+ Income_Status = c(3, 2, 1, 3, 3,
+ 1, 1, 2, 2, 1,
+ 3, 1, 1, 2, 2,
+ 2, 1, 3, 3, 3)
+)
> data
ID Sex Weight Height Age Department Income_Status
1, 1 Female 61 1.55 25. Marketing management 3
2, 2 Male 55 1.70 21 Statistics 2
3 3 Female 49 1.65. 22 Mathematics 1
4 4 Male 58 1.65 21 IT 3
5 5 Male 60 1.74 20 Civil 3
6 6 Male 56 1.80 21 Statistics 1
7 7 Female 47 1.52. 22 History 1
8 8 Male 55 1.56 22 Geography 2
9 9 Female 54 1.57 20 Biology 2
10 10 Male 52 1.55 21 Management 1
11 11 Female 59 1.60. 22 Accounting 3
12 12 Male. 56 1.59 21 Surveying 1
13 13 Male 62 1.68 20 Economics 1
14 14 Female 48 1.52 20 Accounting 2
15 15 Female 51 1.54 21 Banking 2
16 16 Female 50. 1.55 21 Chemistry 2
17 17 Male 64 1.70 22 Mechanical 1
18 18 Female 55 1.62 21 Amharic 3
19 19 Female. 56 1.66 20 Civics 3
20 20 Male. 65 1.70 22 English 3
>
B.
CGPA<-c(3.60, 2.56, 2.45, 3.89, 3.10, 3.20, 2.57, 2.89, 2.02, 2.28, 2.87,2.79,2.45,3.12,3.65,2.65,2.89,2.98,
3.98, 3.47)
> CGPA
[1] 3.60 2.56 2.45 3.89 3.10 3.20 2.57 2.89 2.02 2.28 2.87 2.79 2.45 3.12 3.65
[16] 2.65 2.89 2.98 3.98 3.47
> data<-cbind(data,CGPA)
> data
ID Sex. Weight Height Age Department. Income_Status. . CGPA
1 1. Female 61 1.55 25 Marketing management 3 3.60
2 2 Male 55 1.70. 21 Statistics 2 2.56
3 3 Female 49 1.65 22 Mathematics 1 2.45
4 4 Male 58 1.65. 21 IT 3. 3.89
5 5 Male. 60 1.74 20 Civil 3 3.10
6 6 Male 56 1.80 21 Statistics 1. 3.20
7 7. Female 47 1.52 22 History 1 2.57
8 8 Male 55 1.56. 22 Geography 2 2.89
9 9. Female 54 1.57 20 Biology 2. 2.02
10 10 Male 52 1.55 21 Management 1. 2.28
11 11. Female 59 1.60 22 Accounting 3 2.87
12 12 Male 56 1.59 21 Surveying 1. 2.79
13 13 Male 62 1.68. 20 Economics 1. 2.45
14 14. Female 48. 1.52 20 Accounting 2 3.12
15 15. Female 51 1.54. 21 Banking 2. 3.65
16 16 Female 50 1.55 21 Chemistry 2. 2.65
17 17 Male 64. 1.70 22 Mechanical 1. 2.89
18 18. Female 55 1.62 21 Amharic 3 2.98
19 19. Female 56 1.66 20 Civics 3 3.98
20 20 Male 65 1.70 22 English 3. 3.47
>
C.
N1<-data.frame(ID=21,
Sex="Female",Weight=55,Height=1.68,Age=20,Department="electrical",Income_Status="high",CGPA=2.8
)
> data<-rbind(data,N1)
> data
ID Sex Weight. Height Age Department. Income_Status CGPA
1 1. Female 61 1.55 25. Marketing management high 3.60
2 2 Male 55 1.70 21 Statistics medium 2.56
3 3. Female 49. 1.65 22 Mathematics low. 2.45.
4 4 Male 58 1.65 21 IT high. 3.89
5 5 Male 60 1.74. 20 Civil high 3.10
6 6 Male 56 1.80 21 Statistics low. 3.20
7 7 Female 47 1.52 22 History low. 2.57
8 8 Male 55 1.56. 22 Geography medium. 2.89
9 9. Female 54 1.57 20 Biology medium. 2.02
10 10 Male 52 1.55 21 Management low 2.28
11 11 Female 59 1.60 22 Accounting high 2.87
12 12 Male. 56 1.59. 21 Surveying low. 2.79
13 13 Male. 62 1.68. 20 Economics low. 2.45
14 14. Female 48 1.52 20 Accounting medium. 3.12
15 15 Female 51 1.54 21 Banking medium. 3.65
16 16 Female 50 1.55. 21 Chemistry medium. 2.65
17 17 Male 64. 1.70 22 Mechanical low. 2.89
18 18. Female 55 1.62 21 Amharic high 2.98
19 19 Female 56 1.66. 20 Civics high 3.98
20 20 Male 65 1.70. 22 English high 3.47
21 21 Female 55 1.68. 20 electrical high 2.80
D.
data$Sex<-factor(data$Sex, levels=c("Female","Male"),labels=c(1,0))
> data
ID. Sex Weight. Height Age Department Income_Status CGPA
1 1 1 61 1.55 25 Marketing management high 3.60
2 2 0 55 1.70 21 Statistics medium. 2.56
3 3 1 49 1.65. 22 Mathematics low 2.45
4 4 0. 58 1.65 21 IT high. 3.89
5 5 0 60 1.74. 20 Civil high. 3.10
6 6 0 56 1.80 21 Statistics low 3.20
7 7 1 47 1.52 22 History low. 2.57
8 8 0 55 1.56. 22 Geography medium 2.89
9 9 1 54 1.57 20 Biology medium 2.02
10 10 0 52 1.55 21 Management low. 2.28
11 11. 1 59 1.60 22 Accounting high 2.87
12 12. 0 56 1.59 21 Surveying low 2.79
13 13. 0 62 1.68 20 Economics low. 2.45
14 14 1 48 1.52 20 Accounting medium. 3.12
15 15 1 51. 1.54 21 Banking medium 3.65
16 16 1 50 1.55. 21 Chemistry medium. 2.65
17 17. 0 64 1.70 22 Mechanical high. 2.98
19 19 1 56 1.66 20 Civics high 3.98
20 20. 0 65 1.70 22 English. high. 3.47
21 21. 1 55 1.68. 20 electrical high 2.80
>
data$Income_Status<-ordered(data$Income_Status, levels=c("low","medium","high"),labels=c(1,2,3))
> data
ID Sex Weight Height Age Department Income_Status CGPA
1 1 1 61 1.55 25 Marketing management 3. 3.60
2 2 0 55 1.70 21 Statistics 2 2.56
3 3 1 49 1.65. 22 Mathematics 1 2.45
4 4. 0 58 1.65 21 IT 3 3.89
5 5. 0 60 1.74 20 Civil 3 3.10
6 6 0 56 1.80. 21 Statistics 1 3.20
7 7 1 47. 1.52 22 History 1 2.57
8 8 0 55 1.56 22 Geography 2 2.89
9 9. 1 54. 1.57 20 Biology 2 2.02
10 10 0 52. 1.55 21 Management 1 2.28
11 11 1 59 1.60 22 Accounting 3 2.87
12 12 0 56 1.59 21 Surveying 1 2.79
13 13 0 62 1.68 20 Economics 1 2.45
14 14 1 48 1.52 20 Accounting 2 3.12
15 15 1 51. 1.54 21 Banking 2 3.65
16 16 1 50. 1.55 21 Chemistry 2 2.65
17 17 0 64 1.70 22 Mechanical 1 2.89
18 18 1 55 1.62. 21 Amharic 3 2.98
19 19 1 56 1.66. 20 Civics 3 3.98
20 20 0 65 1.70. 22 English 3 3.47
21 21. 1 55 1.68. 20 electrical 3 2.80
>
E.
data$AgeCategory<-ifelse(data$Age<=21,"Age.Cat1", ifelse(data$Age==21,"Age.Cat2","Age.Cat3"))
> data
ID Sex Weight Height Age Department Income_Status AgeCategory
1 1 Female 61 1.55 25 Marketing management 3 Age.Cat3
2 2 Male 55 1.70 21 Statistics 2. Age.Cat1
3 3 Female 49 1.65 22 Mathematics 1 Age.Cat3
4 4 Male 58 1.65 21 IT 3 Age.Cat1
5 5 Male 60 1.74 20 Civil 3 Age.Cat1
6 6 Male 56 1.80 21 Statistics 1 Age.Cat1
7 7 Female 47 1.52 22 History 1 Age.Cat3
8 8 Male 55. 1.56 22 Geography 2 Age.Cat3
9 9 Female 54 1.57 20 Biology 2 Age.Cat1
10 10 Male 52 1.55 21 Management 1 Age.Cat1
11 11 Female 59 1.60 22 Accounting 3 Age.Cat3
12 12 Male 56 1.59 21 Surveying 1 Age.Cat1
13 13 Male 62. 1.68 20 Economics 1 Age.Cat1
14 14 Female 48 1.52 20 Accounting 2 Age.Cat1
15 15. Female 51 1.54 21 Banking 2 Age.Cat1
16 16 Female 50 1.55 21 Chemistry 2 Age.Cat1
17 17 Male 64 1.70 22 Mechanical 1 Age.Cat3
18 18 Female 55 1.62 21 Amharic 3 Age.Cat1
19 19 Female 56 1.66 20 Civics 3 Age.Cat1
20 20 Male 65. 1.70 22 English 3. Age.Cat3
>
F, I,
data.frame(ID=1:5,Sex=c("Female","Male","Male","Female","Female"),Weight=c(40,65,63,55,53),Height
=c(1.52,1.65,1.68,1.45,1.63),Age=c(20,24,25,22,23),
+ Department=c("Statistics","Statistics","Statistics","Statistics","Statistics"),
+ Income_Status=c(1,2,3,1,2))
>I
ID Sex. Weight Height Age. Department Income_Status
1. 1 Female. 40 1.52. 20 Statistics 1
2 2 Male 65 1.65. 24. Statistics 2
3 3. Male 63 1.68 25 Statistics 3
4. 4 Female 55 1.45 23 Statistics 1
5 5 Female 53 1.63. 22 Statistics 2
>
Ii,
> d<-rbind(data, I)
>d
ID Sex. Weight Height Age Department. Income_Status
1 1. Female. 61 1.55 25. Marketing management 3
2 2. Male. 55 1.70. 21 Statistics 2
3 3 Female. 49 1.65 22 Mathematics. 1
4 4. Male 58 1.65 21 IT 3
5 5 Male 60 1.74 20 Civil 3
6 6 Male 56 1.80 21 Statistics 1
7 7. Female 47 1.52 22 History 1
8 8 Male 55 1.56. 22 Geography 2
9 9. Female 54. 1.57 20 Biology 2
10 10 Male 52 1.55 21 Management 1
11 11 Female 59 1.60 22 Accounting 3
12 12 Male 56. 1.59 21 Surveying 1
13 13 Male 62 1.68 20 Economics 1
14 14. Female 48. 1.52. 20 Accounting 2
15 15. Female 51 1.54. 21 Banking 2
16 16v. Female 50 1.55 21 Chemistry 2
17 17 Male 64 1.70 22 Mechanical. 1
18 18 Female 55 1.62. 21 Amharic 3
19 19. Female 56 1.66. 20 Civics 3
20 20. Male 65. 1.70 22 English 3
21 1 Female 40 1.52 20 Statistics 1
22 2 Male 65 1.65 24 Statistics 2
23 3 Male 63 1.68 25 Statistics 3
24 4. Female 55 1.45. 23 Statistics 1
25 5 Female 53 1.63 22 Statistics 2
G,
ht<-head(data)
> ht
ID Sex Weight Height Age Department Income_Status
1 1 Female 61 1.55. 25 Marketing management 3
2 2. Male 55 1.70 21 Statistics 2
3 3 Female 49. 1.65 22 Mathematics 1
4 4 Male 58 1.65. 21 IT 3
5 5 Male 60 1.74 20 Civil 3
6 6 Male 56. 1.80 21 Statistics 1
> th<-tail(data, n=2)
> th
ID Sex Weight Height. Age Department Income_Status
19 19 Female 56. 1.66. 20 Civics 3
20 20. Male 65. 1.70 22 English 3
>
Interpretation
- Weight:
The average weight of individuals in the dataset shows a fairly uniform distribution, with a mean close to
the median (55 kg), indicating that the distribution of weights is relatively symmetrical, albeit with a
small range (from 47 kg to 65 kg) suggesting no extreme outliers.
- Height:
Heights have a mean slightly lower than the median, which can indicate a slight skew in the data;
however, the range is reasonable. The standard deviation shows there is some variation in height, but
overall most individuals fall within a small range of the mean.
- Age:
The average age is around 21, with very little variation (standard deviation), indicating that most
individuals are clustered around a similar age group.
- Sex Distribution:
The dataset is balanced, with equal representation of male and female individuals, which is useful for
analyses that might explore gender differences.
- Income Status:
The income status indicates a fairly even spread across the categories, with slightly more individuals in
the highest income category (3). This could suggest that the majority of individuals are relatively
financially stable, although with substantial representation in the lower categories, indicating some
diversity in income levels.
Conclusion
Overall, this dataset represents a group of individuals with a balanced gender distribution, a uniform age
range, and some diversity in weight and height. It can be a useful dataset for further analysis, such as
exploring relationships between these variables or deeper demographic studies based on department or
income status.