0% found this document useful (0 votes)

25 views12 pages

Final Solution

The document outlines the final exam details for the CSED105 Introduction to Artificial Intelligence course, including the exam date, time, and rules for answering questions. It consists of various sections including True/False questions, K-means algorithm problems, Variational Auto-Encoder questions, Markov property discussions, and convolution exercises. Students are required to provide justifications for their answers and the exam is closed-book with strict adherence to academic integrity.

Uploaded by

jennykim120106

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views12 pages

Final Solution

Uploaded by

jennykim120106

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Final Exam

CSED105 Introduction to Artificial Intelligence

Instructor: Jungseul Ok
[email protected]
TA’s: Yunjoo Lee, Jeongyeon Hwang, Seockbean Song, Youngjae Kim, Junhyuk So

10:00am - 12:00pm June. 5, 2024

Remarks

• You need to provide proper justification for your answers.

• It may be helpful to understand that I leave sufficient blank space for your answer per
problem, while, of course, you can use more if you need.

• This is a closed-book exam, where you are NOT allowed to consult any other material than
this exam paper. You cannot discuss with other students. Any violation may cause not
only grade F but also be reported to disciplinary committee.

• You can write your answers in English or Korean.

No. Score Comment

1
2 (a) (b) (c) (d) (e) (f)
3 (a) (b) (c)
4 (a) (b) (c)
5 (a) (b) (c)
6 (a) (b)
7 (a) (b) (c) (d)
Total

Name:

Student ID:

1
1 True/False Questions (2 × 12 = 24 pt)
Check whether the following statements are True or False.

(a) (True / False) Examples of unsupervised learning include clustering, generative models and
dimentionality reduction.
True

(b) (True / False) K-means clustering can be used for tasks such as image segmentation.
True

(c) (True / False) Diffusion models can be used to generate images.

True

(d) (True / False) The primary difference between a variational autoencoder (VAE) and a
traditional autoencoder (AE) is that a VAE learns a probabilistic latent space, while an AE
learns a deterministic latent space.
True

(e) (True / False) The goal of reinforcement learning is to learn the optimal policy that maxi-
mizes cumulative reward.
True

(f) (True / False) Reinforcement learning requires labeled data similar to supervised learning.
False

(g) (True / False) In computer vision, multilayer perceptrons are generally superior to convo-
lutional neural networks.
False

(h) (True / False) Recurrent neural networks are employed in natural language processing to
manage inputs of variable length.
True

(i) (True / False) Recurrent neural networks have the advantage of being parallelizable, leading
to faster processing speed.
False

(j) (True / False) Self-attention is a mechanism to compute the relationships between elements
within the same input sequence.
True

(k) (True / False) There have been two AI winters in the history of artificial intelligence research.
True

(l) (True / False) The current state of deep learning is no longer constrained by data.
False

2
2 K-means Algorithm (2 + 3 + 3 + 2 + 4 + 2 = 16 pt)
Given a dataset D = {x(i) }i=1,2,...,m of m data points in R2 , we want to partitioning them into
K clusters using K-means algorithm. Let µk ∈ R2 denote a representative point of cluster k ∈
(i) (i)
{1, 2, ..., K}. Indicating the assignment of data point x(i) by one-hot vector r(i) = (r1 , ..., rK ) ∈
(i) (i)
{0, 1}K such that K (i)
P
k=1 rk = 1 and rk = 1 only if x is assigned to cluster k, the K-means
(i)
algorithm aims at finding {r }i=1,...,m from the following optimization problem:
m X
K K
X 1 (i)
X (i)
min rk ∥x(i) − µk ∥22 subject to rk = 1 ∀i ∈ {1, ..., m} . (1)
{r(i) ∈{0,1}K },{µk ∈R2 }
i=1 k=1
2 k=1

To be specific, the K-means algorithm iterates (i) the assignment step to optimize r(i) ’s given
µk ’s; and (ii) the update step to optimize µk ’s given r(i) ’s.

(a) Suppose we want to perform K-means algorithm for K = 2, the dataset D, and initialization
of µk ’s in the followings:

D = {(−1, 2), (0, 2), (1, 2), (3, 2), (5, 2), (−1, 0), (0, 0), (1, 0), (3, 0), (5, 0)} , (2)
µ1 = (1, −1) and µ2 = (3, 0) . (3)

On Figure 1, plot the dataset D in (2) with circles (◦).

Figure 1: Grid for Problem 2(a)

3
(b) Which of the following corresponds to µk ’s that the update step finds?
(i) (i) (i)
P (i)
x(i)
P P P
i rk i rk x r
(i) µk = (i) (i) (ii) µk = i
(i) (iii) µk = P (i) (iv) µk = P i k(i)
ix
P P
i rk x i rk i rk

Describe the meaning of µk that you selected for the update step.
The update step would find µk in (iii) (+2pt), which is the centroid of the points assigned
to cluster k (+1pt).

(i) (i)
(c) Given the initialization in (3), compute the value of Σi r0 , Σi r1 after the first assignment
step. 5, 5 (+1.5pt each)

(d) Given the initialization in (3), find µ1 and µ2 , which K-means algorithm would find.
{(0,1), (4,1)} (+1pt each)

4
(e) Compute the values of loss function in (1) for (i) µk ’s in (3); and (ii) those which you find
in Problem 2(d).
(i) 27.5 (ii) 9 (+2pt for each; -1pt for a minor mistake, e.g., square rooting, wrong addition,
...)

(f) Describe two examples where the K-means algorithm can be applied. Good example with
appropriate reason (segmentation, recommendation, ... ) (+1pt each)

5
3 Variational Auto-Encoder (VAE) (3 + 3 + 3 = 9 pt)
Consider MNIST dataset D = {x(i) }m i=1 consisting of gray-scale images of the handwriting digits,
and a VAE model consisting of encoder f and decoder g, which are multilayer perceptrons. To be
specific, in training, we encode x(i) using f (x(i) ) = (µ(i) , (σ (i) )2 ) to generate 3-dimensional random
latent code z (i) ∼ N (µ(i) , (σ (i) )2 ), and decode z (i) to reconstruct x(i) using g(z (i) ) = x̂(i) , where
our learning objective can be informally described as follows:
m
1 X (i)
min ∥x − g(z (i) )∥2 +λ KL(N (µ(i) , (σ (i) )2 )∥N (0, I)) , (4)
f,g m | {z } | {z }
i=1
reconstruction error regularization

(a) Figure 2 visualizes the latent space of when the model is trained with λ values of 0 and 1.
Identify which the λ values that each latent space. (left): λ = 1 , (right) : λ = 0 (3pt for
the correct answer; 0pt if any wrong, e.g., (left) = 1, (right) = 1)

(lef t) : λ = (right) : λ =

Figure 2: Visualizations of latent space

6
(b) Describe the meanings of reconstruction error term and regularization term in (4) respec-
tively.
The reconstruction error term measures a discrepancy between original x(i) and recon-
structed x̂(i) = g(z (i) ), and the regularization term measures the distance between the
distribution of z (i) and a zero-mean normal distribution N (0, I)
Two correct description: (3pt)
One correct description: (1.5pt)
Note: Weak or ambiguous description is not allowed.

(c) Figure 3 visualizes two images of 4: x1 and x2 . Considering that N (f (x); 0, I) models the
likelihood of data x, compare the likelihood of N (f (x1 ); 0, I) and N (f (x2 ); 0, I). Justify
your answer briefly.

(a) x1 (b) x2

Figure 3: Two images of digit 4

N (f (x1 ); 0, I) > N (f (x2 ); 0, I), since x2 is a outlier data.

Correct answer and justification: (3pt)
Correct answer and no or wrong justification: (1pt)
Wrong answer regardless of justification: (0pt)

7
4 Markov Property (3 + 3 + 3 = 9 pt)
Consider a reinforcement learning task, where a robot with a local view is looking for the shortest
in the maze from start to end. In each time t = 0, 1, ..., the robot observes local view ot , and selects
an action at ∈ {N(orth), E(ast), S(outh)} to move one of neighboring white tiles in Figure 4. The
robot can perceive one of seven observations in Figure 5. To solve this problem of finding the
shortest path, we give +1 only when the robot reaches the end, and −1 otherwise as reward rt+1 .

Figure 4: An example sequence of (ot , at , rt+1 , ot+1 ).

Figure 5: All possible observations.

(a) State the definition of Markov property.
The Markov property states that the next state of a stochastic process depends only upon
the present state, and is independent of the sequence of events that preceded it, i.e.,
P (st+1 , rt+1 |ht ) = P (st+1 , rt+1 |st , at ) where ht = (s0 , a0 , r1 , ..., st , at ).

(b) Suppose that we define state st = ot at time t in agent-environment interface, i.e., (st , at , rt+1 ) =
(ot , at , rt+1 ). Justify if this definition of state verifies the Markov property.
The Markov property is not satisfied in this scenario. For example, when the robot per-
ceives st = (III) at time t, we can not predict the next state st+1 from st with at = E because
there are two cases, e.g., st+1 = (III) or st+1 = (V II). However, when we know history ht ,
we can predict the next state st+1 with action at , i.e., P (st+1 , rt+1 |ht ) ̸= P (st+1 , rt+1 |st , at ).
(+1pt for identifying no Markov property; +2pt for a proper justification:)

(c) Suppose that we define state st = ot−L:t at time t in agent-environment interface, where
L ≥ 1, i.e., (ot−L:t , at , rt+1 , ot+1−L:t+1 ) = (st , at , rt+1 , st+1 ), where ot−L:t is the stack of the
(L + 1)-most recent observations, i.e., ot−L:t := (ot′ )t′ ∈[max{0,t−L},t] . Check if there exists a
finite constant L that makes this definition of state to verify the Markov property. Justify
your answer.
The Markov property is satisfied when L ≥ 1. If we have at lest 2 recent observations
(L ≥ 1), we can predict the next state st+1 with at based solely on st , independently of the
previous history. Let’s consider that L = 1. When st = ot−1:t is ((V I), (III)), then st+1
will be ((III), (III)) with at = E. Similarly, when st = ot−1:t is ((III), (III)), then st+1
will be ((III), (V II)) with at = E. This resolves the problem (b).
Note that we do not have the action for W(est).
(+1pt for identifying Markov property; +2pt for a proper justification)

8
5 Convolution (5 + 5 + 5 = 15 pt)
In the following questions, you will practice convolution. Let X and W be two real-valued discrete
functions. The convolution of X and W are denoted by X ∗ W such that
∞
X
(X ∗ W )[n] = X[k] W [n − k] .
k=−∞

(a) Given the following 1D input signal X[n] and two filters (kernels) which are W1 [n] and
W2 [n]:

X = [1, 2, 3, 10, 11, 12, 13]

W1 = [1, 0, −1], W2 = [0.25, 0.5, 0.25]

Perform the convolution of the input signal X with each filter W1 and W2 for n = 4, 5, .., 8.
Note 1 : Except for the given values of X[n], W1 [n], and W2 [n], all other values are 0.
Note 2 : The index of X[n], W1 [n], and W2 [n] in the given values starts from 1, such that
X[1] = 1, X[2] = 2, · · · , W1 [1] = 1, W1 [2] = 0, · · ·.

(i) X ∗ W1 :

(ii) X ∗ W2 :

(2.5pt)(i) X ∗ W1 = [2, 8, 8, 2, 2]
(2.5pt)(ii) X ∗ W2 = [2, 4.5, 8.5, 11, 12]
(1.5pt) for calculation mistake for each

(b) Explain the purpose and effect of each filter on the input signal. (hint: what kind of feature
or characteristic do W1 and W2 detect in the input signal X?)

(i) W1 :
(2.5pt)(i) W1 detects sudden changes in intensity in the input signal.
(1.5pt) other general explanation of what filter W1 does
(ii) W2 :
(2.5pt)(ii) W2 performs smoothing or blurring of the input signal. (1.5pt) other general
explanation of what filter W2 does

(c) Explain an advantage of using Convolutional Neural Networks (CNNs) over Multi-Layer
Perceptrons (MLPs) for processing image data.
(5pt)(i) Capture hierarchical features in images through convolutional layers, enabling un-
derstanding at different abstraction levels.
(ii) Share parameters across spatial locations, reducing redundancy and improving efficiency.
(iii) Exploit local connectivity and learn translation-invariant features, making them robust
to variations in object position.
(2.5pt) other general explanation of CNN or MLP but lack of reasoning

9
6 Transformer and attention mechanism (9 + 6 = 15 pt)
Transformer is a well-known neural network architecture leveraging the advantage of the attention
mechanism, of which output given query Q, key K and value V is computed as follows:

Q · KT

AttentionLayer(Q, K, V ) = softmax ·V
scaling

(a) Complete the following illustration of the attention mechanism by filling the six blanks with
terms in {softmax, K, V, dot-product, scaling}. (Note: a term can appear multiple times.)

(i):

(ii):

(iii):

(iv):

(v):

(vi):

(1.5pt) (i) : K
(1.5pt) (ii) : V
(1.5pt) (iii) : dot-product
(1.5pt) (iv) : scaling
(1.5pt) (v) : softmax
(1.5pt) (vi) : dot-product

(b) Describe the purpose of positional encoding in the transformer architecture for natural
language processing.
(5pt) To provide the sequential order information of the input sequence to the model. / To
enable the model to understand contextual information of the input sequence.

10
7 Limitations of AI (3 + 3 + 3 + 3 = 12 pt)
Although artificial intelligence (AI) has achieved remarkable advancements, the journey has not
always been straightforward.

(a) Explain what the universal approximation theorem is.

It states that a neural network with a single layer is sufficient to approximate to any function
at an arbitrary precision.

(b) Explain the limitations of the universal approximation theorem.

It just guarantees the existence of such a network without giving way to obtain it.
However, it does not mean we can find such a neural network and there is no guarantee on
the generalization ability, i.e., the neural network can be just a memorizer of dataset.

The AI community learned that over-promising results can lead to unrealistic expectations
and subsequent disillusionment. Early AI pioneers were overly optimistic about the poten-
tial for rapid advancements and the timelines for achieving human-level intelligence. This
mismatch between promises and actual achievements led to a loss of credibility and support
from both the public and funding agencies.

11
(d) Discuss a limitation of the current large language model.

Hallucination, lack of reasoning, heavy computation requirements.

Machine Learning MCQ Assignment
No ratings yet
Machine Learning MCQ Assignment
56 pages
Midpaper
No ratings yet
Midpaper
16 pages
Final 2018
No ratings yet
Final 2018
15 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Mlgs 2021 Endterm Solution
No ratings yet
Mlgs 2021 Endterm Solution
26 pages
Final2018 Solutions
No ratings yet
Final2018 Solutions
19 pages
Class Test 1
No ratings yet
Class Test 1
5 pages
Set3sol 2022
No ratings yet
Set3sol 2022
3 pages
SMAI End 2015 S
No ratings yet
SMAI End 2015 S
4 pages
First Exam 24 25 Solution
No ratings yet
First Exam 24 25 Solution
13 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
12 pages
AI Midterm Solution Guide 2013
No ratings yet
AI Midterm Solution Guide 2013
11 pages
Final: CS 189 Spring 2016 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2016 Introduction To Machine Learning
12 pages
Fa11 Final
No ratings yet
Fa11 Final
21 pages
Practice Finals
No ratings yet
Practice Finals
7 pages
ML Finals16 PDF
No ratings yet
ML Finals16 PDF
12 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
Cs748 s2021 Quizzes Till q4
No ratings yet
Cs748 s2021 Quizzes Till q4
4 pages
hw2 Red
No ratings yet
hw2 Red
4 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
cs188 sp19 Final Sol
No ratings yet
cs188 sp19 Final Sol
28 pages
RL Theory Tutorial
No ratings yet
RL Theory Tutorial
80 pages
hw1 PDF
No ratings yet
hw1 PDF
6 pages
CS 236, Fall 2018 Midterm Exam: Stanford University Honor Code
100% (1)
CS 236, Fall 2018 Midterm Exam: Stanford University Honor Code
6 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
212 Final-Solution
No ratings yet
212 Final-Solution
23 pages
Final 2019
No ratings yet
Final 2019
15 pages
HW 1
No ratings yet
HW 1
6 pages
Week 1
No ratings yet
Week 1
3 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
CSED105 2024 1 Midterm
No ratings yet
CSED105 2024 1 Midterm
9 pages
HW 1 Eeowh 3
No ratings yet
HW 1 Eeowh 3
6 pages
10-601 Machine Learning: Homework 7: Instructions
No ratings yet
10-601 Machine Learning: Homework 7: Instructions
5 pages
Week 5
100% (1)
Week 5
3 pages
Engineering Exam Analysis
No ratings yet
Engineering Exam Analysis
7 pages
Solution ML KOE - 073 PUT (7th Sem 2024-25) Neeru
No ratings yet
Solution ML KOE - 073 PUT (7th Sem 2024-25) Neeru
14 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
ML Exam 2016
No ratings yet
ML Exam 2016
5 pages
Midterm: CS 188 Spring 2019 Introduction To Artificial Intelligence
No ratings yet
Midterm: CS 188 Spring 2019 Introduction To Artificial Intelligence
23 pages
NNLS1 2019 HW1 Solutions
No ratings yet
NNLS1 2019 HW1 Solutions
5 pages
Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions
No ratings yet
Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions
8 pages
Midterm Solutions
No ratings yet
Midterm Solutions
8 pages
AI Practice Final Exam Spring 2014
No ratings yet
AI Practice Final Exam Spring 2014
19 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
Midterm2008f Sol
No ratings yet
Midterm2008f Sol
12 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
No ratings yet
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
12 pages
Midterm 2008s Solution
No ratings yet
Midterm 2008s Solution
12 pages
AI60201 Module3 4 Problems
No ratings yet
AI60201 Module3 4 Problems
4 pages
Linear Regression & Data Analysis
No ratings yet
Linear Regression & Data Analysis
4 pages
AIFundamentals Level1 Quiz
No ratings yet
AIFundamentals Level1 Quiz
8 pages
Plant Disease Detection With Machine and Deep Lea-Groen Kennisnet 634493
No ratings yet
Plant Disease Detection With Machine and Deep Lea-Groen Kennisnet 634493
82 pages
Anas Project
No ratings yet
Anas Project
114 pages
DL Presentation
No ratings yet
DL Presentation
10 pages
Data Science Graduate's Resume
No ratings yet
Data Science Graduate's Resume
2 pages
Sanket Patel
No ratings yet
Sanket Patel
4 pages
Demo Course PPT - Python
No ratings yet
Demo Course PPT - Python
18 pages
Data Science Immersive Syllabus: Course
No ratings yet
Data Science Immersive Syllabus: Course
4 pages
Inference and Learning From Data - Volume I - Foundations
No ratings yet
Inference and Learning From Data - Volume I - Foundations
1,106 pages
Emotion and Depression Detection From Speech
No ratings yet
Emotion and Depression Detection From Speech
9 pages
Digital Forgery g26
No ratings yet
Digital Forgery g26
42 pages
Childrens Speech Disorders Identification and Therapy Treatment 2
No ratings yet
Childrens Speech Disorders Identification and Therapy Treatment 2
6 pages
IITD DSDS F324a6de7a
No ratings yet
IITD DSDS F324a6de7a
30 pages
Implementation of Discrete Hidden Markov Model For Sequence Classification in C++ Using Eigen
No ratings yet
Implementation of Discrete Hidden Markov Model For Sequence Classification in C++ Using Eigen
8 pages
Neural Learning and ART Explained
No ratings yet
Neural Learning and ART Explained
3 pages
DSA2324 Lecture 01 Introduction To Data Science
No ratings yet
DSA2324 Lecture 01 Introduction To Data Science
96 pages
Crowd Management Main
No ratings yet
Crowd Management Main
33 pages
Embedded Systems
No ratings yet
Embedded Systems
34 pages
Artificial Intelligence & It's Applications in Banking Sector
No ratings yet
Artificial Intelligence & It's Applications in Banking Sector
61 pages
Model-Based Reinforcement Learning
No ratings yet
Model-Based Reinforcement Learning
41 pages
Energies: Machine Learning Based Photovoltaics (PV) Power Prediction Using Di Parameters of Qatar
No ratings yet
Energies: Machine Learning Based Photovoltaics (PV) Power Prediction Using Di Parameters of Qatar
19 pages
Unsupervised Tts by Unsupervised Automatic Speech Recognition-HIFIGAN
No ratings yet
Unsupervised Tts by Unsupervised Automatic Speech Recognition-HIFIGAN
5 pages
Artificial Intelligence and Autonomous Vehicles
No ratings yet
Artificial Intelligence and Autonomous Vehicles
6 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
A Tutorial On Hidden Markov Models PDF
No ratings yet
A Tutorial On Hidden Markov Models PDF
22 pages
Using Keras and Deep Q-Network To Play FlappyBird - Ben Lau PDF
No ratings yet
Using Keras and Deep Q-Network To Play FlappyBird - Ben Lau PDF
21 pages
ML Yifan Lu REPORT
No ratings yet
ML Yifan Lu REPORT
22 pages
The Application of AI Technologies in STEM Education: A Systematic Review From 2011 To 2021
No ratings yet
The Application of AI Technologies in STEM Education: A Systematic Review From 2011 To 2021
20 pages
The Leadership Quarterly: George C. Banks, Shelley D. Dionne, Hiroki Sayama, Marianne Schmid Mast
No ratings yet
The Leadership Quarterly: George C. Banks, Shelley D. Dionne, Hiroki Sayama, Marianne Schmid Mast
2 pages
Artificial Neural Networks & Fuzzy Logic
No ratings yet
Artificial Neural Networks & Fuzzy Logic
13 pages