Exam AI 2021 - Dec2021
For a given a graph perform searches from node S to node T using algorithms (50 pts):
Breadth-first with extended list. Draw your tree.
Beam Search with a beam width of 2, using an extended list. Draw your tree.
Put your ticks in last three columns - where needed. Provide short explanation (50 pts).
Problem description
|
Clusterization
|
Classification
|
Regression
|
Given a set of documents you are asked to apply some fancy algorithms to segregate them into groups. Which algorithm(-s) type(-s) are applicable?
Why?
|
|
|
|
Given a set of documents along with names to which group each specific document belongs to - select algorithm type(-s) which are applicable. Why?
|
|
|
|
You are given historical dataset containing features describing person (age, education, count of previous non-returned loans, count of returned loans, loan he is asking for, time duration of the load, flag (yes/no) - indicating whether loan was returned or not). Mark algorithms applicable. Why?
|
|
|
|
You are presented with time-series data of daily closing prices of forex EURUSD currency pair. You are asked to build a system capable of predicting should we buy (price goes up) now, or sell now (in near future price will go down). Using which algorithm type you would approach this problem? Why?
|
|
|
|
You know that during training of your regression model have zero training error but large testing error - what does it mean? (15 pts)
You see that your trained classifier have high training error and even higher testing error - how would you try fix it? (15 pts)
You see that your (linear) logisitc regression model have ~ 50-60 accuracy rate - (i.e, it correctly classifies only half of your training and testing data) - any ideas how to fix it? (15 pts)
KNN
On the following graph, draw the decision boundaries produced by 1-nearestneighbors. (35 pts):
The graph below shows two new creatures, marked with a question mark symbols and labeled A and B. Show how these will be classified using 3-nearest-neighbors and 5-nearest-neighbors below. (15 pts).
|
Creature A
|
Creature B
|
Using 3-NN (k=3)
|
|
|
Using 5-NN (k=5)
|
|
|
Linear regression. (25 pts)
Draw a regression lines for linear regression f(x)=a0+a1*x1
a0=0.5, a1=0
|
a0=+1, a1=-0.5
|
a0=-1, a1=2
|
|
|
|
For three above-listed regression models calculate f(x) values for x = {0,1,2,3}
f(0)=
f(1)=
f(2)=
f(3)=
|
f(0)=
f(1)=
f(2)=
f(3)=
|
f(0)=
f(1)=
f(2)=
f(3)=
|
Given neural network with some weights and connections. Activation functions in all layers except input layer are sigmoid ( 1/(1+exp(-x)) ). All bias (threshold) units have weights of 1.
Now calculate output of each of the the neural network nodes (including output node) using given weights (75 pts):
X=0.5
Y = 0
Z = 5
WXA = 2
WYA = 2
WYB = 1
WZC = 0.2
WAD = 1
WBD = 1.5
WCD = -1
Please list some classification algorithms (and in two sentences differences between them) (max 200 words) (5pts)
Logistic regression is a classifier - true or false? (5pts)
Will perceptron be able to correctly classify such input data (will you be able to train a perceptron given x1 and x2 to be able always correctly produce desired output) (5 pts)
X1
|
X2
|
OUTPUT
|
0
|
0
|
0
|
1
|
0
|
1
|
0
|
1
|
1
|
1
|
1
|
0
|
Yes/ No ? Why?
What is the minimum no. of variables/ features required to perform clustering? (5 pts) (1 number please)
How is KNN different from K-means clustering? (5pts)
14. Convex vs Non-Convex optimization problems? (5pts each)
Name/Type of the model
|
Convex / Non-Convex optimization problem?
|
Linear Regression
|
|
Logistic Regression
|
|
SVM Linear
|
|
SVM RBF
|
|
Perceptron
|
|
Multilayered Perceptron
|
|
Radial Basis Function Neural Network
|
|
15. What is more preferable to solve? Convex or Non-Convex optimization problem? (5pts)
16. Imagine you have two-dimensional dataset with totally non-correlated features. Can you use PCA to reduce dimensionality to 1D? (5pts)
17. In what cases you might acquire better results with the PCA+LinearRegression instead of just directly running LinearRegression on original dataset? (5pts)
18. In what cases you might acquire worser results with the PCA+LinearRegression instead of just directly running LinearRegression on original dataset? (5pts)
19. In what cases non-Linear dimensionality reduction is preferable over the linear methods (like PCA)? (5pts)
20. Name algorithms of dimensionality reduction / feature selection (5pts)
21. What is regularization and why do we need it? (5pts)
22. When we don’t want to have a regularization? (5pts)
23. Main methods share by all supervised models in scikit learn? (5pts)
24. You build a self-driving car. You find that sometimes visual recognition system fails to recognize traffic signs. You start analyzing data and find that this happens during morning or evening hours. You find that the errors occur when there is a sunset and a sundown and you find that you don’t have such images in your dataset. You acquired these images add them to the training dataset and retrain your deep learning object detector.
Please describe what is: a) AI system here? b) Data Analysis? c) Machine Learning ? (10 pts)
25. You have a dataset and need to build a classification model on that. You don’t know which model will work better, so you need to try different preprocessing teqchniques and model parameters. Questions:
a) If the dataset is huge – what model / parameter selection algorithm (sequence of actions) you will be using? Why? (5pts)
b)If dataset is small – what model / parameter selection algorithm (sequence of actions) you will be using? Why? (5pts)
26. Given a decision tree:
How will be classified (5pts each):
a) Point with coordinates (0.0; 0.0)?
b) Point with coordinates (1.0; 0.0)?
c) Point with coordinates (0.0; 1.0)?
d) Point with coordinates (1.0; 1.0)?
27. Client asks us to build a system that would monitor network behavour of the employees within the organization and would signal if there would be strange behaviour detected. What kind of ML-problem it is (5pts)? What algorithm(-s) we can use? (5pts)
28. We have a task to build a chess-playing program. What kind of ML-problem it is?
29. When you know you might have some outliers in the data, what regression loss you will use? (15 pts)
30. When you know that the classificaion dataset is imbalanced – what kind of performance measure you will use? (15 pts)
Do'stlaringiz bilan baham: |