일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- Linear algebra
- CS231ntwolayerneuralnet
- Algorithm
- ㅐㅕ세ㅕㅅ
- 맥북원스토어
- monoculardepthestimation
- adversarialattackonmonoculardepthestimation
- CNNarchitecture
- MacOS
- CS231nSVM
- 선형대수학
- CNN구조정리
- arm칩에안드로이드
- pycharmerror
- CS231n
- Gilbert Strang
- BOJ
- MIT
- 백준
- CS231nAssignment1
- 맥실리콘
- professor strang
- 선대
- 백준알고리즘
- CS231nAssignments
- ios원스토어
- gpumemory
- 아이폰원스토어
- BAEKJOON
- RegionProposalNetworks
- Today
- Total
개발로 하는 개발
[CS231n] Assignment 1 - KNN 본문
- KNN ( K nearest neighbor)
hyperparameter : k, L1 or L2 (distance calculating formula)
Basically, you are trying to figure out which dot belongs to what region. And you are determining this by calculating the distance between the test point and train points. You get the value of k nearest points, and decide whichever the majority is.
There are two ways to calculate this. L1 and L2.
in the assignment,
you implement the function compute_distances_two_loops and function predict_labels.
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
1. compute_distance_two_loops
In compute_distance_two_loops, you are using L2 distance, so you sub X [i] and X_train [j], and square it, then you calculate the root of it.
dists[i][j] = np.sqrt(np.sum( (X[i]- self.X_train[j]) **2 ))
If you don't implement the predict_labels function, you get only 11.4% of accuracy.
2. predict_labels
closest_y = dists[i].argsort()[:k]
# get the most frequent one in k closest dists.
unique, counts = np.unique(closest_y, return_counts=True)
index = np.argmax(counts)
y_pred[i] = unique[index]
First, I made a mistake by saving the values of dists in closest_y. Then, of course, accuracy was 0. So I tried printing the values in the ipynb file with
print(y_test_pred)
then the result was like this.
You have to remember that you have to save values of y_test. That is why you need the index. And that is why you used argsort.
Also, you have to cut the array of closest_y with [0:k] range because it is searching for its nearest neighbor within k. So you have to find for the most frequent y value in closest_y. If done correctly, it looks like this code.
num_test = dists.shape[0]
y_pred = np.zeros(num_test)
for i in range(num_test):
closest_y = []
# the index of the one with the smallest dist(closest to zero) will be on closest_y[0]
# grab the value of y in the y_train with indices
closest_y = self.y_train[dists[i].argsort()][:k]
# get the most frequent one in k closest dists.
unique, counts = np.unique(closest_y, return_counts=True)
index = np.argmax(counts)
y_pred[i] = unique[index]
When this code is run along with the ipynb file, it looks like this. Which is the desired result.
3. Why One Loop didn't work ( a.k.a. why matrices are different)
dists[i, :] = np.sqrt(np.sum( (X[i] - self.X_train) ** 2 ))
With this code, it runs but it is faulty. How do you know? You run the following code on colab and it will tell you the distance matrices are different whereas they should be the same. It is just the implementation that is different and it shouldn't effect the distance matrix.
So, I figured out what was the problem with my code. I ran some code using numpy array & python to visually see what was going on.
import numpy as np
# List1 = [[1, 2, 3, 4, 5, 6], [3, 4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 7]]
List1 = [[1, 2, 3, 4, 5, 6]]
List2 = [[1, 2, 3, 4, 5, 6], [3, 4, 5, 6, 7, 8], [1,3,5,7,9,11]]
arr1 = np.array(List1)
arr2 = np.array(List2)
print(arr1 - arr2)
print(arr1**2)
print((arr1 - arr2)**2)
print(np.sum((arr1 - arr2)**2))
print(np.sqrt(np.sum((arr1 - arr2)**2)))
The results were like this. So, I needed to get distance matric (500, 5000) but all I was getting was (500, 1).
I need to do sum of elements of a row, but was doing the entire sum.
X[i] : (1,3072) | self.X_train : (5000, 3072) | arr1 : (1, 6) | arr2 : (3, 6) |
X[i] - self.X_train : (5000, 3072) | arr1 - arr2 : (3, 6) |
(X[i] - self.X_train) ** 2 : (5000, 3072) | (arr1 - arr2) ** 2 : (3, 6) |
np.sum ( (X[i] - self.X_train) ** 2) : (1, 1) | np.sum ( (arr1 - arr2) ** 2) : (1, 1) -> (1, 3) |
np.sqrt ( np.sum ( (X[i] - self.X_train) ** 2) ) : (1, 1) | np.sqrt ( np.sum ( (arr1 - arr2) ** 2) ) : (1, 1) -> (1, 3) |
desired : (1, 5000) | desired : (1, 3) |
So, I need to get the sum of elements in each rows and (가로줄의 합을 구해서 그 합이 각자 한 줄을 차지하게) save them.
dists[i, :] = np.sqrt(np.sum( (X[i] - self.X_train) ** 2 , axis = 1))
With the axis = 1, the numpy function sum adds the rows and returns an array of sum of rows.
4. compute_distance_no_loops
The main problem with using no loops is that the arrays didn't broadcast when it wasn't (1,3072) + (5000, 3072). So we needed to use matrix multiplication and broadcast sums to make a distance matrix.
import numpy as np
List1 = [[0,0,0,0,0,0],[1,1,1,1,1,1]]
List2 = [[1, 2, 3, 4, 5, 6], [2,3,4,5,6,7], [3, 4, 5, 6, 7, 8]]
arr1 = np.array(List1)
arr2 = np.array(List2)
print(arr1[0] - arr2)
print(arr1[1] - arr2)
print("\n")
print((arr1[0] - arr2)**2)
print((arr1[1] - arr2)**2)
print("\n")
print(np.sum((arr1[0] - arr2)**2, axis = 1))
print(np.sum((arr1[1] - arr2)**2, axis = 1))
print("\n")
print(np.sqrt(np.sum((arr1[0] - arr2)**2, axis = 1)))
print(np.sqrt(np.sum((arr1[1] - arr2)**2, axis = 1)))
print("\n")
print(arr1 ** 2)
print(arr2 ** 2)
print("\n")
print(np.sum(np.square(arr1), axis = 1))
print(np.sum(np.square(arr1), axis = 1).reshape(arr1.shape[0], 1))
print("\n")
A_2 = np.sum(np.square(arr1), axis = 1)
B_2 = np.sum(np.square(arr2), axis = 1)
print(A_2)
print(B_2)
print("\n")
dot_product = np.dot(arr1, np.transpose(arr2))
print(dot_product)
print("\n")
dists = np.zeros((2, 3))
print(A_2.reshape(2, 1) + B_2 -2*dot_product)
print(dists - 2*dot_product)
print(dists - 2*dot_product + B_2)
print("\n")
print(-dot_product + B_2)
- Cross Validation
For example, in 5-fold cross-validation, we would split the training data into 5 equal folds, use 4 of them for training, and 1 for validation. We would then iterate over which fold is the validation fold, evaluate the performance, and finally average the performance across the different folds.
"https://cs231n.github.io/classification/"
import numpy as np
num_folds = 5
test_arr = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
X_train_folds = []
X_train_folds = np.array_split(test_arr, num_folds)
print(X_train_folds)
for i in range(num_folds):
print(X_train_folds[i])
print(np.concatenate(X_train_folds[:i] + X_train_folds[i+1:]))
'Study' 카테고리의 다른 글
[LG Aimers] Module 1. 데이터의 분석과 AI 윤리 (0) | 2024.01.10 |
---|---|
[CS231n] Assignment 1 - SVM (0) | 2023.07.25 |
[CS231n] Assignment python 정리 (0) | 2023.04.05 |
[대회] SUAPC 2022 Winter 참여기 (0) | 2022.06.06 |
[Linear Algebra] 01. The Geometry of Linear Equations (0) | 2022.06.06 |