require(knitr)
require(ggplot2)
read_chunk("ex2/ex2_chunks.R")
ex3data1 <- cbind(1, as.matrix(read.csv("data/ex3data1.csv")))
initial_theta <- rep(0, times = ncol(ex3data1) - 1)
The script to visualize the greyscale digits was included in the resources and not actually a part of the assignment. I skipped it for now, but may revisit in the future. I should be able to modify part of exercise 7 to print the handwritten digits.
I already implemented a vectorized vesion for exercise 2. I make use of ex2_chunks.R
which contains all of the functions written for the previous exercise.
sig <- function(x){1 / (1 + exp(-x))}
h <- function(theta, x){
# matrix multiplication is pairwise multiplication, then summed
sig(sum(theta * x))
}
costFunction <- function(M, theta, lambda = 0){
m <- nrow(M)
X <- M[, 1:(ncol(M) - 1)]
y <- M[, ncol(M)]
J <- - (1 / m) * crossprod(c(y, 1 - y),
c(log(sig(X %*% theta)), log(1 - sig(X%*% theta)))) +
(lambda / (2 * m)) * sum(theta ^ 2)
grad <- (1 / m) * crossprod(X, sig(X %*% theta) - y) +
(lambda / m) * theta
list(J = as.vector(J), grad = as.vector(grad))
}
For one-vs-all classification, we just loop through each of the K classes and run logistic regression like we did in the previous exercise
thetas <- data.frame()
for(i in 1:10){
Mi <- cbind(ex3data1[, 1:401], ex3data1[, 402] == i)
thetai <- optim(par = initial_theta,
fn = function(x){costFunction(Mi, x)$J},
gr = function(x){costFunction(Mi, x)$grad},
method = "BFGS", control = list(maxit = 400))
thetas <- rbind(thetas, thetai$par)
}
The predicted class is just the class with the highest assigned probability
ex3pred1 <- apply(ex3data1, 1, FUN = function(x){
which.max(as.vector(apply(thetas, 1, FUN = function(y){
h(y, x[1:401])
})))
})
sum(ex3data1[, 402] == ex3pred1) / nrow(ex3data1)
## [1] 0.9698
This is higher than the expected accuracy of 94.9%, although I’m not sure why.
This is just a quick, non-generalized implementation of forward propagation. A more generalized version is implemented in the next exercise
Theta1 <- as.matrix(read.csv("data/ex3weights_Theta1.csv"))
Theta2 <- as.matrix(read.csv("data/ex3weights_Theta2.csv"))
z2 <- Theta1 %*% t(ex3data1[, 1:401])
a2 <- sig(z2)
a2 <- rbind(1, a2)
z3 <- Theta2 %*% a2
a3 <- sig(z3)
ex3pred2 <- apply(a3, 2, which.max)
sum(ex3data1[, 402] == ex3pred2) / nrow(ex3data1)
## [1] 0.9752
The expected accuracy is 97.5%.