[wikipedia] An eigenvector of a square matrix $$A$$ is a non-zero vector $$v$$ that, when multiplied by $$A$$, yields the original vector multiplied by a single number $$\lambda$$; that is, $$A.v = \lambda.v$$. The number $$\lambda$$ is called the eigenvalue of $$A$$ corresponding to $$v$$.

## Eigen-stuff

# boolean function for equality of real values
compare.reals <- function(a,b,tolerance=1e-5) {
abs(a-b) < tolerance
}

# testing eigen values & vectors
A   <- matrix(c(3,1,1,3),2,2)
eA  <- eigen(A)  # compute eigen values and eigen vectors of A
eA
## $values ## [1] 4 2 ## ##$vectors
##           [,1]       [,2]
## [1,] 0.7071068 -0.7071068
## [2,] 0.7071068  0.7071068
ev1 <- as.matrix(eA$vectors[,1]) # extract 1st eigen vector ev1 ## [,1] ## [1,] 0.7071068 ## [2,] 0.7071068 # if we multiply A by its first eigen vector, we get the same vector # if we multiply its first eigen vector by its first eigen value compare.reals(A %*% ev1 , eA$values[1] * ev1)
##      [,1]
## [1,] TRUE
## [2,] TRUE
# same for the 2nd eigen's
ev2 <- as.matrix(eA$vectors[,2]) # extract 2nd eigen vector ev2 ## [,1] ## [1,] -0.7071068 ## [2,] 0.7071068 compare.reals( A %*% ev2 , eA$values[2] * ev2)
##      [,1]
## [1,] TRUE
## [2,] TRUE

## Geometric Transformations

In geometric terms, a matrix is a linear transformation of an object. In 2D a matrix can perform rotations, scalings, shearing, etc.

library(MASS)

draw.segment <- function(p1,p2,color,width=3) {
segments(p1[1],p1[2],p2[1],p2[2], col=color, lwd=width)
}

plot(NULL,xlim=c(-3,3),ylim=c(-3,3),ylab="y",xlab="x")

# draw a segment (initially at 45 degrees)
p.init <- matrix(c(0,0),2,1)
p.end  <- matrix(c(2,2),2,1)
draw.segment(p.init, p.end, "red")

# rotation example
theta <- pi/4  # rotate more 45º degrees
rotate.matrix <- matrix(c( cos(theta),sin(theta),
-sin(theta),cos(theta)),2,2)

p1.init <- rotate.matrix %*% p.init
p1.end  <- rotate.matrix %*% p.end
draw.segment(p1.init, p1.end, "blue")

# scaling example
scalingx <- 0.5  # scale both axis by 50%
scalingy <- 0.5
scale.matrix <- matrix(c(scalingx,    0,
0,     scalingy),2,2)

p2.init <- scale.matrix %*% p.init
p2.end  <- scale.matrix %*% p.end
draw.segment(p2.init, p2.end, "green")

legend("topleft", c("initial", "rotated 45º", "scaled 50%"),
col = c("red","blue","green"), lwd=1)

# we can combine these transformations
plot(NULL,xlim=c(-3,3),ylim=c(-3,3),ylab="y",xlab="x")

xs  <- seq(0,2*pi,0.05)
pts <- matrix(c(cos(xs),sin(xs)),length(xs),2) # points from a unit circle
points(pts, col="red", type="l")               # draw them

scaling <- matrix(c(.5, 0,
0, 2),2,2)
theta <- -pi/4
rotating <- matrix(c( cos(theta),sin(theta),
-sin(theta),cos(theta)),2,2)

pts2 <- t(apply(pts,1,function(p)rotating %*% scaling %*% p))
# The vectorized operation is equivalent to this cyclic version:
# pts2 <- matrix(c(NA,NA),length(xs),2)
# for(i in 1:length(xs)) {
#   pts2[i,] <- rotating %*% scaling %*% pts[i,]  # scale then rotate
# }
points(pts2, col="green", type="l")             # draw transformation

Still the eigen vectors:

A given square matrix $$M$$ represents a geometrical transformations of vectors of the compatible dimension, as we seen above.

An eigenvector $$v$$ of $$M$$ is always transformed into a scaling of itself: $M \times v = \lambda \times v$

A matrix $$M:n\times n$$ might have or not eigenvectors. If it has, then it has $$n$$ eigenvectors, each prependicular to each other.

The scaling is irrelevant, so by convention, the standard eigenvectors have length 1.

The eigenvalue is the amount of scaling the matrix $$M$$ performs over the eigenvector $$v$$ or any scaling of itself: $M(av) = \lambda(av)$

m <- matrix(c(3,-4,-6,0,1,0,1,2,-2),3,3)
m
##      [,1] [,2] [,3]
## [1,]    3    0    1
## [2,]   -4    1    2
## [3,]   -6    0   -2
evs <- eigen(m)
evs
## $values ## [1] 1.000000e+00 1.000000e+00 -2.220446e-16 ## ##$vectors
##               [,1] [,2]        [,3]
## [1,]  5.551115e-17    0 -0.09534626
## [2,] -1.000000e+00    1 -0.95346259
## [3,] -1.110223e-16    0  0.28603878
norm(as.matrix(evs$vectors[,1]), "F") # return eigenvectors have norm = one ## [1] 1 # the dot product is zero between the eigenvectors (they are orthogonal) evs$vectors[,1] %*% evs$vectors[,2] #TODO: hmmm... but it is not zero (?) ## [,1] ## [1,] -1 # testing the eigenvalue property: m %*% (5*evs$vectors[,1])
##               [,1]
## [1,]  2.775558e-16
## [2,] -5.000000e+00
## [3,] -5.551115e-16
evs$values[1] * (5*evs$vectors[,1])
## [1]  2.775558e-16 -5.000000e+00 -5.551115e-16

## SVD

So, given matrix $$M:n\times m$$ that performs some linear transformation, the SVD decomposes that transformation into three steps: $$U:m\times m$$, $$D:m\times n$$, $$V:n\times n$$, such that $$M = U \times D \times V^T$$

• $$V^T$$ is the rotation to a more suitable axis system
• $$D$$ is the scaling done by $$M$$, $$D$$ is a rectangular diagonal, with $$d_{ii} \gt d_{jj}, i \lt j$$
• $$U$$ is the rotation back to the initial axis system

The $$m$$ columns of $$U$$ are called the left singular vectors The $$n$$ columns of $$V$$ are called the right singular vectors

• The right singular vectors are eigenvectors of $$M^T \times M$$
• The left singular vectors are eigenvectors of $$M \times M^T$$
• The non-zero values of $$D$$ are the square root of the eigenvalues of $$M \times M^T$$ and $$M^T \times M$$ are called the singular values

These singular values can be seen as the semiaxes of an ellipsoid where the scaling is done.

$$U$$ and $$V$$ are unitary matrixes, i.e. $$U \times U^T=I \wedge V \times V^T=I$$.

M <- matrix(c(1,0,0,0,
0,0,0,4,
0,3,0,0,
0,0,0,0,
2,0,0,0),4,5)
M
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    0    0    0    2
## [2,]    0    0    3    0    0
## [3,]    0    0    0    0    0
## [4,]    0    4    0    0    0
M.svd <- svd(M) # svd computes the singular-value decomposition of a rectangular matrix.
M.svd$u ## [,1] [,2] [,3] [,4] ## [1,] 0 0 1 0 ## [2,] 0 1 0 0 ## [3,] 0 0 0 -1 ## [4,] 1 0 0 0 M.svd$d
## [1] 4.000000 3.000000 2.236068 0.000000
t(M.svd$v) ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.0000000 1 0 0 0.0000000 ## [2,] 0.0000000 0 1 0 0.0000000 ## [3,] 0.4472136 0 0 0 0.8944272 ## [4,] 0.0000000 0 0 1 0.0000000 # Rebuilding the original matrix M from its svd M.svd$u %*% (diag(M.svd$d) %*% t(M.svd$v)) 
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    0    0    0    2
## [2,]    0    0    3    0    0
## [3,]    0    0    0    0    0
## [4,]    0    4    0    0    0
# we could cut the last column of u since the last singular value in d is zero

## Making Approximations

SVD can produce approximations of the initial matrix that use less space. The values of matrix $$D$$ state the importance of each scaling. If the last values are low, they can be discarded without much loss. The effect is that we don’t need to keep all columns/rows of the original svd.

set.seed(123)
m <- matrix(sample(1:16,16),4,4)
m
##      [,1] [,2] [,3] [,4]
## [1,]    5   13   16    8
## [2,]   12    1    4    2
## [3,]    6   14   10   11
## [4,]   15    9    3    7
m.svd <- svd(m)
m.svd
## $d ## [1] 35.441552 14.143451 6.201237 1.184828 ## ##$u
##            [,1]       [,2]       [,3]       [,4]
## [1,] -0.6050563  0.4695553  0.5054014  0.3974848
## [2,] -0.2577214 -0.5791245  0.6067401 -0.4796467
## [3,] -0.5849621  0.2654779 -0.4722392 -0.6035984
## [4,] -0.4746640 -0.6112725 -0.3916964  0.4976081
##
## $v ## [,1] [,2] [,3] [,4] ## [1,] -0.4725434 -0.86103020 0.1772238 0.06262165 ## [2,] -0.5808125 0.26445681 -0.4772690 0.60409743 ## [3,] -0.5074665 0.42545116 0.7443517 -0.08607947 ## [4,] -0.4264239 0.08764073 -0.4321419 -0.78976904 ds <- diag(m.svd$d[1:2])  # let's just use the first two values
us <- as.matrix(m.svd$u[, 1:2]) vs <- as.matrix(m.svd$v[, 1:2])
m.approx1 <- us %*% ds %*% t(vs)
m.approx1
##           [,1]      [,2]      [,3]     [,4]
## [1,]  4.415069 14.211315 13.707656 9.726325
## [2,] 11.368776  3.139051  1.150429 3.177126
## [3,]  6.563779 13.034360 12.118249 9.169675
## [4,] 15.393557  7.484549  4.858783 6.415958
ds <- diag(m.svd$d[1:3]) # let's now use the first three values us <- as.matrix(m.svd$u[, 1:3])
vs <- as.matrix(m.svd$v[, 1:3]) m.approx2 <- us %*% ds %*% t(vs) m.approx2 # m.approx2 will never be a worst approximation than m.approx1 ## [,1] [,2] [,3] [,4] ## [1,] 4.970508 12.715500 16.040539 8.371943 ## [2,] 12.035588 1.343308 3.951081 1.551175 ## [3,] 6.044785 14.432026 9.938439 10.435189 ## [4,] 14.963080 8.643836 3.050751 7.465632 # we could compute the sum of squared errors approx.error <- function(m1,m2) { sum((m1-m2)^2) } approx.error(m,m.approx1) ## [1] 39.85916 approx.error(m,m.approx2) ## [1] 1.403817 ### robust SVD SVD is guaranteed to produce the optimum approximation concerning the sum of square differences, ie, according to the $$L_2$$ norm. This means that SVD is sensible to outliers that might move the components more than expected. It’s possible to perform a more robust SVD based on the $$L_1$$ norm: # to install # source("http://bioconductor.org/biocLite.R") # biocLite("pcaMethods") library(pcaMethods) ## Loading required package: Biobase ## Loading required package: BiocGenerics ## Loading required package: parallel ## ## Attaching package: 'BiocGenerics' ## ## The following objects are masked from 'package:parallel': ## ## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, ## clusterExport, clusterMap, parApply, parCapply, parLapply, ## parLapplyLB, parRapply, parSapply, parSapplyLB ## ## The following object is masked from 'package:stats': ## ## xtabs ## ## The following objects are masked from 'package:base': ## ## anyDuplicated, append, as.data.frame, as.vector, cbind, ## colnames, do.call, duplicated, eval, evalq, Filter, Find, get, ## intersect, is.unsorted, lapply, Map, mapply, match, mget, ## order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, ## rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, ## table, tapply, union, unique, unlist, unsplit ## ## Welcome to Bioconductor ## ## Vignettes contain introductory material; view with ## 'browseVignettes()'. To cite Bioconductor, see ## 'citation("Biobase")', and for packages 'citation("pkgname")'. ## ## Loading required package: Rcpp ## ## Attaching package: 'pcaMethods' ## ## The following object is masked from 'package:stats': ## ## loadings library(matrixStats) ## matrixStats v0.14.0 (2015-02-13) successfully loaded. See ?matrixStats for help. ## ## Attaching package: 'matrixStats' ## ## The following objects are masked from 'package:Biobase': ## ## anyMissing, rowMedians # From the examples: ## Load a complete sample metabolite data set and mean center the data data(metaboliteDataComplete) mdc <- prep(metaboliteDataComplete, center=TRUE, scale="none") ## Now create 5% of outliers. cond <- runif(length(mdc)) < 0.05; mdcOut <- mdc mdcOut[cond] <- 10 ## Now we do a conventional SVD and a robustSvd on both, the original and the ## data with outliers. resSvd <- svd(mdc) resSvdOut <- svd(mdcOut) resRobSvd <- robustSvd(mdc) resRobSvdOut <- robustSvd(mdcOut) ## Now we plot the results for the original data against those with outliers ## We can see that robustSvd is hardly affected by the outliers. plot(resSvd$v[,1], resSvdOut$v[,1]) plot(resRobSvd$v[,1], resRobSvdOutv[,1]) ## The three transformations Next stuff based on this tutorial # Let's make a dot-by-dot picture xs = seq(-10,10,.1) pts = matrix(c(xs,((xs+5)/10)*sin(xs)),length(xs),2) # eg: twisted sine wave # given two new vectors defining a new perpframe, ie, orthogonal axis... # p1 and p2 define the vector for the new x-acis and y-axis respectively theta = pi/4 # rotate 45 degrees p1 = c( cos(theta),sin(theta)) p2 = c(-sin(theta),cos(theta)) # So p1 and p2 define a new axis. We will make two transformations, one to # go from the old to the new axis (hanger) and vice-versa (aligner) # aligner: its job is to align the perpframe into the xy axis aligner <- matrix(rbind(p1,p2),2,2) aligner ## [,1] [,2] ## [1,] 0.7071068 0.7071068 ## [2,] -0.7071068 0.7071068 # hanger: its job is to align the xy axis into the perpframe hanger <- matrix(cbind(p1,p2),2,2) hanger ## [,1] [,2] ## [1,] 0.7071068 -0.7071068 ## [2,] 0.7071068 0.7071068 # Show both transformations # Hanger plot(c(-10,10), c(-10,10), type="n",xlab = "x", ylab = "y", main = "Hanger") mtext("hanging object to the new frame axis",3) scale.arrow <- 3 # just for the plot presentation arrows(0,0,p1[1]*scale.arrow,p1[2]*scale.arrow,col="blue") arrows(0,0,p2[1]*scale.arrow,p2[2]*scale.arrow,col="red") abline(h=0,col="grey"); abline(v=0,col="grey") points(pts,type="l") # draw the sequence hanger.pts <- t(apply(pts,1,function(p) hanger %*% p)) points(hanger.pts,type="l",col="green") legend("topleft", c("new x axis", "new y axis"), col = c("blue","red"), lwd=1) text(0, -9, "green: after rotation into the new axis") # Aligner plot(c(-10,10), c(-10,10), type="n",xlab = "x", ylab = "y", main = "Aligner") mtext("aligning transformed object again into the orignal frame axis",3) scale.arrow <- 3 # just for the plot presentation arrows(0,0,p1[1]*scale.arrow,p1[2]*scale.arrow,col="blue") arrows(0,0,p2[1]*scale.arrow,p2[2]*scale.arrow,col="red") abline(h=0,col="grey"); abline(v=0,col="grey") points(hanger.pts,type="l") # draw the sequence aligned.pts <- t(apply(hanger.pts,1,function(p) aligner %*% p)) points(aligned.pts,type="l",col="green") legend("topleft", c("new x axis", "new y axis"), col = c("blue","red"), lwd=1) text(0, -9, "green: (re)aligning to original axis") Now we have both transformations that map one axis to the other. We just need to include an extra transformation: scaling. #stretcher: its job is to x-scale and y-scale the object stretcher <- matrix(c(1, 0, 0, 4),2,2) # Stretcher plot(c(-10,10), c(-10,10), type="n",xlab = "x", ylab = "y", main = "Stretcher") mtext("scaling object in both axis (herein just 4x in y-axis)",3) abline(h=0,col="grey"); abline(v=0,col="grey") points(pts,type="l") # draw the sequence stretcher.pts <- t(apply(pts,1,function(p) stretcher %*% p)) points(stretcher.pts,type="l",col="red") text(0, -9, "red: scaled object") With these three operations, we can transform the original object in three distict steps: + hang to the new axis + scale it + align to the original axis again This can be done by matrix multiplication (as seen in the next eg). ### # here's an example hanger ## [,1] [,2] ## [1,] 0.7071068 -0.7071068 ## [2,] 0.7071068 0.7071068 stretcher ## [,1] [,2] ## [1,] 1 0 ## [2,] 0 4 aligner ## [,1] [,2] ## [1,] 0.7071068 0.7071068 ## [2,] -0.7071068 0.7071068 product <- aligner %*% stretcher %*% hanger product ## [,1] [,2] ## [1,] 2.5 1.5 ## [2,] 1.5 2.5 # All together now! plot(c(-10,10), c(-10,10), type="n",xlab = "x", ylab = "y", main = "Complete transformation") mtext("making all three transformations in one step",3) abline(h=0,col="grey"); abline(v=0,col="grey") points(pts,type="l") # draw the sequence new.pts <- t(apply(pts,1,function(p) product %*% p)) points(new.pts,type="l",col="red") text(0, -9, "red: transformed object") ## SVD & (Hang,Stretch,Align) product ## [,1] [,2] ## [1,] 2.5 1.5 ## [2,] 1.5 2.5 m1.svd <- svd(product) m1.svd ##d
## [1] 4 1
##
## $u ## [,1] [,2] ## [1,] -0.7071068 -0.7071068 ## [2,] -0.7071068 0.7071068 ## ##$v
##            [,1]       [,2]
## [1,] -0.7071068 -0.7071068
## [2,] -0.7071068  0.7071068
m1.svd$u %*% diag(m1.svd$d) %*% t(m1.svd$v) # should be equal to product ## [,1] [,2] ## [1,] 2.5 1.5 ## [2,] 1.5 2.5 # what this means is that svd$v does the hanging, svd$d does the scaling (after we use # those numbers to make a diagonal matrix), and svd$u does the aligning back!

###########
# Besides, we can also use just the first components of svd$d to # compute approximate transformations. When the first values of svd$d
# are relatively very big (ie, they explain most of the variance) the last
# components are not very important (if they indeed contribute at all)
# and we are able to have condensed transformations (which saves space) that
# are good approximations of the original transformation
###########
ds <- as.matrix(m1.svd$d[1]) # eg, just use the first value us <- as.matrix(m1.svd$u[, 1])
vs <- as.matrix(m1.svdv[, 1]) m1.approx <- us %*% ds %*% t(vs) m1.approx ## [,1] [,2] ## [1,] 2 2 ## [2,] 2 2 ## Some examples We can use the hanger, strecher, aligner 2x2 matrixes to manipulate points in 2D. Some egs follow: # Question: What are the coordinates of point P=(x,y) at p1,p2 perpframe? # Answer: aligner %*% P # Question: A point P has cord (x',y') at the perpframe, what are its # coordinates at the xy axis? # Answer: hanger %*% P ### Eg: reflect a curve over a line xs <- seq(0,2*pi,.025) pts <- matrix(c(1+cos(xs)*(.2-sin(xs))*sin(xs), 1.2+cos(xs)*(.6-sin(xs))),length(xs),2) theta <- pi/4 # axis rotation p1 = c(cos(theta),sin(theta)) p2 = c(-sin(theta),cos(theta)) plot(c(-3,3), c(-3,3), type="n", main="Mirroring figure") abline(h=-3:3, v=-3:3, col="lightgray", lty=3) abline(h=0);abline(v=0) arrows(0,0,p1[1],p1[2],col="black") arrows(0,0,p2[1],p2[2],col="red") abline(0,-1,col="red") # p2 axis (the new y axis) points(pts,type="l",col="black") # draw original figure # apply the transformation (mirror over y axis) stretcher <- matrix(c(-1, 0, 0 ,1),2,2) stretcher.pts <- t(apply(pts,1,function(p) stretcher %*% p)) points(stretcher.pts,type="l",col="green") # apply the transformation (mirror over p2 axis) aligner <- matrix(rbind(p1,p2),2,2) stretcher <- matrix(c(-1, 0, 0 ,1),2,2) hanger <- matrix(cbind(p1,p2),2,2) product.pts <- t(apply(pts,1,function(p) hanger %*% stretcher %*% aligner %*% p)) points(product.pts,type="l",col="blue") legend("bottomleft", c("original", "mirror y-axis", "mirror over p2"), col = c("black","green","blue"), lwd=1) This next eg fits an elipse to sample $$n$$ points $$(x_i,y_i)$$, ie, the ellipse will have the slope of the linear regression, with axis the size of both standard deviations Notice that we do not use linear regression (miniziming distances to yi) but Deming Linear Regression, ie, minimizing distances to data points, since both x and y both the y and x values are observed, and neither is controlled by the person running the experiment. Check http://www.had2know.com/academics/deming-regression-perpendicular-distances.html. library(MethComp) # use: deming regression ## Loading required package: nlme # function to compute the distance between two points distance.points <- function(p1,p2) sqrt((p1[1]-p2[1])^2+(p1[2]-p2[2])^2) # function that draws ellipses (definition is the sides of the polygon used to draw the ellipse) draw.ellipse <- function(x0=0, y0=0, a=1, b=1, angle=0, definition=50, color="black") { theta <- seq(0, 2 * pi, length=definition) x <- x0 + a * cos(theta) * cos(angle) - b * sin(theta) * sin(angle) y <- y0 + a * cos(theta) * sin(angle) + b * sin(theta) * cos(angle) points(x, y, type = "l", col=color) } # returns both foci of a given ellipse compute.foci <- function(x0,y0,a,b,angle) { distance.foci.center <- sqrt(abs(a^2 - b^2)) if (a > b) { # foci lie over slope foci.1 <- c(x0+cos(angle)*distance.foci.center, y0+sin(angle)*distance.foci.center) foci.2 <- c(x0-cos(angle)*distance.foci.center, y0-sin(angle)*distance.foci.center) } else { # foci lie over slope's prependicular foci.1 <- c(x0-sin(angle)*distance.foci.center, y0+cos(angle)*distance.foci.center) foci.2 <- c(x0+sin(angle)*distance.foci.center, y0-cos(angle)*distance.foci.center) } return(c(foci.1,foci.2)) } set.seed(101) ### create sample data set size <- 100 # number of data points pts <- matrix(c(1.1 + rnorm(size,0,1.0), -.5 + rnorm(size,0,2.25)), size, 2) theta <- pi/6 # rotate data points 30 degrees p1 <- c( cos(theta),sin(theta)) # find vectors of the new perspective frame p2 <- c(-sin(theta),cos(theta)) hanger <- matrix(cbind(p1,p2),2,2) # rotation matrix to perpframe (see above) for(i in 1:size) { # rotate all sample points pts[i,] <- hanger %*% pts[i,] } rm("theta","p1","p2","hanger") # erase memory of sample construction # So, *pts* is our initial dataset! How can we find the fitting elipsoid? # First find the appropriate perspective frame (prepframe) using deming regression dlm <- Deming(x=pts[,1], y=pts[,2]) # make deming regression dlm ## Intercept Slope sigma.pts[, 1] sigma.pts[, 2] ## 2.4134327 -1.9684156 0.9322395 0.9322395 plot(pts, xlim=c(-5,5), ylim=c(-5,5), xlab="xs", ylab="ys", pch=19) # plot data... abline(h=0,col="grey"); abline(v=0,col="grey") title("Initial Data Set") abline(dlm[1:2],col="green") # ... and deming line (just for show!) angle.slope <- atan(dlm["Slope"]) # get slope angle in radians # we need to find the standard deviations over that slope. So we will # rotate it to the xy-axis using aligner and compute those sd's p1 <- c( cos(angle.slope),sin(angle.slope)) p2 <- c(-sin(angle.slope),cos(angle.slope)) aligner <- matrix(rbind(p1,p2),2,2) # transformation matrix aligned.pts <- matrix(c(NA,NA),size,2) for(i in 1:size) { aligned.pts[i,] <- aligner %*% pts[i,] # align each data point to xy-axis } plot(aligned.pts, xlim=c(-5,5), ylim=c(-5,5), xlab="xs", ylab="ys", pch=19) # show aligned dataset title("Aligned Data Set") abline(h=0,col="grey"); abline(v=0,col="grey")  sd.xs <- sd(aligned.pts[,1]) # compute sd's sd.ys <- sd(aligned.pts[,2]) # Now we have all the data to draw the ellipse(s) plot(pts, xlim=c(-5,5), ylim=c(-5,5), xlab="xs", ylab="ys", pch=19) # plot data... title("Initial Data Set with Fitting Ellipses") abline(h=0,col="grey"); abline(v=0,col="grey") legend("topleft", c("one sigma", "two sigma", "three sigma"), col = c("red","blue","green"), lwd=2) draw.ellipse(mean(pts[,1]), mean(pts[,2]), sd.xs, sd.ys, angle.slope) # and the ellipse draw.ellipse(mean(pts[,1]), mean(pts[,2]), 2*sd.xs, 2*sd.ys, angle.slope, color="red") # contains points under two standard deviations draw.ellipse(mean(pts[,1]), mean(pts[,2]), 3*sd.xs, 3*sd.ys, angle.slope, color="green")  # we could also have the option to color the points inside each sigma-ellipse # Def: An ellipse is the locus of points such that the sum of the distances # from each point to the two foci equals twice the length of the major axis. foci <- compute.foci(mean(pts[,1]),mean(pts[,2]),sd.xs,sd.ys,angle.slope) foci.1 <- foci[1:2] foci.2 <- foci[3:4] # local function that returns in which 'sigma-elipse' it belongs (up to 6-sigma) sigma.elipse <- function(p,x0,y0,a,b,angle) { for(i in 1:6) { foci <- compute.foci(x0,y0,i*a,i*b,angle) foci.1 <- foci[1:2] foci.2 <- foci[3:4] if(2*i*max(a,b) >= distance.points(p,foci.1) + distance.points(p,foci.2)) return (i) } return (7) } which.sigma <- rep(NA,size) for(i in 1:size) { which.sigma[i] <- sigma.elipse(c(pts[i,1],pts[i,2]), mean(pts[,1]),mean(pts[,2]),sd.xs,sd.ys,angle.slope) # check for each point } plot(pts, xlim=c(-5,5), ylim=c(-5,5), xlab="xs", ylab="ys",type="n") # plot data... title("Show what data points are inside each ellipse") abline(h=0,col="grey"); abline(v=0,col="grey") legend("bottomleft", c("one sigma", "two sigma", "three sigma", "four sigma"), col = 1:4, lwd=2) points(pts,pch=19,col=which.sigma) Comparing previous eg with the respective SVD plot(pts, xlim=c(-5,5), ylim=c(-5,5), xlab="xs", ylab="ys", pch=19) angle.slope*180/pi # the rotation for the deming regression ## Slope ## -63.06839 abline(dlm[1:2],col="black") sd.xs # their sd's ## [1] 2.262085 sd.ys ## [1] 0.9275193 svd.pts <- svd(pts) svd.ptsd
## [1] 23.35590 13.94837
t(svd.ptsv) # this is the aligner rotation for the SVD transform ## [,1] [,2] ## [1,] 0.5976120 -0.8017855 ## [2,] 0.8017855 0.5976120 svd.slope <- as.numeric(acos(t(svd.ptsv[1,1]))) # its rotation (in radians)
abline(a=0,b=svd.slope,col="green")

rotated.pts <- matrix(c(NA,NA),size,2) # rotate the original points
for(i in 1:size) {
rotated.pts[i,] <- t(svd.ptsv) %*% pts[i,] # align each data point to new axis } points(rotated.pts,col="red") sd(rotated.pts[,1]) # check their sd's after the alignment ## [1] 2.234842 sd(rotated.pts[,2]) ## [1] 0.9913653 final.pts <- svd.ptsu %*% diag(svd.pts$d) %*% t(svd.pts$v) # == original pts
points(final.pts,col="green")

## Generalized Inverse

SVD to find a generalized inverse of a non-full-rank matrix.

For a square matrix $$A$$ with a non-zero determinant, there exists an inverse matrix $$B$$ such that $$A \times B = I$$ and $$B \times A = I$$.

For a matrix that is not square, generalized inverse matrices have some (but not all) of the properties of an inverse matrix. SVD can be used to find a generalized inverse matrix.

In the example below, we use SVD to find a generalized inverse $$B$$ to the matrix $$A$$ such that $$A \times B \times A = A$$. We compare our generalized inverse with the one generated by the ginv command (MASS library).

source

a <- matrix(c(1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1), 9, 4)
a
##       [,1] [,2] [,3] [,4]
##  [1,]    1    1    0    0
##  [2,]    1    1    0    0
##  [3,]    1    1    0    0
##  [4,]    1    0    1    0
##  [5,]    1    0    1    0
##  [6,]    1    0    1    0
##  [7,]    1    0    0    1
##  [8,]    1    0    0    1
##  [9,]    1    0    0    1
a.svd <- svd(a)
a.svd$d ## [1] 3.464102e+00 1.732051e+00 1.732051e+00 1.922963e-16 ds <- diag(1/a.svd$d[1:3]) # take the first 3 values, since the 4th is almost zero
u <- a.svd$u v <- a.svd$v
us <- as.matrix(u[, 1:3])
vs <- as.matrix(v[, 1:3])
a.ginv <- vs %*% ds %*% t(us)
round(a %*% a.ginv %*% a,0) # round just for pretty print
##       [,1] [,2] [,3] [,4]
##  [1,]    1    1    0    0
##  [2,]    1    1    0    0
##  [3,]    1    1    0    0
##  [4,]    1    0    1    0
##  [5,]    1    0    1    0
##  [6,]    1    0    1    0
##  [7,]    1    0    0    1
##  [8,]    1    0    0    1
##  [9,]    1    0    0    1
# Using the library R function to compare:
round(a %*% ginv(a) %*% a,0) # round just for pretty print
##       [,1] [,2] [,3] [,4]
##  [1,]    1    1    0    0
##  [2,]    1    1    0    0
##  [3,]    1    1    0    0
##  [4,]    1    0    1    0
##  [5,]    1    0    1    0
##  [6,]    1    0    1    0
##  [7,]    1    0    0    1
##  [8,]    1    0    0    1
##  [9,]    1    0    0    1

## Principal Components Analysis

# The principal components are equal to the right singular values if
# you first scale (subtract the mean, divide by the standard deviation)
# the variables
m1 <- matrix(sample(1:16,16),4,4)
m1.scale.svd <- svd(scale(m1))
m1.scale.svd
## $d ## [1] 2.882267e+00 1.693190e+00 9.086496e-01 3.772009e-17 ## ##$u
##            [,1]       [,2]       [,3] [,4]
## [1,] -0.2289567  0.7605523  0.3451653  0.5
## [2,]  0.5439093 -0.4082943  0.5361514  0.5
## [3,] -0.7061765 -0.4866231 -0.1204686  0.5
## [4,]  0.3912239  0.1343651 -0.7608481  0.5
##
## $v ## [,1] [,2] [,3] [,4] ## [1,] 0.5992552 0.07637774 0.003260894 -0.79689960 ## [2,] 0.5419748 0.25090503 -0.677801421 0.42883010 ## [3,] 0.5570447 0.00629217 0.715000188 0.42241721 ## [4,] -0.1919835 0.96497332 0.171316467 -0.05118085 m1.pca <- prcomp(m1,scale=T) m1.pca ## Standard deviations: ## [1] 1.664078e+00 9.775637e-01 5.246091e-01 2.177770e-17 ## ## Rotation: ## PC1 PC2 PC3 PC4 ## [1,] 0.5992552 0.07637774 0.003260894 -0.79689960 ## [2,] 0.5419748 0.25090503 -0.677801421 0.42883010 ## [3,] 0.5570447 0.00629217 0.715000188 0.42241721 ## [4,] -0.1919835 0.96497332 0.171316467 -0.05118085 # m1.pca$rotation == m1.scale.svd$v # plot the 1st component/right singular vector just to check they are the same plot(m1.pca$rotation[,1],m1.scale.svd\$v[,1],pch=19,
xlab="Principal Component 1",
ylab="Right Singular Vector 1")
abline(c(0,1))  # 45 degree line

## Image Processing

source It reads in a jpeg (pansy.jpg) and plots it in R, first in color (when the image is stored as three matrices–one red, one green, one blue) and then in grayscale (when the image is stored as one matrix). Then, using SVD, we can essentially compress the image. Note that we can recover the image to varying degrees of detail as we recreate the image from different numbers of dimensions from our SVD matrices. You can see how many dimensions are needed before you have an image that cannot be differentiated from the original.

This code uses package EBImage. intro ref.manual To install: source("http://bioconductor.org/biocLite.R"); biocLite("EBImage")

library("EBImage")
##
## Attaching package: 'EBImage'
##
## The following object is masked from 'package:Biobase':
##
##     channel
pic <- readImage("pansy.jpg")
dims <- dim(pic)
plot(c(0, dims[1]), c(0, dims[2]), type='n', xlab="", ylab="")
rasterImage(pic,0,0,dims[1],dims[2])        # present image