Refs:
library(MASS) # use: mvrnorm
library(psych) # use: pairs.panels
set.seed(100)
The next code creates a sample from the given multivariate normal distribution:
m <- 3
n <- 2000
mu <- rep(0, m)
sigma <- matrix(c(1.0, 0.4, 0.2,
0.4, 1.0, -0.8,
0.2, -0.8, 1.0), nrow=3)
x <- mvrnorm(n, mu=mu, Sigma=sigma, empirical=TRUE)
colnames(x) <- paste0("x",1:m)
cor(x, method='spearman') # check sample correlation
## x1 x2 x3
## x1 1.0000000 0.3812244 0.1937548
## x2 0.3812244 1.0000000 -0.7890814
## x3 0.1937548 -0.7890814 1.0000000
Function mvrnorm
is nice to generate correlated random variables but there is a restriction, the marginal functions are normal distributions:
pairs.panels(x)
Many empirical evidence have skewed or heavy tailed marginals. What if we would like to create a similar correlated random variables but with arbitrary marginals, say, a Gamma(2,1), a Beta(2,2), and a t(\(\nu\)=5) distribution?
Here’s a possible algorithm to do that:
Generate the variables \(x_i\) from a Gaussian multivariate (as we did)
Remembering the probability integral transformation:
\[X \sim F_X \Rightarrow U = F_X(X) \sim \mathcal{U}(0,1)\]
transform \(x_i\) using the Gaussian cdf, \(\Phi\) (in R is called pnorm
), \(u_i = \Phi(x_i)\), where \(u_i\) have marginal uniform distributions but are still correlated as variables \(x_i\).
u <- pnorm(x)
pairs.panels(u)
plot3d(u[,1],u[,2],u[,3],pch=20,col='navyblue')