  Course Content

Probability Theory Mastering

##   Random Vectors

Random vectors are collections of random variables organized into a single vector or array. Each element in the vector represents a random variable, and the vector as a whole represents a multi-dimensional random variable. Random vectors are used to model and analyze systems or phenomena that involve multiple random quantities related to each other. In Data Science, the data with which we train models are usually represented as random vectors: each vector is a set of characteristics of a certain object, and each characteristic is a random variable.
The distribution law of a random vector is described using the joint distribution function: it describes the simultaneous distribution of all the components or variables in the vector.

The simplest random vector of dimension n can be constructed as follows:

1. Independently generate n random variables. Assume f1, f2, ..., fn are functions that represent each variable's distribution (PMF for discrete variables and PDF for continuous variables).
2. Use these variables as coordinates of the vector.
3. Due to the multiplication rule described in Probability Theory Basics course the joint distribution of the whole vector can be determined as multiplication of distributions of all coordinates: f = f1 * f2 * f3 * .... * fn.

For discrete values, the joint distribution is still specified using the PMF: the function's input is a combination of values of the vector coordinates, and the function returns the probability of such a combination.
Suppose we toss two coins and write the result into a vector of the form (the result of the first coin, the result of the second coin). The PMF of such a vector will look like this:  First coin/Second coin head tail head 0.25 0.25 tail 0.25 0.25 For continuous variables, multivariate PDF is used:  Note

The characteristics of random vectors are also given in vector form: this vector consists of coordinates that correspond to the statistics of the coordinates of the original vector. In the example above `mu` vector corresponds to the mean values of the coordinates, `cov` corresponds to the covariance matrix of a two-dimensional random vector.

The probability that a continuous random variable belongs to a certain area can be calculated as the volume bounded by the corresponding area in plane X-Y and the joint PDF plot on top.

Let's look at the distribution of the Gaussian vector in space, where each point is a vector with independent Gaussian coordinates:  Note

In the code above `.rvs()` method is 'Random Value Sample' - it generates random variables from the corresponding distribution.

But there are also vectors with coordinates dependent on each other. Then, we will no longer be able to define the joint distribution as the product of the distributions of coordinates. In this case, the joint distribution is given based on the knowledge of the domain area or some additional information about the dependencies between the coordinates.

Let's look at the example of two-dimensional Gaussian samples with dependent coordinates. The nature of the dependencies will be determined using the covariance matrix (off-diagonal elements are responsible for the dependencies between the corresponding coordinates):  Obviously, in the picture above, we can no longer generate each coordinate separately - we need to generate the entire vector based on the given joint PDF.

Note

We can also create vectors with mixed coordinates: one coordinate is discrete, and the other is continuous. But this topic is outside the scope of this course.

Gaussian random variables and Gaussian random vectors are most commonly used in real-life tasks. Some useful properties of the Gaussian distribution will be discussed in the next chapter.

Everything was clear?

Section 1. Chapter 4