## Random Vectors

**Random vectors** represent multiple related random variables grouped together. They're commonly used in Data Science to model systems with interconnected random quantities, such as datasets where each vector corresponds to a data point.

The probability distribution of a random vector is described by its **joint distribution function**, which shows how all variables in the vector are distributed simultaneously.

To create a simple random vector with `n`

dimensions:

- Generate
`n`

independent random variables, each following its distribution function; - Use these variables as coordinates for the vector;
- Apply the multiplication rule to determine the joint distribution:
`f = f1 * f2 * ... * fn`

.

## Discrete random vectors

For discrete values, the joint distribution is still described using the PMF, where the function takes a combination of coordinate values and returns their probability.

For example, consider **tossing two coins** and recording the results in a vector. The PMF for this vector will look like:

First coin / Second coin | Head | Tail |
---|---|---|

Head | 0.25 | 0.25 |

Tail | 0.25 | 0.25 |

## Continuous random vectors

For continuous variables, multivariate PDF is used:

Note

The characteristics of random vectors are also given in vector form: this vector consists of coordinates that correspond to the statistics of the coordinates of the original vector. In the example above

`mu`

vector corresponds to the mean values of the coordinates,`cov`

corresponds to the covariance matrix of a two-dimensional random vector.

## Vectors with dependent coordinates

But there are also vectors with **coordinates dependent on each other**. Then, we will no longer be able to define the joint distribution as the product of the distributions of coordinates. In this case, the joint distribution is given based on the knowledge of the **domain area** or some **additional information** about the dependencies between the coordinates.

Let's look at the example of two-dimensional Gaussian samples with dependent coordinates. The nature of the dependencies will be determined using the covariance matrix (off-diagonal elements are responsible for the dependencies between the corresponding coordinates):

We can see that these coordinates exhibit a strong linear dependency: as the X coordinate increases, the Y coordinate decreases.

Everything was clear?

Course Content

Probability Theory Mastering

# Probability Theory Mastering

1. Additional Statements From The Probability Theory

3. Estimation of Population Parameters

4. Testing of Statistical Hypotheses

## Random Vectors

**Random vectors** represent multiple related random variables grouped together. They're commonly used in Data Science to model systems with interconnected random quantities, such as datasets where each vector corresponds to a data point.

The probability distribution of a random vector is described by its **joint distribution function**, which shows how all variables in the vector are distributed simultaneously.

To create a simple random vector with `n`

dimensions:

- Generate
`n`

independent random variables, each following its distribution function; - Use these variables as coordinates for the vector;
- Apply the multiplication rule to determine the joint distribution:
`f = f1 * f2 * ... * fn`

.

## Discrete random vectors

For discrete values, the joint distribution is still described using the PMF, where the function takes a combination of coordinate values and returns their probability.

For example, consider **tossing two coins** and recording the results in a vector. The PMF for this vector will look like:

First coin / Second coin | Head | Tail |
---|---|---|

Head | 0.25 | 0.25 |

Tail | 0.25 | 0.25 |

## Continuous random vectors

For continuous variables, multivariate PDF is used:

Note

The characteristics of random vectors are also given in vector form: this vector consists of coordinates that correspond to the statistics of the coordinates of the original vector. In the example above

`mu`

vector corresponds to the mean values of the coordinates,`cov`

corresponds to the covariance matrix of a two-dimensional random vector.

## Vectors with dependent coordinates

But there are also vectors with **coordinates dependent on each other**. Then, we will no longer be able to define the joint distribution as the product of the distributions of coordinates. In this case, the joint distribution is given based on the knowledge of the **domain area** or some **additional information** about the dependencies between the coordinates.

Let's look at the example of two-dimensional Gaussian samples with dependent coordinates. The nature of the dependencies will be determined using the covariance matrix (off-diagonal elements are responsible for the dependencies between the corresponding coordinates):

We can see that these coordinates exhibit a strong linear dependency: as the X coordinate increases, the Y coordinate decreases.

Everything was clear?