Course Content

# Data Science Interview Challenge

Data Science Interview Challenge

## Challenge 2: Bayes' Theorem

In the world of probability and statistics, Bayesian thinking offers a framework for understanding the probability of an event based on prior knowledge. It contrasts with the frequentist approach, which determines probabilities based on the long-run frequencies of events. **Bayes' theorem** is a fundamental tool within this Bayesian framework, connecting prior probabilities and observed data.

# Task

Imagine you are a data scientist working for a medical diagnostics company. Your company has developed a new test for a rare disease. The prevalence of this disease in the general population is 1%. The test has a 99% true positive rate (sensitivity) and a 98% true negative rate (specificity).

Your task is to compute the probability that a person who tests positive actually has the disease.

Given:

**P(Disease)**= Probability of having the disease =`0.01`

**P(Positive|Disease)**= Probability of testing positive given that you have the disease =`0.99`

**P(Negative|No\ Disease)**= Probability of testing negative given that you don't have the disease =`0.98`

Using **Bayes' theorem**:

`P(Disease|Positive) = P(Positive|Disease) * P(Disease) / P(Positive)`

Where **P(Positive)** can be found using the law of total probability:

`P(Positive) = P(Positive|Disease) * P(Disease) + P(Positive|No Disease) * P(No Disease)`

Compute **P(Disease|Positive)**, the probability that a person who tests positive actually has the disease.

## Code Description

**p_positive_given_no_disease = 1 - p_negative_given_no_disease**

Calculates the probability of testing positive given no disease, which is the complement of testing negative given no disease.

**p_positive = p_positive_given_disease * p_disease + p_positive_given_no_disease * (1 - p_disease)**

Computes the total probability of testing positive using the law of total probability, considering both the probability of testing positive with the disease and without the disease.

**p_disease_given_positive = (p_positive_given_disease * p_disease) / p_positive**

Applies Bayes' theorem to determine the probability that a person who tests positive actually has the disease.

Everything was clear?