Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Randomized Response | Differential Privacy in Machine Learning & Real Systems
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Data Privacy and Differential Privacy Fundamentals

bookRandomized Response

Randomized response is a foundational technique for achieving local differential privacy in survey data collection. When you use randomized response, each participant perturbs their answer to a sensitive binary question (such as "Have you ever committed tax fraud?") according to a probabilistic protocol, so that even the data collector cannot be certain of the true answer from any individual. This protocol allows you to estimate population-level statistics with strong privacy guarantees for every respondent.

The basic randomized response protocol for a binary question works as follows: each respondent flips a coin (or generates a random bit). With probability pp, they report their true answer. With probability 1p1 - p, they report a random answer, chosen with equal probability between "Yes" and "No". This way, even if a respondent answers "Yes", you cannot be sure whether it reflects their real answer or was randomly chosen. The protocol ensures that each individual's privacy is protected, while the aggregate responses can still be used to accurately estimate the true proportion of "Yes" answers in the population, once you account for the noise introduced by the protocol.

12345678910111213141516171819202122
# Randomized response for a binary ("Yes"/"No") question import random def randomized_response(true_answer: bool, p: float = 0.7) -> bool: """ Simulates the randomized response protocol for a binary question. Args: true_answer (bool): The respondent's actual answer (True for "Yes", False for "No"). p (float): Probability to report the true answer (0 < p < 1). Returns: bool: The (possibly randomized) reported answer. """ if random.random() < p: return true_answer else: return random.choice([True, False]) # Example: simulate 10 responses from a respondent whose true answer is "Yes" responses = [randomized_response(True, p=0.7) for _ in range(10)] print("Simulated responses:", responses)
copy
Note
Study More

The mathematical privacy guarantee of randomized response can be analyzed using the concept of epsilon-local differential privacy. By carefully choosing the probability pp, you can control the privacy parameter epsilon, which quantifies how much the reported answer reveals about the true answer. For a deeper treatment, see Dwork & Roth's "The Algorithmic Foundations of Differential Privacy" (2014), Section 2.4.

1. Which of the following best describes how randomized response protects individual privacy in a survey?

2. If you collect many randomized responses to a binary question using the randomized response protocol, what must you do to accurately estimate the true proportion of "Yes" answers in the population?

question mark

Which of the following best describes how randomized response protects individual privacy in a survey?

Select the correct answer

question mark

If you collect many randomized responses to a binary question using the randomized response protocol, what must you do to accurately estimate the true proportion of "Yes" answers in the population?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 3. Hoofdstuk 3

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

bookRandomized Response

Veeg om het menu te tonen

Randomized response is a foundational technique for achieving local differential privacy in survey data collection. When you use randomized response, each participant perturbs their answer to a sensitive binary question (such as "Have you ever committed tax fraud?") according to a probabilistic protocol, so that even the data collector cannot be certain of the true answer from any individual. This protocol allows you to estimate population-level statistics with strong privacy guarantees for every respondent.

The basic randomized response protocol for a binary question works as follows: each respondent flips a coin (or generates a random bit). With probability pp, they report their true answer. With probability 1p1 - p, they report a random answer, chosen with equal probability between "Yes" and "No". This way, even if a respondent answers "Yes", you cannot be sure whether it reflects their real answer or was randomly chosen. The protocol ensures that each individual's privacy is protected, while the aggregate responses can still be used to accurately estimate the true proportion of "Yes" answers in the population, once you account for the noise introduced by the protocol.

12345678910111213141516171819202122
# Randomized response for a binary ("Yes"/"No") question import random def randomized_response(true_answer: bool, p: float = 0.7) -> bool: """ Simulates the randomized response protocol for a binary question. Args: true_answer (bool): The respondent's actual answer (True for "Yes", False for "No"). p (float): Probability to report the true answer (0 < p < 1). Returns: bool: The (possibly randomized) reported answer. """ if random.random() < p: return true_answer else: return random.choice([True, False]) # Example: simulate 10 responses from a respondent whose true answer is "Yes" responses = [randomized_response(True, p=0.7) for _ in range(10)] print("Simulated responses:", responses)
copy
Note
Study More

The mathematical privacy guarantee of randomized response can be analyzed using the concept of epsilon-local differential privacy. By carefully choosing the probability pp, you can control the privacy parameter epsilon, which quantifies how much the reported answer reveals about the true answer. For a deeper treatment, see Dwork & Roth's "The Algorithmic Foundations of Differential Privacy" (2014), Section 2.4.

1. Which of the following best describes how randomized response protects individual privacy in a survey?

2. If you collect many randomized responses to a binary question using the randomized response protocol, what must you do to accurately estimate the true proportion of "Yes" answers in the population?

question mark

Which of the following best describes how randomized response protects individual privacy in a survey?

Select the correct answer

question mark

If you collect many randomized responses to a binary question using the randomized response protocol, what must you do to accurately estimate the true proportion of "Yes" answers in the population?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 3. Hoofdstuk 3
some-alt