Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Local Differential Privacy and Telemetry | Differential Privacy in Machine Learning & Real Systems
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Data Privacy and Differential Privacy Fundamentals

bookLocal Differential Privacy and Telemetry

Local Differential Privacy (Local DP) is a powerful approach for protecting user data at its source, ensuring that privacy is preserved before any information ever leaves the user's device. Unlike global differential privacy, which relies on a trusted data collector to add noise to aggregated data, Local DP empowers each individual to randomize their own data, making it private even if the data collector is not trustworthy. This model is especially important in scenarios where users do not want to rely on a central authority to protect their information.

Mathematical Guarantee of Local Differential Privacy

A randomized algorithm MM satisfies Ξ΅-local differential privacy if, for any pair of possible user inputs xx and xβ€²x', and for any possible output yy, the following holds:

P[M(x)=y]≀exp(Ξ΅)Γ—P[M(xβ€²)=y]P[M(x) = y] ≀ exp(Ξ΅) Γ— P[M(x') = y]

This means that the probability of any specific output does not change much (by more than a factor of exp⁑(Ρ)\exp(Ρ)) no matter what the user's true input is. The smaller the value of ΡΡ, the stronger the privacy guarantee. This ensures that an observer cannot confidently infer a user's true data from the randomized result.

In Local DP, each user's data is perturbedβ€”typically by adding noise or randomizing the dataβ€”before it is shared. This means that the data collector only ever receives noisy, privacy-protected information, making it nearly impossible to infer anything specific about any individual user. Real-world applications of Local DP include browser telemetry (such as Chrome or Firefox collecting usage statistics), mobile analytics (apps gathering crash reports or usage data), and smart home devices reporting on user interactions.

For example, when a browser wants to understand which features are most popular, it can use Local DP to collect this information without ever learning the true preferences of any single user. Each browser instance randomizes its own data locally, so even if the telemetry server is compromised, individual privacy is preserved. Similarly, mobile app developers can gather insights about app usage or crashes without risking exposure of sensitive user behavior.

Browsers collecting usage statistics where users do not trust the browser vendor
expand arrow

Local DP ensures that each user's browsing data is randomized on their own device before being sent. This protects privacy even if the browser vendor or telemetry server cannot be fully trusted.

Mobile apps reporting sensitive analytics (like location or app usage) directly from user devices
expand arrow

When collecting sensitive data, such as location or detailed app usage, Local DP randomizes this information on the user's device. This prevents the app developer or analytics provider from learning precise personal details.

Smart home devices sending data about user habits without a trusted central authority
expand arrow

Smart home devices can use Local DP to add noise to user interaction data before sending it out. This allows for useful aggregate insights without exposing the specific behaviors of any individual household.

Voting or survey systems where individual responses are highly sensitive and privacy must be protected at the source
expand arrow

In scenarios where responses are especially private, such as voting or confidential surveys, Local DP randomizes each answer before submission. This ensures that no oneβ€”including the system organizerβ€”can reliably link responses to individuals.

1234567891011121314151617181920212223242526
import random def local_dp_coin_flip(response, epsilon=1.0): """ Simulate Local Differential Privacy using randomized response. Flips a biased coin to decide whether to report the true response or its opposite. Args: response (bool): The user's true binary response (True/False). epsilon (float): Privacy parameter; higher means less privacy. Returns: bool: The randomized (private) response. """ # Probability of reporting the true answer p = (1 + pow(2.718, epsilon)) / (2 + pow(2.718, epsilon)) if random.random() < p: return response else: return not response # Example usage: real_answer = True # User's true answer private_answer = local_dp_coin_flip(real_answer, epsilon=0.5) print("Reported (private) answer:", private_answer)
copy

1. Which of the following best describes Local Differential Privacy?

2. What is a key trade-off when using Local Differential Privacy in real-world analytics?

question mark

Which of the following best describes Local Differential Privacy?

Select the correct answer

question mark

What is a key trade-off when using Local Differential Privacy in real-world analytics?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 2

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain how the randomized response mechanism works in this code?

What does the epsilon parameter control in local differential privacy?

Can you give more real-world examples where local differential privacy is used?

bookLocal Differential Privacy and Telemetry

Swipe to show menu

Local Differential Privacy (Local DP) is a powerful approach for protecting user data at its source, ensuring that privacy is preserved before any information ever leaves the user's device. Unlike global differential privacy, which relies on a trusted data collector to add noise to aggregated data, Local DP empowers each individual to randomize their own data, making it private even if the data collector is not trustworthy. This model is especially important in scenarios where users do not want to rely on a central authority to protect their information.

Mathematical Guarantee of Local Differential Privacy

A randomized algorithm MM satisfies Ξ΅-local differential privacy if, for any pair of possible user inputs xx and xβ€²x', and for any possible output yy, the following holds:

P[M(x)=y]≀exp(Ξ΅)Γ—P[M(xβ€²)=y]P[M(x) = y] ≀ exp(Ξ΅) Γ— P[M(x') = y]

This means that the probability of any specific output does not change much (by more than a factor of exp⁑(Ρ)\exp(Ρ)) no matter what the user's true input is. The smaller the value of ΡΡ, the stronger the privacy guarantee. This ensures that an observer cannot confidently infer a user's true data from the randomized result.

In Local DP, each user's data is perturbedβ€”typically by adding noise or randomizing the dataβ€”before it is shared. This means that the data collector only ever receives noisy, privacy-protected information, making it nearly impossible to infer anything specific about any individual user. Real-world applications of Local DP include browser telemetry (such as Chrome or Firefox collecting usage statistics), mobile analytics (apps gathering crash reports or usage data), and smart home devices reporting on user interactions.

For example, when a browser wants to understand which features are most popular, it can use Local DP to collect this information without ever learning the true preferences of any single user. Each browser instance randomizes its own data locally, so even if the telemetry server is compromised, individual privacy is preserved. Similarly, mobile app developers can gather insights about app usage or crashes without risking exposure of sensitive user behavior.

Browsers collecting usage statistics where users do not trust the browser vendor
expand arrow

Local DP ensures that each user's browsing data is randomized on their own device before being sent. This protects privacy even if the browser vendor or telemetry server cannot be fully trusted.

Mobile apps reporting sensitive analytics (like location or app usage) directly from user devices
expand arrow

When collecting sensitive data, such as location or detailed app usage, Local DP randomizes this information on the user's device. This prevents the app developer or analytics provider from learning precise personal details.

Smart home devices sending data about user habits without a trusted central authority
expand arrow

Smart home devices can use Local DP to add noise to user interaction data before sending it out. This allows for useful aggregate insights without exposing the specific behaviors of any individual household.

Voting or survey systems where individual responses are highly sensitive and privacy must be protected at the source
expand arrow

In scenarios where responses are especially private, such as voting or confidential surveys, Local DP randomizes each answer before submission. This ensures that no oneβ€”including the system organizerβ€”can reliably link responses to individuals.

1234567891011121314151617181920212223242526
import random def local_dp_coin_flip(response, epsilon=1.0): """ Simulate Local Differential Privacy using randomized response. Flips a biased coin to decide whether to report the true response or its opposite. Args: response (bool): The user's true binary response (True/False). epsilon (float): Privacy parameter; higher means less privacy. Returns: bool: The randomized (private) response. """ # Probability of reporting the true answer p = (1 + pow(2.718, epsilon)) / (2 + pow(2.718, epsilon)) if random.random() < p: return response else: return not response # Example usage: real_answer = True # User's true answer private_answer = local_dp_coin_flip(real_answer, epsilon=0.5) print("Reported (private) answer:", private_answer)
copy

1. Which of the following best describes Local Differential Privacy?

2. What is a key trade-off when using Local Differential Privacy in real-world analytics?

question mark

Which of the following best describes Local Differential Privacy?

Select the correct answer

question mark

What is a key trade-off when using Local Differential Privacy in real-world analytics?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 3. ChapterΒ 2
some-alt