Modeling Human Preferences: Distributions and Noise
When you seek to align machine learning systems with human values, you must formally represent human preferences. At the most basic level, a preference relation describes when a human prefers one outcome over another. Formally, if you have two options, A and B, the relation A≻B means "A is preferred to B." In practice, human choices are rarely deterministic; instead, they exhibit variability due to uncertainty, ambiguity, or other factors. This motivates the use of stochastic choice models, which assign probabilities to each possible choice rather than treating preferences as fixed. For example, you might model the probability that a human prefers A to B as P(A≻B), which can be estimated from observed choices.
To capture the full range of possible human behaviors, you introduce the concept of a preference distribution. This distribution describes the likelihood of each possible ranking or selection among a set of options. Such distributions allow you to account for both consistent and inconsistent preferences across different individuals or even within the same individual over time.
Danke für Ihr Feedback!
Fragen Sie AI
Fragen Sie AI
Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen
Großartig!
Completion Rate verbessert auf 11.11
Modeling Human Preferences: Distributions and Noise
Swipe um das Menü anzuzeigen
When you seek to align machine learning systems with human values, you must formally represent human preferences. At the most basic level, a preference relation describes when a human prefers one outcome over another. Formally, if you have two options, A and B, the relation A≻B means "A is preferred to B." In practice, human choices are rarely deterministic; instead, they exhibit variability due to uncertainty, ambiguity, or other factors. This motivates the use of stochastic choice models, which assign probabilities to each possible choice rather than treating preferences as fixed. For example, you might model the probability that a human prefers A to B as P(A≻B), which can be estimated from observed choices.
To capture the full range of possible human behaviors, you introduce the concept of a preference distribution. This distribution describes the likelihood of each possible ranking or selection among a set of options. Such distributions allow you to account for both consistent and inconsistent preferences across different individuals or even within the same individual over time.
Danke für Ihr Feedback!