duelpy.stats
¶
Utilities for implementation of PB-MABA algorithms.
Submodules¶
Package Contents¶
Classes¶
An estimation of a preference matrix based on samples. |
-
class
PreferenceEstimate
(num_arms: int, confidence_radius: duelpy.stats.confidence_radius.ConfidenceRadius = TrivialConfidenceRadius())¶ An estimation of a preference matrix based on samples.
Consider this example:
>>> preference_estimate = PreferenceEstimate( ... num_arms = 3, ... confidence_radius=lambda num_samples: 1/(num_samples + 1) ... )
We use a
TrivialConfidenceRadius
for easy illustration. Note that the results are not accurate, you probably want to use something likeHoeffdingConfidenceRadius
in practice.In the beginning, nothing is known yet.
>>> preference_estimate.get_mean_estimate_matrix() array([[0.5, 0.5, 0.5], [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]) >>> preference_estimate.get_upper_estimate_matrix() array([[0.5, 1. , 1. ], [1. , 0.5, 1. ], [1. , 1. , 0.5]]) >>> preference_estimate.get_lower_estimate_matrix() array([[0.5, 0. , 0. ], [0. , 0.5, 0. ], [0. , 0. , 0.5]])
If we enter a sampled win, the estimated probability of that arm increases and the inverse probability decreases accordingly.
>>> preference_estimate.enter_sample(0, 1, first_won=True) >>> preference_estimate.get_mean_estimate_matrix() array([[0.5, 1. , 0.5], [0. , 0.5, 0.5], [0.5, 0.5, 0.5]])
When entering more samples, the probability keeps adjusting. Let’s make it one win out of four.
>>> preference_estimate.enter_sample(0, 1, first_won=False) >>> preference_estimate.enter_sample(0, 1, first_won=True) >>> preference_estimate.enter_sample(0, 1, first_won=True) >>> preference_estimate.get_mean_estimate_matrix() array([[0.5 , 0.75, 0.5 ], [0.25, 0.5 , 0.5 ], [0.5 , 0.5 , 0.5 ]])
Meanwhile the confidence intervals have adjusted as well:
>>> preference_estimate.get_upper_estimate_matrix() array([[0.5 , 0.95, 1. ], [0.45, 0.5 , 1. ], [1. , 1. , 0.5 ]]) >>> preference_estimate.get_lower_estimate_matrix() array([[0.5 , 0.55, 0. ], [0.05, 0.5 , 0. ], [0. , 0. , 0.5 ]])
And if we tighten the confidence radius, they get changed yet again:
>>> preference_estimate.set_confidence_radius(lambda num_samples: 1/(6 * num_samples + 1)) >>> preference_estimate.get_upper_estimate_matrix() array([[0.5 , 0.79, 1. ], [0.29, 0.5 , 1. ], [1. , 1. , 0.5 ]]) >>> preference_estimate.get_lower_estimate_matrix() array([[0.5 , 0.71, 0. ], [0.21, 0.5 , 0. ], [0. , 0. , 0.5 ]])
We can now also sample a complete preference matrix from a beta distribution:
>>> preference_estimate.sample_preference_matrix( ... random_state=np.random.RandomState(42) ... ) array([[0.5 , 0.72606244, 0.4978376 ], [0.27393756, 0.5 , 0.44364733], [0.5021624 , 0.55635267, 0.5 ]])
- Parameters
num_arms – The number of arms in the estimated preference matrix.
confidence_radius – The confidence radius to use when computing confidence intervals.
-
set_confidence_radius
(self, confidence_radius: duelpy.stats.confidence_radius.ConfidenceRadius) → None¶ Set the confidence radius to the given parameter.
- Parameters
confidence_radius – The confidence radius to be set as the new
confidence_radius
.
-
enter_sample
(self, first_arm_index: int, second_arm_index: int, first_won: bool) → None¶ Enter the result of a sampled duel.
- Parameters
first_arm_index – The index of the first arm of the duel.
second_arm_index – The index of the second arm of the duel.
first_won – Whether the first arm won the duel.
-
get_mean_estimate
(self, first_arm_index: int, second_arm_index: int) → float¶ Get the estimate of the win probability of first_arm_index against second_arm_index.
- Parameters
first_arm_index – The first arm of the duel.
second_arm_index – The second arm of the duel.
- Returns
The estimated probability that
first_arm_index
wins againstsecond_arm_index
.- Return type
float
-
get_confidence_interval
(self, first_arm_index: int, second_arm_index: int) → Tuple[float, float]¶ Get the bounds of the confidence interval on the win probability.
- Parameters
first_arm_index – The first arm of the duel.
second_arm_index – The second arm of the duel.
- Returns
The lower and upper bound of the confidence estimate for the probability that
first_arm_index
wins againstsecond_arm_index
.- Return type
Tuple[float, float]
-
get_upper_estimate
(self, first_arm_index: int, second_arm_index: int) → float¶ Get the upper estimate of the win probability of
first_arm_index
againstsecond_arm_index
.- Parameters
first_arm_index – The first arm of the duel.
second_arm_index – The second arm of the duel.
- Returns
The upper bound of the confidence estimate for the probability that
first_arm_index
wins againstsecond_arm_index
.- Return type
float
-
get_lower_estimate
(self, first_arm_index: int, second_arm_index: int) → float¶ Get the lower estimate of the win probability of
first_arm
againstsecond_arm
.- Parameters
first_arm_index – The first arm of the duel.
second_arm_index – The second arm of the duel.
- Returns
The lower bound of the confidence estimate for the probability that
first_arm
wins againstsecond_arm
.- Return type
float
-
get_num_samples
(self, first_arm_index: int, second_arm_index: int) → int¶ Get the number of times a duel between first_arm and second_arms was sampled.
- Parameters
first_arm_index – The first arm of the duel.
second_arm_index – The second arm of the duel.
- Returns
The number of times a duel between the two arms was sampled, regardless of the arm order.
- Return type
int
-
get_radius_matrix
(self) → numpy.array¶ Seed the confidence radius cache and return it.
- Returns
A numpy matrix containing the current confidence radius values.
- Return type
np.array
-
get_mean_estimate_matrix
(self) → duelpy.stats.preference_matrix.PreferenceMatrix¶ Get the current mean estimates as a PreferenceMatrix.
- Returns
The current mean estimate.
- Return type
-
get_upper_estimate_matrix
(self) → duelpy.stats.preference_matrix.PreferenceMatrix¶ Get the current upper estimates as a PreferenceMatrix.
- Returns
The current mean estimate.
- Return type
-
get_lower_estimate_matrix
(self) → duelpy.stats.preference_matrix.PreferenceMatrix¶ Get the current lower estimates as a PreferenceMatrix.
- Returns
The current mean estimate.
- Return type
-
get_pessimistic_copeland_score_estimates
(self) → numpy.array¶ Get pessimistic estimates for every arm’s Copeland score.
This only counts wins that have a probability of above 50% in the pessimistic estimate. Those wins are “certain”, assuming the confidence interval is correct.
-
get_optimistic_copeland_score_estimates
(self) → numpy.array¶ Get optimistic estimates for every arm’s Copeland score.
This counts every win that is considered possible within the confidence interval.
-
sample_preference_matrix
(self, random_state: numpy.random.RandomState) → duelpy.stats.preference_matrix.PreferenceMatrix¶ Sample a preference matrix based on a Beta distribution.
The outcome is a
PreferenceMatrix
object which is initialized from a sampled preference matrix. In this preference matrix, each pairwise preference is drawn from a beta-distribution which is parameterized on the results of prior duels.- Parameters
random_state – A numpy random state.
- Returns
A
PreferenceMatrix
object which is initialized from a preference matrix which is sampled on a Beta distribution.- Return type
-
__str__
(self) → str¶ Produce a string representation of the estimate.