TY - JOUR
T1 - Comparing Two Samples Through Stochastic Dominance
T2 - A Graphical Approach
AU - Arza, Etor
AU - Ceberio, Josu
AU - Irurozki, Ekhiñe
AU - Pérez, Aritz
N1 - Publisher Copyright:
© 2022 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Nondeterministic measurements are common in real-world scenarios: the performance of a stochastic optimization algorithm or the total reward of a reinforcement learning agent in a chaotic environment are just two examples in which unpredictable outcomes are common. These measures can be modeled as random variables and compared among each other via their expected values or more sophisticated tools such as null hypothesis statistical tests. In this article, we propose an alternative framework to visually compare two samples according to their estimated cumulative distribution functions. First, we introduce a dominance measure for two random variables that quantifies the proportion in which the cumulative distribution function of one of the random variables stochastically dominates the other one. Then, we present a graphical method that decomposes in quantiles (i) the proposed dominance measure and (ii) the probability that one of the random variables takes lower values than the other. With illustrative purposes, we reevaluate the experimentation of an already published work with the proposed methodology and we show that additional conclusions—missed by the rest of the methods—can be inferred. Additionally, the software package RVCompare was created as a convenient way of applying and experimenting with the proposed framework.
AB - Nondeterministic measurements are common in real-world scenarios: the performance of a stochastic optimization algorithm or the total reward of a reinforcement learning agent in a chaotic environment are just two examples in which unpredictable outcomes are common. These measures can be modeled as random variables and compared among each other via their expected values or more sophisticated tools such as null hypothesis statistical tests. In this article, we propose an alternative framework to visually compare two samples according to their estimated cumulative distribution functions. First, we introduce a dominance measure for two random variables that quantifies the proportion in which the cumulative distribution function of one of the random variables stochastically dominates the other one. Then, we present a graphical method that decomposes in quantiles (i) the proposed dominance measure and (ii) the probability that one of the random variables takes lower values than the other. With illustrative purposes, we reevaluate the experimentation of an already published work with the proposed methodology and we show that additional conclusions—missed by the rest of the methods—can be inferred. Additionally, the software package RVCompare was created as a convenient way of applying and experimenting with the proposed framework.
KW - Cumulative distribution function
KW - Data visualization
KW - First-order stochastic dominance
KW - Random variables
U2 - 10.1080/10618600.2022.2084405
DO - 10.1080/10618600.2022.2084405
M3 - Article
AN - SCOPUS:85134407299
SN - 1061-8600
VL - 32
SP - 551
EP - 566
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 2
ER -