TY - JOUR
T1 - Tail index estimation for discrete heavy-tailed distributions with application to statistical inference for regular markov chains
AU - Bertail, Patrice
AU - Clémençon, Stephan
AU - Fernández, Carlos
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/9/1
Y1 - 2025/9/1
N2 - It is the purpose of this paper to investigate the issue of estimating the regularity index β>0 of a discrete heavy-tailed r.v. S, i.e. a r.v. S valued in N∗ such that P(S>n)=L(n)·n-β for all n≥1, where L:R+∗→R+ is a slowly varying function. Such discrete probability laws, referred to as generalized Zipf’s laws sometimes, are commonly used to model rank-size distributions after a preliminary range segmentation in a wide variety of areas such as e.g. quantitative linguistics, social sciences or information theory. As a first go, we consider the situation where inference is based on independent copies S1,…,Sn of the generic variable S. The estimator β^ we propose can be derived by means of a suitable reformulation of the regularly varying condition, replacing S’s survivor function by its empirical counterpart. Under mild assumptions, a non-asymptotic bound for the deviation between β^ and β is established, as well as limit results (consistency and asymptotic normality). Beyond the i.i.d. case, the inference method proposed is extended to the estimation of the regularity index of a regenerative β-null-recurrent Markov chain. Since the parameter β can be then viewed as the tail index of the (regularly varying) distribution of the return time of the chain X to any (pseudo-) regenerative set, in this case, the estimator is constructed from the successive regeneration times. Because the durations between consecutive regeneration times are asymptotically independent, we can prove that the consistency of the estimator promoted is preserved. In addition to the theoretical analysis carried out, simulation results provide empirical evidence of the relevance of the inference technique proposed.
AB - It is the purpose of this paper to investigate the issue of estimating the regularity index β>0 of a discrete heavy-tailed r.v. S, i.e. a r.v. S valued in N∗ such that P(S>n)=L(n)·n-β for all n≥1, where L:R+∗→R+ is a slowly varying function. Such discrete probability laws, referred to as generalized Zipf’s laws sometimes, are commonly used to model rank-size distributions after a preliminary range segmentation in a wide variety of areas such as e.g. quantitative linguistics, social sciences or information theory. As a first go, we consider the situation where inference is based on independent copies S1,…,Sn of the generic variable S. The estimator β^ we propose can be derived by means of a suitable reformulation of the regularly varying condition, replacing S’s survivor function by its empirical counterpart. Under mild assumptions, a non-asymptotic bound for the deviation between β^ and β is established, as well as limit results (consistency and asymptotic normality). Beyond the i.i.d. case, the inference method proposed is extended to the estimation of the regularity index of a regenerative β-null-recurrent Markov chain. Since the parameter β can be then viewed as the tail index of the (regularly varying) distribution of the return time of the chain X to any (pseudo-) regenerative set, in this case, the estimator is constructed from the successive regeneration times. Because the durations between consecutive regeneration times are asymptotically independent, we can prove that the consistency of the estimator promoted is preserved. In addition to the theoretical analysis carried out, simulation results provide empirical evidence of the relevance of the inference technique proposed.
KW - Generalized discrete Pareto distribution
KW - Nonparametric estimation
KW - Null-recurrent Markov chain
KW - Regularity index
KW - Zipf’s law
UR - https://www.scopus.com/pages/publications/105007067210
U2 - 10.1007/s11749-025-00975-9
DO - 10.1007/s11749-025-00975-9
M3 - Article
AN - SCOPUS:105007067210
SN - 1133-0686
VL - 34
SP - 691
EP - 713
JO - Test
JF - Test
IS - 3
ER -