Efficient block boundaries estimation in block-wise constant matrices: An application to HiC data

Vincent Brault, Julien Chiquet, Céline Lévy-Leduc

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose a novel modeling and a new methodology for estimating the location of block boundaries in a random matrix consisting of a block-wise constant matrix corrupted with white noise. Our method consists in rewriting this problem as a variable selection issue. A penalized least-squares criterion with an ℓ1-type penalty is used for dealing with this problem. Firstly, some theoretical results ensuring the consistency of our block boundaries estimators are provided. Secondly, we explain how to implement our approach in a very efficient way. This implementation is available in the R package blockseg which can be found in the Comprehensive R Archive Network. Thirdly, we provide some numerical experiments to illustrate the statistical and numerical performance of our package, as well as a thorough comparison with existing methods. Fourthly, an empirical procedure is proposed for estimating the number of blocks. Finally, our approach is applied to HiC data which are used in molecular biology for better understanding the influence of the chromosomal conformation on the cells functioning.

Original languageEnglish
Pages (from-to)1570-1599
Number of pages30
JournalElectronic Journal of Statistics
Volume11
Issue number1
DOIs
Publication statusPublished - 1 Jan 2017
Externally publishedYes

Keywords

  • Change-points
  • HiC experiments
  • High-dimensional sparse linear model

Fingerprint

Dive into the research topics of 'Efficient block boundaries estimation in block-wise constant matrices: An application to HiC data'. Together they form a unique fingerprint.

Cite this