Bump hunting by topological data analysisShow others and affiliations
2017 (English)In: Stat, E-ISSN 2049-1573, Vol. 6, no 1, p. 462-471Article in journal (Refereed) Published
Abstract [en]
A topological data analysis approach is taken to the challenging problem of finding and validating the statistical significance of local modes in a data set. As with the SIgnificance of the ZERo (SiZer) approach to this problem, statistical inference is performed in a multi-scale way, that is, across bandwidths. The key contribution is a twoparameter approach to the persistent homology representation. For each kernel bandwidth, a sub-level set filtration of the resulting kernel density estimate is computed. Inference based on the resulting persistence diagram indicates statistical significance of modes. It is seen through a simulated example, and by analysis of the famous Hidalgo stamps data, that the new method has more statistical power for finding bumps than SiZer.
Place, publisher, year, edition, pages
John Wiley & Sons, 2017. Vol. 6, no 1, p. 462-471
Keywords [en]
Bootstrap, kernel density estimation, mode hunting, persistent homology, SiZer
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:oru:diva-65213DOI: 10.1002/sta4.167ISI: 000441301200019Scopus ID: 2-s2.0-85051252789OAI: oai:DiVA.org:oru-65213DiVA, id: diva2:1185425
Note
Funding Agencies:
Studienstiftung des Deutschen Volkes
Natural Sciences and Engineering Research Council of Canada DG 293180
McIntyre Memorial Fund
US National Science Foundation IIS-1633074
2018-02-242018-02-242018-08-27Bibliographically approved