Statistics Seminar

Merle BehrUniversity of California, Berkeley
Finite Alphabet Blind Separation

Wednesday, October 3, 2018 - 4:15pm
Biotech G01

The Statistics Seminar speaker for Wednesday, October 3, 2018, is Merle Behr, a mathematician and statistician, currently working as a Neyman Visiting Assistant Professor in the Department of Statistics at University of California, Berkeley. Before that, she was a postdoctoral researcher in the group of Prof. Axel Munk at the Institute for Mathematical Stochastics, University of Göttingen. In her work she studies change-point problems employing multiscale methods and blind source separation with certain finite alphabet constraints.

Title: Finite Alphabet Blind Separation

Abstract: We consider a particular blind source separation problem, where the sources are assumed to only take values in a known finite set, denoted as the alphabet. More precisely, one observes M linear mixtures of m signals (sources) taking values in the known finite alphabet. The aim in this model is to identify the unknown mixing weights and sources, including the number of sources, from noisy observations of the mixture.

Finite Alphabet Blind Separation (FABS) occurs in many areas. For instance, in digital communication with mixtures of multilevel pulse amplitude modulated digital signals, but also in cancer genetics, where one aims to infer copy number aberrations of different clones in a tumor.

First, we provide necessary and sufficient identifiability conditions and obtain exact recovery within a neighborhood of the mixture.

Second, we consider FABS for single mixtures M=1 within a change-point regression setting with Gaussian error. We provide uniformly honest lower confidence bounds and estimators with exponential convergence rates for the number of source components. With this at hand, we obtain consistent estimators with optimal convergence rates (up to log-factors) and asymptotically uniform honest confidence statements for the weights and the sources. We explore our procedure with a data example from cancer genetics.

Third, we consider multivariate FABS, where several mixtures M > 1 are observed. For Gaussian error we show that the least squares estimator (LSE) attains the minimax rates, both for the prediction and for the estimation error. As computation of the LSE is not feasible, an efficient algorithm is proposed. Simulations suggest that this approximates the LSE well.