##### * *Note that all contributed and special sessions are poster sessions*.

## Monday, June 27

**Monday, June 27, 09:00 – 10:00: ***Plenary talk: Jose Moura*

*Plenary talk: Jose Moura*

**Monday, June 27, 10:00 – 11:30**

**Mon-Ia: Detection and estimation theory I**

*Efficient Distributed Estimation of Inverse Covariance Matrices*-
In distributed systems, communication is a major concern due to issues such as its vulnerability or efficiency. In this paper, we are interested in estimating sparse inverse covariance matrices when samples are distributed into different machines. We address communication efficiency by proposing a method where, in a single round of communication, each machine transfers a small subset of the entries of the inverse covariance matrix. We show that, with this efficient distributed method, the error rates are comparable with estimation in a non-distributed setting, and correct model selection is still possible. Practical performance is shown through simulations.
*Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models*-
We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.
*Measure transformed quasi likelihood ratio test for Bayesian binary hypothesis testing*-
In this paper, a generalization of the Gaussian quasi likelihood ratio test (GQLRT) for Bayesian binary hypothesis testing is developed. The proposed generalization, called measure-transformed GQLRT (MT-GQLRT), selects a Gaussian probability model that best empirically fits a transformed conditional probability measure of the data. By judicious choice of the transform we show that, unlike the GQLRT, the proposed test is resilient to outliers and involves higher-order statistical moments leading to significant mitigation of the model mismatch effect on the decision performance. Under some mild regularity conditions we show that the test statistic of the proposed MT-GQLRT is asymptotically normal. A data driven procedure for optimal selection of the measure transformation parameters is developed that minimizes an empirical estimate of the asymptotic Bayes risk. The MT-GQLRT is applied to signal classification in a simulation example that establishes significantly improved probability of error performance relative to the standard GQLRT.
*Detecting the dimension of the subspace correlated across multiple data sets in the sample poor regime*-
This paper addresses the problem of detecting the number of signals correlated across multiple data sets with small sample support. While there have been studies involving two data sets, the problem with more than two data sets has been less explored. In this work, a rank-reduced hypothesis test for more than two data sets is presented for scenarios where the number of samples is small compared to the dimensions of the data sets.
*Nonparametric estimation of a shot-noise process*-
We propose an efficient method to estimate in a nonparametric fashion the marks’ density of a shot-noise process in presence of pileup from a sample of low-frequency observations. Based on a functional equation linking the marks’ density to the characteristic function of the observations and its derivative, we propose a new time-efficient method using B-splines to estimate the density of the underlying gamma-ray spectrum which is able to handle large datasets used in nuclear physics. A discussion on the numerical computation of the algorithm and its performances on simulated data are provided to support our findings.
*Finite sample performance of least squares estimation in sub-Gaussian noise*-
In this paper we analyze the finite sample performance of the least squares estimator. In contrast to standard performance analysis which uses bounds on the mean square error together with asymptotic normality, our bounds are based on large deviation and concentration of measure results.

This allows for accurate bounds on the tail of the estimator. We show the fast exponential convergence of the number of samples required to ensure accuracy with high probability. We analyze a sub-Gaussian setting with fixed or random mixing matrix of the least squares problem. We provide probability tail bounds on the L infinity norm of the error of the finite sample approximation of the true parameter. Our method is simple and uses simple analysis for L infinity type bounds of the estimation error. The tightness of the bound is studied through simulations. *Weighting a resampled particle in Sequential Monte Carlo*-
The Sequential Importance Resampling (SIR) method is the core of the Sequential Monte Carlo (SMC) algorithms (a.k.a., particle filters). In this work, we point out a suitable choice for weighting properly a resampled particle. This observation entails several theoretical and practical consequences, allowing also the design of novel sampling schemes. Specifically, we describe one theoretical result about the sequential estimation of the marginal likelihood. Moreover, we suggest a novel resampling procedure for SMC algorithms called partial resampling, involving only a subset of the current cloud of particles. Clearly, this scheme attenuates the additional variance in the Monte Carlo estimators generated by the use of the resampling.
*Structure-Induced Complex Kalman Filter for Decentralized Sequential Bayesian Estimation*-
The letter considers a multi-sensor state estimation problem configured in a decentralized architecture where local complex statistics are communicated to the central processing unit for fusion instead of the raw observations. Naive adaptation of the augmented complex statistics to develop a decentralized state estimation algorithm results in increased local computations, and introduces extensive communication overhead, making it practically unattractive. The letter proposes a structure-induced complex Kalman filter framework with reduced communication overhead. In order to further reduce the local computations, the letter proposes a non-circularity criterion which allows each node to examine the non-circularity of its local observations. A local sensor node disregards its extra second-order statistical information when the non-circularity coefficient is small. In cases where the local observations are highly non-circular, an intuitively pleasing circularization approach is proposed to avoid computation and communication of the pseudo-covariance matrices. Simulation results indicate that the proposed structured-induced complex Kalman filter (SCKF) provides significant performance improvements over its traditional counterparts.
*An EM Algorithm for Maximum Likelihood Estimation of Barndorff-Nielsen’s Generalized Hyperbolic Distribution*-
We present an EM algorithm for Maximum Likelihood (ML) estimation of the location, structure matrix, skew or drift, and shape parameters of Barndorff-Nielsen’s Generalized Hyperbolic distribution, which is the Gaussian Location Scale mixture (GLSM) (or Normal Variance Mean Mixture) with Generalized Inverse Gaussian (GIG) scale mixing distribution. We use the GLSM representation along with the closed form posterior expectations possible with the GIG distribution to derive an EM algorithm for computing ML parameter estimates.
*Markov-tree Bayesian Group-sparse Modeling with Wavelets*-
In this paper, we propose a new Markov-tree Bayesian modeling of wavelet coefficients. Based on a group-sparse GSM model with 2-layer cascaded Gamma distributions for the variances, the proposed method effectively exploits both intrascale and interscale relationships across wavelet subbands. To determine the posterior distribution, we apply Variational Bayesian inference with a subband adaptive majorization-minimization method to make the method tractable for large problems.

**Mon-Ib: Signal processing over graphs and networks I**

*Inferring Network Properties From Fixed-Choice Design with Strong and Weak Ties*-
Typically studies of networked systems begin with obtaining information about the network structure. In many settings it is impractical or impossible to directly observe the network, and sampling is used. Sampling the structure of offline social networks is especially costly and time-consuming. Respondents are asked to name close friends and acquaintances (strong and weak ties). However, because a person may have a large number of acquaintances, surveys use a fixed-choice design, where respondents are asked to name a small, fixed number of their weak ties. Surprisingly, studies based on fixed-choice designs then directly use the network derived from the responses without correcting for the bias introduced by fixed-choice sampling. In this paper we demonstrate how to account for fixed-choice sampling when inferring network characteristics. Our approach is based on application of the generalized method of moments. We verify the accuracy of our results via simulation and discuss immediate applications and consequences of our work to existing results.
*Estimating Signals over Graphs via Multi-kernel Learning*-
Estimating functions on graphs finds well-documented applications in machine learning and, more recently, in signal processing. Given signal values on a subset of vertices, the goal is to estimate the signal on the remaining ones. This task amounts to estimating a function (or signal) over a graph. Most existing techniques either rely on parametric signal models or require costly cross-validation. Leveraging the framework of multi-kernel learning, a data-driven non-parametric is developed here. Instead of a single kernel, the algorithm relies on a dictionary of candidate kernels and efficiently selects the most suitable ones by minimizing a convex criterion using a group Lasso module. Numerical tests demonstrate the superior estimation performance of the novel approach over competing alternatives.
*Network Topology Identification from Spectral Templates*-
Network topology inference is a cornerstone problem in statistical analyses of complex systems. In this context, the fresh look advocated here permeates benefits from convex optimization and graph signal processing, to identify the so-termed graph shift operator (encoding the network topology) given only the eigenvectors of the shift. These spectral templates can be obtained, for example, from principal component analysis of a set of graph signals defined on the particular network. The novel idea is to find a graph shift that while being consistent with the provided spectral information, it endows the network structure with certain desired properties such as sparsity. The focus is on developing efficient recovery algorithms along with identifiability conditions for two particular shifts, the adjacency matrix and the normalized graph Laplacian. Application domains include network topology identification from steady-state signals generated by a diffusion process, and design of a graph filter that facilitates the distributed implementation of a prescribed linear network operator. Numerical tests showcase the effectiveness of the proposed algorithms in recovering synthetic and structural brain networks.
*Bayesian Inference of Diffusion Networks with Unknown Infection Times*-
The analysis of diffusion processes in different propagation scenarios often involve estimating variables that are not directly observed in real-world scenarios. These hidden variables include parental relationships, strength of connections between nodes, and the moment of time that the infection happens for each node. In this paper, we propose a framework in which all the three sets of parameters are assumed to be hidden and develop a Bayesian approach to infer them. After justifying the model assumptions, we evaluate the performance efficiency of our proposed approach through numerical simulations on datasets from synthetic and real-world diffusion processes.
*Clustering time-varying connectivity networks by Riemannian geometry: The brain-network case*-
In response to the demand on data-analytic tools that monitor time-varying connectivity patterns within brain networks, the present paper introduces a framework for clustering (unsupervised learning) of dynamically evolving connectivity states of networks. This work advocates learning of network dynamics on Riemannian manifolds, capitalizing on the well-known fact that popular features in statistics enjoy that structure: (Partial) correlations or covariances can be mapped to the manifold of positive (semi-)definite symmetric matrices, while low-rank linear subspaces can be considered as points of the Grassmannian. Sequences of such features, collected over time and across a network, are mapped to sequences of points on a Riemannian manifold, and a sequence that corresponds to a specific state of the network forms a cluster or submanifold. Geometry is exploited in a novel way to demonstrate the rich potential of the proposed learning method for monitoring time-varying network patterns by outperforming state-of-the-art techniques on synthetic brain-network data.
*Temporal Network Tracking based on Tensor Factor Analysis of Graph Signal Spectrum*-
A wide variety of networks, ranging from biological to social, evolve, adapt and change over time. Recent methods employed in the assessment of temporal networks include tracking topological graph metrics, evolutionary clustering, tensor based anomaly methods and, more recently, graph to signal transformations. In this paper, we propose to assess the temporal evolution of networks by first transforming networks into signals through Classical Multidimensional Scaling based on the resistance distance and then constructing a tensor based on the spectra of each signal across time. The proposed method is first evaluated on simulated temporal networks with varying structural properties. Next, the method is applied to temporal functional connectivity networks constructed from multichannel electroencephalogram (EEG) data collected during a study of cognitive control. This analysis shows that the proposed method is more sensitive to changes in the network structure and more robust to variations in edge weights.
*Multitask Diffusion LMS with Optimized Inter-Cluster Cooperation*-
We consider a multitask network where nodes are divided into several connected clusters, with each cluster performing a least mean squares estimation of a different random parameter vector. Inspired by the adapt-then-combine strategy, we propose a multitask diffusion strategy whose mean and mean-square stability can be achieved independent of the inter-cluster cooperation weights. We develop a distributed optimization algorithm that allows each node in the network to locally optimize its inter-cluster cooperation weights. Simulation results demonstrate that our approach leads to a lower average steady-state network MSD, compared with the multitask diffusion strategy using an averaging rule for the inter-cluster cooperation.

**Mon-Ic SS: Random matrices in signal processing and machine learning**

*Robust Shrinkage M-estimators of Large Covariance Matrices*-
Robust high dimensional covariance estimators are considered, comprising regularized (linear shrinkage) modifications of Maronna’s classical M-estimators. Such estimators aim to provide robustness to outliers, while simultaneously giving well-defined solutions under high dimensional scenarios where the number of samples does not exceed the number of variables. By applying tools from random matrix theory, we characterize the asymptotic performance of such estimators when the number of samples and variables grow large together. In particular, our results show that, when outliers are absent, many estimators of the shrinkage-Maronna type share the same asymptotic performance, and for such estimators we present a data-driven method for choosing the asymptotically optimal shrinkage parameter. Although our results assume an outlier-free scenario, simulations suggest that certain estimators perform substantially better than others when subjected to outlier samples.
*Training Performance of Echo State Neural Networks*-
This article proposes a first theoretical performance analysis of the training phase of large dimensional linear echo-state networks. This analysis is based on advanced methods of random matrix theory. The results provide some new insights on the core features of such networks, thereby helping the practitioner when using them.
*Optimal adaptive Normalized Matched Filter for Large Antenna Arrays*-
This paper focuses on the problem of detecting a target in the presence of a compound Gaussian clutter with unknown statistics. To this end, we focus on the design of the adaptive normalized matched filter (ANMF) detector which uses the regularized Tyler estimator (RTE) built from N -dimensional observations x 1 , • • • , x n in order to estimate the clutter covariance matrix. The choice for the RTE is motivated by its possessing two major attributes: first its resilience to the presence of outliers, and second its regularization parameter that makes it more suitable to handle the scarcity in observations. In order to facilitate the design of the ANMF detector, we consider the regime in which n and N are both large. This allows us to derive closed-form expressions for the asymptotic false alarm and detection probabilities. Based on these expressions, we propose an asymptotically optimal setting for the regularization parameter of the RTE that maximizes the asymptotic detection probability while keeping the asymptotic false alarm probability below a certain threshold. Numerical results are provided in order to illustrate the gain of the pro- posed detector over a recently proposed setting of the regularization parameter.
*Linear receivers for Massive MIMO FBMC/OQAM under strong channel frequency selectivity*-
Filterbank Multicarrier (FBMC) modulations based on OQAM (FBMC/OQAM) have become a promising alternative to conventional OFDM because of their higher spectral efficiency and their improved selectivity in the frequency domain. Unfortunately, the orthogonality of these modulations is lost when the channel presents strong frequency selectivity, meaning that it cannot be approximated as frequency flat within each subcarrier bandwidth. In this paper, this effect is analyzed in the massive MIMO setting, whereby the number of transmit and receive antennas is asymptotically large (but not as large as the number of subcarriers). It is formally shown that, under these asymptotic conditions, the output mean squared error (MSE) at each subcarrier converges to a constant independent of the subcarrier index. This was previously referred to as “self-equalization” principle in the FBMC/OQAM literature. It is demonstrated here that this phenomenon is a direct consequence of channel hardening effect in large scale MIMO configurations.
*On the statistical performance of MUSIC for distributed sources*-
This paper addresses the statistical behaviour of the MUSIC method for DoA estimation, in a scenario where each source signal direct path is disturbed by a clutter spreading in an angular neighborhood around the source DoA. In this scenario, it is well-known that subspace methods performance suffers from an additional clutter subspace, which breaks the orthogonality between the source steering vectors and noise subspace. To perform a statistical analysis of the MUSIC DoA estimates, we consider an asymptotic regime in which both the number of sensors and the sample size tend to infinity at the same rate, and rely on classical random matrix theory results. We establish the consistency of the MUSIC estimates and provide numerical results illustrating their performance in this non standard scenario.
*Optimization of the loading factor of regularized estimated spatial-temporal Wiener filters in large system case*-
In this paper, it is established that the signal to interference plus noise ratio (SINR) produced by a trained regularized Wiener spatio-temporal filter can be estimated consistently in the asymptotic regime where the number of receivers and the number of snapshots converge to infinity at the same rate. The optimal regularization parameter is estimated as the argument of the maximum of the estimated SINR. Numerical simulations show that the proposed optimum regularized Wiener filter outperforms the existing regularized spatio-temporal Wiener filters.
*On the Eigenvalue Distribution of Column Sub-sampled Semi-unitary Matrices*-
Random matrix theory is applied in areas of signal processing, communications, and machine learning. One aspect of random matrix theory involves the study of the eigenvalues of random matrices. For example, in communications, the eigenvalues associated with a channel matrix are used in the analysis of channel capacity and in compressive sensing the eigenvalues of sub-matrices of the sampling matrix can be used to study its restricted isometry property. This paper focuses on a theoretical analysis of the limiting empirical spectral distribution of column sub-sampled semi-unitary matrices. A key contribution of this paper is a closed-form expression for the limiting empirical spectral distribution and the identification of the sufficient conditions for this result.

**Monday, June 27, 12:00 – 13:00: ***Plenary talk: Petar Djuric*

*Plenary talk: Petar Djuric*

**Monday, June 27, 15:00 – 16:00: ***Plenary talk: Antonio Torralba*

*Plenary talk: Antonio Torralba*

**Monday, June 27, 16:30 – 18:00**

**Mon-IIa: Machine learning and pattern recognition I**

*Multi-Scale Sparse Coding with Anomaly Detection and Classification*-
We here place a recent joint anomaly detection and classification approach based on sparse error coding methodology into multi-scale wavelet basis framework. The model is extended to incorporate an overcomplete wavelet basis into the dictionary matrix whereupon anomalies at specified multiple levels of scale are afforded equal importance. This enables, for example, subtle transient anomalies at finer scales to be detected which would otherwise be drowned out by coarser details and missed by the standard sparse coding techniques. Anomaly detection in power networks provides a motivating application and tests on a real-world data set corroborates the efficacy of the proposed model.
*Learning Rank Reduced Mappings using Canonical Correlation Analysis*-
Correspondence relations between different views of the same scene can be learnt in an unsupervised manner. We address autonomous learning of arbitrary fixed spatial (point-to-point) mappings. Since any such transformation can be represented by a permutation matrix, the signal model is a linear one, whereas the proposed analysis method, mainly based on Canonical Correlation Analysis (CCA) is based on a generalized eigensystem problem, i.e. a nonlinear operation. The learnt transformation is represented implicitly in terms of pairs of learned basis vectors and does neither use nor require an analytic / parametric expression for the latent mapping. We show how the rank of the signal that is shared among views may be determined from canonical correlations and how the overlapping (=shared) dimensions among the views may be inferred.
*Indian Buffet Process Dictionary Learning for image inpainting*-
Ill-posed inverse problems call for adapted models to define relevant solutions. Dictionary learning for sparse representation is often an efficient approach. In many methods, the size of the dictionary is fixed in advance and the noise level as well as regularization parameters need some tuning. Indian Buffet process dictionary learning (IBP-DL) is a Bayesian non parametric approach which permits to learn a dictionary with an adapted number of atoms. The noise and sparsity levels are also inferred so that the proposed approach is really non parametric: no parameters tuning is needed. This work adapts IBP-DL to the problem of image inpainting by proposing an accelerated collapsed Gibbs sampler. Experimental results illustrate the relevance of this approach.
*Online low-rank subspace learning from incomplete data using rank revealing $\ell_2/\ell_1$ regularization*-
Massive amounts of data (also called
*big data*) generated by a wealth of sources such as social networks, satellite sensors etc., necessitate the deployment of efficient processing tools. In this context, online subspace learning algorithms that aim at retrieving low-rank representations of data constitute a mainstay in many applications. Working with incomplete (partially observed) datums has recently become commonplace. Moreover, the knowledge of the real rank of the sought subspace is rarely at our disposal {\it a priori}. Herein, a novel low-rank subspace learning algorithm from incomplete data is presented. Its main premise is the online processing of incomplete datums along with the imposition of low-rankness on the sought subspace via a sophisticated utilization of the group sparsity inducing $\ell_2/\ell_1$ norm. As is experimentally shown, the resulting scheme is efficient in accurately learning the subspace as well as in unveiling its real rank. *Video denoising via online sparse and low-rank matrix decomposition*-
Video denoising refers to the problem of removing “noise” from a video sequence. Here the term “noise” is used in a broad sense to refer to any corruption or outlier or interference that is not the quantity of interest. In this work, we develop a novel approach to video denoising that is based on the idea that most noisy or corrupted videos can be split into two parts – the approximate “low-rank” layer and the “sparse layer”. We first splitting the given video into these two layers, and then apply an existing state-of-the-art denoising algorithm on each layer. We show, using extensive experiments, that our denoising approach outperforms the state-of-the art denoising algorithms.
*Sparse Multivariate Factor Regression*-
We introduce a sparse multivariate regression algorithm which simultaneously performs dimensionality reduction and parameter estimation. We decompose the coefficient matrix into two sparse matrices: a long matrix mapping the predictors to a set of factors and a wide matrix estimating the responses from the factors. We impose an elastic net penalty on the former and an `1 penalty on the latter. Our algorithm simultaneously performs dimension reduction and coefficient estimation and automatically estimates the number of latent factors from the data. Our formulation results in a non-convex optimization problem, which despite its flexibility to impose effective low-dimensional structure, is difficult, or even impossible, to solve exactly in a reasonable time. We specify a greedy optimization algorithm based on alternating minimization to solve this non-convex problem and provide theoretical results on its convergence and optimality. Finally, we demonstrate the effectiveness of our algorithm via experiments on simulated and real data.
*Binary stable embedding via paired comparisons*-
Suppose that we wish to estimate a vector x from a set of binary paired comparisons of the form “x is closer to p than to q” for various choices of vectors p and q . The problem of estimating x from this type of observation arises in a variety of contexts, including nonmetric multidimensional scaling, “unfolding,” and ranking problems, often because it provides a powerful and flexible model of preference. The main contribution of this paper is to show that under a randomized model for p and q, a suitable number of binary paired comparisons yield a stable embedding of the space of target vectors.
*Group-Sparse Subspace Clustering with Missing Data*-
This paper explores algorithms for subspace clustering with missing data. In many high-dimensional data analysis settings, data points lie in or near a union of subspaces. Subspace clustering is the process of estimating these subspaces and assigning each data point to one of them. However, in many modern applications the data are severely corrupted by missing values. This paper describes two novel methods for subspace clustering with missing data: (a) group-sparse subspace clustering (GSSC), which is based on group-sparsity and alternating minimization, and (b) mixture subspace clustering (MSC), which models each data point as a convex combination of its projections onto all subspaces in the union. Both of these algorithms are shown to converge to a local minimum, and experimental results show that they outperform the previous state-of-the-art, with GSSC yielding the highest overall clustering accuracy.
*Joint segmentation of multiple images with shared classes: a Bayesian nonparametrics approach*-
A combination of the hierarchical Dirichlet process (HDP) and the Potts model is proposed for the joint segmentation/classification of a set of images with shared classes. Images are first divided into homogeneous regions that are assumed to belong to the same class when sharing common characteristics. Simultaneously, the Potts model favors configurations defined by neighboring pixels belonging to the same class. This HDP-Potts model is elected as a prior for the images, which allows the best number of classes to be selected automatically. A Gibbs sampler is then designed to approximate the Bayesian estimators, under a maximum a posteriori (MAP) paradigm. Preliminary experimental results are finally reported using a set of synthetic images.
*Fast Convergent Algorithms for Multi-Kernel Regression*-
Kernel ridge regression plays a central role in various signal processing and machine learning applications. Suitable kernels are often chosen as linear combinations of “basis kernels” by optimizing criteria under regularization constraints. Although such approaches offer reliable generalization performance, solving the associated min-max optimization problems face major challenges, especially with big data inputs. After analyzing the key properties of a convex reformulation, the present paper introduces an efficient algorithm based on a generalization of Nesterov’s acceleration method, which achieves order-optimal convergence rate among first-order methods. Closed-form updates are derived for common regularizers. Experiments on real datasets corroborate considerable speedup advantages over competing algorithms.

**Mon-IIb SS: Advanced robust techniques for signal processing applications**

*A robust signal subspace estimator*-
An original estimator of the orthogonal projector onto the signal subspace is proposed. This estimator is derived as the maximum likelihood estimator for a model of sources plus orthogonal outliers, both with varying power (modeled by Compound Gaussians process), embedded in a white Gaussian noise. Validity and interest – in terms of performance and robustness – of this estimator is illustrated through simulation results on a low rank STAP filtering application.
*The impact of unknown extra parameters on scatter matrix estimation and detection performance in complex t-distributed data*-
Scatter matrix estimation and hypothesis testing in Complex Elliptically Symmetric (CES) distributions often relies on the knowledge of additional parameters characterizing the distribution at hand. In this paper, we investigate the performance of optimal estimation and detection algorithms exploiting low-complexity but suboptimal estimates of the extra parameters under the assumption of t-distributed data. Their performance is also compared with that of robust algorithms, which do not rely on such estimates.
*Mean Square Error performance of sample mean and sample median estimators*-
Based on the Ziv-Zakai methodology to bound estimators, we derived an estimation bound able to predict the mean square error degradation due to model mismatches. In this article, we build upon this result to provide a performance comparison between mean and median estimators in the presence of outliers. The latter is well know to be statistically more robust than the mean in the presence of outliers. Here we show this superiority by comparing their theoretical error bounds. Analytical results are obtained, which are validated by computer simulations.
*A robust estimation approach for fitting a PARMA model to real data*-
This paper proposes an estimation approach of the Whittle estimator to fit periodic autoregressive moving average (PARMA) models when the process is contaminated with additive outliers and/or has heavy-tailed noise. It is derived by replacing the ordinary Fourier transform with the non-linear $M$-regression estimator in the harmonic regression equation that leads to the classical periodogram. A Monte Carlo experiment is conducted to study the finite sample size of the proposed estimator under the scenarios of contaminated and non-contaminated series. The proposed estimation method is applied to fit a PARMA model to the sulfur dioxide (SO$_2$) daily average pollutant concentrations in the city of Vit\’oria (ES), Brazil.
*Recursive Bayesian Tracking in big-data: Analysis of Estimation Accuracy with respect to Sensor Reliability*-
Recursive bayesian filtering methods provide a well-understood class of techniques for `tracking’ the behavior of a dynamic system that is observed via a sequence of noisy measurements. While tracking performance is defined by the estimation accuracy, the latter in turn depends heavily on what’s being measured by the sensors and how accurately these measurements are modeled. In particular, with statistical signal processing community taking an interest in big-data and potential application of recursive bayesian tracking methods therein, one must clearly understand the ramifications of using `non-ideal’ sensors on estimation accuracy. The existing approach to characterizing the system states and observations solely via a measurement model may turn out to be inadequate in these applications, mainly due to the difficulties associated with capturing highly uncertain, imperfect and subject nature of these environments. By deriving from first principles to include explicit sensor reliability terms into estimation equations, we explore the impact of non-ideal sensors on estimation accuracy. Multiple non-ideal sensor case is also explore. A numerical example is utilized for illustration of results.
*Automatic diagonal loading for Tyler’s robust covariance estimator*-
An approach of regularizing Tyler’s robust M-estimator of the covariance matrix is proposed. We also provide an automatic choice of the regularization parameter in the high-dimensional regime. Simulations show its advantage over the sample covariance estimator and Tyler’s M-estimator when data is heavy-tailed and the number of samples is small. Compared with the previous approaches of regularizing Tyler’s M-estimator, our approach has a similar performance and a much simpler way of choosing the regularization parameter automatically.

**Mon-IIc SS: Recent advances in Monte Carlo methods for multi-dimensional signal processing and machine learning**

*NLOS Mitigation in TOA-based Indoor Localization by nonlinear filtering under Skew t-distributed measurement noise*-
Wireless localization by time-of-arrival (TOA) measurements is typically corrupted by non-line-of-sight (NLOS) conditions, causing biased range measurements that can degrade the overall positioning performance of the system. In this article, we propose a localization algorithm that is able to mitigate the impact of NLOS observations by employing a heavy-tailed noise statistical model. Modeling the observation noise by a skew $t$-distribution allows us to, on the one hand, employ a computationally light sigma-point Kalman filtering method while, on the other hand, be able to effectively characterize the positive skewed non-Gaussian nature of TOA observations under LOS/NLOS conditions. Numerical results show the enhanced performance of such approach.
*A Partially Collapsed Gibbs Sampler with Accelerated Convergence for EEG Source Localization*-
This paper addresses the problem of designing efficient sampling moves in order to accelerate the convergence of MCMC methods. The Partially collapsed Gibbs sampler (PCGS) takes advantage of variable reordering, marginalization and trimming to accelerate the convergence of the traditional Gibbs sampler. This work studies two specific moves which allow the convergence of the PCGS to be further improved. It considers a Bayesian model where structured sparsity is enforced using a multivariate Bernoulli Laplacian prior. The posterior distribution associated with this model depends on mixed discrete and continuous random vectors. Due to the discrete part of the posterior, the conventional PCGS gets easily stuck around local maxima. Two Metropolis-Hastings moves based on multiple dipole random shifts and inter-chain proposals are proposed to overcome this problem. The resulting PCGS is applied to EEG source localization. Experiments conducted with synthetic data illustrate the effectiveness of this PCGS with accelerated convergence.
*Multiple Importance Sampling with Overlapping Sets of Proposals*-
In this paper, we introduce multiple importance sampling (MIS) approaches with overlapping (i.e., non-disjoint) sets of proposals. We derive a novel weighting scheme, based on the deterministic mixture methodology, that leads to unbiased estimators. The proposed framework can be seen as a generalization of other well-known MIS algorithms available in the literature. Furthermore, it allows to achieve any desired trade-off between the variance of the estimators and the computational complexity through the definition of the sets of proposals. Preliminary numerical results on a bimodal target density show the good performance of the proposed approach.
*An improved SIR-based sequential Monte Carlo algorithm*-
Sequential Monte Carlo (SMC) algorithms are based on importance sampling (IS) techniques. Resampling has been introduced as a tool for fighting the weight degeneracy problem. However, for a fixed sample size N, the resampled particles are dependent, are not drawn exactly from the target distribution, nor are weighted properly. In this paper, we revisit the resampling mechanism and propose a scheme where the resampled particles are (conditionally) independent and weighted properly. We validate our results via simulations.
*Sticky proposal densities for adaptive MCMC methods*-
Monte Carlo (MC) methods are commonly used in Bayesian signal processing to address complex inference problems. The performance of any MC scheme depends on the similarity between the proposal (chosen by the user) and the target (which depends on the problem). In order to address this issue, many adaptive MC approaches have been developed to construct the proposal density iteratively. In this paper, we focus on adaptive Markov chain MC (MCMC) algorithms, introducing a novel class of adaptive proposal functions that progressively “stick” to the target. This proposed class of sticky MCMC methods converge very fast to the target, thus being able to generate virtually independent samples after a few iterations. Numerical simulations illustrate the excellent performance of the sticky proposals when compared to other adaptive and non-adaptive schemes.
*Sequential Monte Carlo methods under model uncertainty*-
We propose a Sequential Monte Carlo (SMC) method for filtering and prediction of time-varying signals under model uncertainty. Instead of resorting to model selection, we fuse the information from the considered models within the proposed SMC method. We achieve our goal by dynamically adjusting the resampling step according to the posterior predictive power of each model, which is updated sequentially as we observe more data. The method allows the models with better predictive powers to explore the state space with more resources than models lacking predictive power. This is done autonomously and dynamically within the SMC method. We show the validity of the presented method by evaluating it on an illustrative application.
*Use of Particle Filtering and MCMC for Inference in Probabilistic Acoustic Tube Model*-
Speech modeling has a wide range of applications in speech processing. Probabilistic Acoustic Tube (PAT) is a probabilistic generative model for speech which has a potential advantage in many speech processing tasks. In this paper, we model AM-FM effect in voiced source via autoregressive process. Based on Auxiliary Particle Filtering (APF) and Taylor expansion assisted Markov Chain Monte Carlo (MCMC), we successfully develop an effective inference algorithm for such improved and complicated model, which produces satisfactory performance in our experiments.

## Tuesday, June 28

**Tuesday, June 28, 09:00 – 10:00: ***Plenary talk: Wolfgang Utschick*

*Plenary talk: Wolfgang Utschick*

**Tuesday, June 28, 10:00 – 11:15**

**Tue-Ia: Array processing, radar and sonar**

*Online Angle of Arrival Estimation in the Presence of Mutual Coupling*-
A novel algorithm for estimating the Angles of Arrival (AoA) of multiple sources in the presence of mutual coupling is derived. We first formulate an “Equality Constrained Quadratic Optimisation” problem, then derive a suitable MUSIC-like algorithm to solve the aforementioned problem, and thus obtain good estimates of the AoA parameters. Identifiability conditions of the proposed algorithm are also derived. Finally, simulation results demonstrate the Root-Mean-Square Error (RMSE) performance of the algorithm as a function of Signal-to-Noise Ratio (SNR) and number of snapshots, with comparison to an existing method.
*Robust Adaptive Subspace Detection in Impulsive Noise*-
This paper addresses the design of the Adaptive Subspace Matched Filter (ASMF) detector in the presence of compound Gaussian clutters and a mismatch in the array steering vector. In particular, we consider the case wherein the ASMF uses the regularized Tyler estimator (RTE) to estimate the clutter covariance matrix. Under this setting, a major question that needs to be addressed concerns the setting of the threshold and the regularization parameter. To answer this question, we consider the regime in which the number observations used to estimate the RTE and their dimensions grow large together. Recent results from random matrix theory are then used in order to approximate the false alarm and detection probabilities by deterministic quantities. The latter are optimized in order to maximize an upper bound on the asymptotic detection probability while keeping the asymptotic false alarm probability at a fixed rate.
*A sparse approach for DOA estimation with a multiple spatial invariances sensor array*-
In this paper, we introduce a sparse direction-of-arrival (DOA) estimation algorithm for sensor arrays presenting multiple scales of spatial invariance. We exploit the Khatri-Rao structure of the over-complete steering vector dictionary, corresponding to this array geometry, in order to devise a computationally efficient sparse estimation approach. This approach is based on a iterative refinement and pruning strategy of the dictionary. We show, in numerical simulations, that our approach outperforms the state-of-the art approach based on a Candecomp/Parafac (CP) decomposition, proposed by Miron et al. in 2015.
*Improved RARE Methods for DOA Estimation in Uniform Circular Arrays with Unknown Mutual Coupling*-
Mutual coupling between array elements is known to seriously degrade the direction of arrival estimation performance of most super-resolution techniques. The rank-reduction (RARE) methods, which are variations of the MUSIC algorithm, can reduce these performance loss. But their spatial spectra may still be disturbed by the spurious peaks caused by unknown mutual coupling. In this work, we try to mitigate the influence of these peaks to the self-calibration performance of RARE methods. According to our analysis, the spurious peaks can be divided into predictable and unpredictable sets, and the former can be identified and located by utilizing the special structure of mutual coupling matrix. Based on this, three different methods are proposed to improve existing RARE methods. The validity and effectiveness of our analysis is demonstrated via numerical experiments.
*Elevation and azimuth estimation in arbitrary planar mono-static MIMO radar via tensor decomposition*-
Elevation and azimuth estimation in arbitrary planar monostatic multiple-input-multiple-output (MIMO) radar via tensor decomposition is proposed. Transmit beamspace design is used to map a planar array into a Khatri-Rao product of two

perpendicular desired uniform linear arrays within a range bin of interest, and to suppress sidelobes outside of the range bin of interest. Each desired array has rotational invariance property with respect to elevation and azimuth separately. Then

the received MIMO radar data are folded into a fourth-order tensor along each spatial and temporal dimension. A computationally efficient tensor-based target localization method is proposed. Our simulation results demonstrate the effectiveness of the proposed method and its superiority over the matrix-based counterpart. *Comparison of Passive Radar Detectors with Noisy Reference Signal*-
Traditional passive radar systems with a noisy reference signal use the cross-correlation statistic for detection. However, owing to the composite nature of this hypothesis testing problem, no claims can be made about the optimality of this detector. Therefore, exploiting the low-rank structure of most passive radar illuminators, we recently proposed singular value decomposition based detectors that outperform the CC detector. In this paper, we derive the generalized likelihood ratio tests for this signal model and compare with our proposed SVD based detectors. We demonstrate the near CFAR behavior (highly desirable) of our SVD detectors. We show that on the other hand, the GLRT detectors have a varying probability of false alarm with changing reference channel characteristics making it impractical to use them in a passive radar system.
*Fast Convolution Formulations for Radar Detection using LASSO*-
Sparse reconstruction has recently been shown to perform better than conventional detection methods in multi-target scenarios. However, algorithms performing sparse reconstruction have computational complexity that far exceeds that of conventional detection techniques. To bridge this computational gap, we present three methods of exploiting fast operations in the convolution transform model for the detection problem using LASSO. We empirically show that these methods do not compromise on the statistical power provided by LASSO.
*SINR Analysis in Persymmetric Adaptive Processing*-
We study the normalized output signal-to-interference-plus-noise ratio (SINR) of a sample matrix inversion (SMI) beamformer with exploiting a priori information on persymmetric structures in the received signal. An exact expression for the expectation of the normalized output SINR (i.e., average SINR loss) of the persymmetric SMI beamformer is obtained. Simulation results reveal that the exploitation of the persymmetric structure is equivalent to doubling the amount of training data.
*Stochastic Resolution Analysis of Co-prime Arrays in Radar*-
Resolution from co-prime arrays and from a full ULA of the size equal to the virtual size of co-prime arrays is investigated. We take into account not only the resulting beam width but also the fact that fewer measurements are acquired by co-prime arrays. This fact is relevant in compressive acquisition typical for compressive sensing. Our stochastic approach to resolution uses information distances computed from the geometrical structure of data models that is characterized by the Fisher information. The probability of resolution is assessed from a likelihood ratio test by using information distances. Based on this information-geometry approach, we compare stochastic resolution from active co-prime arrays and from the full-size ULA. This novel stochastic resolution analysis is applied in a one-dimensional angle processing. Results demonstrate the suitability in radar-resolution analysis.
*Multiple target tracking in the fully adaptive radar framework*-
The fully adaptive radar framework aims to use some known information, or cognition, about the environment in which the system is deployed to obtain some improvement in its performance, typically either reducing the uncertainty of the obtained results or optimizing the use of the available resources. In this paper, the extension of the fully adaptive radar framework for the case of multiple target tracking is introduced. In order to exemplify the proposed framework use, a simulation is carried out in an scenario comprising multiple targets and a sensor network with resource constraints. Results show a remarkable performance improvement when the proposed fully adaptive radar approach is used.
*Direct Data Domain Based Adaptive Beamforming for FDA-MIMO Radar*-
As frequency diverse array (FDA) introduces range-angle-dependent beamforming, it is capable to handle the range-dependent interference. However, the underlying independent and identically distributed condition of the interference is violated, which induces performance degradation of interference suppression. In this paper, we propose a robust adaptive beamforming approach based on direct data domain technique for the multiple-input multiple-output (MIMO) radar with FDA as transmit array, referred to as FDA-MIMO. In this approach, the data is smoothed once in transmit and receive domains to mitigate the influence of target, which results in three homogeneous samples. In the sequel, the data collected from different pulses is utilized as secondary data. Basically, the interference can be isolated from the target signal in the joint transmit and receive domains of the FDA-MIMO radar. Simulation results demonstrate the effectiveness of the proposed approach.

**Tue-Ib: Detection and estimation theory II**

*Wald-Kernel: A Method for Learning Sequential Detectors*-
We consider the problem of training a binary sequential classifier under an error rate constraint. It is well known that for known densities, accumulating the likelihood ratio statistics is time optimal under a fixed error rate constraint. For the case of unknown densities, we formulate the learning for sequential detection problem as a constrained density ratio estimation problem. Specifically, we show that the problem can be posed as a convex optimization problem using a Reproducing Kernel Hilbert Space (RKHS) representation for the log-density ratio function. The proposed binary sequential classifier is tested on a synthetic data set and four real world data sets, together with previous approaches for density ratio estimation. Our empirical results show that the classifier trained through the proposed technique achieves smaller average sampling cost than previous classifiers proposed in the literature for the same error rate.
*Signal Reconstruction for Multi-source Variable-rate Samples with Autocorrelated Errors in Variables*-
Aggregating data from multiple sensors has become a critical requirement in cyberphysical systems (CPS) to increase the effective sampling rate for signal reconstruction. Depending on the application, these sensors can be geo-distributed, mobile, or only intermittently functional. These factors cause the aggregated sample set to be nonuniformly spaced with varying amounts of data collected per sensor. Due to the nature of how the timing or location measurements are made from the different sensors (e.g., indexed by GPS location), the samples may have significant errors in variables (EIV), where the location error from the different sensors follows an exponential autocorrelation function. In this work we demonstrate how to reconstruct signals using such noisy multi-source, variable-rate (MSVR) data samples, and show that the proposed approach improves the error over existing EIV signal reconstruction algorithms.
*Inferring High-Dimensional Poisson Autoregressive Models*-
Consider observing a series of events associated with a group of interacting nodes in a network, where the interactions among those nodes govern the likelihood of future events. Such data are common in spike trains recorded from biological neural networks, interactions within a social network, and pricing changes within financial networks. Vector autoregressive point processes accurately model these settings and are widely used in practice. This paper addresses the inference of the network structure and autoregressive parameters from such data. A sparsity-regularized maximum likelihood estimator is proposed for a Poisson autoregressive process. While sparsity- regularization is well-studied in the statistics and machine learning communities, common assumptions from that literature are difficult to verify here because of correlations and heteroscedasticity inherent in the problem. Novel performance guarantees characterize how much data must be collected to ensure reliable inference depending on the size and sparsity of the autoregressive parameters, and these bounds are supported by several simulation studies.
*Generalized Legendre Transform Multifractal Formalism for Nonconcave Spectrum Estimation*-
Despite widespread adoption of multifractal analysis as a signal processing tool, most practical multifractal formalisms suffer from a major drawback: since they are based on Legendre transforms, they can only yield concave estimates for multifractal spectrum that are, in most cases, only upper bounds on the (possibly nonconcave) true spectrum. Inspired by ideas borrowed from statistical physics, a procedure is devised for the estimation of not a priori concave spectra that retains the simple and efficient Legendre transform formalism structure. The potential and interest of the proposed procedure are illustrated and assessed on realizations of a synthetic multifractal process, with theoretically known nonconcave multifractal spectrum.
*An Auxiliary Variable Method for Langevin based MCMC algorithms*-
Markov Chain Monte Carlo sampling algorithms are efficient Bayesian tools to explore complicated posterior distributions. However, sampling in large scale problems remains a challenging task since the Markov chain is very sensitive to the dependencies between the signal samples. In this paper, we are mainly interested in Langevin based MCMC sampling algorithms that allow us to speed up the convergence by controlling the direction of sampling and/or exploiting the correlation structure of the target signal. However, these techniques may sometimes fail to explore efficiently the target space because of poor mixing properties of the chain or the high cost of each iteration. By adding some auxiliary variables, we show that the resulting conditional distribution of the target signal is much simpler to explore by using these algorithms. Experiments performed in the context of multicomponent image restoration illustrate that the proposed approach can achieve substantial performance improvement compared with standard algorithms.
*Alternative Effective Sample Size measures for Importance Sampling*-
The Effective Sample Size (ESS) is an important measure of efficiency in the Importance Sampling (IS) technique. A well-known approximation of the theoretical ESS definition, involving the inverse of the sum of the squares of the normalized importance weights, is widely applied in literature. This expression has become an essential piece within Sequential Monte Carlo (SMC) methods, using adaptive resampling procedures. In this work, first we show that this ESS approximation is related to the Euclidean distance between the probability mass function (pmf) described by the normalized weights and the uniform pmf. Then, we derive other possible ESS functions based on different discrepancy measures. In our study, we also include another ESS measure called perplexity, already proposed in literature, that is based on the discrete entropy of the normalized weights. We compare all of them by means of numerical simulations.
*Robust Markov Random Field Outlier Detection and Removal in Subsampled Images*-
Certain imaging technologies, such as fibred optical microscopy, operate with irregularly-spaced sparse sub-samples from their field of view. In this work, we address the problem of data restoration for applications where the observed sub-samples are corrupted by additive observation noise and sparse outliers (such as broken and damaged fibre cores). This problem is formulated as joint outlier detection and de-noising of irregularly sampled data. A fully Bayesian approach is used. A Markov Random Field is considered to capture the intrinsic spatial correlation of the underlying intensity field and binary labels are used to locate the spatial position of the outliers. Markov Chain Monte Carlo is then used to perform Bayesian inference, which uses the posterior distribution associated with the resulting Bayesian model. Simulations conducted on simulated data shows the potential benefits of the proposed method in terms of image reconstruction and outlier identification.
*Translation Invariant DWT based Denoising using Goodness of Fit Test*-
A novel signal denoising method based on discrete wavelet transform (DWT) and goodness of fit (GOF) statistical tests employing empirical distribution function (EDF) statistics is proposed. We formulate the denoising problem into a hypothesis testing problem with a null hypothesis $\mathcal{H}_0$ corresponding to the presence of noise, and alternate hypothesis $\mathcal{H}_1$ representing the presence of \emph{only} desired signal in the samples being tested. The decision process involves GOF tests being applied directly on multiple scales obtained from DWT. Cycle spinning approach is next employed on the denoised data to render translation invariance property to the proposed method. We evaluate the performance of the resulting method against standard and modern wavelet shrinkage denoising methods through extensive repeated simulations performed on standard test signals.
*A wavelet based likelihood ratio test for the homogeneity of Poisson processes*-
Estimating the rate (first-order intensity) of a point process is a task of great interest in the understanding of its nature. In this work we first address the estimation of the rate of an orderly point process on the real line using a multiresolution wavelet expansion approach. Implementing Haar wavelets, we find that in the case of a Poisson process the piecewise constant wavelet estimator of the rate has a scaled Poisson distribution. We apply this result in the design of a likelihood ratio test for a multiresolution formulation of the homogeneity of a Poisson process. We demonstrate this method with simulations and provide Type 1 error and empirical power plots under specific models.

**Tue-Ic SS: Multivariate statistical signal modeling and analysis**

*Distributed Multivariate Regression with Unknown Noise Covariance in the presence of Outliers: An MDL Approach*-
We consider the problem of estimating the coefficients in a multivariate linear regression model by means of a wireless sensor network which may be affected by anomalous measurements. The noise covariance matrices at the different sensors are assumed unknown. Treating outlying samples, and their support, as additional nuisance parameters, the Maximum Likelihood estimate is investigated, with the number of outliers being estimated according to the Minimum Description Length principle. A distributed implementation based on iterative consensus techniques is then proposed, and it is shown effective for managing outliers in the data.
*Optimal transport vs. Fisher-Rao distance between copulas for clustering multivariate time series*-
We present a methodology for clustering N objects which are described by multivariate time series, i.e. several sequences of real-valued random variables. This clustering methodology leverages copulas which are distributions encoding the dependence structure between several random variables. To take fully into account the dependence information while clustering, we need a distance between copulas. In this work, we compare renowned distances between distributions: the Fisher-Rao geodesic distance, related divergences and optimal transport, and discuss their advantages and disadvantages. Applications of such methodology can be found in the clustering of financial assets. A tutorial, experiments and implementation for reproducible research can be found at www.datagrapple.com/Tech.
*Combining EEG source connectivity and network similarity: Application to object categorization in the human brain*-
A major challenge in cognitive neuroscience is to evaluate the ability of the human brain to categorize or group visual stimuli based on common features. This categorization process is very fast and occurs in few hundreds of millisecond time scale. However, an accurate tracking of the spatiotemporal dynamics of large-scale brain networks is still an unsolved issue. Here, we show the combination of recently developed method called ‘dense-EEG source connectivity’ to identify functional brain networks with excellent temporal and spatial resolutions and an algorithm, called SimNet, to compute brain networks similarity. Two categories of visual stimuli were analysed in this study: immobile and mobile. Networks similarity was assessed within each category (intra-condition) and between categories (inter-condition). Results showed high similarity within each category and low similarity between the two categories. A significant difference between similarities computed in the intra and inter-conditions was observed at the period of 120-190ms supposed to be related to visual recognition and memory access. We speculate that these observations will be very helpful toward understanding the object categorization in the human brain from a network perspective.
*A Generalized Multivariate Logistic Model and EM Algorithm based on the Normal Variance Mean Mixture Representation*-
We present an EM algorithm for Maximum Likelihood estimation of the location, scale, and skew, and shape parameters of the z distribution, also known as the generalized logistic function (type IV). We use the Barndorff-Nielsen, Kent, and Sorensen representation of the z distribution as a Gaussian location-scale mixture to derive an EM algorithm for estimating the location, scale, skew, and shape parameters. We use a variational bound on the likelihood function to determine a monotonically converging closed form update for the skew (or drift) parameter. The algorithm also extends naturally to multivariate GLSM estimation using the Kolmogorov-Smirnov mixing density in odd dimensions.
*Approximating Bayesian confidence regions in convex inverse problems*-
Solutions to inverse problems that are ill-conditioned or ill-posed have significant intrinsic uncertainty. Unfortunately, analysing and quantifying this uncertainty is very challenging, particularly in high-dimensional settings. As a result, while most modern statistical signal processing methods achieve impressive point estimation results, they are generally unable to quantify the uncertainty in the solutions delivered. This work presents a new general methodology for approximating Bayesian high-posterior-density (confidence or credibility) regions in inverse problems that are convex and potentially very high-dimensional. A remarkable property of the approximations is that they can be computed very efficiently, even in large-scale problems, by using standard convex optimisation techniques. The proposed methodology is demonstrated on a high-dimensional image restoration problem, where the approximation error is assessed by using proximal Markov chain Monte Carlo as benchmark.
*Balanced Least Squares: Estimation in Linear Systems with Noisy Inputs and Multiple Outputs*-
This paper revisits the linear model with noisy inputs, in which the performance of the total least squares (TLS) method is far from acceptable. Under the assumption of Gaussian noises, the maximum likelihood (ML) estimation of the system response is reformulated as a general balanced least squares (BLS) problem. Unlike TLS, which minimizes the trace of the product between the empirical and inverse theoretical covariance matrices, BLS promotes solutions with similar values of both the empirical and theoretical error covariance matrices. The general BLS problem is reformulated as a semidefinite program with a rank constraint, which can be relaxed in order to obtain polynomial time algorithms. Moreover, we provide new theoretical results regarding the scenarios in which the relaxation is tight, as well as additional insights on the performance and interpretation of BLS. Finally, some simulation results illustrate the satisfactory performance of the proposed method.
*Multimodal Metric Learning with Local CCA*-
In this paper, we address the problem of multimodal signal processing from a kernel-based manifold learning standpoint. We propose a data-driven method for extracting the common hidden variables from two multimodal sets of nonlinear high-dimensional observations. To this end, we present a metric based on local canonical correlation analysis (CCA). Our approach can be viewed both as an extension of CCA to a nonlinear setting as well as an extension of manifold learning to multiple data sets. We test our method in simulations, where we show that it indeed discovers the common variables hidden in high-dimensional nonlinear observations without assuming prior rigid model assumptions.

**Tuesday, June 28, 11:45 – 13:00**

**Tue-IIa: Compressed sensing**

*Oracle Performance Estimation of Bernoulli-distributed Sparse Vectors*-
Compressed Sensing (CS) is now a well-established research area and a plethora of applications has emerged in the last decade. In this context, assuming $N$ available noisy measurements, lower bounds on the Bayesian Mean Square Error (BMSE) for the estimated entries of a sparse amplitude vector are derived in the proposed work for $(i)$ a Gaussian overcomplete measurement matrix and $(ii)$ for a random support, assuming that each entry is modeled as the product of a continuous random variable and a Bernoulli random variable indicating that the current entry is non-zero with probability $P$. A closed-form expression of the Expected CRB (ECRB) is proposed. In the second part, the BMSE of the Linear Minimum MSE ({\rm LMMSE}) estimator is derived and it is proved that the {\rm LMMSE} estimator tends to be statistically efficient in asymptotic conditions, {\em {\em i.e.}}, if product $(1-P)^2 \ {\rm SNR}$ is maximized. This means that in the context of the Gaussian CS problem, the {\rm LMMSE} estimator gathers together optimality in the low noise variance regime and a simple derivation (as opposed to the derivation of the MMSE estimator). This result is original because the {\rm LMMSE} estimator is generally sub-optimal for CS when the measurement matrix is a single realization of a given random process.
*Bound on the estimation grid size for sparse reconstruction in direction of arrival estimation*-
A bound for sparse reconstruction involving both the signal-to-noise ratio (SNR) and the estimation grid size is presented. The bound is illustrated for the case of a uniform linear array (ULA). By reducing the number of possible sparse vectors present in the feasible set of a constrained $\ell_{1}$-norm minimization problem, ambiguities in the reconstruction of a single source under noise can be reduced. This reduction is achieved by means of a proper selection of the estimation grid, which is naturally linked with the mutual coherence of the sensing matrix. Numerical simulations show the performance of sparse reconstruction with an estimation grid meeting the provided bound demonstrating the effectiveness of the proposed bound.
*On Sparse Recovery Using Finite Gaussian Matrices: RIP-Based Analysis*-
We provide a probabilistic framework for the analysis of the restricted isometry constants (RICs) of finite dimensional Gaussian measurement matrices. The proposed method relies on the exact distribution of the extreme eigenvalues of Wishart matrices, or on its approximation based on the gamma distribution. In particular, we derive tight lower bounds on the cumulative distribution functions (CDFs) of the RICs. The presented framework provides the tightest lower bound on the maximum sparsity order, based on sufficient recovery conditions on the RICs, which allows signal reconstruction with a given target probability via different recovery algorithms.
*Bilevel Feature Selection in Nearly-Linear Time*-
Selection of a small subset of informative features from data is a basic technique in signal processing, machine learning, and statistics. Joint selection of entire groups of features is desirable if the data features exhibit shared grouping structures. Bilevel feature selection constitutes a refinement of these ideas, producing a small subset of data features that themselves belong to a small number of feature groups. However, algorithms for bilevel feature selection suffer a computational cost that can be cubic in the size of the data, hence impeding their utility.

In this paper, we propose an approach for bilevel feature selection that resolves this computational challenge. The core component of our approach is a novel fast algorithm for bilevel hard thresholding for a specific non-convex, discrete optimization problem. Our algorithm produces an approximate solution to this problem, but only incurs a nearly-linear running time. We extend this algorithm into a two-stage thresholding method that performs statistically as well as the best available methods for bilevel feature selection, but that also scales extremely well to massive dataset sizes.

*Sparse Error Correction with Multiple Measurement Vectors*-
Exploiting a sparse nature of gross errors in sensor system measurements, feasibility of gross error identification and robust estimation of system state is studied.

Under the practical assumption that potential locations of gross errors stay the same during multiple measurement periods, gross error correction based on multiple measurement vectors is proposed.

Our feasibility analysis shows that the maximum number of identifiable gross errors can double compared to the identification based on a single measurement vector, if gross error values are diverse across different measurement periods.

A convex optimization framework is proposed to identify gross error locations and calculate an accurate state estimate.

The proposed state estimator is applied for power system DC state estimation of IEEE 14-bus network and shown to outperform benchmark techniques that are based on a single measurement vector. *A Refined Analysis for the Sample Complexity of Adaptive Compressive Outlier Sensing*-
The Adaptive Compressive Outlier Sensing (ACOS) method, proposed recently in (Li & Haupt, 2015), is a randomized sequential sampling and inference method designed to locate column outliers in large, otherwise low rank, matrices. While the original ACOS work established conditions on the sample complexity sufficient to enable accurate outlier localization (with high probability), the guarantees required a minimum sample complexity that grew linearly (albeit slowly) in the number of matrix columns. This work presents a refined analysis of the sampling complexity of ACOS that overcomes this limitation; we show that the sample complexity of ACOS is sublinear in both of the matrix dimensions — on the order of the squared rank of the low-rank component plus the number of outliers, times constant and logarithmic factors.
*Low Rank Matrix Recovery From Column-Wise Phaseless Measurements*-
We study the problem of recovering a low-rank matrix, X, from magnitude-only observations of random linear projections of its columns (phaseless measurements). We develop a new approach, called Phase Retrieval Low Rank (PReLow), that borrows ideas from a very recent non-convex phase retrieval approach, called truncated Wirtinger flow (TWF). We show via extensive numerical experiments that, when the rank of X is small compared to its dimensions, PReLow significantly outperforms TWF which operates on one column of X at a time (does not use its low-rank structure).

**Tue-IIb: Signal processing over graphs and networks II**

*Evolutionary Spectral Graph Clustering Through Subspace Distance Measure*-
In the era of Big Data, massive amounts of high-dimensional data are increasingly gathered. Much of this is streaming big data that is either not stored or stored only for short periods of time. Examples include cell phone conversations, texts, tweets, network traffic, changing Facebook connections, mobile video chats or video surveillance data. It is important to be able to reduce the dimensionality of this data in a streaming fashion. One common way of reducing the dimensionality of data is through clustering. Evolutionary clustering provides a framework to cluster the data at each time point such that the cluster assignments change smoothly across time. In this paper, an evolutionary spectral clustering approach is proposed for community detection in dynamic networks. The proposed method tries to obtain smooth cluster assignments by minimizing the subspace distance between consecutive time points, where the subspaces are defined through spectral embedding. The algorithm is evaluated on several synthetic and real data sets, and the results show the improvement in performance over traditional spectral clustering and state of the art evolutionary clustering algorithms.
*Diffusion estimation of mixture models with local and global parameters*-
The state-of-art methods for distributed estimation of mixtures assume the existence of a common mixture model. In many practical situations, this assumption may be too restrictive, as a subset of parameters may be purely local, e.g., if the numbers of observable components differ across the network. To reflect this issue, we propose a new online Bayesian method for simultaneous estimation of local parameters, and diffusion estimation of global parameters. The algorithm consists of two steps. First, the nodes perform local estimation from own observations by means of factorized prior/posterior distributions. Second, a diffusion optimization step is used to merge the nodes’ global parameters estimates. A simulation example demonstrates improved performance in estimation of both parameters sets.
*Translation on Graphs: An Isometric Shift Operator*-
In this letter, we propose a new shift operator for graph signals, enforcing that our operator is isometric. Doing so, we ensure that as many properties of the time shift as possible get carried over. Finally, we show that our operator behaves reasonably for graph signals.
*Identifying Rumor Sources with Different Start Times*-
We study the problem of identifying multiple rumor or infection sources in a network under the susceptible-infected (SI) model. We do not assume that the sources start infection spreading at the same time. We introduce the notion of a quasi-regular tree as the basic model, and an abstract estimator, which includes several of the single source estimators developed in the literature. We develop a general two source joint estimation algorithm based on any abstract estimator, and show that it converges to a local optimum of the estimation function if the underlying network is a quasi-regular tree. We further extend our algorithm to more than two sources, and heuristically to general graphs.
*Estimation of Time-Varying Mixture Models: An Application to Traffic Estimation*-
Time varying mixture models can be a useful tool for modelling complex data collections. However the additional complexity of letting the number of mixture components vary over time adds even more difficulty in inference of the distribution parameters. We propose the automatic k-means algorithm to infer the parameters of these complex, time-varying mixture models. We demonstrate its performance using simulated and real data in a traffic estimation scenario.
*Synchronization for classical blind source separation algorithms in wireless acoustic sensor networks*-
The use of wireless acoustic sensor networks is becoming very popular since they entail many advantages. However, this type of distributed sensor networks has an important drawback for many signal processing algorithms, the synchronization problem. Broadly speaking, in those networks, signals received at the different nodes are not synchronized due to two main factors, the clock problem and the important differences in propagation delays between sources and microphones. In this work we introduce a synchronization solution for mixtures of two and three speech sources in the framework of blind source separation. This proposal of synchronization has a mixture alignment stage prior to apply the separation method. Obtained results demonstrate that this synchronization method aligns speech mixtures correctly since it improves the performance of the classical separation algorithm in terms of both speech quality and speech intelligibility.
*A Hawkes’ eye view of network information flow*-
An important problem that arises in the analysis of many complex networks is to identify the common pathways that enable the flow of information (or other quantities) through the network. This is a particularly challenging problem when the only information observed consists of the timing of events in the network. We develop a framework based on multidimensional Hawkes processes that can be used to determine how events are related. This extends the capability of Hawkes process-based models to infer how network events relate. We then show how a simple dynamic program can exploit this data to recognize chains of events and provide a much deeper insight into the behavior of nodes within the network. Simulations are provided to demonstrate the capabilities and limitations of this framework.

**Tue-IIc SS: Bayesian detection and estimation techniques for radar applications**

*Weiss-Weinstein bound for an abrupt unknown frequency change*-
In this paper, we derive analytical expressions of the Weiss-Weinstein bound (WWB) in the context of observations whose frequency abruptly changes at an unknown time instant. Since both frequencies before and after the change are assumed to be unknown as well, it is appropriate to consider the multiparameter version of the WWB. Furthermore, numerical simulations are provided in order to illustrate the tightness of the proposed bound expressions regarding to the estimates errors.
*Adaptive Waveform Design for Target Detection with Sequential Composite Hypothesis Testing*-
This paper addresses the problem of adaptive waveform design for target detection with composite sequential hypothesis testing. We begin with an asymptotic analysis of the generalized sequential probability ratio test (GSPRT). The analysis is based on Bayesian considerations, similar to the ones used for the Bayesian information criterion (BIC) for model order selection. Following the analysis, a novel test, named penalized GSPRT (PGSPRT), is proposed on the basis of restraining the exponential growth of the GSPRT with respect to (w.r.t.) the sequential probability ratio test (SPRT). The performance measures of the PGSPRT in terms of average sample number (ASN) and error probabilities are also investigated. In the proposed waveform design scheme, the transmit spatial waveform is adaptively determined at each step based on observations in the previous steps. The waveform is determined to minimize the ASN of the PGSPRT. Simulations demonstrate the performance measures of the new algorithm for target detection in a multiple input, single output (MISO) channel.
*Estimation and compensation of I/Q imbalance for FMCW radar receivers*-
This paper deals with adaptive compensation of gain and phase errors possibly present in FMCW radars. In particular, we show that previously-proposed compensation procedures can be related to the LMMSE approach. In addition, we assess by Monte Carlo simulation gain and phase estimators and, eventually, the effectiveness of the compensation procedure.
*Improving Separating Function Estimation Tests Using Bayesian Approaches*-
Separating function estimation tests (SFETs) replace detection problems with an estimation problem. In this paper, we study the relationship between improving the estimation of unknown parameters using Bayesian approaches and the performance of the corresponding SFET. Although the estimation method in the SFET is deterministic, we show that applying Bayesian methods to estimate the rest of unknown parameters that are not involved in the SF provide improved SFET performance. We illustrate this idea using two important problems. In the first example, we consider a sinusoid signal with unknown parameters in white noise. We show that a softmax function using the Fourier transform of the signal is a proper probability density function (pdf) for the frequency to improve the performance of the SFET. In the second example, a more accurate estimation of the unknown parameters of the signal is achieved, using the Minimum Mean Square Error (MMSE) estimation of the random signal corrupted by white noise.
*Velocity ambiguity mitigation of off-grid range migrating targets via Bayesian sparse recovery*-
Within the scope of sparse signal representation, we consider the problem of velocity ambiguity mitigation for wideband radar signal. We present a Bayesian robust algorithm based on a new sparsifying dictionary suited for range-migrating targets possibly straddling range-velocity bins. Numerical simulations on experimental data demonstrate the ability of the proposed algorithm in mitigating velocity ambiguity.
*Bayesian Framework and Radar: On Misspecified Bounds and Radar-Communication Cooperation*-
The Bayesian framework is versatile as it allows for incorporation of prior knowledge or experience in making inference. The case of no prior knowledge at all is likewise seamlessly supported. The Bayesian framework is naturally suited to many fields of science and engineering including the discipline of radar design, analysis, and function. This paper will highlight recent findings on (i) Bayesian bounds under model misspecification that include the impact of using incorrect prior information, and (ii) parameter estimation for a cooperative radar-communication system.
*Detection in Multiple Channels Having Unequal Noise Power*-
A bayesian detector is formulated for the problem of detecting a signal of known rank using data collected at multiple sensors. The noise on each sensor channel is white and gaussian, but its variance is unknown and may be different from channel to channel. A low-SNR assumption that enables approximation of one of the marginalization integrals in the likelihood ratio, yielding a tractable approximate bayesian detector for this regime. Performance of this detector is evaluated and compared to other recently introduced detectors.

**Tue-IId SS: Making sense out of multi-channel physiological data for pervasive health applications**

*Accelerometer-based gait assessment: pragmatic deployment on an international scale*-
Gait is emerging as a powerful tool to detect early disease and monitor progression across a number of pathologies. Typically quantitative gait assessment has been limited to specialised laboratory facilities. However, measuring gait in home and community settings may provide a more accurate reflection of gait performance because: (1) it will not be confounded by attention which may be heightened during formal testing; and (2) it allows performance to be captured over time. This work addresses the feasibility and challenges of measuring gait characteristics with a single accelerometer based wearable device during free-living activity. Moreover, it describes the current methodological and statistical processes required to quantify those sensitive surrogate markers for ageing and pathology. A unified framework for large scale analysis is proposed. We present data and workflows from healthy older adults and those with Parkinson’s disease (PD) while presenting current algorithms and scope within modern pervasive healthcare. Our findings suggested that free-living conditions heighten between group differences showing greater sensitivity to PD, and provided encouraging results to support the use of the suggested framework for large clinical application.
*Cortical Distribution of N400 Potential in Response to Semantic Priming with Visual Non-Linguistic Stimuli*-
We conducted a study on visual semantic priming using related and unrelated image pairs while simultaneously recording electroencephalography (EEG) from 27 scalp electrodes and electrocorticography (ECoG) from a mixture of deep brain and subdural grid/strip electrodes in the left and right hippocampus, the right temporobasal and temporo-lateral cortices, and the left temporal cortex. The EEG data showed a clear centro-parietal, bi-hemispheric N400 effect in response to unrelated image-pairs compared to related ones. Although with ECoG the N400 effect was more widely spread across both hemispheres, compared to linguistic stimuli, it was relatively localized within each ECoG grid as it was present only in some electrodes and, in some cases, even had its polarity reversed. We advocate this could be due to some grids gauging dipoles at different positions when covering sulci and gyri.
*Reconstruction of Brain Activity in EEG / MEG Using Reduced-Rank Nulling Spatial Filter*-
We consider the problem of reconstruction of brain activity from electroencephalography (EEG) and magnetoencephalography (MEG) using spatial filtering (beamforming). We assume the presence of interfering sources, whose activity may be highly correlated with activity of sources to be reconstructed. Such situation causes the celebrated linearly constrained minimum-variance (LCMV) filter to produce erroneous estimates due to source cancelation. This problem is especially acute if signal-to-noise ratio (SNR) is low. We propose a robust reduced-rank nulling spatial filter designed to overcome these drawbacks of the LCMV filter. The proposed filter is a natural reduced-rank extension of the recently introduced nulling spatial filter, allowing efficient implementation of nulling constraints in low SNR regime. It is equipped with rank-selection criterion minimizing mean-square-error (MSE) of the resulting estimate. We verify its improved performance in the challenging conditions described above in comparison with established spatial filtering techniques.
*Efficient Low-Rank Spectrotemporal Decomposition using ADMM*-
Classical time-evolving spectral analysis techniques utilize a sliding window approach that fails to exploit overarching spectrotemporal structures that are known to occur in many real-world signals. In particular, many biological signals have the distinct quality of having few defining spectral characteristics. We propose an algorithm for efficiently estimating a low-rank spectrotemporal decomposition. While existing approaches yield such representations by penalizing a group norm of successive differences, our insight is that a penalty that promotes low-rank yields more flexible representations and an efficient distributed implementation. We demonstrate on simulated time series data and human electroencephalogram (EEG) recordings that this low-rank spectrotemporal decomposition can provide a spectral representation of time series that highlights salient features and reduces the effects of noise.
*Dimensionality Reduction of Sample Covariance Matrices by Graph Fourier Transform for Motor Imagery Brain-Machine Interface*-
An efficient method for dimensionality reduction in classification of multi-class electroencephalogram (EEG) during motor imagery (MI) aiming at brain-machine interfacing is proposed. In this method, the reduction of dimensions is achieved by spectral decomposition of a given graph, which is defined by a geometrical distribution of electrodes on the head surface. The resulting subspace reduces the dimension of EEG signals, and therefore, the size of the sample covariance matrix (SCM) of EEG can also be reduced. The reduction method is combined with a differential geometry-based approach called tangent space mapping (TSM) that can map a SCM in a Riemannian manifold onto an element in an Euclidean space called a tangent space. Thus, any machine learning algorithm that works in the Euclidean space can be applied. Results of two-class and four-class classification of EEG during MI show that the proposed method of dimensionality reduction increases the recognition accuracy even in the case of a training dataset having a small number of elements.
*Estimation of High-Dimensional Connectivity in fMRI Data via Subspace Autoregressive Models*-
We consider the challenge in estimating effective connectivity of brain networks with a large number of nodes from fMRI data. The classical vector autoregressive (VAR) modeling tends to produce unreliable estimates for large dimensions due to the huge number of parameters. We propose a subspace estimator for large-dimensional VAR model based on a latent variable model. We derive a subspace VAR model with the observational and noise process driven by a few latent variables, which allows for a lower-dimensional subspace of the dependence structure. We introduce a fitting procedure by first estimating the latent space by principal component analysis (PCA) of the residuals and then reconstructing the subspace estimators from the PCs. Simulation results show superiority of the subspace VAR estimator over the conventional least squares (LS) under high-dimensional settings, with improved accuracy and consistency. Application to estimating large-scale effective connectivity from resting-state fMRI shows the ability of our method in identifying interesting modular structure of human brain networks during rest.
*Hierarchical Online SSVEP Spelling Achieved With Spatiotemporal Beamforming*-
Steady-State Visual Evoked Potentials (SSVEP) are widely adopted in brain-computer interface (BCI) applications. To increase the number of selectable targets, joint frequency- and phase-coding is sometimes used but it has only been tested in offline settings.

In this study, we report on an online, hierarchical SSVEP spelling application that relies on joint frequency/ phase coded targets, and, in addition, propose a new decoding scheme based on spatiotemporal beamforming combined with time-domain EEG analysis. Experiments on 17 healthy subjects confirm that with our new decoding scheme, accurate spelling can be performed in an online setting, even when using short stimulation lengths (1 sec) and closely separated stimulation frequencies (1 Hz).

**Tuesday, June 28, 15:00 – 16:00: ***Plenary talk: Zoubin Ghahramani*

*Plenary talk: Zoubin Ghahramani*

**Tuesday, June 28, 16:30 – 18:00**

**Tue-IIIa: Machine learning and pattern recognition II**

*Efficient KLMS and KRLS Algorithms: A Random Fourier Feature Perspective*-
We present a new framework for online Least Squares algorithms for nonlinear modeling in RKH spaces (RKHS). Instead of implicitly mapping the data to a RKHS (e.g., kernel trick), we map the data to a finite dimensional Euclidean space, using random features of the kernel’s Fourier transform. The advantage is that, the inner product of the mapped data approximates the kernel function. The resulting “linear” algorithm does not require any form of sparsification, since, in contrast to all existing algorithms, the solution’s size remains fixed and does not increase with the iteration steps. As a result, the obtained algorithms are computationally significantly more efficient compared to previously derived variants, while, at the same time, they converge at similar speeds and to similar error floors.
*Designing Classifier Architectures using Information Theory*-
An architecture for hypothesis testing systems is analyzed using tools from information theory. The classifier architecture consists of a Markov Chain of statistical signal processing blocks. As part of the analysis, a mathematical framework for tradeoff studies among system components is introduced.
*Democratic prior for anti-sparse coding*-
Anti-sparse coding aims at spreading the information uniformly over representation coefficients and can be naturally expressed through an $\ell_{\infty}$-norm regularization. This paper derives a probabilistic formulation of such a problem. To that, a new probability distribution is introduced. This so-called \emph{democratic} distribution is then used as a prior to promote anti-sparsity in a linear Gaussian inverse problem. A Gibbs sampler is designed to generate samples asymptotically distributed according to the joint posterior distribution of interest. To scale to higher dimension, a proximal Markov chain Monte Carlo algorithm is proposed as an alternative to Gibbs sampling. Simulations on synthetic data illustrate the performance of the proposed method for anti-sparse coding on a complete dictionary. Results are compared with the recent deterministic variational FITRA algorithm.
*Group invariant subspace learning for outlier detection*-
In this paper, we present a novel method for detecting outliers when the images are misaligned by action of a finite group. Our approach rests on robust learning of group-invariant subspaces in presence of outliers. By group-invariant subspaces, we mean subspaces of a vector space that are invariant to action of a finite (Abelian) group. Such scenarios naturally arise in computer vision problems when one is interested in shift (translation) or rotation invariant image processing. While the proposed methods are general, we will focus on misalignment by the group of circular shifts on 2-D images and show that our methods are effective in detecting outliers in real data sets (YaleB and MNIST database) and outperform methods that do not take the group-invariance into account
*Human Authentication From Ankle Motion Data Using Convolutional Neural Networks*-
We present a data acquisition and signal processing framework for the authentication of users from their gait signatures (accelerometer and gyroscope data). An ankle-worn inertial measurement unit (IMU) is utilized to acquire the raw motion data, which is pre-processed and used to train a number of signal processing tools, including a convolutional neural network (CNN) for the extraction of features as well as one-class single- and multi-stage classifiers. The CNN is trained (offline and only once) using a representative set of subjects and is then exploited as a universal feature extractor, i.e., to extract relevant features of walking patterns from previously unseen subjects. The one-class classifier is trained on the subject that we intend to authenticate and employed to gauge new motion data. Scores from the one-class classifier are finally fed into a multi-stage decision maker, which performs a sequential decision testing for improved accuracy. The system operates in an online fashion, delivering excellent results, while requiring in the worst case fewer than five walking cycles to reliably authenticate the user.
*Unsupervised segmentation of piecewise constant images from incomplete, distorted and noisy data*-
The paper tackles the problem of piecewise constant image segmentation. A triple degradation model is assumed for the observation system: missing data, non-linear gain and additive noise. The proposed solution follows a Bayesian strategy that yields optimal decisions and estimations. A numerical approach is used to explore the intricate posterior distribution: a Gibbs sampler including a Metropolis-Hastings step. The posterior samples are subsequently used in computing the estimates and the decisions. A first numerical evaluation provided encouraging results despite the triple degradation.
*Multilinear Subspace Clustering*-
In this paper we present a new model and an algorithm for unsupervised clustering of 2-D data such as images. We assume that the data comes from a union of multilinear subspaces (UOMS) model, which is a specific structured case of the much studied union of subspaces (UOS) model. For segmentation under this model, we develop Multilinear Subspace Clustering (MSC) algorithm and evaluate its performance on the YaleB and Olivietti image data sets. We show that MSC is highly competitive with existing algorithms employing the UOS model in terms of clustering performance while enjoying improvement in the computational complexity.
*Order-based Generalized Multivariate Regression*-
In this paper, we consider a generalized multivariate regression problem where the responses are monotonic functions of linear transformations of predictors. We propose a semi-parametric algorithm based on the ordering of the responses which is invariant to the functional form of the transformation function. We prove that our algorithm, which maximizes the rank correlation of responses and linear transformations of predictors, is a consistent estimator of the true coefficient matrix. We also identify the rate of convergence and show that the squared estimation error decays with a rate of $o(1/\sqrt{n})$. We then propose a greedy algorithm to maximize the highly non-smooth objective function of our model and examine its performance through simulations. Finally, we compare our algorithm with traditional multivariate regression algorithms over synthetic and real data.
*Hierarchical Bayesian variable selection in the probit model with mixture of nominal and ordinal responses*-
Multi-class classification problems have been studied for pure nominal and pure ordinal responses. However, there are some cases where the multi-class responses are a mixture of nominal and ordinal. To address this problem we build a hierarchical multinomial probit model with a mixture of both types of responses using latent variables. The nominal responses are each associated to distinct latent variables whereas the ordinal responses have a single latent variable. Our approach first treats the ordinal responses as a single nominal category and then separates the ordinal responses within this category. We introduce sparsity into the model using Bayesian variable selection within the regression in order to improve variable selection classification accuracy. Two indicator vectors (indicating presence of the covariate) are used, one for nominal and one for ordinal responses. We develop efficient posterior sampling. Using simulated data, we compare the classification accuracy of our method to existing ones.
*Jeffreys Prior Regularization for Logistic Regression*-
Logistic regression is a statistical model commonly used for solving classification problems. Maximum likelihood is used train the model parameters. When data from two classes is linearly separable, maximum likelihood is ill-posed for logistic regression. To address this problem as well as to handle over-fitting issues, regularization is considered. A regularization coefficient is used to balance the trade‐off between model complexity and data fit. Cross-validation is commonly used to determine this parameter. In this paper, we develop a regularization framework for logistic regression using Jeffreys prior, which is free of any tuning parameters. Our experiments show that our proposed method achieved promising results in comparison with recent well-known regularization techniques.

**Tue-IIIb: Signal processing for communications**

*A Finite Moving Average Test for Transient Change Detection in GNSS Signal Strength Monitoring*-
Due to the increasing interest in Global Navigation Satellite Systems (GNSS) for safety-critical applications, one of the major challenges to be solved is the provision of integrity to urban environments. In the past years, it has been noted that to do so, the integrity of the received signal must be analyzed with the aim of detecting any local effect disturbing the GNSS signal. Moreover, the detection of such disturbing effects must be done with a bounded delay. This is desirable because the presence of any local effect may cause large position errors. This work addresses the signal integrity problem as a transient change detection problem by proposing a stopping time based on a Finite Moving Average. The statistical performance of this stopping time is investigated and compared, in the context of multipath detection relying on the C/N0 monitoring, to different methods available in the literature. Numerical results are presented in order to assess their performance.
*Statistical Analysis and Optimization of FFR/SFR-aided OFDMA-based Multi-cellular Networks*-
Interference coordination techniques are incorporated in OFDMA-based multi-cellular networks allowing near universal frequency reuse while preserving reasonably high spectral efficiencies over the whole coverage area. Two very representative strategies are fractional frequency reuse (FFR) and soft frequency reuse (SFR), which are deemed to play a key role in current and next generation networks. This paper presents an statistical characterization of FFR/SFR-aided networks that is subsequently used to optimize various operational parameters. The proposed design is capable of trading off throughput performance and fairness by suitably dimensioning the inner and outer cellular areas, the frequency allocation to each of these regions and the corresponding transmit power.
*Analog joint source channel coding over MIMO fading channels with imperfect CSI*-
In this work, analog Joint Source Channel Coding (JSCC) techniques are considered for the transmission of independent data over fading Multiple Input Multiple Output (MIMO) channels assuming imperfect Channel State Information (CSI). In general, analog JSCC schemes require channel knowledge for both the design of the receive filter and the optimization of the encoder parameters. The inaccuracy of this information leads to a severe performance degradation. Robust strategies are required to mitigate the impact of the imperfect CSI. In this work, a statistical characterization of the channel estimation error is employed to design a robust linear MMSE receiver and to select adequate encoder parameters. Simulation results are presented to evaluate the performance loss due to imperfect CSI and to illustrate the gain of the proposed robust strategy. In addition, the optimal distortion-cost tradeoff with imperfect CSI is determined.
*A new approach for solving anti-jamming games in stochastic scenarios as pursuit-evasion games*-
We solve a communication problem between a UAV and a set of relays, in the presence of a jamming UAV, using differential game theory tools. The standard solution involves a set of coupled Bellman equations which are hard to solve. We propose a new approach in which this kind of games can be approximated as pursuit-evasion games. The problem is posed in terms of optimizing capacity and it is approximated as a zero-sum, pursuit-evasion game. This game is solved using a set of differential equations known as Isaacs equations and simulations are run in order to validate the results.
*Weighted Sum Rate Maximization of MISO Interference Broadcast Channels via Difference of Convex Functions Programming: A Large System Analysis*-
The weighted sum rate (WSR) maximizing linear precoder algorithm is studied in large correlated multiple-input single-output (MISO) interference broadcast channels (IBC). We consider an iterative WSR design via successive convex approximation as in [1], [2] and [3], focusing on the version in [3]. We propose an asymptotic approximation of the signal-to-interference plus noise ratio (SINR) at every iteration. Simulations show that the asymptotic approximations are accurate.
*Recursive End-To-End Distortion Estimation for Error-Resilient Adaptive Predictive Compression Systems*-
Linear prediction is widely used in speech, audio and video coding systems. Predictive coders often operate over unreliable channels or networks prone to packet loss, wherein errors propagate through the prediction loop and may catastrophically degrade the reconstructed signal at the decoder. To mitigate this problem, end-to-end distortion (EED) estimation, accounting for error propagation and concealment at the decoder, has been developed for video coding, and enables optimal rate-distortion (RD) decisions at the encoder. However, this approach was limited to the video coder’s simple setting of a single tap constant coefficient temporal predictor. This paper considerably generalizes the framework to account for: i) high order prediction filters, and ii) filter adaptation to local signal statistics. Specifically, we propose to simultaneously track the decoder statistics of the reconstructed signal and the prediction parameters, which enable effective estimation of the overall EED. We first demonstrate the accuracy of the EED estimate in comparison to extensive simulation of transmission through a lossy network. Finally, experimental results demonstrate how this EED estimate can be leveraged, by an encoder with short and long term linear prediction, to improve RD decisions and achieve major performance gains.
*Study of Statistical Robust Closed Set Speaker Identification with Feature and Score-Based Fusion*-
In this paper, the statistical combination of Power Normalization Cepstral Coefficient (PNCC) and Mel Frequency Cepstral Coefficient (MFCC) features in robust closed set speaker identification is studied. Feature normalization and warping together with late score-based fusion are also exploited to improve performance in the presence of channel and noise effects. In addition, combinations of score and feature-based approaches are considered with early and/or late fusion; these systems use different feature dimensions (16, 32). A 4th order G.712 type IIR filter is employed to represent handset degradation in the channel. Simulation studies based on the TIMIT database confirm the improvement in Speaker Identification Accuracy (SIA) through the combination of PNCC and MFCC features in the presence of handset and Additive White Gaussian Noise (AWGN) effects.
*Analog Distributed Coding of Correlated Sources for Fading Multiple Access Channels*-
In this work, we address the analog transmission of correlated information over fading Multiple Access Channels (MACs) using analog Joint Source Channel Coding (JSCC). We consider module-like mappings to encode the source data and the utilization of different orthogonal access schemes. The user information is individually mapped at each transmitter and the receiver exploits the source correlation and the properties of the module mappings to decode the received symbols. We also propose a Maximum-A-Posteriori (MAP) method which achieves similar performance to that of the optimal decoding with significantly lower complexity. The obtained results confirm the potential of analog JSCC techniques to transmit correlated data over fading MACs.
*Generalized Integration techniques for high-sensitivity GNSS receivers affected by oscillator phase noise*-
This paper addresses the use of generalized correlations in the context of High-Sensitivity Global Navigation Satellite System (HS-GNSS) receivers. Generalized correlations are also referred to as post-detection integration (PDI) techniques or simply as non-coherent integration methods. The contributions of this work are twofold. On the one hand, a novel PDI method is presented, which improves the performance of methods found in the literature for small errors of the frequency offset. On the other hand, an exhaustive comparative performance analysis is provided between the proposed technique and the existing ones in the presence of phase noise coming from the local oscillator. To this end, results have been obtained for two different clocks, namely a temperature compensated crystal oscillator (TCXO) and an oven-controlled crystal oscillator (OCXO). In both cases, the proposed technique outperforms the existing ones.
*Measurement Matrix Design For Compressive Sensing With Side Information at the Encoder*-
We study the problem of measurement matrix design for Compressive Sensing (CS) when the encoder has access to side information, a signal analogous to the signal of interest. We propose a novel design scheme to incorporate the side information into the acquisition process in order to reduce the number of encoding measurements, while still allowing perfect signal reconstruction at the decoder. We analyse the reconstruction performance of the resulting CS system assuming the decoder reconstructs the signal via Basis Pursuit. Finally, we leverage Gaussian width related tools to establish a tight theoretical bound for the number of required measurements. Extensive numerical experiments not only validate our approach, but also demonstrate that it requires fewer measurements than alternative designs, such as an i.i.d. Gaussian matrix.

**Tue-IIIc SS: Statistical signal processing and learning in smart grid**

*An Approximation Algorithm for future wind scenarios*-
The success of stochastic optimization methods in helping power systems operations cope with increasing penetration of wind and solar power rests on the effective construction of scenario trees as approximations of the true probability space of the renewable power stochastic process. In this work, while analyzing the statistical properties of wind power samples of a wind farm in Washington state (WA) for years 2012-2014, we first identify gaps that exist in traditional modeling approaches and propose a possible solutions. The key idea we propose is to view scenario tree generation as a form of compression of the wind power trajectories, sidestepping completely the model selection approach. We argue that to retain key features of the high order statistics, one can directly quantize realizations over the optimization horizon or to perform the quantization in a finite subspace of Morlet-wavelets. In fact, the wind time series tend to be characterized by the same set of localized features and has a sparse representation over the Morlet basis. We analyze our scenario reduction performance compared to time domain methods.
*Online Learning and Pricing for Demand Response in Smart Distribution Networks*-
The problem of online learning of consumer response to retail pricing of electricity in a distribution network is considered. In a two-settlement market, the retailer who sets the retail price is exposed to risks from the stochastic response of its consumers and the real-time price fluctuation in the wholesale market. The optimal price maximizing the expected profit is a function of consumer’s response to prices, and any pricing scheme under unknown demand model accumulates regret measured by the difference between the total expected profit of the retailer under known and unknown demand model.

This paper presents an online learning approach to dynamic pricing aimed at minimizing the regret of the retailer for consumers with unknown Markov jumped affine demand. It is shown that the regret of the proposed policy has the lowest order of regret growth characterized by the square-root of the learning horizon.

*Decentralized MMSE Attacks in Electricity Grids*-
Decentralized data injection attack constructions with minimum-mean-square-error state estimation is studied in a game-theoretic setting. Within this framework, the interaction between the network operator and the set of attackers, as well the interactions among the attackers, are modeled by a game in normal form. A novel utility function that captures the trade-off between the maximum distortion that an attack can introduce and the probability of the attack being detected by the network operator is proposed. Under the assumption that the state variables can be modelled as a multivariate Gaussian random process, it is shown that the resulting game is a potential game. The cardinality of the corresponding set of Nash Equilibria (NEs) of the game is analyzed. It is shown that attackers can agree on a data injection vector construction that achieves the best trade-off between distortion and detection probability by sharing only a limited number of bits offline. Interestingly, this vector construction is also shown to be an NE of the resulting game.
*A Graphical Approach to Quickest Outage Localization in Power Grids*-
Line outage detection and localization play pivotal roles in contingency analysis, power flow optimization, and situational awareness delivery in power grids. Hence, agile detection and localization of line outages enhances the efficiency of operations and their resilience against cascading failures. This paper proposes a stochastic graphical framework for localizing line outages. This framework capitalizes on the correlation among the measurements generated across the grid, where the correlation is induced by the connectivity topology of the grid. By formalizing a proper correlation model, this paper designs data-adaptive coupled data-acquisition and decision-making processes for the quickest localization of the line outages. This leads to efficient outage localization by using only partial measurements and is shown to outperform the existing dimensionality reduction methods.
*Estimating Treatment Effects in Demand Response*-
Demand response is designed to motivate electricity customers to modify their loads at critical time periods. Accurately estimating customers response to demand response signals is crucial to the success of these programs. In this paper, we consider signals in demand response programs as a treatment to the customers and estimate the average treatment effect. Specifically, we adopt the linear regression model and derive several consistent linear regression estimators. From both synthetic and real data, we show that including more information about the customers does not always improve estimation accuracy: the interaction between the side information and the demand response signal must be carefully modeled. We then apply the so-called modified covariate method to capture these interactions and show it can strike a balance between having more data and model correctness. Our results are validated using data collected by Pecan Street.
*Dynamic Decentralized Voltage Control for Power Distribution Networks*-
Voltage regulation in power distribution networks has been increasingly challenged by the integration of volatile and intermittent distributed energy resources (DERs). These resources can also provide limited reactive power resources that can be used to optimize the network-wide voltage. A decentralized voltage control scheme based on the gradient-projection (GP) method is adopted to minimize a voltage mismatch error objective under limited reactive power. Coupled by the power network flow, the local voltage directly provides the instantaneous gradient information. This paper aims to quantify the performance of this decentralized GP-based voltage control under dynamic system operating conditions modeled by an autoregressive process. Our analysis offers the tracking error bound on the instantaneous solution to the transient optimizer. Under stochastic processes that have bounded iterative changes, the results can be extended to general constrained dynamic optimization problems with smooth strongly convex objective functions. Numerical tests have been perform to validate our analytical results using a 21-bus network.
*Learning to Infer: a New Variational Inference Approach for Power Grid Topology Identification*-
Identifying arbitrary topologies of power networks is a computationally hard problem due to the number of hypotheses that grows exponentially with the network size. A new variational inference approach is developed for efficient marginal inference of every line status in the network. Optimizing the variational model is transformed to and solved as a discriminative learning problem. A major advantage of the developed learning based approach is that the labeled data used for learning can be generated in an arbitrarily large amount at very little cost. As a result, the power of offline training is fully exploited to offer effective real-time topology identification. The proposed methods are evaluated in the IEEE 30-bus system. With relatively simple variational models and only an undercomplete measurement set, the proposed methods already achieve reasonably well performance in identifying arbitrary power network topologies.

## Wednesday, June 29

**Wednesday, June 29, 09:00 – 10:00: ***Plenary talk: Pablo Laguna*

*Plenary talk: Pablo Laguna*

**Wednesday, June 29, 10:30 – 12:00**

**Wed-Ia: Applications (biomedical, energy, security)**

*Sparse Genomic Structural Variant Detection: Exploiting Parent-Child Relatedness for Signal Recovery*-
Structural variants (SVs) – rearrangements of an individuals’ genome – are an important source of heterogeneity in human and other mammalian species. Typically, SVs are identified by comparing fragments of DNA from a test genome to a known reference genome, but errors in both the sequencing and the noisy mapping process contribute to high false positive rates. When multiple related individuals are studied, their relatedness offers a constraint to improve the signal of true SVs. We develop a computational method to predict SVs given genomic DNA from a child and both parents. We demonstrate that enforcing relatedness between individuals and constraining our solution with a sparsity-promoting $\ell_1$ penalty (since SV instances should be rare) results in improved performance. We present results on both simulated genomes as well as two-sequenced parent-child trios from the 1000 Genomes Project.
*Multiscale Time Irreversibility to predict Orthostatic Intolerance in Older People*-
Orthostatic intolerance (OI) is a clinical syndrome, which is characterized by symptoms and loss of consciousness before impeding syncope and that it has been reported that is caused by orthostatic hypotension (OH). Healthy subjects and people with diseases can often be distinguished by the complexity of their physiological activity. The phenomenon of irreversibility is specific for non-equilibrium system and its presence in hemodynamic variables results from the complexity of cardiovascular control system typical for the healthy human. This study is focused to quantify the effect of the OI on the time irreversibility of the heart rate (HR), cardiac output (CO) and systolic blood pressure (SBP) time series, during six-minutes walking distance test in symptomatic older people, who were compared by the following multiscale time irreversibility indexes: Porta’s (Pm%), Guzik’s (Gm%) and Euclidean distance (Dm). We analyzed 65 older subjects, of whom 42 were women. Results show higher indexes in non-OI groups, specially during the descent phase and in the subsequent passive phase. This study shows the irreversibility indexes as useful measure to extract non-linearity properties in hemodynamic parameters in order to find out differences during orthostastism.
*Pose estimation of cyclic movement using inertial sensor data*-
We propose a method for estimating the rotation and displacement of a rigid body from inertial sensor data based on the assumption that the movement is cyclic in nature, meaning that the body returns to the same position and orientation at regular time intervals. The method builds on a parameterization of the movement by sums of sinusoids, and the amplitude and phase of the sinusoids are estimated from the data using measurement models with Gaussian noise. The maximum likelihood estimate is then equivalent to a weighted nonlinear least squares estimate. The performance of the method is demonstrated on simulated data and on experimental data.
*Genomic Transcription Regulatory Element Location Analysis via Poisson weighted LASSO*-
The distances between DNA Transcription Regulatory Elements (TRE) provide important clues to their dependencies and function within the gene regulation process. However, the locations of those TREs as well as their cross distances between occurrences are stochastic, in part due to the inherent limitations of Next Generation Sequencing methods used to localize them, in part due to biology itself. This paper describes a novel approach to analyzing these locations and their cross distances even at long range via a Poisson random convolution. The resulting deconvolution problem is ill-posed, and sparsity regularization is used to offset this challenge. Unlike previous work on sparse Poisson inverse problems, this paper adopts a weighted LASSO estimator with data-dependent weights calculated using concentration inequalities that account for the Poisson noise. This method exhibits better squared error performance than the classical (unweighted) LASSO both in theoretical performance bounds and in simulation studies, and can easily be computed using off-the-shelf LASSO solvers.
*Design of Data-Injection Adversarial Attacks against Spatial Field Detectors*-
Data-injection attacks on spatial field detection corrupt a subset of measurements to cause erroneous decisions. We consider a centralized decision scheme exploiting spatial field smoothness to overcome lack of knowledge on system parameters such as noise variance. We obtain closed-form expressions for system performance and investigate strategies for an intruder injecting false data in a fraction of the sensors in order to reduce the probability of detection. The problem of determining the most vulnerable subset of sensors is also analyzed.
*Security of (n,n)-threshold audio secret sharing schemes encrypting audio secrets*-
Secret sharing is a method of encrypting a secret into multiple pieces called shares so that only qualified sets of shares can be employed to reconstruct the secret. Audio secret sharing (ASS) is an example of secret sharing whose decryption can be performed by human ears. This paper examines the security of (n,n)-threshold ASS schemes encrypting audio secrets by estimating the mutual information between secret and shares.
*Secure Estimation Against Complex-valued Attacks*-
Motivated by recent evolution of cutting-edge sensor technologies with complex-valued measurements, the paper exposes complex-valued (non-circular) false data injection attacks. We propose an attack model where an adversary applies widely-linear transformations on the sensor measurements to introduce correlations between the real and imaginary parts of the reported observations. Existing state estimators and attack detectors assume the measurements to have statistical properties similar to real-valued signals making them highly vulnerable to such complex-valued attacks. As a countermeasure, we propose to transform the attack detection problem into the problem of comparing the statistical distance between the Gaussian representation of the innovation sequence under attack and its counterpart with the optimal profile. Our Monte Carlo simulations illustrate the destructive nature of complex-valued attacks and validate the effectiveness of the proposed detection concept.
*A skewed exponential power distribution to measure value at risk in electricity markets*-
Interest in risk measurement for spot price has increased since the worldwide deregulation and liberalization of electricity started in the early 90’s. This paper focused on quantifying risk for the Nordic Power Exchange (Nord Pool) system price. Our analysis is based on a GARCH approach with skewed exponential power innovations to model the stochastic component of the system price. Value-at-risk backtesting procedures are conducted and our model performance is compared to commonly used distributions in risk measurement. We show that the skewed exponential power distribution outperforms the competitors for the upside risk (95\%, 97.5\% and 99\% Value-at-risk), which is of high interest as electricity spot prices are positively skewed.
*Accelerometer calibration using sensor fusion with a gyroscope*-
In this paper, a calibration method for a triaxial accelerometer using a triaxial gyroscope is presented. The method uses a sensor fusion approach, combining the information from the accelerometers and gyroscopes to find an optimal calibration using Maximum likelihood. The method has been tested by using real sensors in smartphones to perform orientation estimation and verified through Monte Carlo simulations. In both cases, the method is shown to provide a proper calibration reducing the effect of sensor errors and improving orientation estimates.

**Wed-Ib: Detection and estimation theory III**

*Joint range estimation and spectral classification for 3D scene reconstruction using multispectral Lidar waveforms*-
This paper presents a new Bayesian classification method to analyse remote scenes sensed via multispectral Lidar measurements. To a first approximation, each Lidar waveform mainly consists of the temporal signature of the observed target, which depends on the wavelength of the laser source considered and which is corrupted by Poisson noise. By sensing the scene at several wavelengths, we expect a more accurate target range estimation and a more efficient spectral analysis of the scene. Thanks to its spectral classification capability, the proposed hierarchical Bayesian model, coupled with an efficient Markov chain Monte Carlo algorithm, allows the estimation of depth images together with reflectivity-based scene segmentation images. The proposed methodology is illustrated via experiments conducted with real multispectral Lidar data.
*Regularised Estimation of 2D-Locally Stationary Wavelet Processes*-
Locally Stationary Wavelet processes provide a flexible way of describing the time/space evolution of autocovariance structure over an ordered field such as an image/time-series. Classically, estimation of such models assume continuous smoothness of the underlying spectra and are estimated via local kernel smoothers. We propose a new model which permits spectral jumps, and suggest a regularised estimator and algorithm which can recover such structure from images. We demonstrate the effectiveness of our method in a synthetic experiment where it shows desirable estimation properties. We conclude with an application to real images which illustrate the qualitative difference between the proposed and previous methods.
*Fast filtering with new sparse transition Markov chains*-
We put forward a novel Markov chain approximation method with regard to the filtering problem. The novelty consists in making use of the sparse grid theory which deals with the curse of dimensionality. Our method imitates the marginal distribution of the latent continuous process with a discrete probability distribution on a sparse grid. The grid points may be seen as the states of a Markov chain which we construct to imitate the whole process. The transition probabilities are then chosen to preserve the joint moments of the underlying continuous process. We provide a simulation study on a multivariate stochastic volatility filtering problem to compare the proposed methodology with a similar technique and the particle filtering.
*On the estimation of many closely spaced complex sinusoids*-
The estimation of sinusoidal signals is a very well researched area, and it is well known that two signals can be resolved well for frequency separation below the Fourier resolution at high enough signal to noise ratio. However, in the case of many closely spaced sinusoids estimation is impaired for separations well above the Fourier resolution, and the dependence on signal to noise ratio is involved. The problem is analyzed by considering the Hessian of the log-likelihood function. When there is some direction in the parameter space, where the curvature of its deterministic part (i.e. the Fisher information matrix) is less than the curvature of its stochastic part, this is an indication of problems for correct estimation. An expression for the probability of this to occur is presented.
*A Multiscale Approach for Tensor Denoising*-
As higher-order datasets become more common, researchers are primarily focused on how to analyze and compress them. However, the most common challenge encountered in any type of data, including tensor data, is noise. Furthermore, the methods developed for denoising vector or matrix type datasets cannot be applied directly to higher-order datasets. This motivates the development of denoising methods for tensors. In this paper, we propose the use of a multiscale approach for denoising general higher-order datasets. The proposed approach works by decomposing the higher-order data into subtensors, and then denoises the subtensors by recursively exploiting filtered residuals. The method is validated on both hyperspectral image and brain functional connectivity network data.
*Two-Stage Estimation after Parameter Selection*-
In many practical multiparameter estimation problems, no a-priori information exists regarding which parameters are more relevant within a group of candidate unknown parameters. This paper considers the estimation of a selected “parameter of interest”, where the selection is conducted according to a data-based selection rule, $\Psi$. The selection process introduces a selection bias and creates coupling between decoupled parameters. We propose a two-stage data-acquisition approach that can remove the selection bias and improve estimation performance. We derive a two-stage Cramer-Rao-type bound on the post-selection mean squared error (PSMSE) of any $\Psi$-unbiased estimator, where the $\Psi$-unbiasedness is in the Lehmann sense. In addition, we present the two-stage post-selection maximum-likelihood (PSML) estimator. The proposed $\Psi$-Cramer-Rao bound (CRB), PSML estimator and other existing estimators are examined for a linear Gaussian model, which is widely used in clinical research.
*An order fitting rule for optimal subspace averaging*-
The problem of estimating a low-dimensional subspace from a collection of experimentally measured subspaces arises in many applications of statistical signal processing. In this paper we address this problem, and give a solution for the average subspace that minimizes an extrinsic mean-squared error, defined by the squared Frobenius norm between projection matrices. The solution automatically returns the dimension of the optimal average subspace, which is the novel result of the paper. The proposed order fitting rule is based on thresholding the eigenvalues of the average projection matrix, and thus it is free of penalty terms or other tuning parameters commonly used by other rank estimation techniques. Several numerical examples demonstrate the usefulness and applicability of the proposed criterion, showing how the dimension of the average subspace captures the variability of the measured subspaces.
*Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection*-
Studies of change-point detection (CPD) often focus on developing similarity metrics that quantify how likely a time point is to be a change point. After that, the process of selecting true change points among those high-score candidates is less well-studied. This paper proposes a new CPD method that uses determinantal point processes to model the process of change-point selection. Particularly, this paper explores the special kernel structure arose in such modelling, i.e. almost block diagonal, and show that the maximum a posteriori task, requiring at least $O(N^{2.4})$ in general, can be achieved using $O(N)$ under such structure. The resulting algorithm, BwDPP-MAP and BwDppCpd, are empirically validated through simulation and five real-world data experiments.

**Wed-Ic SS: Optimization and simulation for image processing**

*Spatial regularization for nonlinear unmixing of hyperspectral data with vector-valued functions*-
This communication introduces a new framework for incorporating spatial regularization into a nonlinear unmixing procedure dedicated to hyperspectral data. The proposed model promotes smooth spatial variations of the nonlinear component in the mixing model. The spatial regularizer and the nonlinear contributions are jointly modeled by a vector-valued function that lies in a reproducing kernel Hilbert space (RKHS). The unmixing problem is strictly convex and reduces to a quadratic programming problem. Simulations on synthetic data illustrate the effectiveness of the proposed approach.
*A Regularized Sparse Approximation Method for Hyperspectral Image Classification*-
This paper presents a new technique for hyperspectral images classification based on simultaneous sparse approximation. The proposed approach consists in formulating the problem as a convex multi-objective optimization problem which incorporates a term favoring the simultaneous sparsity of the estimated coefficients and a term enforcing a regularity constraint along the rows of the coefficient matrix. We show that the optimization problem can be solved efficiently using FISTA (Fast Iterative Shrinkage-Thresholding Algorithm). This approach is applied to a wood wastes classification problem using NIR hyperspectral images.
*Unbiased Injection of Signal-Dependent Noise in Variance-Stabilized Range*-
The design, optimization, and validation of many image processing or image-based analysis systems often requires testing of the system performance over a dataset of images corrupted by noise at different signal-to-noise ratio regimes. A noise-free ground-truth image may not be available, and different SNRs are simulated by injecting extra noise into an already noisy image. However, noise in real-world systems is typically signal-dependent, with variance determined by the noise-free image. Thus, also the noise to be injected shall depend on the unknown ground-truth image. To circumvent this issue, we consider the additive injection of noise in variance-stabilized range, where no previous knowledge of the ground-truth signal is necessary. Specifically, we design a special noise-injection operator that prevents the errors on expectation and variance that would otherwise arise when standard variance-stabilizing transformations are used for this task. Thus, the proposed operator is suitable for accurately injecting signal-dependent noise even to images acquired at very low counts.
*Bayesian Multifractal Analysis of Multi-temporal Images Using Smooth Priors*-
Texture analysis can be conducted within the mathematical framework of multifractal analysis (MFA) via the study of the regularity fluctuations of image amplitudes. Successfully used in various applications, however MFA remains limited to the independent analysis of single images while, in an increasing number of applications, data are multi-temporal. The present contribution addresses this limitation and introduces a Bayesian framework that enables the joint estimation of multifractal parameters for multi-temporal images. It builds on a recently proposed Gaussian model for wavelet leaders parameterized by the multifractal attributes of interest. A joint Bayesian model is formulated by assigning a Gaussian prior to the second derivatives of time evolution of the multifractal attributes associated with multi-temporal images. This Gaussian prior ensures that the multifractal parameters have a smooth temporal evolution. The associated Bayesian estimators are then approximated using a Hamiltonian Monte-Carlo algorithm. The benefits of the proposed procedure are illustrated on synthetic data

*Robust hyperspectral unmixing accounting for residual components*-
This paper presents a new hyperspectral mixture model jointly with a Bayesian algorithm for supervised hyperspectral unmixing. Based on the residual component analysis model, the proposed formulation assumes the linear model to be corrupted by an additive term that accounts for mismodelling effects (ME). The ME formulation takes into account the effect of outliers, the propagated errors in the signal processing chain and copes with some types of endmember variability (EV) or nonlinearity (NL). The known constraints on the model parameters are modeled via suitable priors. The resulting posterior distribution is optimized using a coordinate descent algorithm which allows us to compute the maximum a posteriori estimator of the unknown model parameters. The proposed model and estimation algorithm are validated on both synthetic and real images showing competitive results regarding the quality of the inferences and the computational complexity when compared to the state-of-the-art algorithms.
*Analysis Dictionary Learning for Scene Classification*-
This paper presents a new framework for scene classification based on an analysis dictionary learning approach. Despite their tremendous success in various image processing tasks and so called sparse coding, synthesis-based and analysis-based sparse models fall short in classification tasks. This is partly due to the linear dependence of the dictionary atoms. In this work, we aim at improving classi- fication performances by compensating for such dependence. The proposed methodology consists in grouping the atoms of the dictionary using clustering methods. This allows to sparsely model images from various scene classes and use such a model for classification. Experimental evidence shows the benefit of such an approach. Finally, we propose supervised way to train the baseline representation for each class-specific dictionary, and achieve multiple classification by finding the minimum distance between the learned baseline representation and the data’s sub-dictionary representation. The achieved results in scene classification are better than the state-of-the-art.
*Weakly-supervised Analysis Dictionary Learning with Cardinality Constraints*-
In synthesis dictionary learning, data is compactly represented as sparse combination over a dictionary. In analysis dictionary learning, a sparsifying analysis dictionary is learned from data. In this paper, we consider the problem of analysis dictionary learning under the weak supervision setting. We introduce a discriminative probabilistic model and present a novel approach to enforce sparsity using probabilistic cardinality constraints. A detailed derivation of the expectation maximization procedure for maximum likelihood estimation with a computationally efficient E-step implementation is introduced. We illustrate the performance of the model on synthetic data.