Network Data on the Statistical Testbench

A New Method for Generating Realistic Null Data Exploiting Underlying Graph Structure with Application to EEG

METHODS

E. Pirondini, A. Vybornova, M. Coscia, and D. Van De Ville, Senior Member

 

Graph Laplacian operator is defined as L=D-W, where the degree matrix D is a diagonal matrix whose i"th" element, D_i, is the sum of the ith row of the adjacency matrix, A. The basis U formed by the eigenvectors of the graph Laplacian constitutes the Graph Fourier Transform (GFT). GFT is applied to graph signals and the phases of the Fourier coefficients are randomly permuted or generated. Each realization of the surrogate data is then obtained by a different randomization followed by the inverse Fourier transform. The family of surrogate graph signals can then be employed for non-parametric statistical tests.

Graph Laplacian operator is defined as L=D-W, where the degree matrix D is a diagonal matrix whose i”th” element, Di, is the sum of the ith row of the adjacency matrix, A. The basis U formed by the eigenvectors of the graph Laplacian constitutes the Graph Fourier Transform (GFT). GFT is applied to graph signals and the phases of the Fourier coefficients are randomly permuted or generated. Each realization of the surrogate data is then obtained by a different randomization followed by the inverse Fourier transform. The family of surrogate graph signals can then be employed for non-parametric statistical tests. View larger version of this figure (PDF).


Technological and computational advances are making available large amounts of high-dimensional and rich-structured biomedical data, including brain images and signals. Acknowledging the network structure in our analyses opens a multitude of avenues in investigating “systems level” properties. For instance, computational neuroscience has boosted the interest in modeling and analyzing large datasets using concepts normally applied in networks and graph theories (Dayan and Abbott, 2003).

Mathematically, networks can be represented as weighted graphs defined by a set of nodes and edge weights. Graph signals are measured as values at the nodes and are typically time-varying in many applications.

The signal processing community is finding recent interest in developing and tailoring classical operations and statistical approaches to graph structured data. One notable tool for extending conventional signal-processing operations to networks is the graph Fourier transform that can be obtained as the eigendecomposition of the graph Laplacian (Shuman et al., 2013, Chen et al., 2015).

As one of the workhorses of scientific disciplines, statistical hypothesis testing tries to invalidate a given null hypothesis assuming that the measured effect is not present in the experimental data. Conventional parametric procedures typically test whether the effect can plausibly be explained by noise only. However, non-parametric procedures and surrogate techniques, in particular, make up a much stronger null hypothesis as they preserve some important features of the data, thus making it harder to reject the null against these more realistic alternatives. In particular, the method of phase randomization is widely adopted to generate surrogate data for time courses under the null hypothesis that the experimental data are part of a class of stationary signals with prescribed autocorrelation structure (Theiler et al., 1992). The phase-randomization method preserves the original correlation structure under the assumption that the experimental signal is part of a class of stationary signals with the same power spectrum density. Practically, the experimental data are transformed into the Fourier domain and the phases of the Fourier coefficients are permuted or randomized. Each realization of the surrogate data is then obtained by a different randomization followed by the inverse Fourier transform.

Proper statistical assessment of properties of graph signals should account the “connectivity” of the underlying graph; e.g., typically a certain level of smoothness will be observed between signal values of connected nodes. We have recently proposed a simple and elegant method to generate realistic data under the null hypothesis that the measured signal is stationary on the graph (Pirondini, 2016). This method can serve as an essential ingredient in a non-parametric statistical procedure and is thus highly flexible with respect to the type of measures that can be evaluated. Our approach extends the phase randomization method to network data by applying the graph Fourier transform to graph signals and then using sign randomization of the graph Fourier coefficients as the equivalent of phase randomization. The surrogate graph signal is then obtained by applying the inverse graph Fourier transform. By construction, this method preserves the amplitude of the spectral coefficients and, thus, the power spectrum density is imposed on the class of surrogate signals.

We demonstrate the feasibility of this new technique for electroencephalography (EEG) data. EEG is a portable, non-invasive, and widespread technology, which is increasingly emerging as diagnostic tool for functional investigations of the human brain. Modern EEG technologies provide high electrodes counts and sampling rates, which result in considerably high-dimensional datasets. Although most of the neuroscientific experiments aim at determining stable and repeatable effects, analysis of trial-to- trial variability might allow identifying changes in the internal state of the subject. However, within-trial analyses require specialized techniques that either complement or extend common statistical approaches (Kass et al., 2005). Trial-to- trial variability is a common question also in EEG, where the analysis of grand-average event-related- potential features fails to address the effects of behavioral tasks on the dynamics of cortical activity in single trials.

We showed that a graph can be constructed by the spatial neighborhood relationships of the EEG electrodes and that the instantaneous EEG topographies can be considered as graph signals (Pirondini, 2016). The surrogate graph signals have similar smoothness as the experimental one since they preserve correlation structure as captured by the graph Laplacian. Such realistic surrogates constitute a much stronger null hypothesis than, for instance, naïve spatial random permutation. We have demonstrated that this method allows statistically assessing the occurrence of given target topographies and quantifying inter-trial variability.

In conclusion, we proposed a new tool that enables statistical testing of graph signals using surrogate data. The approach can be applied to a broad range of applications, including EEG data. The increasing availability of structural brain graphs, such as those based on magnetic resonance imaging data (Behjat et al., 2015, Patel et al., 2006), will further contribute to “backbone graph” structures that can be used to define realistic randomization schemes using anatomical knowledge of connectivity.

References

  • Behjat, H., Leonardi, N., Sörnmo, L. & van de Ville, D. 2015. Anatomically-adapted graph wavelets for improved group-level fMRI activation mapping. NeuroImage, 123, 185-199.
  • Chen, S., Varma, R., Sandryhaila, A. & Kovačević, J. 2015. Discrete signal processing on graphs: Sampling theory. arXiv preprint arXiv:1503.05432.
  • Dayan, P. & Abbott, L. 2003. Theoretical neuroscience: computational and mathematical modeling of neural systems. Journal of Cognitive Neuroscience, 15, 154-155.
  • Kass, R. E., Ventura, V. & Brown, E. N. 2005. Statistical issues in the analysis of neuronal data. Journal of neurophysiology, 94, 8-25.
  • Patel, R. S., van de Ville, D. & Bowman, F. D. 2006. Determining significant connectivity by 4D spatiotemporal wavelet packet resampling of functional neuroimaging data. NeuroImage, 31, 1142-1155.
  • Pirondini, E., Vybornova, A., Coscia, M., Ville, Dimitri van de 2016. A Spectral Method for Generating Surrogate Graph Signals. IEEE Signal Process. Lett., 23, 1275-1278.
  • Shuman, D., Narang, S. K., Frossard, P., Ortega, A. & Vandergheynst, P. 2013. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. Signal Processing Magazine, IEEE, 30, 83-98.
  • Theiler, J., Eubank, S., Longtin, A., Galdrikian, B. & Farmer, J. D. 1992. Testing for nonlinearity in time series: the method of surrogate data. Physica D: Nonlinear Phenomena, 58, 77-94.

Authors

Elvira PirondiniElvira Pirondini received the M.Sc. degree in bioengineering from the Swiss Federal Institute of Technology (EPFL), Lausanne, in 2012. During this time, she was awarded a fellowship from the Bertarelli Program in Translational Neuroscience and Neuroengineering to carry out her M.Sc. thesis at Harvard Medical School. She worked under the supervision of E. Brown and P. Purdon and she developed a new algorithm for electroencephalography inverse problem. From 2013, she is a Ph.D. student at the EPFL in the laboratory of S. Micera. Her thesis focuses on robotic rehabilitation and brain imaging.

 

Anna VybornovaAnna Vybornova has recieved a B. S. degree in life sciences and technologies from Ecole Polytechnique Fédérale de Lausanne (EPFL) in 2015. She is currently accomplishing her Master in Bioengineering and Minor in neuroprosthetics in EPFL. Her research interests include signal processing on graphs and its biomedical applications.

 

 

Martina CosciaMartina Coscia is a post-doctoral fellow at the Bertarelli Foundation Chair in Translational Neuroengineering, EPFL, Switzerland since 2013. She received the Ph.D. degree in Biorobotics from Scuola Superiore Sant’ Anna, Italy, in 2013, and the Master degree in Biomedical Engineering from University of Pisa, Italy, in 2009. In 2011-2012, she was a visiting student in the Motion Analysis Lab at the Spaulding Rehabilitation Hospital, Boston, Massachusetts.

Her research interests include neurorehabilitation after stroke, in particular the design of innovative strategies for robot-aided rehabilitation, and the quantitative assessment of motor disabilities, especially using muscle synergies.

 

Dimitri Van De VilleDimitri Van De Ville (IEEE Senior member) (M’02,SM’12) received the M.S. degree in engineering and computer sciences and the Ph.D. degree from Ghent University, Belgium, in 1998, and 2002, respectively. After a post-doctoral stay (2002-2005) at the lab of Prof. Michael Unser at the Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, he became responsible for the Signal Processing Unit at the University Hospital of Geneva, Switzerland, as part of the Centre d’Imagerie Biomédicale (CIBM). In 2009, he received a Swiss National Science Foundation professorship and since 2015 became Professor of Bioengineering at the EPFL and the University of Geneva, Switzerland. His research interests include wavelets, sparsity, pattern recognition, and their applications in computational neuroimaging. He was a recipient of the Pfizer Research Award 2012, the NARSAD Independent Investigator Award 2014, and the Leenaards Foundation Award 2016.

Dr. Van De Ville served as an Associate Editor for the IEEE TRANSACTIONS ON IMAGE PROCESSING from 2006 to 2009 and the IEEE SIGNAL PROCESSING LETTERS from 2004 to 2006, as well as Guest Editor for several special issues. He was the Chair of the Bio Imaging and Signal Processing (BISP) TC of the IEEE Signal Processing Society (2012- 2013) and is the Founding Chair of the EURASIP Biomedical Image & Signal Analytics SAT. He is Co-Chair of the biennial Wavelets & Sparsity series conferences, together with V. Goyal and M. Papadakis.