The list of accepted papers is below.
Dimensionality Reduction From Several Angles (invited talk)
Tamara Munzner University of British Columbia
I will present several projects that attack the problem of dimensionality reduction (DR) in visualization from different methodological angles of attack, in order to answer different kinds of questions. First, can we design better DR algorithms? Glimmer is a multilevel multidimensional scaling (MDS) algorithm that exploits the GPU. Glint is a new MDS framework that achieves high performance on costly distance functions. Second, can we build a DR system for real people? DimStiller is a toolkit for DR that provides local and global guidance to users who may not be experts in the mathematics of high-dimensional data analysis, in hopes of “DR for the rest of us”. Third, how should we show people DR results? An empirical lab study provides guidance on visual encoding for system developers, showing that points are more effective than spatialized landscapes for visual search tasks with DR data. A data study, where a small number of people make judgements about a large number of datasets rather than vice versa as with a typical user study, produced a taxonomy of visual cluster separation factors. Fourth, when do people need to use DR? Sometimes it is not the right solution, as we found when grappling with the design of the QuestVis system for a environmental sustainability simulation. We provide guidance for researchers and practitioners engaged in this kind of problem-driven visualization work with the nested model of visualization design and evaluation and the nine-stage framework for design study methodology. Much of this work was informed by preliminary results from an ongoing project, a two-year qualitative study of high-dimensional data analysts in many domains, to discover how the use of DR “in the wild” may or may not match up with the assumptions that underlie previous algorithmic work.
Pierrick Bruneau and Benoit Otjacques CRP Gabriel Lippmann
This work describes a novel interactive visual clustering system. It combines a 2D projection with a clustering algorithm that operates on this projected data. Example-based interactions are supported directly through the 2D representation. Each interaction incrementally updates the 2D projection and the associated clustering.
Sylvain Lespinats and Michael Aupetit CEA LIST and CEA LITEN
Dimensionality reduction algorithms may be of great help as decision support, representing the information as a map which summarizes the data similarities. When data come with an assigned class label, such a map can be used to check the quality of the labeling detecting class outliers or data near decision boundary, or to evaluate the relevance of the similarity measure used for the mapping from which to derive a good classification space. However, state-of-the-art mapping techniques are either unsupervised, not considering the class labels, or supervised, considering it but putting too much emphasis on the class information. The result is that well separated classes can be mapped as overlapping with the unsupervised techniques, while overlapping classes can be mapped as clearly separated with the supervised techniques, so none of these maps tends to show the truth about the inter-class and between-class high-dimensional structure. We designed ClassiMap, a supervised mapping technique which come over these limits by exploiting the unavoidable tears and false neighborhoods mapping distortions to preserve at best the class structure through the mapping. We compare it to other supervised mapping techniques in labeled data visual exploration tasks.
Daniel Perez, Leishi Zhang, Matthias Schaefer, Tobias Schreck, Daniel Keim, and Ignacio Diaz University of Oviedo and University of Konstanz
Projecting multidimensional data to a lower-dimensional visual display as a scatter-plot-like visualization is a common approach for analyzing multidimensional data. Many dimension reduction techniques exist for performing such a task, but the quality of projections varies in terms of both preserving the original data structure and avoiding cluttered visual displays. In this paper, we propose an interactive feature transformation approach that allows the analyst to monitor and improve the projection quality by transforming feature space and assessing/comparing the quality of different projection results. The method integrates feature selection and transformation as well as a variety of projection quality measures to help analyst generate uncluttered projections that preserve the structural properties of the data. These projections enhance the visual analysis process and provide a better understanding of data.
Nicolas Heulot, Michael Aupetit, and Jean-Daniel Fekete CEA LIST and INRIA
As dimensionality increases, analysts are faced with difficult problems to make sense of their data. In exploratory data analysis, multidimensional scaling projections can help analyst to discover patterns by identifying outliers and enabling visual clustering. However to exploit these projections, artifacts and interpretation issues must be overcome. We present ProxiLens, a semantic lens which helps exploring data interactively. The analyst becomes aware of the artifacts navigating in a continuous way through the 2D projection in order to cluster and analyze data. We demonstrate the applicability of our technique for visual clustering on synthetic and real data sets.
Stability comparison of dimensionality reduction techniques attending to data and parameter variations
Francisco Garcia-Fernandez, Michel Verleysen, John Lee, and Ignacio Diaz University of Oviedo and Universite Catholique de Louvain
The continuous growth in the volumes of data requires efficient and robust dimension reduction techniques to represent data into lower-dimensional spaces, which ease human understanding. This paper presents a study of the stability, robustness and performance of some of these dimension reduction algorithms with respect to algorithm and data parameters, which usually have a major influence in the resulting embeddings. This analysis includes the performance of a large panel of techniques on both artificial and real datasets, focusing on the geometrical variations experimented when changing different parameters. The results are presented by identifying the visual weaknesses of each technique, providing some suitable data-processing tasks to enhance the stability.
Emilie Renard, Pierre Dupont, and Michel Verleysen Universite Catholique de Louvain
Dealing with high-dimensional data becomes very common nowadays; visualization is a natural preprocessing to have an overview of such data. A lot of dimensionality reduction methods exist; many of them require to tune a parameter implementing a trade-off between conflicting objectives. Automatically choosing the appropriate trade-off is usually a difficult task because in most cases the exact final goal of the visualization is ill-defined. The approach developed here aims at taking advantage of the user's capacities and feedback by allowing him to control parameters in real-time and to see the resulting visualization. In order to have fast transitions between visualizations resulting from different values of the parameter, interpolation on a grid is used as an approximation. The accuracy of this approximation is estimated using Procrustes analysis and can be adjusted through a threshold. Simulations provide an interpretation of this threshold and are validated on a real dataset.