Cluster stability in the analysis of mass cytometry data

Melchiotti, R., Gracio, F., Kordasti, S. et al.

Manual gating has been traditionally applied to cytometry data sets to identify cells based on protein expression. The advent of mass cytometry allows for a higher number of proteins to be simultaneously measured on cells, therefore providing a means to define cell clusters in a high dimensional expression space. This enhancement, whilst opening unprecedented opportunities for single cell-level analyses, makes the incremental replacement of manual gating with automated clustering a compelling need. To this aim many methods have been implemented and their successful applications demonstrated in different settings. However, the reproducibility of automatically generated clusters is proving challenging and an analytical framework to distinguish spurious clusters from more stable entities, and presumably more biologically relevant ones, is still missing. One way to estimate cell clusters' stability is the evaluation of their consistent re-occurrence within- and between-algorithms, a metric that is commonly used to evaluate results from gene expression. Herein we report the usage and importance of cluster stability evaluations, when applied to results generated from three popular clustering algorithms - SPADE, FLOCK and PhenoGraph - run on four different data sets. These algorithms were shown to generate clusters with various degrees of statistical stability, many of them being unstable. By comparing the results of automated clustering with manually gated populations, we illustrate how information on cluster stability can assist towards a more rigorous and informed interpretation of clustering results. We also explore the relationships between statistical stability and other properties such as clusters' compactness and isolation, demonstrating that whilst cluster stability is linked to other properties it cannot be reliably predicted by any of them. Our study proposes the introduction of cluster stability as a necessary checkpoint for cluster interpretation and contributes to the construction of a more systematic and standardized analytical framework for the assessment of cytometry clustering results.


Melchiotti, R., Gracio, F., Kordasti, S. et al. "Cluster stability in the analysis of mass cytometry data" Cytometry Part A (2017): 73–84