Generating Quantitative Cell Identity Labels with Marker Enrichment Modeling (MEM)

Kirsten E. Diggins1, Jocelyn S. Gandelman2, Caroline E. Roe1, Jonathan M. Irish2

1 Vanderbilt‐Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee, 2 Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, Tennessee
Publication Name:  Current Protocols in Cytometry
Unit Number:  Unit 10.21
DOI:  10.1002/cpcy.34
Online Posting Date:  January, 2018
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Multiplexed single‐cell experimental techniques like mass cytometry measure 40 or more features and enable deep characterization of well‐known and novel cell populations. However, traditional data analysis techniques rely extensively on human experts or prior knowledge, and novel machine learning algorithms may generate unexpected population groupings. Marker enrichment modeling (MEM) creates quantitative identity labels based on features enriched in a population relative to a reference. While developed for cell type analysis, MEM labels can be generated for a wide range of multidimensional data types, and MEM works effectively with output from expert analysis and diverse machine learning algorithms. MEM is implemented as an R package and includes three steps: (1) calculation of MEM values that quantify each feature's relative enrichment in the population, (2) reporting of MEM labels as a heatmap or as a text label, and (3) quantification of MEM label similarity between populations. The protocols here show MEM analysis using datasets from immunology and oncology. These MEM implementations provide a way to characterize population identity and novelty in the context of computational and expert analyses. © 2018 by John Wiley & Sons, Inc.

Keywords: bioinformatics; cell identity; cytotype; computational biology; flow cytometry; mass cytometry; machine learning; marker enrichment modeling; MEM; single cell

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: MEM Analysis of Internal Data
  • Basic Protocol 2: Analyzing External Data with MEM
  • Support Protocol 1: viSNE and Flowsom Clustering of Mass Cytometry Data
  • Basic Protocol 3: Calculate RMSD Similarity on Populations from Separate MEM Analyses
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: MEM Analysis of Internal Data

  Necessary Resources
  • See protocol 1

Basic Protocol 2: Analyzing External Data with MEM

  Necessary Resources
  • ViSNE analysis
    • Analyze data using viSNE (Amir el et al., ). In this example, data were first analyzed by viSNE in Cytobank (http://www.cytobank.org). viSNE is also available from https://www.c2b2.columbia.edu/danapeerlab/html/index.html, and the underlying t‐SNE algorithm can be run in R using the packages “tsne” (https://CRAN.R‐project.org/package=tsne) or “Rtsne” (https://CRAN.R‐project.org/package=Rtsne).
    • For the viSNE analysis, sample cells equally from the 10 FCS files and select all measured markers for dimensionality reduction. Download files as FCS files containing the t‐SNE channels.
  • FlowSOM R package
    • FlowSOM is implemented as an R package. It can be downloaded at: https://github.com/SofieVG/FlowSOM (Van Gassen et al., ). To install the package:
    • source("https://bioconductor.org/biocLite.R")
    • biocLite("FlowSOM")
  • FlowSOM analysis
    • FlowSOM can be run using a single FCS file or multiple FCS files. In this example, 10 FCS files from the viSNE analysis are analyzed separately. Files can also be concatenated for FlowSOM analysis using the Cytobank concatenation tool (https://support.cytobank.org/hc/en‐us/articles/206336147‐FCS‐file‐concatenation‐tool).
    • Use FlowSOM to generate clusters across all patient files. 25 clusters are generated in this example using t‐SNE channels as input for clustering. The per‐cell cluster ID is added back to the file, which can then be uploaded to Cytobank for visualization and formatting, or input directly into MEM within R. Here, the data are gated in Cytobank into 25 clusters and download as 25 files, one for each cluster. These files are available for download from Flow Repository (https://flowrepository.org/experiments/1394).

Support Protocol 1: viSNE and Flowsom Clustering of Mass Cytometry Data

  Necessary Resources
  • See protocol 1.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

 
  Aghaeepour, N., Finak, G., Flow, C. A. P. C., Consortium, D., Hoos, H., Mosmann, T. R., … Scheuermann, R. H. (2013). Critical assessment of automated flow cytometry data analysis techniques. Nature Methods, 10(3), 228–238. doi: 10.1038/nmeth.2365.
  Aghaeepour, N., Ganio, E. A., McIlwain, D., Tsai, A. S., Tingle, M., Van Gassen, S., … Gaudilliere, B. (2017). An immune clock of human pregnancy. Science Immunology, 2(15), pii: Eaan2946. doi: 10.1126/sciimmunol.aan2946.
  Amir el, A. D., Davis, K. L., Tadmor, M. D., Simonds, E. F., Levine, J. H., Bendall, S. C., … Pe'er, D. (2013). viSNE enables visualization of high dimensional single‐cell data and reveals phenotypic heterogeneity of leukemia. Nature Biotechnology, 31(6), 545–552. doi: 10.1038/nbt.2594.
  Bandura, D. R., Baranov, V. I., Ornatsky, O. I., Antonov, A., Kinach, R., Lou, X., … Tanner, S. D. (2009). Mass cytometry: Technique for real time single cell multitarget immunoassay based on inductively coupled plasma time‐of‐flight mass spectrometry. Analytical Chemistry, 81(16), 6813–6822. doi: 10.1021/ac901049w.
  Bruggner, R. V., Bodenmiller, B., Dill, D. L., Tibshirani, R. J., & Nolan, G. P. (2014). Automated identification of stratifying signatures in cellular subpopulations. Proceedings of the National Academy of Sciences of the United States of America, 111(26), E2770–2777. doi: 10.1073/pnas.1408792111.
  Cavrois, M., Banerjee, T., Mukherjee, G., Raman, N., Hussien, R., Rodriguez, B. A., … Roan, N. R. (2017). Mass cytometric analysis of HIV entry, replication, and remodeling in tissue CD4+ T cells. Cellular Reprogramming, 20(4), 984–998. doi: 10.1016/j.celrep.2017.06.087.
  Diggins, K. E., Ferrell, P. B., Jr., & Irish, J. M. (2015). Methods for discovery and characterization of cell subsets in high dimensional mass cytometry data. Methods, 82, 55–63. doi: 10.1016/j.ymeth.2015.05.008.
  Diggins, K. E., Greenplate, A. R., Leelatian, N., Wogsland, C. E., & Irish, J. M. (2017). Characterizing cell subsets using marker enrichment modeling. Nature Methods, 14(3), 275–278. doi: 10.1038/nmeth.4149.
  DiGiuseppe, J. A., Cardinali, J. L., Rezuke, W. N., & Pe'er, D. (2017). PhenoGraph and viSNE facilitate the identification of abnormal T‐cell populations in routine clinical flow cytometric data. Cytometry. Part B, Clinical Cytometry, 2017 Sep. 2. doi: 10.1002/cyto.b.21588.
  Finck, R., Simonds, E. F., Jager, A., Krishnaswamy, S., Sachs, K., Fantl, W., … Bendall, S. C. (2013). Normalization of mass cytometry data with bead standards. Cytometry. Part A, 83(5), 483–494. doi: 10.1002/cyto.a.22271.
  Greenplate, A. R., Johnson, D. B., Roussel, M., Savona, M. R., Sosman, J. A., Puzanov, I., … Irish, J. M. (2016a). Myelodysplastic syndrome revealed by systems immunology in a melanoma patient undergoing anti‐PD‐1 therapy. Cancer Immunology Research, 4(6), 474–480. doi: 10.1158/2326‐6066.CIR‐15‐0213.
  Greenplate, A. R., Johnson, D. B., Roussel, M., Savona, M. R., Sosman, J. A., Puzanov, I., … Irish, J. M. (2016b). Myelodysplastic syndrome revealed by systems immunology in a melanoma patient undergoing anti‐PD‐1 therapy. Cancer Immunology Research, 4(6), 474–480. doi: 10.1158/2326‐6066.CIR‐15‐0213.
  Kotecha, N., Krutzik, P. O., & Irish, J. M. (2010). Web‐based analysis and publication of flow cytometry experiments. Current Protocols in Cytometry, 53, 10.17:10.17.1–10.17.24. doi: 10.1002/0471142956.cy1017s53.
  Lakshmikanth, T., Olin, A., Chen, Y., Mikes, J., Fredlund, E., Remberger, M., … Brodin, P. (2017). Mass cytometry and topological data analysis reveal immune parameters associated with complications after allogeneic stem cell transplantation. Cellular Reprogramming, 20(9), 2238–2250. doi: 10.1016/j.celrep.2017.08.021.
  Leelatian, N., Diggins, K. E., & Irish, J. M. (2015). Characterizing phenotypes and signaling networks of single human cells by mass cytometry. Methods in Molecular Biology (Clifton, N.J.), 1346, 99–113. doi: 10.1007/978‐1‐4939‐2987‐0_8.
  Levine, J. H., Simonds, E. F., Bendall, S. C., Davis, K. L., Amir el, A. D., Tadmor, M. D., … Nolan, G. P. (2015). Data‐driven phenotypic dissection of AML reveals progenitor‐like cells that correlate with prognosis. Cell, 162(1), 184–197. doi: 10.1016/j.cell.2015.05.047.
  Lun, A. T. L., Richard, A. C., & Marioni, J. C. (2017). Testing for differential abundance in mass cytometry data. Nature Methods, 14(7), 707–709. doi: 10.1038/nmeth.4295.
  Newell, E. W., & Cheng, Y. (2016). Mass cytometry: Blessed with the curse of dimensionality. Nature Immunology, 17(8), 890–895. doi: 10.1038/ni.3485.
  Saeys, Y., Van Gassen, S., & Lambrecht, B. N. (2016). Computational flow cytometry: Helping to make sense of high‐dimensional immunology data. Nature Reviews Immunology, 16(7), 449–462. doi: 10.1038/nri.2016.56.
  Seshadri, A., Brat, G. A., Yorkgitis, B. K., Keegan, J., Dolan, J., Salim, A., … Lederer, J. A. (2017). Phenotyping the immune response to trauma: A multiparametric systems immunology approach. Critical Care Medicine, 45(9), 1523–1530. doi: 10.1097/CCM.0000000000002577.
  Shekhar, K., Brodin, P., Davis, M. M., & Chakraborty, A. K. (2014). Automatic classification of cellular expression by nonlinear stochastic embedding (ACCENSE). Proceedings of the National Academy of Sciences of the United States of America, 111(1), 202–207. doi: 10.1073/pnas.1321405111.
  Spitzer, M. H., & Nolan, G. P. (2016). Mass Cytometry: Single cells, many features. Cell, 165(4), 780–791. doi: 10.1016/j.cell.2016.04.019.
  Van Gassen, S., Callebaut, B., Van Helden, M. J., Lambrecht, B. N., Demeester, P., Dhaene, T., Saeys, Y. (2017). FlowSOM: Using self‐organizing maps for visualization and interpretation of cytometry data. Cytometry A, 87(7), 636‐645. doi: 10.1002/cyto.a.22625.
  Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., & Batzoglou, S. (2017). Visualization and analysis of single‐cell RNA‐seq data by kernel‐based similarity learning. Nature Methods, 14(4), 414–416. doi: 10.1038/nmeth.4207.
  Weber, L. M., & Robinson, M. D. (2016). Comparison of clustering methods for high‐dimensional single‐cell flow and mass cytometry data. Cytometry A, 89(12), 1084–1096. doi: 10.1002/cyto.a.23030.
  Wei, S. C., Levine, J. H., Cogdill, A. P., Zhao, Y., Anang, N. A. S., Andrews, M. C., … Allison, J. P. (2017). Distinct cellular mechanisms underlie anti‐CTLA‐4 and anti‐PD‐1 checkpoint blockade. Cell, 170, 1120–1133, doi: 10.1016/j.cell.2017.07.024.
Internet Resources
  https://rdrr.io/bioc/flowCore/
  flowCore: Basic structures for flow cytometry data. R Package Version, 1.42.2. (2017). B. Ellis, P. Haaland, F. Hahne, N. Le Meur, N. Gopalakrishnan, J. Spidlen, and M. Jiang.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library