You don't need FlowJo, FCS Express, or any paid software to do serious flow cytometry data analysis. R gives you a complete, reproducible, free workflow — from loading your FCS files to publication-ready figures. This tutorial walks you through every step with real code.

What you'll learn: Install the right packages → load FCS files → QC → compensation → transformation → gating → extract statistics → visualize results. All free, all in R.

Why Analyze Flow Cytometry Data in R?

The standard tools (FlowJo, FCS Express) are powerful but expensive — licenses can cost thousands of dollars per year. R offers a fully functional alternative that is free, open-source, and crucially, reproducible. Every gate, every transformation, every statistical test is written in code you can share, version-control, and re-run on new data.

For research labs with multiple users, R also scales much better: a single script can process hundreds of FCS files overnight with no manual intervention. That's not something you can do clicking through a GUI.

Step 1 — Install the Essential Packages

All packages you need are available through Bioconductor, which is the standard repository for bioinformatics R packages.

# Install Bioconductor manager once
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

# Core flow cytometry packages
BiocManager::install(c(
  "flowCore",      # read/write FCS files, core data structures
  "flowViz",       # base plotting for flow data
  "ggcyto",        # ggplot2-based visualization
  "openCyto",      # automated gating pipelines
  "flowWorkspace", # gating set management
  "CytoExploreR"  # interactive gating interface
))
flowCoreFoundation — read FCS files, transformations, basic statistics
ggcytoggplot2 extension — beautiful, publication-ready flow plots
openCytoAutomated gating using data-driven algorithms
flowWorkspaceManage gating hierarchies and populations
CytoExploreRInteractive gating — closest thing to FlowJo in R
cytofastOur package — fast visualization of FlowSOM results

Step 2 — Load Your FCS Files

The flowCore package handles FCS 2.0, 3.0, and 3.1 files. The basic unit is a flowFrame (one FCS file) or a flowSet (multiple files).

library(flowCore)

# Load a single FCS file
ff <- read.FCS("sample_001.fcs", transformation = FALSE)

# See what's inside
summary(ff)
colnames(ff)       # channel names (e.g., "FITC-A", "PE-A")
pData(parameters(ff))  # full channel descriptions

# Load all FCS files from a folder into a flowSet
fs <- read.flowSet(
  path = "./fcs_files/",
  pattern = "*.fcs",
  transformation = FALSE
)

length(fs)          # number of samples
sampleNames(fs)    # file names

Important: always load with transformation = FALSE first. Apply transformations yourself in a controlled way — letting flowCore auto-transform can give unexpected results with some instruments.

Step 3 — Quality Control

Before any analysis, remove debris, dead cells, and time-based anomalies (instrument clogs, pressure drops). The flowAI or PeacoQC packages automate this, but a manual check is always useful first.

library(ggcyto)

# Quick look at FSC vs SSC to assess overall quality
autoplot(fs[[1]], x = "FSC-A", y = "SSC-A", bins = 64)

# Time-based QC: check for signal drift
autoplot(fs[[1]], x = "Time", y = "FSC-A")

# Automated QC with flowAI
BiocManager::install("flowAI")
library(flowAI)
fs_clean <- flow_auto_qc(fs, output = 1)

Step 4 — Compensation

Spectral spillover is unavoidable in multicolor flow cytometry. If your FCS files already contain a compensation matrix embedded by the cytometer software, you can apply it directly. Otherwise, you calculate it from single-stain controls.

# Extract the embedded compensation matrix
comp_matrix <- keyword(fs[[1]])"$SPILL"]
comp <- compensation(comp_matrix)

# Apply to the whole flowSet
fs_comp <- compensate(fs_clean, comp)

# If you have single-stain controls, compute from scratch
# spillover() estimates the matrix from single-color tubes
comp_manual <- spillover(
  x = single_stains_flowset,
  unstained = unstained_sample,
  patt = "*.fcs",
  fsc = "FSC-A",
  ssc = "SSC-A",
  plot = TRUE
)

Step 5 — Transformation

Flow cytometry data is never analyzed on a linear scale. Fluorescence intensities span several decades and include negative values after compensation. The two standard transformations are logicle (biexponential) and arcsinh.

# Logicle transform — the standard for flow cytometry
# estimateLogicle() picks optimal parameters per channel
trans <- estimateLogicle(fs_comp[[1]], channels = colnames(fs_comp))
fs_trans <- transform(fs_comp, trans)

# Arcsinh — preferred for mass cytometry (CyTOF), cofactor = 5
asinhTrans <- arcsinhTransform(transformationId = "arcsinh", a = 0, b = 1/5, c = 0)
channels <- colnames(fs_comp)[4:30]  # adjust to your channel range
translist <- transformList(channels, asinhTrans)
fs_trans <- transform(fs_comp, translist)

Step 6 — Gating

Gating is where most of the analytical decisions happen. In R you can gate manually (by defining polygon or rectangle gates), use statistical methods, or use fully automated pipelines via openCyto.

Manual gating with flowWorkspace

library(flowWorkspace)
library(ggcyto)

# Create a GatingSet from your flowSet
gs <- GatingSet(fs_trans)

# Define a lymphocyte gate (FSC/SSC)
lymph_gate <- polygonGate(
  filterId = "Lymphocytes",
  .gate = matrix(c(
    50000, 20000,   # FSC-A, SSC-A corner 1
    200000, 20000,  # corner 2
    200000, 80000,  # corner 3
    50000, 80000    # corner 4
  ), ncol = 2, dimnames = list(NULL, c("FSC-A", "SSC-A")))
)

# Add gate to the gating hierarchy
gs_pop_add(gs, lymph_gate, parent = "root")
recompute(gs)

# Gate on CD3+CD4+ T cells from lymphocyte parent
cd4_gate <- rectangleGate(
  filterId = "CD4 T cells",
  "BV421-A" = c(2, Inf),   # CD3
  "PE-A" = c(2, Inf)        # CD4
)
gs_pop_add(gs, cd4_gate, parent = "Lymphocytes")
recompute(gs)

Automated gating with openCyto

library(openCyto)

# openCyto uses a gating template (CSV) to define the hierarchy
# then applies data-driven algorithms to each gate

gt <- gatingTemplate("gating_template.csv")
gt_gating(gt, gs)

# The template CSV looks like:
# alias, pop, parent, dims, gating_method, gating_args
# Lymphocytes, +, root, FSC-A:SSC-A, flowClust.2d, K=1
# Live, +, Lymphocytes, Viability-A, mindensity,
# CD4 T cells, +, Live, CD3-A:CD4-A, cytokine,

Step 7 — Extract Cell Population Statistics

Once your gating hierarchy is built, extracting counts and percentages is a single function call.

# Get population statistics for all samples
stats <- gs_pop_get_stats(gs, type = "percent")
head(stats)

# Get absolute counts
counts <- gs_pop_get_stats(gs, type = "count")

# Reshape to wide format for downstream stats
library(tidyr)
library(dplyr)

stats_wide <- stats |>
  pivot_wider(names_from = pop, values_from = percent) |>
  left_join(sample_metadata, by = "name")

# Now run your statistics
wilcox.test(stats_wide$`CD4 T cells` ~ stats_wide$group)

Step 8 — Visualization

The ggcyto package extends ggplot2 to understand flow cytometry data structures. You get the full power of the ggplot2 grammar — facets, themes, scales — applied directly to your gating sets.

library(ggcyto)

# Overlay gate on a dot plot
autoplot(gs, gate = "Lymphocytes", x = "FSC-A", y = "SSC-A",
         bins = 64, strip.text = "gate")

# Multi-panel: one panel per sample, CD3 vs CD4
ggcyto(gs, aes(x = "CD3-A", y = "CD4-A"), subset = "Lymphocytes") +
  geom_hex(bins = 64) +
  geom_gate("CD4 T cells") +
  geom_stats() +
  scale_x_logicle() + scale_y_logicle() +
  facet_wrap(~name) +
  theme_bw()

# Box plot comparing populations across groups
library(ggplot2)
ggplot(stats_wide, aes(x = group, y = `CD4 T cells`, fill = group)) +
  geom_boxplot(alpha = 0.7) +
  geom_jitter(width = 0.1) +
  labs(y = "CD4+ T cells (%)", x = NULL) +
  theme_bw()

Going Further: CytoFAST for High-Throughput Visualization

When you work with FlowSOM clustering results and need to visualize many clusters across many samples quickly, CytoFAST was built exactly for that. It reads FlowSOM output and produces heatmaps and summary plots in seconds — something that becomes critical when you're analyzing 50+ FCS files simultaneously.

# CytoFAST — rapid visualization of FlowSOM cluster results
install.packages("cytofast")
library(cytofast)

# Read FlowSOM cluster labels
cfList <- readCytof(
  ffList = ff_list,
  clust = "SOM_label",
  samples = "sampleID",
  ...
)

# Summary heatmap: clusters × markers
cytoHeat(cfList, key = "clust", legend = TRUE)

# Stacked bar chart: cluster frequencies per sample
cytoBar(cfList)

Recommended Workflow Summary

StepPackageKey function
Load FCS filesflowCoreread.flowSet()
Quality controlflowAI / PeacoQCflow_auto_qc()
CompensationflowCorecompensate()
TransformationflowCoreestimateLogicle()
GatingflowWorkspace / openCytogs_pop_add(), gt_gating()
StatisticsflowWorkspacegs_pop_get_stats()
Visualizationggcytoggcyto(), autoplot()
ClusteringFlowSOM / cytofastFlowSOM(), cytoHeat()

Common Pitfalls and How to Avoid Them

Resources and Further Reading

← All Articles Clustering Algorithms for Mass Cytometry →