Current version: public review, v0.3.beta

Current database version: 20_stable_20250821

Welcome to the Atlas of Neutrophil Biology

Specialized omics platform for neutrophil single-cell data analysis

🎨 Try Our Literal Canvas - Sketch Your Ideas!

Since we're called 'Neutrophil Canvas ', we thought you'd enjoy an actual drawing canvas! Sketch research ideas or just doodle while exploring neutrophil data.

💡 Works on desktop and mobile devices

🗂️ Curated Neutrophil Data Collection

Hand-curated datasets from published studies with neutrophils carefully identified, subset, and annotated for optimal research quality.

High-quality single-cell RNA-seq data
Expert-defined metadata and annotations
Multiple disease models, species, and tissues
Harmonized data for cross-study comparisons

Focus directly on neutrophil states across various experimental conditions.

🎨 Interactive Analysis Canvas

Powerful interactive interface for visualizing gene expression, applying signatures, and investigating neutrophil diversity.

Real-time gene expression visualization
Custom gene signature scoring
Interactive UMAP with brush selection
Advanced filtering and plotting controls

All computations happen on the fly — no coding required!

🔬 Expert-Curated Quality Data

Access clean, focused datasets tailored specifically for neutrophil research.

Quality-controlled single-cell data
Comprehensive metadata annotations
Reliable foundation for research
Continuously updated database

⚙️ Neutrophil-Specialized Analytics

Analysis tools engineered specifically for neutrophil biology research.

Tailored analytical algorithms
Neutrophil-specific workflows
Reliable and relevant results
Enhanced biological insights

🔍 Flexible Population Exploration

In-depth exploration and comparison of different neutrophil populations.

Cell-to-cell variability analysis
Population dynamics insights
Pattern and trend discovery
Enhanced biological understanding

🆕 What's New in v0.3.beta

Welcome to v0.3.beta! This release focuses on enhanced user experience and improved dataset management.

🎨 User Interface Improvements

Modernized homepage with cleaner, more organized layout
Enhanced dataset selection with visual previews and better filtering
Improved navigation and visual hierarchy throughout the app

📊 Dataset & Analysis Enhancements

Major database expansion with additional high-quality neutrophil datasets
Expanded functional gene signature scoring options
Better marker detection with user-defined thresholds
Improved plot resizing and visualization responsiveness
Enhanced data integration and cross-study compatibility

📋 Documentation & Support

Added comprehensive changelog with version history
Enhanced documentation pages with better organization
Improved user guidance and onboarding experience

📋 See full details: View Complete Changelog for complete version history and technical details.

Analysis page front

Documentations

Current Available Datasets

Selection indicators

Select Paper/Dataset

Modify Base Object

=== Recalculation Parameters === Recommended re-running upon selecting different population

PCA: Randomize SVD

Do not randomize, ultrafast

Randomize with random seed

Randomize with fixed seed

PCA: Seed (1-100000)

Batch correction

Without batch correction

Include batch correction, slow

Batch correct on:

PCA: components

UMAP: neighboours

UMAP: min distance

Cluster: resolution

Download Current Object in RDS format

Master UMAP Plot

=== Graphic Options === Not affecting Base Object (reversible)

Color by

=== Subset by meta data ===

Subset by

Select on

Apply current selection into Base Object

Single Gene on UMAP

Plot by

Type to search

Plot on

Single Gene on Violin

Plot by

Type to search

Plot on

Group by

Colour by

Calculate New Signature

Calculate on

Download full results

The calculation will rank all the genes in one group against all selected groups. However, the heatmap will only show the automatically filtered significant signatures. Due to the complexity of some datasets, we recommend to download the full results for further filtering.

Whole level detection
Filter by one group

This page filters markers using same significant thresholds applied to all groups. Please use the following parameters to focus on more dominant markers.

AUC threshold

AUC restricts likelihood, how likely the marker is differentially expressed one group against another.

Median logFC threshold

Median logFC restricts fold change.

This page filters significant markers for one group, including both highly- and lowly-expressed markers. And plot its expression in all other groups.

Filter by

Sorted by (deprecated)

How many markers?

Download selections

Calculate gene sets enrichment

Functional Analysis

Upload signature from a plain text file

Choose TXT file

Browse...

Please put all genes in one line, delimited by single space.

We are calling for your giveaway signatures!

=== Found following genes available in this dataset ===

Choose algorithm

Download results

Group by

Colour by

Basic Workflow

1. Load dataset from Atlas Collection

2. Subset Cell Populations

3. Re-calculate UMAP

4. Choose analysis

Neutrophil Canvas Changelog

Track the evolution of Neutrophil Canvas with detailed version history, new features, and improvements.

Version v0.3.beta (Current)

Release Date: 2025-08-11

Database Version: 20_stable_20250821

🆕 New Features

Comprehensive changelog system with detailed version tracking
Modernized homepage with organized, collapsible content sections
Enhanced dataset selection interface with improved visual previews
Expanded functional gene signature scoring options
Significantly expanded neutrophil dataset collection with additional published studies

🔧 Improvements

Cleaner, more readable box layouts throughout the application
Better visual hierarchy and user interface organization
Improved plot resizing and responsiveness across different screen sizes
Enhanced navigation and user guidance throughout the platform

🐛 Bug Fixes

Resolved marker detection threshold configuration issues
Fixed various UI layout and styling inconsistencies
Improved stability of gene signature analysis workflows

📊 Database & Documentation Updates

Upgraded to database version 20_stable_20250821 with significantly expanded dataset collection
Added 12 high-quality published neutrophil studies from recent literature to the Atlas Collection
Enhanced data quality control and metadata standardization across all datasets
Improved cross-study data integration and harmonization protocols
Expanded coverage of neutrophil biology across different disease states and experimental conditions
Enhanced documentation organization with better categorization and improved user onboarding experience

📸 Visual updates and screenshots will be available here in future releases.

Version v0.2.beta

Release Date: 2025-05-05

Database Version: 12_stable_20250505

🆕 New Features

Enhanced user interface with improved navigation
Streamlined dataset selection with visual previews
Advanced gene signature analysis tools
Interactive UMAP visualization with brush selection
Comprehensive documentation system

🔧 Improvements

Optimized performance for large datasets
Enhanced error handling and user feedback
Improved responsive design for different screen sizes
Better integration between analysis modules

🐛 Bug Fixes

Fixed session management issues
Resolved computation conflicts between users
Improved stability of marker calculations

📊 Database Updates

Updated to stable database version 12 with improved data quality and expanded neutrophil datasets.

📸 Visual updates and screenshots will be available here in future releases.

Version v0.1.alpha

Release Date: 2025-03-15

Database Version: 11_beta_20250315

🎉 Initial Release

Basic neutrophil data visualization
Simple gene expression analysis
Core UMAP plotting functionality
Basic marker detection algorithms

🏗️ Foundation Features

Established data pipeline architecture
Initial user interface framework
Basic single-cell analysis workflows
Prototype gene signature scoring

📸 Legacy screenshots and documentation available upon request.

Contributing & Feedback

📝 How to Report Issues

Found a bug or have a feature request? We'd love to hear from you!

Email: fzhang@uni-muenster.de
Include your browser information and steps to reproduce any issues
Screenshots are always helpful for UI-related problems

🎯 Roadmap

Future development priorities:

Enhanced visualization tools and customization options
Integration with additional single-cell analysis packages
Improved export and sharing capabilities
Mobile-responsive design improvements
Advanced statistical analysis modules

Thank you for using Neutrophil Canvas! Your feedback drives our development.

What is PCA?

PCA (Principal Component Analysis) is a statistical method used to reduce the dimensionality of data while preserving as much variability (information) as possible.

In simpler terms, PCA:

- Takes data with many features (e.g., thousands of genes)

- Finds new summary axes (called principal components, or PCs)

- Each PC captures as much variation in the data as possible

And you can plot or analyze data using just a few of these PCs (often 2 or 3)

Why use PCA?

- To visualize high-dimensional data (e.g., scRNA-seq) in 2D or 3D

- To denoise data and focus on the main patterns

- As a preprocessing step for clustering or other dimensionality reduction (e.g., UMAP)

Key PCA Concepts

- Principal Components (PCs): New axes formed from linear combinations of original features (e.g., genes). PC1 captures the most variance, PC2 the second most, and so on.

- Scores: The projection of your samples (e.g., cells) on the PCs. Used for visualization.

- SVD: Singular Value Decomposition, a mathematical technique used to decompose a matrix into a product of three matrices: U, Σ, and V. It's a fundamental tool in PCA.

- Seed: A random number used to initialize the PCA calculation. Helps ensure reproducibility of results.

What is batch effect?

In single-cell data, batch effects can arise due to technical variation across different samples, experiments, or sequencing runs — not true biological differences. Batch correction is the process of adjusting for these effects to better compare cells across conditions.

Examples of batch effects:

- Different donors or patients

- Time points or replicates

- Library preparation batches

- Species or sequencing platforms

How this app corrects batch effects?

Neutrophil Canvas gives you full control over batch correction.

You decide:

- Whether to apply correction or not

- Which metadata variable(s) to correct out (e.g., patient ID, species, tissue)

This flexibility allows you to:

- Correct for known batch effects

- Explore the data without correction

- Compare the effect of different correction methods

Batch Correction Recommendations

Batch correction removes variation explained by the selected factor — this may include true biological signal if misused.

If your goal is to compare disease vs control, do not correct for disease status.

If your goal is to compare across patients, correcting for patient might make sense.

What is UMAP?

UMAP (Uniform Manifold Approximation and Projection) is a nonlinear dimensionality reduction method used to create 2D or 3D plots from high-dimensional data like single-cell gene expression.

It works by finding a low-dimensional embedding (a mathematical representation) of the high-dimensional data that preserves the local structure of the data.

UMAP is particularly useful for visualizing complex, high-dimensional datasets in a more interpretable 2D or 3D space.

What UMAP Shows?

- Each point = a single cell

- Cells closer together = more similar gene expression profiles

- Clusters = groups of similar cells (often representing a state or type)

- UMAP preserves both local structure (neighbors) and global relationships

This makes UMAP especially useful for:

- Identifying cell subpopulations

- Comparing across conditions or treatments

- Finding activation gradients or rare cell types

What the App Does?

In Neutrophil Canvas, UMAP is automatically computed based on PCA.

Besides Coloring cells by metadata, gene expression, or pathway scores

You also have access to:

- Interactive select tools

- Download UMAP embedding matrix

Technical Notes

- UMAP uses nearest-neighbor graphs built from PCA space.

- The layout is stochastic — repeated runs may look slightly different unless the seed is fixed.

- UMAP is not quantitative — distances on the plot should not be interpreted as exact differences.

- UMAP is a visualization tool — not a clustering or scoring method. Use it to explore the data visually, but interpret clusters using metadata and gene signatures, not UMAP shape alone.

What is Clustering?

Clustering is the process of grouping cells into clusters based on their similarity in gene expression profiles.

This helps identify distinct cell populations or states that share similar characteristics.

How the App Clusters?

In Neutrophil Canvas, clustering is performed automatically based on the PCA-reduced space (with or without batch correction, depending on your setting).

You can:

- Choose which resolution to use (higher = more clusters)

- View clusters on UMAP or directly use them as filters for population comparisons

Important Notes

Clustering is unsupervised and can be sensitive to parameter changes (e.g., resolution, number of PCs)

Not every cluster corresponds to a biologically meaningful group — always validate with known markers or metadata.

What is Being Tested?

The App performs differential signature scoring to highlight genes or pathways that distinguish one cell group from others. This is similar to differential gene expression, but optimized for single-cell resolution.

In bulk RNA-seq, methods like DESeq2, edgeR, or limma are used to account for count distributions and replicates. But in single-cell RNA-seq, the unit of observation is the individual cell, and variance from group size or sparsity becomes more prominent.

To reduce bias from group size imbalance and scaling effects, we provide three metrics that are better suited for single-cell data:

1. Cohen's d in log-fold change (logFC.cohen): similar to t-test, but it ignores unwanted the effect from highly variable group sizes.

2. AUC: Area Under ROC Curve: related to Wilcoxon rank-sum test or Mann–Whitney U test. It is scale-invariant and less sensitive to the variance of the group when compared to Cohen's d.

3. Cohen's d on only expressed cells (LogFC.detected): algorithm is the same with the previous test, but performs only on cells with at least 1 count of the gene. Might perform better when strictly binary design was given.

How Are Groups Compared?

The App uses scoreMarkers function from Bioconductor scran package.

How it works inside is summarizing every pair-wise comparison for each group given in the choice.

Let us say, we have the following groups: A, B, C in the dataset.

Default choice will calculate: 1) (B+C)/2 vs A; 2) (A+C)/2 vs B; 3) (A+B)/2 vs C.

Centralized plot helps you to focus on one group, picked the most positive ranked markers and the most negative ranked markers.

But if you are more interested in especially A vs B, then selecting only A and B in the calculation part is necessary.

You may update your new clustering results (e.g. more biologically meaningful clusters) as the group for comparison.

Q&A

Collecting questions...

Known Issues

No known issues.

About Me

- Developed by Dr. rer. nat. Fengjun Zhang

Email: fzhang@uni-muenster.de

UniMS mattermost: @fzhang

X.com: @fengjun_zhang