Publications

Below an overview of the discovAIR publications

SARS-CoV-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes

Waradon Sungnak 1 ✉, Ni Huang1, Christophe Bécavin 2, Marijn Berg3,4, Rachel Queen5, Monika Litvinukova1,6, Carlos Talavera-López1, Henrike Maatz6, Daniel Reichart7, Fotios Sampaziotis 8,9,10, Kaylee B. Worlock11, Masahiro Yoshida 11, Josephine L. Barnes11 and HCA Lung Biological Network*✉ 

We investigated SARS-CoV-2 potential tropism by survey- ing expression of viral entry-associated genes in single-cell RNA-sequencing data from multiple tissues from healthy human donors. We co-detected these transcripts in specific respiratory, corneal and intestinal epithelial cells, potentially explaining the high efficiency of SARS-CoV-2 transmission. These genes are co-expressed in nasal epithelial cells with genes involved in innate immunity, highlighting the cells’ potential role in initial viral infection, spread and clearance. The study offers a useful resource for further lines of inquiry with valuable clinical samples from COVID-19 patients and we provide our data in a comprehensive, open and user-friendly fashion at www.covid19cellatlas.org. 

1Wellcome Sanger Institute, Cambridge, UK. 2Université Côte d’Azur, CNRS, IPMC, Sophia-Antipolis, France. 3Department of Pathology and Medical Biology, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands. 4Groningen Research Institute for Asthma and COPD, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands. 5Bioinformatics Core Facility, Newcastle University Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle-upon-Tyne, UK. 6Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany. 7Department of Genetics, Harvard Medical School, Boston, MA, USA. 8Wellcome and MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK. 9Department of Medicine, Addenbrookes Hospital, Cambridge, UK. 10Cambridge Liver Unit, Cambridge University Hospitals, Cambridge, UK. 11UCL Respiratory, Division of Medicine, University College London, London, UK. *A list of authors and their affiliations appears at the end of the paper. ✉e-mail: ws4@sanger.ac.uk; lung@humancellatlas.org

pdf Link to Article

Preprints

Benchmarking atlas-level data integration in single-cell genomics

Luecken MD(1)​,​ Büttner M​(1),​ Chaichoompu K(1),​ Danese A(1),​ Interlandi M(2),​ Mueller MF(1),​ Strobl DC(1),​ Zappia L(1,3),​ Dugas M(2),​ Colomé-Tatché M(1,4,5*), Theis FJ​(1,3,5*)

Cell atlases often include samples that span locations, labs, and conditions, leading to complex, nested batch effects in data. Thus, joint analysis of atlas datasets requires reliable data integration.

Choosing a data integration method is a challenge due to the difficulty of defining integration success. Here, we benchmark 38 method and preprocessing combinations on 77 batches of gene expression, chromatin accessibility, and simulation data from 23 publications, altogether representing >1.2 million cells distributed in nine atlas-level integration tasks. Our integration tasks span several common sources of variation such as individuals, species, and experimental labs. We evaluate methods according to scalability, usability, and their ability to remove batch effects while retaining biological variation.

Using 14 evaluation metrics, we find that highly variable gene selection improves the performance of data integration methods, whereas scaling pushes methods to prioritize batch removal over conservation of biological variation. Overall, BBKNN, Scanorama, and scVI perform well, particularly on complex integration tasks; Seurat v3 performs well on simpler tasks with distinct biological signals; and methods that prioritize batch removal perform best for ATAC-seq data integration. Our freely available reproducible python module can be used to identify optimal data integration methods for new data, benchmark new methods, and improve method development.

(1) Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany. (2) Institute of Medical Informatics, University of Münster, Münster, Germany. (3) Dep of Mathematics, Technische Universität München, Garching bei München, Germany. (4) European Research Institute for the Biology of Ageing, University of Groningen, University Medical centre Groningen, Groningen, The Netherlands. (5) TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany. *Correspondence: ​maria.colome@helmholtz-muenchen.de​; fabian.theis@helmholtz-muenchen.de

pdf

Integrated analyses of single-cell atlases reveal age, gender, and smoking status associations with cell type-specific expression of mediators of SARS-CoV-2 viral entry and highlights inflammatory programs in putative target cells

Christoph Muus*, Malte D. Luecken*, Gokcen Eraslan*, Avinash Waghray*, Graham Heimberg*, Lisa Sikkema*, Yoshihiko Kobayashi*, Eeshit Dhaval Vaishnav*, Ayshwarya Subramanian*, Christopher Smilie*, Karthik Jagadeesh*, Elizabeth Thu Duong*, Evgenij Fiskin*, Elena Torlai Triglia*, Meshal Ansari*, Peiwen Cai*, Brian Lin*, Justin Buchanan*, Sijia Chen*, Jian Shu*, Adam L Haber*, Hattie Chung*, Daniel T Montoro*, et al.
* These authors contributed equally

The COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, creates an urgent need for identifying molecular mechanisms that mediate viral entry, propagation, and tissue pathology. Cell membrane bound angiotensin-converting enzyme 2 (ACE2) and associated proteases, transmembrane protease serine 2 (TMPRSS2) and Cathepsin L (CTSL), were previously identified as mediators of SARS-CoV2 cellular entry. Here, we assess the cell type-specific RNA expression of ACE2, TMPRSS2, and CTSL through an integrated analysis of 107 single-cell and single-nucleus RNA-Seq studies, including 22 lung and airways datasets (16 unpublished), and 85 datasets from other diverse organs. Joint expression of ACE2 and the accessory proteases identifies specific subsets of respiratory epithelial cells as putative targets of viral infection in the nasal passages, airways, and alveoli. Cells that co-express ACE2 and proteases are also identified in cells from other organs, some of which have been associated with COVID-19 transmission or pathology, including gut enterocytes, corneal epithelial cells, cardiomyocytes, heart pericytes, olfactory sustentacular cells, and renal epithelial cells. Performing the first meta- analyses of scRNA-seq studies, we analyzed 1,176,683 cells from 282 nasal, airway, and lung parenchyma samples from 164 donors spanning fetal, childhood, adult, and elderly age groups, associate increased levels of ACE2, TMPRSS2, and CTSL in specific cell types with increasing age, male gender, and smoking, all of which are epidemiologically linked to COVID-19 susceptibility and outcomes. Notably, there was a particularly low expression of ACE2 in the few young pediatric samples in the analysis. Further analysis reveals a gene expression program shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues, including genes that may mediate viral entry, subtend key immune functions, and mediate epithelial-macrophage cross- talk. Amongst these are IL6, its receptor and co-receptor, IL1R, TNF response pathways, and complement genes. Cell type specificity in the lung and airways and smoking effects were conserved in mice. Our analyses suggest that differences in the cell type-specific expression of mediators of SARS-CoV-2 viral entry may be responsible for aspects of COVID-19 epidemiology and clinical course, and point to putative molecular pathways involved in disease susceptibility and pathogenesis.

pdf
This project is funded by
Grant no. 874656

Share project / Contact us