Giornata Milanese NGS

Milan meeting on “Next Generation Sequencing”

10 February 2023, 9:30

University of Milan-Bicocca, U4 Building
Aula Sironi (Edificio Tellus, U4 – 8) underground floor
Piazza della Scienza 4, 20126 Milan

The NGS Milano Meeting is an informal, day-long forum to discuss experiences, research lines and projects regarding the generation and analysis of Next Generation Sequencing data, both from the wet- and the dry-lab perspectives.
The current instantiation will mainly focus on the advancements of Data Science and Artificial Intelligence in Cancer and Viral Genomics.


  • Marco Antoniotti (co-chair). Department of Informatics, Systems and Communication , University of Milan-Bicocca.
  • Gianluca Ascolani. Department of Informatics, Systems and Communication , University of Milan-Bicocca.
  • Alessia Donato. Department of Informatics, Systems and Communication , University of Milan-Bicocca.
  • Alex Graudenzi (co-chair). Department of Informatics, Systems and Communication , University of Milan-Bicocca.
  • Giancarlo Mauri. Department of Informatics, Systems and Communication , University of Milan-Bicocca.

Preliminary Speakers List

9:309:50Francesco FerrariIFOM
9:5010:10Elena BianchiPolitecnico di Milano, Milano
10:1010:30Francesca BuffaUniversità Bocconi, Milano
11:0011:20Giovanni TononIRCCS Ospedale San Raffaele, Milano
11:2011:40Davide GiacominiThe Prophet AI
11:4012:00Elvezia ParaboschiHumanitas University, Milano
12:0012:20Andrea TangherloniUniversità Bocconi, Milano
12:2012:40Andrea SottorivaHuman Technopole, Milano
14:0014:20Martina Di TraniHumanitas Clinical and Research Center, Milano
14:2014:40Silvia CascianelliPolitecnico di Milano, Milano
14:4015:00Silvia SpinelliUniversità degli studi di Milano-Bicocca, Milano
15:0015:20Luca DentiUniversità degli studi di Milano-Bicocca, Milano
15:2015:40Mattia PelizzolaIstituto Italiano di Tecnologia, Milano

Registration and Coffee!!!

Participation is FREE as well as coffee. Participants will just be asked to provide contact information.
Please fill the form to announce your participation.

We are collecting your email for legitimate purposes and it will not be redistributed without your consent.


Talks Titles and Abstracts

9:30 – 9:50

SAMMY-seq: Biochemical properties of chromatin domains define their compartmentalization

Francesco Ferrari (IFOM)

We present a new experimental technique to map chromatin compartments starting from their distinct biochemical properties. Our approach is based on the rationale that chromatin domains located in the same 3D nuclear neighborhood are in a similar biochemical environment. From a single experiment we can extract information on the linear segmentation of euchromatic and heterochromatic genomic regions, as well as on their 3D segregation in active and inactive chromatin compartments.

9:50 – 10:10
A microfluidic platform for High Throughput drug screening on Patient Derived Organoids of cancer patients.

Elena Bianchi (Politecnico di Milano, Milano)

Nowadays, cancer patients undergo several rounds of treatments, often without a concrete indication of efficacy. Many patients do not respond to the first line therapy or become refractory to the original therapy. Patient-derived organoids (PDOs) have recently emerged as robust preclinical models with the potential to predict clinical outcomes in patients . We present a platform to perform in vitro high-throughput drug screening directly on patient-derived cells. We take advantage from the microfluidic technology to obtain hundreds of PDOs cultured in single droplets, aiming at providing continuous flow and to resembling the in vivo pharmacokinetic profiles.Main goal of this tool is to better define the optimal chemotherapy treatment for each patient but single cells and eluates recovery from hundreds of single culture units open the field to the generation of a huge amount of data, available for genomic analyses on drug resistance and cancer evolution.

10:10 – 10:30

Genotype to phenotype: a multi-agent approach

Francesca Buffa (Università Bocconi, Milano)

A cell’s phenotype is the set of observable characteristics resulting from the interaction of the genotype with the surrounding environment, determining the function of an organism. Deciphering genotype-phenotype relationships can be challenging due to its complexity, but it has been crucial to understanding normal and
disease biology. This has involved analysis of molecular pathways, from DNA to protein and function. Gene network representations have been an invaluable tool to tackle this complexity; however, typically they do not consider the physical microenvironment, which is a key determinant of phenotype. We present a novel modelling framework to study of the link between genotype and cell behaviour in a three-dimensional microenvironment. To achieve this, we bring together multi-agent modelling, a powerful computational technique, and gene networks. This combination lends itself naturally to model a heterogeneous population of cells acting and evolving in a dynamic microenvironment. Importantly, this enables studying evolution, cell-cell interactions and the effect of co-occurring perturbations, such as mutations and environmental changes.

11:00 – 11:20

Sketching Open and Closed Chromatin, One Cell at a Time

Giovanni Tonon (IRCCS Ospedale San Raffaele, Milano)

Recent efforts have succeeded in surveying open chromatin at the single-cell level, but high-throughput, single-cell assessment of heterochromatin and its underlying genomic determinants remains challenging. We engineered a hybrid transposase including the chromodomain (CD) of the heterochromatin protein-1α (HP-1α), which is involved in heterochromatin assembly and maintenance through its
binding to trimethylation of the lysine 9 on histone 3 (H3K9me3), and
developed a single-cell method, single-cell genome and epigenome by transposases sequencing (scGET-seq), that, unlike single-cell assay for transposase-accessible chromatin with sequencing (scATAC-seq), comprehensively probes both open and closed chromatin and concomitantly records the underlying genomic sequences. We tested scGET-seq in cancer-derived organoids and human-derived xenograft (PDX) models and identified genetic events and plasticity-driven mechanisms contributing to cancer drug resistance. Next, building upon the differential enrichment of closed and open chromatin, we devised a method, Chromatin Velocity, that identifies the trajectories of epigenetic modifications at the single-cell level. Chromatin Velocity uncovered paths of epigenetic reorganization during stem cell
reprogramming and identified key transcription factors driving these developmental processes. scGET-seq reveals the dynamics of genomic and epigenetic landscapes underlying any cellular processes.

11:20 – 11:40

TheProphetAI Innovation for Life Sciences

Davide Giacomini (TheProphetAI, Milan)

Recent advances in the field of machine learning have yielded new
tools that show promise for addressing complex problems in the life
sciences. The purpose of the paper is to discuss the potential
advantages of gene recommendation performed by artificial intelligence
(AI). Indeed, gene recommendation engines try to solve this problem.

If the user is interested in a set of genes, which other genes are
likely to be related to the starting set and should be investigated?

A custom deep learning recommendation engine, DeepProphet2 (DP2), was
developed to successfully complete this task. DP2 is available for use
by researchers globally, and can be accessed through the website: Hereafter, insights behind the algorithm and
its practical applications are illustrated.

The gene recommendation problem can be addressed by mapping the genes
to a metric space where a distance can be defined to represent the
real semantic distance between them. To achieve this objective a
transformer-based model has been trained on a well-curated freely
available paper corpus, PubMed. The paper provides a comprehensive
description of the neural network architecture utilized in the study,
including the training process. Multiple optimization procedures were
implemented to achieve the optimal balance between bias and variance.
The focus was on the impact of factors such as embedding size and
network depth. In the investigation, the performance of the model in
identifying sets of genes related to diseases and pathways was
evaluated using cross-validation. The evaluation was based on the
assumption that the network had no prior knowledge of pathways or
diseases, and that it learned gene similarities and interactions
solely through the training process. Furthermore, to gain a deeper
understanding of the gene representation learned by the neural
network, the dimensionality of the embeddings was reduced and the
results were mapped onto a lower-dimensional space that is easily
interpretable by humans. In conclusion, a set of use cases illustrates
the algorithm’s potential applications in a real-world setting.

11:40 – 12:00

The circular RNA landscape in multiple sclerosis: disease-associated variants and exon methylation shape circular RNA expression profile

Elvezia Paraboschi (Humanitas University, Milano)

Circular RNAs (circRNAs) are a class of non-coding RNAs increasingly emerging as crucial actors in the pathogenesis of autoimmune and neurodegenerative disorders as multiple sclerosis (MS). The mechanisms regulating circRNAs expression are still largely unknown, and the circRNA profile and regulation in MS-relevant cell models has not been completely investigated. In this frame, we aimed at exploring the global landscape of circRNA expression in MS and at evaluating the correlation with the genetic and epigenetic background. Our work defines a map of circRNA expression in immune cells of patients and suggests that disease-associated variants may tune the expression levels of circRNAs, acting as quantitative trait loci (circ-QTLs). Finally, we also propose a role for exon-based DNA methylation in regulating circRNA expression.

12:00 – 12:20

An overview of existing tools for the automated annotation of cell populations on scRNA-seq data

Andrea Tangherloni (Università Bocconi, Milano)

An Single-cell sequencing opened a new era for transcriptomic and genomic research, allowing us to better understand cellular heterogeneity and dynamics, as well as the molecular processes governing the development of an organism and the onset of pathologies. In this context, cell-type annotation represents a crucial step for the analysis of sequencing data; however, it is still often manually performed by expert biologists, resulting in a time-consuming and partially subjective task. As an alternative, computational tools have been recently proposed for automatic cell type identification. Such approaches exploit curated marker gene databases, correlating reference expression data, and supervised or unsupervised Machine Learning approaches. I will present an overview of the existing tools that can be used to automatically annotate cell populations on scRNA-seq data, by highlighting the underlying approaches.

12:20 – 12:40


Andrea Sottoriva (Human Technopole, Milano)


14:00 – 14:20 Next Generation Sequencing Meets Liquid Biopsy in Lymphoma

Martina Di Trani (Humanitas Clinical and Research Center, Milano)

Lymphomas are a heterogeneous group of tumors affecting the cells of the lymphoid system, in particular B lymphocytes. Tissue biopsy is the gold standard for diagnosis, allowing the identification of specific genetic profiles with prognostic and predictive value. In the era of personalized and precision medicine, the identification of specific genetic profiles is important for the correct management of the disease, as well as the possibility of monitoring tumor evolution over time. In this context, the analysis of circulating tumor DNA (ctDNA) from liquid biopsies, is increasingly becoming the subject of study thanks to the contribution of innovative Next Generation Sequencing and data analysis techniques. There is a lot of scientific evidence in the field of lymphomas, about the qualitative and quantitative role of ctDNA correlating with the most common clinical parameters, which demonstrates how ctDNA becomes a valid biomarker and a potential tissue biopsy surrogate. The evidences are encouraging, but in order to achieve this goal, improvements in the understanding of ctDNA biology, bioinformatics and Next Generation Sequencing technologies are necessary.

14:20 – 14:40

Machine Learning for Oncogenomics: a clinically-relevant single-sample multi-label subtyping approach for colorectal cancer patients

Silvia Cascianelli (Politecnico di Milano, Milano)

Tertiary analysis aims to make sense of NGS omics data to
address complex biological questions and tasks of medical interest. With the advent of the Omics Data Science field, tertiary analysis can follow the lifecycle of a Data Science investigation, as we demonstrated in this application focused on colorectal cancer (CRC) patient subtyping. Transcriptional classification of CRC has already been used to stratify patients into molecular subtypes with distinct biological and clinical features, such as the Intrinsic Subtypes (CRIS). Yet, CRIS are currently assigned using a dataset-level approach that cannot be applied to a single sample at a time, as clinical practice requires. Furthermore, it was still unclear whether these CRIS are mutually exclusive or potentially overlapping
molecular/ phenotypic states. Therefore, after showing that associating heterogenous samples with multiple CRIS provides enhanced clinical/biological information, we developed a machine learning-based multi-label CRIS classifier. This classifier can better characterize a single CRC patient and improve predictions of prognosis and response to treatments by assigning more than one CRIS in case of molecular heterogeneity.

14:40 – 15:00Spatial Transcriptomic at single cell resolution

Silvia Spinelli (Università degli studi di Milano-Bicocca, Milano)

In the last couple of years there has been a very intense development of spatial transcriptomics technologies which are potentially able to change the way we do research in many areas. At the same time a standardization of spatial transcriptomic techniques is completely lacking, and only few of them are actually available on the market and have a very high cost. Seq-Scope is a spatial transcriptomic technology published in 2021 by Chun-Seok Cho et al (doi: 10.1016/j.cell.2021.05.010). It is based on Illumina NGS technology and allows to study gene expression at cellular and subcellular resolution, keeping tissue and cells morphology information. This technology does not require specific instruments and has a significantly lower overall cost than other commercial spatial transcriptomics technologies. Therefore, given the potential of Seq-Scope, we are currently developing this technology in our laboratory, trying to further improve the original protocol by increasing the tissue deposition area, with the aim of applying this new technology to analyze bone marrow biopsies in patients affected by hematological disorders.

15:00 – 15:20
SVDSS (Structural Variation Discovery from sample-specific strings): a new method for discovery of SVs from PacBio HiFi

Luca Denti (Università degli studi di Milano-Bicocca, Milano)

Structural variants (SVs) account for a large amount of sequence variability across genomes and play an important role in human genomics and precision medicine. Despite intense efforts over the years, the discovery of SVs in individuals remains challenging due to the diploid and highly repetitive structure of the human genome, and by the presence of SVs that vastly exceed sequencing read lengths. However, the recent introduction of low-error long-read sequencing technologies such as PacBio HiFi may finally enable these barriers to be overcome. Here we present SV discovery with sample-specific strings (SVDSS)—a method for discovery of SVs from long-read sequencing technologies (for example, PacBio HiFi) that combines and effectively leverages mapping-free, mapping-based and assembly-based methodologies for overall superior SV discovery performance. Our experiments on several human samples show that SVDSS outperforms state-of-the-art mapping-based methods for discovery of insertion and deletion SVs in PacBio HiFi reads and achieves notable improvements in calling SVs in repetitive regions of the genome.

15:20 – 15:40Profiling of RNA modifications through Nanopore direct RNA sequencing

Mattia Pelizzola (Istituto Italiano di Tecnologia, Milano)

More than 160 modifications decorate and markedly impact the fate of coding and non-coding RNA species. The profiling of native RNA through the Nanopore single-molecule sequencing platform is emerging as a powerful approach to jointly profile the transcriptome and the epitranscriptome. The benchmarking of various computational tools currently available for the identification of RNA modifications on Nanopore direct RNA sequencing data will be presented.