Bioinformatics Core Facility

Overview  Head of the Core Facility  Core Facility Members  Publications 

The Bioinformatics Core Facility supports the analysis, interpretation and publication of NGS and other complex datasets.

The Bioinformatics Core Facility supports researchers with computing infrastructure, web services, software training, experimental design, biostatistics and data analysis. Genome-wide assays employing next-generation sequencing (NGS) and other omics methods have become indispensable analytical tools and our facility provides comprehensive support for such projects.

We help you with the computational processing, visualisation, interpretation and publication of high-throughput data generated in the course of your research projects. With an experienced multi-disciplinary team and state-of-the-art high-performance computing resources, our facility is ready to provide comprehensive support for your NGS experiments, biostatistical analysis, and much more.

We offer different levels of assistance depending on your project needs, ranging from basic bioinformatics and biostatistics services to full-scale scientific collaborations. Our facility provides customised solutions and long-term analytical support for data-intensive projects that require expert handling for optimal results.

Full services:

  • data quality assessment, processing, analysis, visualisation and interpretation
  • development of NGS pipelines and customising them for individual projects
  • mining of various omics datasets and correlation and integration of results
  • development of novel software tools and methods for specific analysis tasks
  • statistical analysis of complex data like microscopy images and development of tailored models

Assisted services:

  • individual user consulting on bioinformatics and biostatistics topics
  • assistance with preparing manuscripts, presentations and grant proposals
  • testing, implementing and customising computational tools and resources
  • management and submission of omics data to public repositories like GEO and SRA

Self services:

  • access to state-of-the-art high-performance computing (HPC) resources
  • maintaining a collection of software tools and pipelines for HPC users
  • providing user-friendly interfaces for data analysis (e.g. RStudio, Shiny Apps, JupyterHub)

Support services:

  • user training via lectures and workshops to facilitate data analysis and interpretation
  • assistance with planning and good experimental design of omics projects
  • supporting computational biologists and data analysts in research groups
  • consulting on practical mathematical and statistical topics

To use our services, please schedule an initial user meeting to discuss the details of your project and our user guidelines. 

Key instruments

Our services operate on a Linux compute cluster, which is used for analysing omics data, running machine-learning applications and hosting software services like RStudio, Shiny Apps and JupyterHub deployed as virtual machines or containers. HPC users can install their own software and access common bioinformatics tools and resources maintained by us. 

Our HPC cluster has a dedicated system administrator and consists of

  • 6 compute nodes with 368 CPU cores (736 threads) and 9 TB RAM 
  • 12 GPU cards (including 4 NVIDIA-L40) for running AI applications like AlphaFold
  • high-performance fileserver from NetApp with 710 TB net storage capacity

Further details and usage instructions are available on our Intranet

Lectures and courses 

Courses are run annually. Please click here for upcoming lectures and courses.

  • Design and Analysis of NGS Experiments
  • AI Methods and Novel LLM Tools in Biomedical Research
  • Introduction to Biostatistics
  • Introduction to R
  • Plotting with R
  • Introduction to RNA-seq
  • Introduction to ChIP-seq and Related NGS Assays
  • Data Analysis Using HPC and Nextflow

Recent developments

Our facility maintains repositories with software tools and pipelines for comprehensive omics data analysis, which are also used by bioinformaticians and data analysts in the research groups. An ongoing major modernization of our infrastructure includes a transition from Bpipe to Nextflow workflow management, containerising the entire software stack, enhancing NGS pipeline accessibility and usability, and implementing best practices using GitLab. 

Some of the bioinformatics tools developed at IMB have already been made available to the broader research community:

  • NGSpipe2go is a set of modular pipelines for NGS data processing and analysis
  • dupRadar is a Bioconductor package for RNA-seq PCR duplication rate assessment
  • PingPongPro is a tool for discovery of piRNA ping-pong signatures
  • rrvgo is a Bioconudctor package for reducing and visualising lists of Gene Ontology terms
  • breakinspectoR is an R package for finding CRISPR on- and off-target sites
  • GLOE-pipe is a dedicated analytical pipeline for DNA strand break detection via GLOE-seq

Our staff actively collaborate with researchers, as well as with our Genomics, Flow Cytometry and Microscopy Core Facilities, in implementing novel omics technologies, such as GLOE-seq, sBLISS, BreakTag, STARR-seq, (m)iCLIP, scRNA-seq, multiome and spatial transcriptome profiling. We have also been contributing to the development of data management policies at IMB, a research data exchange database for the CRC 1361, as well as user-friendly interfaces for various software tools, e.g. for running AlphaFold predictions and for local deployment of large language models (LLMs).

Acknowledgements/ Co-authorships

“An acknowledgement of the Core Facilities in your publication is important and more than just a nice way to say thank you. Your acknowledgement matters!”

Acknowledgements are the currency by which Core Facilities are measured: they help us prove our value and our contribution to research results.  Acknowledging Core Facilities shows that we are important partners in the scientific community, and it allows us to maintain funding and make further investment in the Core Facilities. In the case that Core Facility staff members have contributed significantly to your research project, we ask that you treat them the same way you would treat any other collaborator and consider a co-authorship

Specific examples of how to properly acknowledge our Core Facilities can be viewed on our Intranet. In accordance with DFG recommendations, we furthermore ask that you acknowledge any instrumentation that is funded through a DFG major instrumentation grant by including the corresponding project number. 

Communities

We network and exchange ideas with other computational groups and core facilities, and participate in professional communities like ISCB and the BioInfo-Core network. All facility members regularly attend meetings, conferences and workshops in order to keep up with the latest developments in bioinformatics, biostatistics and AI applications in biomedical research. The Facility provides bioinformatics and biostatistics expertise to the Collaborative Research Centre on “Regulation of DNA Repair and Genome Stability” (CRC1361) and the “Science of Healthy Ageing Research Programme” (SHARP).