Big Data Analytics efficient tools for storing, processing and analyzing Big Data in powerful supercomputers are necessary. This course is divided into two parts. In the second part, learners will apply skills acquired from the first part to advance their knowledge on Machine and Deep Learning applied on scientific research and related topics. The course will involve necessary theoretical lectures, and hands-on and lab sessions. The course is generally geared towards efficient use of HPC resources for Big Data Analytics. The topics covered include Supervised and Unsupervised Machine Learning, and Convolutional and Recurrent Neural Network (CNN and RNN).
CUDA is a widely used programming environment for GPUs. This course introduces hardware and parallelization concepts for GPUs. The CUDA programming environment is described in detail. both for C and Fortran, including the language elements for controlling the processor parallelism and for accessing the various levels of memory. The use of GPU accelerated libraries (cuBLAS, cuFFT) is demonstrated. All topics are explained by means of examples in practical exercises.
NVIDIA and the Max Planck Computing and Data Facility (MPCDF) are hosting a GPU Bootcamp open to all researchers and code developers of the Max Planck Society (MPG). During this online Bootcamp, participants will learn how to apply AI tools, techniques, and algorithms to real-life problems. You’ll study the key concepts of Deep Neural Networks, how to build Deep Learning models, and how to measure and improve the accuracy of your models. You’ll also learn essential data preprocessing techniques to ensure a robust machine learning pipeline. This online Bootcamp is a hands-on learning experience where you’ll be guided by step-by-step instructions with mentors on hand to help throughout the process.
LOCATION: online via Zoom HOST
The most important parallelization constructs of MPI are explained and applied in hands on exercises. The parallelization of algorithms is demonstrated in simple examples, their implementation as MPI programs will be studied in practical exercises. Fundamentals of parallel processing (computer architectures and programming models), Introduction to the Message Passing Interface (MPI), The main language constructs of MPI-1 and MPI-2 (Point-to-point communication, Collective communication incl. synchronization, Parallel operations, Data Structures, Parallel I / O, Process management), Demonstration and practical exercises with Fortran, C and Python source codes for all topics; Practice for the parallelization of sample programs; Analysis and optimization of parallel efficiency.
This course is intended to provide a smooth entry into the world of High Performance Computing. We will show the most important Linux commands, how to log onto the Cluster, compile and install software and give examples how to efficiently use the compute resources
Software literacy has become a key competence for scientists across all disciplines. Scientists use the software daily, and software development is becoming an increasingly important component of scientific productivity. However, the software needed for certain research projects can get highly complex and take up resources otherwise needed for core research. In the demanded professionalization of software development in research, specialized Research Software Engineers have emerged in recent years. With their help, researchers tackle the challenges in the areas of software and data, such as reproducibility, correctness, user-friendliness, performance, or maintenance. Our two-day workshop provides new opportunities for learning about best practices in scientific software development, such as
Seeing recent flagship projects in action.
Discussing software licensing and intellectual property issues.
Discovering new ways to make your software known and recognized.
We invite all interested scientists, research software engineers, IT and computing specialists and individuals involved in creating, using or otherwise dealing with research software in the Max Planck Society. We also welcome participants from other research institutions.
Registration deadline: 25 April 2022.
How to start with the MPCDF systems and services: This introductory course helps new users with the first steps on the MPCDF systems. It covers login, file systems, HPC systems, SLURM, and the MPCDF services remote visualization, Jupyter notebooks and datashare. Basic knowledge of Linux is required.
Apr 28, 2022 TIME: 14:00 - 16:30 (online, via Zoom)
Details and registration: https://www.mpcdf.mpg.de/about-mpcdf/news-events/mpcdf-introductory-user-course
As part of the (virtual) 2-weekly Göttingen HPC Coffee meeting, there is a short introduction to Dask on offer.
Details: 11:00 online, via Big Blue Button
This practical course is comprised of two parts. The first part is a crash course on the basics of High-Performance Computing, where you’ll get hands-on experience, it covers the theoretical knowledge regarding parallel computing, high-performance computing, supercomputers, and the development and performance analysis of parallel applications using MPI and OpenMP. In the second part, you will team up in groups of two and parallelize a non-trivial problem of your choice, create a sequential solution and parallelize and analyze the scalability of the application.
If you are just interested to learn about parallel programming and don’t need credits, you can join only the first part of the course from April 25th to 29th and gain a certificate.
For additional information check the Details
The deadline for registration is April 10th.
The GWDG are going to give you an introduction to Container usage in HPC for Users. This workshop will be an interesting excursion into the field of Containers in HPC. The main part will be the hands-on session where participants will get course accounts for our SCC cluster and use JupyterHub on HPC. In Jupyter you will be running prepared Jupyter notebooks with Singularity, which you will be able to access with your browser. You can also use the materials afterwards as a starting point for your own workflows with Singularity containers.
Second part of the course.
Building on “Python for Computational Science”, introduction to
compiling and linking code
basic revision control with git (managing your files)
using High Performance Computing installations (Slurm and modules)
One week course not assuming prior Python knowledge - a good starting point to get into programming and computation for scientific research.
Teaching materials: https://www.desy.de/~fangohr/teaching/py4cs2022/
This MPCDF workshop helps HPC users to better manage, debug and profile their code. One day is dedicated to GPUs. MPCDF organizes again an advanced HPC workshop for users of the MPG from Monday, November 22nd until Wednesday, November 24th, 2021 with an optional day with hands-on on Thursday, November 25th. The workshop will be given online. The main topics of the lectures are:
Software engineering for HPC codes (git, gitlab, CI, testing)
Debugging and profiling of CPU and GPU codes
Porting codes to GPU-accelerated systems
NVIDIA and the Max Planck Computing and Data Facility (MPCDF) is hosting a GPU Bootcamp open to all researchers and code developers of the Max Planck Society (MPG). During this two-day online Bootcamp, participants will learn about multiple GPU programming models and can choose the one that best fits their needs to run their scientific codes on GPUs (like those in the HPC system Raven). This Bootcamp will cover an introduction to GPU programming using OpenACC, OpenMP, stdpar and CUDA C, and provides hands-on opportunities to learn how to analyse GPU-enabled applications using NVIDIA Nsight Systems.
For details and registration please visit https://www.mpcdf.mpg.de/about-mpcdf/news-events/gpu-bootcamp
The course provides a basic introduction to the compute and data services available at MPCDF, it includes about 2 hours of tutorial sessions, an interactive chat option and a concluding Q&A session and is intended specifically to lower the bar for the first-time usage of MPCDF services. Major topics include an overview and practical hints for connecting to the HPC compute and storage facilities and using them via the Slurm batch system. Basic knowledge about working in a Linux environment is a prerequisite. An MPCDF account is not needed. This course is offered twice per year, with one issue in spring and another one in autumn.
For details and registration please visit https://www.mpcdf.mpg.de/about-mpcdf/news-events/mpcdf-introductory-user-course
This workshop introduces the use of Python for High Performance Computing. Main topics are HPC packages like numpy, scipy etc., writing parallel code with Python, speeding up Python code with Cython or interfaces to compiled languages. Building Python packages and documentation as well as a good coding style are further topics of this three-day workshop.
See https://www.mpcdf.mpg.de/events/28729/2825 for details and registration
Presentation (pdf) Hans Fangohr at (Day 2, Research software, 13:30)
See https://nfdi4ing.de/konferenz/ for details.
Research software has become an essential tool in modern science. Despite this, often scientist only have basic training in programming, missing important aspects of modern software development and engineering. In this course we will present key ideas, tools and techniques used by researchers to develop robust, efficient codes in a collaborative environment. This will be done in the context of the Octopus code (https://octopus-code.org). The philosophy of the Octopus code is to be a platform which allows to implement new scientific ideas with relative ease. The code has a modular structure and to a large extend hides numerical nitty-gritties at a lower level, allowing researchers to write new modules without the necessity to touch those low level parts. The code is actively developed with strict quality control. In this second part of the course, the participants will learn everything necessary to contribute to the Octopus development, and to implement new scientific ideas.
Advanced version control in a collaborative environment (merge requests, code review, etc)
Advanced topics of git (merge, rebase, etc)
Regression testing and continuous integration
Object oriented programming
Parallelization and performance
As an outlook this course also provides an overview of current state-of-the-art scientific problems with Octopus. Also, the second part will last for one week, with a mixture of lectures and demonstration, as well as hands-on sessions to try the learned concepts.
Details at https://www.mpsd.mpg.de/events/27629/500135
The BiGmax Summer School 2021 “Harnessing big data in materials science from theory to experiment” will take place from September 13 - 17, 2021 (held as an online event only).
The school focuses on combining lectures of renowned experts with hands-on tutorials predominantly targeted towards PhD students and early career researchers.
Density functional theory (DFT) and its time dependent variant, time-dependent DFT (TD-DFT), are tools of choice to simulate microscopic processes in nature. This, however, requires powerful numerical tools to solve the underlying equations and perform simulations of relevant physical processes. In his course we will give an introduction on how the DFT and TD-DFT equations can be solved numerically on a computer, with plently of practical examples using the Octopus code (https://octopus-code.org). Octopus is a real-space DFT code, geared mainly at the real-time propagation of time-dependent systems. Besides introducing the code and highlighting its functionalities, a set of hands-on tutorials will allow the students to learn how to set up the system, run ground-state and time-dependent calculations.
Ground-state calculations and total energy convergence
Throughout the course, it will be emphasized how to check the results for numerical convergence, but also for computational efficiency.
2021-08-04 19:00 Software Engineering Challenges and Best Practices for Multi-Institutional Scientific Software Development¶
Part of the webinar on HPC best practices (https://ideas-productivity.org/events/hpc-best-practices-webinars/). Registration at link is required.
Presenter is Keith Beattie from Lawrence Berkeley National Laboratory (https://crd.lbl.gov/departments/data-science-and-technology/idf/staff/keith-beattie/). Here is the summary:
Scientific software is increasingly becoming the backbone of obtaining and validating scientific results. This is no longer just the case for traditionally computationally intensive areas but is now true across a wide variety of scientific disciplines. This circumstance elevates how scientific software is developed, independent of the field, to a new level of importance. Further, the multi-institutional nature of many science projects presents unique challenges to how scientific software can be effectively developed and maintained over the long term. In this webinar we present the challenges faced in leading the development of scientific software across a distributed, multi-institutional team of contributors, and we describe a set of best-practices we have found to be effective in producing impactful and trustworthy scientific software.
Wednesday, 2021/07/21, 14:00 - 16:00 CEST, organised by MPCDF training team
This online tutorial gives a basic introduction to using the GPUs on the new HPC system Raven. The following topics will be covered:
GPU software environment for HPC and AI/data analytics: applications, libraries, compilers, tools
batch system: submitting GPU jobs
Zoom link available in Zulip
Brainstorming on use of workflows in Photon Science data analysis
(Event from data analysis group at EuXFEL, presentation Thomas Kluyver / Robert Rosca).
Target group: research software engineers or scientists with significant interest in software / computational methods.
Addition from after the meeting: notes are at https://codimd.desy.de/SHbiNcAcRGm7q749cgKzRg?view
Zoom link available in Zulip
extra-data, Thursday, June 24th 2021,
Zoom link available in Zulip
Wednesday, June 23, 2021 from 3:00 p.m. to 4:00 p.m.
Presenter? Prof. Dr. Frank Oliver Glöckner, Head of Data at the Computing Center at the AWI Computing and Data Center.
Joint Data Science Colloquium on June 24th, 2021 “Geometric Deep Learning: From Euclid to Drug Design” Date: 21 June 2021 at 08:12:42 CEST
We are looking forward to the talk of Prof. Michael Bronstein (Professor for Machine Learning and Pattern Recognition, Faculty of Engineering, Department of Computing, Imperial College London, England) on Thursday, June 24th, 2021 at 2 pm.
His talk is entitled “Geometric Deep Learning: From Euclid to Drug Design”. The official announcement of the talk can be found here: https://syncandshare.desy.de/index.php/s/AYAEAeMCCpxsHj4
Further information concerning the speakers and the lectures can also be found here: https://www.dashh.org/events/data_science_colloquium/index_eng.html
Zoom link available in Zulip
Dr. Carsten Fortmann-Grote (Max-Planck-Institut für Evolutionsbiologie) und Prof. Dr. Hans Fangohr (Max-Planck-Institut für Struktur und Dynamik der Materie) at Forschungsdatenmanagement Workshop 2021
The Jupyter notebook format enables seamless coexistence of computer program code, documentation, and execution, as well as interactive visualization and discussion of results in one document and provides a user-friendly work environment. We will give an overview of the Jupyter ecosystem of tools and services and discuss how Jupyter enhances reproducibility in data intensive research.
Theory and experiment have been the two pillars of science that for centuries have underpinned our understanding of the world around us. With the advent of powerful computers, computational methods have emerged as a third pillar of science. Among other techniques, numerical methods, data analysis, and visualization have become indispensable tools for many scientists nowadays. This course intends to introduce basic numerical methods which allow to perform numerical simulations on modern computing platforms.
Approximation of functions
Root finding and solving nonlinear equations
Numerical differentiation and integration
Solving ordinary and partial differential equations
Solving linear systems of equations
In addition, in the course we provide hands-on exercises for participants to gain experience with high-performance computing environments. We intend to cover:
Compiling and linking codes
The Slurm queueing system and module environments
Basic revision control with git
Delivered by Heiko Appel, Henning Glawe, Hans Fangohr, Martin Lueders
The one-week course has been designed for scientists and engineers to teach the practical programming skills that are relevant for modern computational science. The module does not assume prior programming knowledge of participants. The module uses hands-on activities for all participants to exercise and experiment with the taught material. The material covers a wide spectrum of skills that are advantageous for scientists who need to handle data - be it from experiment or simulation – and provides a basis for self learning or directed learning of more specialized topics at a later stage.
More detailed announcement: https://www.mpsd.mpg.de/events/27626/500135?1619346787
Delivered by Hans Fangohr, Henning Glawe, Heiko Appel
Presenter: Hans Fangohr
The Open Science COVID Analysis (OSCOVIDA) project provides an open science portal at https://oscovida.github.io/ to display and investigate the COVID19 infections and deaths as a function of time for the US states, the districts in Germany, and most other countries in the world. For each region, different observables are shown as a function of time: total infections and deaths, daily changes, the reproduction number and growth factor, and doubling times; this can be normalised by the population of the relevant region. Data sources are currently the Johns Hopkins University and the Robert Koch Institute. All the source code that creates the plots is open source, and provides a Python library to simplify download of current data and population numbers, and computation and display of the COVID tracking plots available on the website. The system is based on Jupyter notebooks, which users can execute in their browser (using the MyBinder project) to adapt the analysis to their own interests, or base additional studies on top of the framework. Some tutorials and additional data analysis studies are available, and further contributions are welcome.