HPC Education Practice Platform Based on National High Performance Computing Environment
The R&D and application level of supercomputers is an important symbol of the national scientific and technological development level and comprehensive competitiveness, and also the commanding point of technological innovation in the computer field in this century. China's supercomputer technology is developing rapidly, but the gap between China's supercomputing application level and foreign countries is large, and the talent shortage makes it difficult to support the rapid growth of application demand. At the same time, the scale of supercomputing system is huge, the structure is complex, the resources are heterogeneous, and the application field needs are diverse, which further increases the difficulty of supercomputing personnel training. Therefore, it is urgent to build an educational practice platform based on the national high-performance environment to improve the parallel computing thinking and R&D ability of supercomputing application of college students, graduate students and people in various industries.
EasyHPC(Easyhpc.net) is the largest supercomputing education practice platform in China. Based on the national high-performance computing environment, this platform provides high-performance computing quality education content for undergraduate and graduate students in universities across the country (see Figure 1 for the interface). EasyHPC is jointly developed by Sun Yat-Sen University, Tsinghua University, Peking University, University of science and technology of China, Hunan University, Northwest University of technology, Northwest University, Network center of Chinese Academy of Sciences, Inspur company, Parallel Technology company, Metacomputing company and other domestic first-class universities, scientific research institutions and enterprises, and has won the key support of "High Performance Computing" in the national key research and development plan. Many supercomputing centers provide machine time resources and technical support for the platform, such as the National Supercomputing Center in Guanghzhou and Changsha, the Supercomputing Center in the Chinese Academy of Sciences, and the Shanghai Supercomputing Center.
Figure 1：Main interface of EasyHPC
Highlights and Features：
The main highlights and features of EasyHPC are as follows:
1. Personalized and progressive learning path of high performance computing
Aiming at the traditional education mode of high-performance computing that emphasizes theory over practice, EasyHPC innovates and realizes a learner centered, personalized and progressive practical teaching mode (see Figure 2). By developing a learning path from shallow to deep, integrating multi-level resource environment and education content, and fully guiding learners to make use of software and hardware resources, it is easy to start high-performance computing in a short period of time, and then move to advanced challenges, and finally achieve the full coverage of practice level. In particular, in order to adapt to the interdisciplinary nature of high-performance computing, to build a practical environment for different architectures and specific domain programming models and languages, the subject-oriented visual analysis technology is used to customize online teaching laboratories in different application fields on demand. EasyHPC achieves the effect of practical experiments and enables learners to cultivate their own learning ability in the real environment and the spirit of innovation and creation, which meets the training needs of multi-level talents.
Figure 2: Progressive learning path of high performance computing
2. Feedback debugging test and analysis of large scale parallel programs
The debugging and testing technology and application feature analysis of high-performance computing applications play an important role in inproving users' practice level. Considering the high threshold of traditional debugging and testing tools, EasyHPC realizes the feedback based parallel program debugging and Analysis Technology (see Figure 3), which can support the feedback debugging and testing of 2400 core parallel programs at most. Through the program checker, application feature collector and application feature analyzer, it can monitorand and analysis the processor, memory, network and storage performance data of cluster management / login node, computing node, I / O node and other servers in real time. It can feed back the running characteristics of the application software in the cluster system over time to user in time, realize the accurate and efficient reconstruction of the cluster application running process, help user to find the faults and performance bottlenecks in the running of the application program, and enhance the interaction of the platform in the parallel program debugging and testing.
Figure 3：Feedback debugging test and analysis of parallel programs
3. Rapid customization and deployment of efficient containerized HPC practice environment
According to the diversity and multi-level characteristics of educational users' demands for resources and environment, EasyHPC can realize the dynamic construction of container resource cluster on demand and support users to build the system operation software stack adaptively. Besides, EasyHPC can also quickly generate resource containers for student users and can support efficient batch replication and publishing (see Figure 4). To achieve the best matching of resource requirements and practice environment, EasyHPC supports dynamic scheduling and configuration of resources on demand, and can effectively schedule I / O resources, computing resources, accelerator resources, network resources, data and software library resources and application software. Through the dynamic construction and scheduling technology of resources, it can well meet the diversified resource environment requirements of the supercomputing education practice platform.
Figure 4：An efficient and containerized education practice platform architecture
Platform application effect：
Since EasyHPC launched, it has attracted more than 280000 visitors and served more than 7000 students distributed in more than 100 units in 29 provinces and autonomous regions across the country (see Figure 5). It supports the practice teaching of more than 30 high-performance computing courses in more than 10 universities. It has completed nearly 8000 parallel program evaluations for course selection and learning, and provided more than 20 million kernel hours / year. At the same time, EasyHPC actively supported college students and graduate students to participate in rdma17, cpc17 Pac17 optimization, pac17 application, SC17, asc18, asc19 and a series of domestic. Using the curriculum teaching resources provided by EasyHPC in the competition training process, on the one hand, the teaching experience obtained in the training is injected into the design of the curriculum content, on the other hand, the integrity and practicability of the measured curriculum resources are measured. EasyHPC has made outstanding achievements in the cultivation of high-performance talents and gained wide attention. It has been reported by many magazines and media such as computer education, communication of China computer society, information engineering and technology, Sohu, Sina, science network, etc. (see Figure 6).
Figure 5：Growth and distribution of users' visits to EasyHPC
Figure 6：Media reports of EasyHPC and HPC competition awards