Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- What is ROCm?
- What is HIP?
- ROCm vs CUDA vs OpenCL
- Overview of ROCm and HIP features and architecture
- ROCm for Windows vs ROCm for Linux
Installation
- Installing ROCm on Windows
- Verifying the installation and checking device compatibility
- Updating or uninstalling ROCm on Windows
- Troubleshooting common installation issues
Getting Started
- Creating a new ROCm project using Visual Studio Code on Windows
- Exploring the project structure and files
- Compiling and running the program
- Displaying the output using printf and fprintf
ROCm API
- Using ROCm API in the host program
- Querying device information and capabilities
- Allocating and deallocating device memory
- Copying data between host and device
- Launching kernels and synchronizing threads
- Handling errors and exceptions
HIP Language
- Using HIP language in the device program
- Writing kernels that execute on the GPU and manipulate data
- Using data types, qualifiers, operators, and expressions
- Using built-in functions, variables, and libraries
ROCm and HIP Memory Model
- Using different memory spaces, such as global, shared, constant, and local
- Using different memory objects, such as pointers, arrays, textures, and surfaces
- Using different memory access modes, such as read-only, write-only, read-write, etc.
- Using memory consistency model and synchronization mechanisms
ROCm and HIP Execution Model
- Using different execution models, such as threads, blocks, and grids
- Using thread functions, such as hipThreadIdx_x, hipBlockIdx_x, hipBlockDim_x, etc.
- Using block functions, such as __syncthreads, __threadfence_block, etc.
- Using grid functions, such as hipGridDim_x, hipGridSync, cooperative groups, etc.
Debugging
- Debugging ROCm and HIP programs on Windows
- Using Visual Studio Code debugger to inspect variables, breakpoints, call stack, etc.
- Using ROCm Debugger to debug ROCm and HIP programs on AMD devices
- Using ROCm Profiler to analyze ROCm and HIP programs on AMD devices
Optimization
- Optimizing ROCm and HIP programs on Windows
- Using coalescing techniques to improve memory throughput
- Using caching and prefetching techniques to reduce memory latency
- Using shared memory and local memory techniques to optimize memory accesses and bandwidth
- Using profiling and profiling tools to measure and improve the execution time and resource utilization
Summary and Next Steps
Requirements
- Understanding of C/C++ language and parallel programming concepts
- Foundational knowledge of computer architecture and memory hierarchy
- Experience with command-line tools and code editors
- Familiarity with the Windows operating system and PowerShell
Audience
- Developers seeking to learn how to install and use ROCm on Windows to program AMD GPUs and exploit their parallelism
- Developers aiming to write high-performance, scalable code capable of running across various AMD devices
- Programmers interested in exploring the low-level aspects of GPU programming and optimizing their code's performance
21 Hours
Custom Corporate Training
Training solutions designed exclusively for businesses.
- Customized Content: We adapt the syllabus and practical exercises to the real goals and needs of your project.
- Flexible Schedule: Dates and times adapted to your team's agenda.
- Format: Online (live), In-company (at your offices), or Hybrid.
Price per private group, online live training, starting from 3900 € + VAT*
Contact us for an exact quote and to hear our latest promotions