Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Big Data Overview:
- Definition of Big Data
- Reasons behind the growing popularity of Big Data
- Case studies in Big Data
- Key characteristics of Big Data
- Solutions for managing Big Data.
Hadoop and Its Components:
- Understanding Hadoop and its constituent parts.
- Hadoop architecture and the types of data it can handle or process.
- A brief history of Hadoop, companies utilizing it, and their motivations.
- Detailed explanation of the Hadoop framework and its components.
- Explanation of HDFS and operations such as reading from and writing to the Hadoop Distributed File System.
- Procedures for setting up a Hadoop cluster in various modes: Stand-alone, Pseudo-distributed, and Multi-node.
This section covers setting up a Hadoop cluster within VirtualBox, KVM, or VMware, addressing necessary network configurations, running Hadoop Daemons, and conducting cluster tests.
- Understanding the MapReduce framework and its operational mechanics.
- Executing MapReduce jobs on a Hadoop cluster.
- Understanding replication, mirroring, and rack awareness within Hadoop clusters.
Hadoop Cluster Planning:
- Strategies for planning your Hadoop cluster.
- Aligning hardware and software requirements for effective cluster planning.
- Analyzing workloads to plan a cluster that prevents failures and ensures optimal performance.
What is MapR and Why Choose MapR:
- Overview of MapR and its architectural design.
- Understanding and utilizing the MapR Control System, MapR Volumes, snapshots, and mirrors.
- Planning a cluster specifically in the context of MapR.
- Comparing MapR with other distributions and Apache Hadoop.
- MapR installation and cluster deployment processes.
Cluster Setup and Administration:
- Managing services, nodes, snapshots, mirrored volumes, and remote clusters.
- Understanding and managing nodes.
- Gaining insight into Hadoop components and installing them alongside MapR Services.
- Accessing data on the cluster, including via NFS, and managing services and nodes.
- Data management using volumes, user and group administration, role assignment to nodes, and the commissioning or decommissioning of nodes. Additionally, cluster administration, performance monitoring, configuring and analyzing metrics, and administering MapR security.
- Understanding and working with M7 native storage for MapR tables.
- Configuring and tuning the cluster for optimal performance.
Cluster Upgrade and Integration with Other Setups:
- Upgrading MapR software versions and types of upgrades available.
- Configuring the MapR cluster to access an HDFS cluster.
- Setting up a MapR cluster on Amazon Elastic MapReduce.
All topics above include demonstrations and practical sessions to provide learners with hands-on experience with the technology.
Requirements
- Fundamental knowledge of the Linux file system
- Basic Java programming skills
- Familiarity with Apache Hadoop (recommended)
28 Hours
Custom Corporate Training
Training solutions designed exclusively for businesses.
- Customized Content: We adapt the syllabus and practical exercises to the real goals and needs of your project.
- Flexible Schedule: Dates and times adapted to your team's agenda.
- Format: Online (live), In-company (at your offices), or Hybrid.
Price per private group, online live training, starting from 5200 € + VAT*
Contact us for an exact quote and to hear our latest promotions
Testimonials (1)
practical things of doing, also theory was served good by Ajay