Hadoop Cluster setup using Ansible 🎯

👉Our Aim : Configure Hadoop and start cluster services using Ansible Playbook…..

🔑What is an Ansible ?

Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems, and can configure both Unix-like systems as well as Microsoft Windows.

🔑What is Ansible PlayBook ?

An Ansible® playbook is a blueprint of automation tasks — which are complex IT actions executed with limited or no human involvement. Ansible playbooks are executed on a set, group, or classification of hosts, which together make up an Ansible inventory.

🔑What is Hadoop ?

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.

Let’s have hands on the task now …..

We will be performing this task in RHEL8 booted on Virtual Box……

🟥Let’s first install Ansible.

Ansible is built in the top of Python …..Thus we will using pip to install Ansible

🟥Now we will be writing the Ansible Playbook , Excited!!!

👉Before Commencing ,jot down what is to be achieved…

1️⃣ Mounting the DVD on RHEL8

2️⃣ Configuring Yum Repo

3️⃣ JDK Installation for Hadoop

4️⃣ Hadoop Installation

5️⃣ NameNode Configuration

6️⃣ DataNode Configuration

7️⃣ Starting the Hadoop Daemon service

📀Lets see how we write yaml code for mounting DVD :(Namenode as well as Datanode)

🎨Configuring Yum Repo :(Namenode as well as Datanode)

👉Installing JDK (JAVA) : (Namenode as well as Datanode)

👉Installing Hadoop :(Namenode as well as Datanode):

👉Stopping firewall for connection purpose : (Namenode as well as Datanode)

🛠Name Node Configuration Hadoop :

👉 Creating directory in NameNode :

👉 Configuring hdfs-site.xml in /etc/hadoop/

👉Configuring core-site.xml in /etc/hadoop/

👉Formatting NameNode directory :

👉 Starting hadoop-daemon namenode service :

🛠Data Node Configuration :

👉 Creating DataNode directory :

👉 Configuring hdfs-site.xml :

👉Configuring core-site.xml

👉Starting the DataNode :

Here’s the github link for the playbook : Hadoop_Ansible

📚Lets see how its working now :

Bravo ! : NameNode is started and 1 DataNode is connected to it..

Also the DataNode is configured and launched :

WEB UI View :

That’s how we succeeded in configuring the Hadoop Cluster using Ansible…..

Ansible has made our task easy…Rather than configuring the cluster manually , automating it had made our task easy and in less time we can configure as many as desired clusters…

Keep Sharing ……🤗

Happy Learning …🧮