Hadoop

  1. What is Hadoop?

Hadoop is an open source framework that helps us to store big data ranging from gigabytes to petabytes in a distributed environment. Instead of using one large computer Hadoop uses clustering multiple computers to go through massive datasets so that anyone can access them in parallel more quickly. Hadloop has four main modules – Hadoop Distributed File System (HDFS), Yet Another Resource Negotiator (YARN), MapReduce and Hadoop Common.

  1. Why Hadoop?

This open source framework, Hadloop comes free as it uses commodity hardware to store big data. Second reason is because Hadoop is too quick. The ability to store huge amounts of data and to process them is way too fast in Hadoop. So it’s very useful nowadays with social media and the internet growing. Hadoop is both flexible and scalable so it is used by IT giants like Facebook, Yahoo & Amazon.

  1. How to prepare for Hadoop?

Hadoop course is best suited for IT students or Data management or Analytics professionals. There are various online websites who provide online courses each having their own way of teaching. The basic syllabus of Hadoop course is as follows-

Syllabus-

  1. What is Big Data? Types and Characteristics
  2. What is Hadoop? Introduction
  3. Architecture, Ecosystem, Components of Hadoop
  4. HDFS and Yarn
  5. MapReduce
  6. Apache Sqoop
  7. Apache Flume
  8. Hadoop Pig
  9. Hadoop & MapReduce-Interview Training
  10. Important Materials:
  11. Ebooks-

Big Data Now: http://cdn.oreillystatic.com/oreilly/radarreport/0636920028307/Big_Data_Now_2012_Edition.pdf

Data-Intensive Text Processing with MapReduce: Jimmy Lin and Chris Dyer- https://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf

Disruptive Possibilities: How Big Data Changes Everything: Jeffrey Needham- https://www.oreilly.com/data/free/files/disruptive-possibilities.pdf

Migrating Big Data Analytics into the Cloud: Mike Barlow- https://www.oreilly.com/data/free/files/migrating-big-data-analytics.pdf

b. Videos-

Big Data & Hadoop Full Course – https://www.youtube.com/watch?v=1vbXmCrkT3Y

What is Hadoop – https://www.youtube.com/watch?v=9s-vSeWej1U

How to Install Hadoop on Windows 10- https://www.youtube.com/watch?v=g7Qpnmi0Q-s

Big Data and Hadoop –https://www.youtube.com/watch?v=Uv96qQ3uC6Y&list=PLWPirh4EWFpENnR0p1JvhJkyTK1M0sOLR

  1. Important Tips-

Before learning Hadoop one should know that it’s not all about just creating codes in programming but you also should possess necessary skills like Java, writing scripts or reviewing log files and sometimes you might also have to schedule jobs across clusters on Hadoop.

Some common questions asked during Interview are –

  1. What are “Big Data” and five V’s of Big Data?
  2. What are Hadoop and its different components like HDFS and YARN.
  3. What is a checkpoint?
  4. What are the main configuration parameters in a “MapReduce” program?
  5. Briefly say what’s “Distributed Cache” in a “MapReduce Framework”.

 

Be the first to add a review.

Please, login to leave a review