Hadoop: Introduction
Interactive

Hadoop: Introduction

LearnNow Online
Updated Aug 21, 2018

Course description

In this course we are going to look at the necessity of big data in today’s world and how it fits into your organizations future. Then we will look at one big data framework in particular, Hadoop, as it is fully open source and driven by the community. We will examine some of the pieces that comprise Hadoop and demonstrate some of its functionality. There are so many use cases where big data can enhance your organizations competitive edge - analyzing social media, sensor data, click stream data, geographic analysis, emails, the list goes on. Hopefully you have a better understanding, not only of what big data and Hadoop are, but, more importantly, where they fit into your organizations structure and what they bring to the table.

Each LearnNowOnline training course is made up of Modules (typically an hour in length). Within each module there are Topics (typically 15-30 minutes each) and Subtopics (typically 2-5 minutes each). There is a Post Exam for each Module that must be passed with a score of 70% or higher to successfully and fully complete the course.


Prerequisites

This course assumes that the users have an understanding of working with databases and database systems. The user should also be familiar with syntax commands for Linux.


Meet the expert

Barry Solomon

Barry Solomon has over 23 years of experience as a consultant. He has developed with Fortran, C, C , Visual Basic, Java, and Visual C#. His extensive database experience includes working with Microsoft Access, Microsoft SQL Server, MySQL, and Oracle. His expertise now includes working with big data, Hadoop in particular, and all of its attending ecosystems as the limitations have been exceeded in most modern database systems.

Video Runtime

121 Minutes

Time to complete

282 Minutes

Course Outline

What is Big Data

Purpose of Big Data (40:41)

  • Introduction (00:22)
  • End of the Line (05:16)
  • OLTP and OLAP (03:07)
  • Storage (02:39)
  • Big Data as Supercomputer (05:11)
  • Scalability (02:22)
  • Hard Drives (03:20)
  • Parallelism (02:13)
  • Whose Data is it? (04:16)
  • Being Competitive and Relevant (03:47)
  • What is Big Data (02:06)
  • Variety, velocity and volume (01:31)
  • Leveraging and ROI (01:42)
  • Data Data Everywhere (01:38)
  • Throw it in the Lake of Data (00:52)
  • Summary (00:10)

Use Cases (13:22)

  • Introduction (00:15)
  • Use Cases (02:38)
  • Real Time vs Batch Processing (01:15)
  • What About Databases (02:10)
  • OLTP and OLAP (02:15)
  • Appliances (00:57)
  • Mix and Match (01:03)
  • Schema on Write, on Read (00:50)
  • NoSQL (01:40)
  • Summary (00:12)
Hadoop

Hadoop (37:06)

  • Introduction (00:16)
  • What do I get (01:01)
  • Hadoop (02:49)
  • File System (01:56)
  • MapReduce (01:18)
  • YARN (02:02)
  • Ecosystem (03:03)
  • Pig (03:30)
  • Hive (03:42)
  • Mahout and Oozie (03:05)
  • NoSQL (00:20)
  • Sqoop (01:36)
  • Ambari (01:51)
  • ZooKeeper (01:20)
  • The other pieces (07:16)
  • Tez (01:37)
  • Summary (00:16)

Hadoop Demo (29:50)

  • Introduction (00:20)
  • Where do we go? (05:44)
  • Demo: Download (02:32)
  • Demo: Putty (01:01)
  • Demo: Web Interface (03:56)
  • Demo: Back to Putty (02:57)
  • Demo: PIG (03:00)
  • Demo: HIVE Table (05:40)
  • Demo: Ambari (02:30)
  • Demo: Query (01:56)
  • Summary (00:09)