Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop

Level
Total time

Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop

Cloudera University
Logo Cloudera University

Tip: need more info about the programme, starting date or price? Request information for free!

Starting dates and places

There are no known starting dates for this product.

Description

Course Summary

This hands-on course is for anyone who wants to manage, manipulate, and query large, complex data in real time using SQL and familiar scripting languages on Hadoop. Learn how Apache Pig, Apache Hive, and Cloudera Impala enable data transformations and analyses via filters, joins, and user-defined functions familiar from other technologies.

You Will Learn

  • The fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with Hadoop tools
  • Joining multiple data sets and analyzing disparate data with Pig
  • Organizing data into tables, performing transformations, and simplifying complex queries with Hive
  • Performing real-time interactive anal…

Read the complete description

Frequently asked questions

There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.

Didn't find what you were looking for? See also: Six Sigma, Lean, Process Management, Project Management, and Risk Analysis.

Course Summary

This hands-on course is for anyone who wants to manage, manipulate, and query large, complex data in real time using SQL and familiar scripting languages on Hadoop. Learn how Apache Pig, Apache Hive, and Cloudera Impala enable data transformations and analyses via filters, joins, and user-defined functions familiar from other technologies.

You Will Learn

  • The fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with Hadoop tools
  • Joining multiple data sets and analyzing disparate data with Pig
  • Organizing data into tables, performing transformations, and simplifying complex queries with Hive
  • Performing real-time interactive analyses on massive data sets stored in HDFS or HBase using SQL with Impala
  • How to pick the best analysis tool for a given task in Hadoop

Prerequisites

This course is best suited to data analysts, business analysts, developers, and administrators who have experience with SQL and basic UNIX or Linux commands. Prior knowledge of Apache Hadoop is not required.

Outline

  • Introduction
  • Hadoop Fundamentals
  • Introduction to Pig
  • Basic Data Analysis with Pig
  • Processing Complex Data with Pig
  • Multi-Dataset Operations with Pig
  • Extending Pig
  • Pig Troubleshooting and Optimization
  • Introduction to Hive
  • Data Analysis with Hive
  • Hive Data Management
  • Text Processing with Hive
  • Hive Optimization
  • Extending Hive
  • Introduction to Impala
  • Analyzing Data with Impala
  • Choosing the Best Tool for the Job
  • Conclusion

Stay up-to-date on new reviews

There are no reviews yet.

Share your review

Do you have experience with this course? Submit your review and help other people make the right choice. As a thank you for your effort we will donate $1.- to Stichting Edukans.

There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.

Where should we send the information?

(optional)
(optional)
(optional)
(optional)
(optional)
We store your personal details, and share them with Cloudera University, in order to help you along via email and potentially via phone. You can find more info in our privacy policy.