• LOGIN
  • No products in the cart.

End-to-End Hive : HQL, Partitioning, Bucketing, UDFs, Windowing, Optimization, Map Joins, Indexes.


Course Description

Hive helps you leverage the power of Distributed computing and Hadoop for Analytical processing. It’s interface is like an old friend – the very SQL like HiveQL. This course will fill in all the gaps between SQL and what you need to use Hive.
The course is an end-to-end guide for using Hive: whether you are analyst who wants to process data or an Engineer who needs to build custom functionality or optimize performance – everything you’ll need is right here. New to SQL? No need to look elsewhere. The course has a primer on all the basic SQL constructs.
Everything is taught using real-life examples, working queries and code .

Learning Outcomes

  • Write complex analytical queries on data in Hive and uncover insights
  • Leverage ideas of partitioning, bucketing to optimize queries in Hive
  • Customize hive with user defined functions in Java and Python
  • Understand what goes on under the hood of Hive with HDFS and MapReduce

Pre-requisites

Hive requires knowledge of SQL. The course includes and SQL primer at the end. Please do that first if you don’t know SQL. You’ll need to know Java if you want to follow the sections on custom functions.

Who is this course intended for?

Analysts who want to write complex analytical queries on large scale data

Engineers who want to know more about managing Hive as their data warehousing solution

 


Your Instructor

Loonycorn

Loonycorn is us, Janani Ravi and Vitthal Srinivasan. Between us, we have studied at Stanford, been admitted to IIM Ahmedabad and have spent years working in tech, in the Bay Area, New York, Singapore and Bangalore.

Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft

Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too

We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Learnsector!

We hope you will try our offerings and think you’ll like them 🙂

Course Curriculum

Introduction
Introduction 00:00:00
Introducing Hive
Hive: An Open-Source Data Warehouse 00:00:00
Hive and Hadoop 00:00:00
Hive vs Traditional Relational DBMS 00:00:00
HiveQL and SQL 00:00:00
Hadoop and Hive Install
Hadoop Install Modes 00:00:00
Hadoop Install Step 1 : Standalone Mode 00:00:00
Hadoop Install Step 2 : Pseudo-Distributed Mode 00:00:00
Hive install 00:00:00
Code-Along: Getting started 00:00:00
Hadoop and HDFS Overview
What is Hadoop? 00:00:00
HDFS or the Hadoop Distributed File System 00:00:00
Hive Basics
Primitive Datatypes 00:00:00
Collections Arrays Maps 00:00:00
Structs and Unions 00:00:00
Create Table 00:00:00
Insert Into Table 00:00:00
Insert into Table 2 00:00:00
Alter Table 00:00:00
HDFS 00:00:00
HDFS CLI – Interacting with HDFS 00:00:00
Code-Along: Create Table 00:00:00
Code-Along : Hive CLI 00:00:00
Built-in Functions
Three types of Hive functions 00:00:00
The Case-When statement, the Size function, the Cast function 00:00:00
The Explode function 00:00:00
Code-Along : Hive Built – in functions 00:00:00
Sub-Queries
Quirky Sub-Queries 00:00:00
More on subqueries: Exists and In 00:00:00
Inserting via subqueries 00:00:00
Code-Along : Use Subqueries to work with Collection Datatypes 00:00:00
Views 00:00:00
Partitioning
Indices 00:00:00
Partitioning Introduced 00:00:00
The Rationale for Partitioning 00:00:00
How Tables are Partitioned 00:00:00
Using Partitioned Tables 00:00:00
Dynamic Partitioning: Inserting data into partitioned tables 00:00:00
Code-Along : Partitioning 00:00:00
Bucketing
Introducing Bucketing 00:00:00
The Advantages of Bucketing 00:00:00
How Tables are Bucketed 00:00:00
Using Bucketed Tables 00:00:00
Sampling 00:00:00
Windowing
Windowing Introduced 00:00:00
Windowing – A Simple Example: Cumulative Sum 00:00:00
Windowing – A More Involved Example: Partitioning 00:00:00
Windowing – Special Aggregation Functions 00:00:00
Understanding MapReduce
The basic philosophy underlying MapReduce 00:00:00
MapReduce – Visualized and Explained 00:00:00
MapReduce – Digging a little deeper at every step 00:00:00
MapReduce Logic for Queries: Behind the Scenes
MapReduce Overview: Basic Select-From-Where 00:00:00
MapReduce Overview: Group-By and Having 00:00:00
MapReduce Overview: Joins 00:00:00
Join Optimizations in Hive
Improving Join performance with tables of different sizes 00:00:00
The Where clause in Joins 00:00:00
The Left Semi Join 00:00:00
Map Side Joins: The Inner Join 00:00:00
Map Side Joins: The Left, Right and Full Outer Joins 00:00:00
Map Side Joins: The Bucketed Map Join and the Sorted Merge Join 00:00:00
Custom Functions in Python
Custom functions in Python 00:00:00
Code-Along : Custom Function in Python 00:00:00
Custom functions in Java
Introducing UDFs – you’re not limited by what Hive offers 00:00:00
The Simple UDF: The standard function for primitive types 00:00:00
The Simple UDF: Java implementation for replacetext() 00:00:00
Generic UDFs, the Object Inspector and DeferredObjects 00:00:00
The Generic UDF: Java implementation for containsstring() 00:00:00
The UDAF: Custom aggregate functions can get pretty complex 00:00:00
The UDAF: Java implementation for max() 00:00:00
The UDAF: Java implementation for Standard Deviation 00:00:00
The Generic UDTF: Custom table generating functions 00:00:00
The Generic UDTF: Java implementation for namesplit() 00:00:00
SQL Primer - Select Statemets
Select Statements 00:00:00
Select Statements 2 00:00:00
Operator Functions 00:00:00
SQL Primer - Group By, Order By and Having
Aggregation Operators Introduced 00:00:00
The Group By Clause 00:00:00
More Group By Examples 00:00:00
Order By 00:00:00
Having 00:00:00
SQL Primer - Joins
Introduction to SQL Joins 00:00:00
Cross Joins aka Cartesian Joins 00:00:00
Inner Joins 00:00:00
Left Outer Joins 00:00:00
RIght, Full Outer Joins, Natural Joins, Self Joins 00:00:00
Appendix
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables 00:00:00
Setting up a Virtual Linux Instance – For Windows Users 00:00:00

Course Reviews

N.A

ratings
  • 5 stars0
  • 4 stars0
  • 3 stars0
  • 2 stars0
  • 1 stars0

No Reviews found for this course.

TAKE THIS COURSE
  • $99.00 $15.00
  • UNLIMITED ACCESS
  • Course Certificate
1 STUDENTS ENROLLED
© Learnsector