A first-principles guide to working with Workflows, Coordinators and Bundles in Oozie.
Oozie is formidable because it is entirely written in XML, which is hard to debug when things go wrong. However, once you’ve figured out how to work with it, it’s like magic. Complex dependencies, managing a multitude of jobs at different time schedules, managing entire data pipelines are all made easy with Oozie.
Oozie allows you to manage Hadoop jobs as well as Java programs, scripts and any other executable with the same basic set up. It manages your dependencies cleanly and logically.
Knowing the right configurations parameters which gets the job done, that is the key to mastering Oozie
- Install and set up Oozie
- Configure Workflows to run jobs on Hadoop
- Configure time-triggered and data-triggered Workflows
- Configure data pipelines using Bundles
Working with Oozie requires some basic knowledge of the Hadoop eco-system and running MapReduce jobs.
Who is this course intended for?
Engineers, analysts and sysadmins who are interested in big data processing on Hadoop.
This course is not recommended for the beginners who have no knowledge of the Hadoop eco-system.
Loonycorn is us, Janani Ravi and Vitthal Srinivasan. Between us, we have studied at Stanford, been admitted to IIM Ahmedabad and have spent years working in tech, in the Bay Area, New York, Singapore and Bangalore.
Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft
Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too
We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Learnsector!
We hope you will try our offerings and think you’ll like them 🙂
|A Brief Overview Of Oozie|
|What is Oozie?||00:00:00|
|Oozie architectural components||00:00:00|
|Oozie Install And Set Up|
|Installing Oozie on your machine||00:00:00|
|Workflows: A Directed Acyclic Graph Of Tasks|
|Running MapReduce on the command line||00:00:00|
|The lifecycle of a Workflow||00:00:00|
|Running our first Oozie Workflow MapReduce application||00:00:00|
|The job.properties file||00:00:00|
|The workflow.xml file||00:00:00|
|A Shell action Workflow||00:00:00|
|Control nodes, Action nodes and Global configurations within Workflows||00:00:00|
|Coordinators: Managing Workflows|
|Running our first Coordinator application||00:00:00|
|A time-triggered Coordinator definition||00:00:00|
|Coordinator control mechanisms||00:00:00|
|Data availability triggers||00:00:00|
|Running a Coordinator which waits for input data||00:00:00|
|Coordinator configuration to use data triggers||00:00:00|
|Bundles: A Collection Of Coordinators For Data Pipelines|
|Bundles and why we need them||00:00:00|
|The Bundle kick-off time||00:00:00|
|Installing Hadoop in a Local Environment|
|Hadoop Install Modes||00:00:00|
|Hadoop Install Step 1 : Standalone Mode||00:00:00|
|Hadoop Install Step 2 : Pseudo-Distributed Mode||00:00:00|
|[For Linux/Mac OS Shell Newbies] Path and other Environment Variables||00:00:00|
|Setting up a Virtual Linux Instance – For Windows Users||00:00:00|
No Reviews found for this course.