Building Batch Data Pipelines on Google Cloud (BBDP) Online

Total time
Location
Online
Starting date and place

Building Batch Data Pipelines on Google Cloud (BBDP) Online

Fast Lane Institute for Knowledge Transfer GmbH
Logo Fast Lane Institute for Knowledge Transfer GmbH
Provider rating: starstarstarstarstar_half 8.9 Fast Lane Institute for Knowledge Transfer GmbH has an average rating of 8.9 (out of 33 reviews)

Need more information? Get more details on the site of the provider.

Starting dates and places

computer Online: Online Training
2 Apr 2025
computer Online: Online Training
20 Aug 2025
computer Online: Online Training
26 Nov 2025

Description

Voraussetzungen

  • Experience with data modeling and ETL (extract, transform, load) activities.
  • Experience with developing applications by using a common programming language such as Python or Java.

Zielgruppe

This course is intended for developers who are responsible for designing pipelines and architectures for data processing.

Detaillierter Kursinhalt

Module 1 - Introduction to Building Batch Data Pipelines

Topics:

  • EL, ELT, ETL
  • Quality considerations
  • How to conduct operations in BigQuery
  • Shortcomings
  • ETL to solve data quality issues

Objectives:

  • Review different methods of loading data into your data lakes and warehouses: EL, ELT and ETL

Module 2 - Executing Spark on Dataproc

Topic…

Read the complete description

Frequently asked questions

There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.

Voraussetzungen

  • Experience with data modeling and ETL (extract, transform, load) activities.
  • Experience with developing applications by using a common programming language such as Python or Java.

Zielgruppe

This course is intended for developers who are responsible for designing pipelines and architectures for data processing.

Detaillierter Kursinhalt

Module 1 - Introduction to Building Batch Data Pipelines

Topics:

  • EL, ELT, ETL
  • Quality considerations
  • How to conduct operations in BigQuery
  • Shortcomings
  • ETL to solve data quality issues

Objectives:

  • Review different methods of loading data into your data lakes and warehouses: EL, ELT and ETL

Module 2 - Executing Spark on Dataproc

Topics:

  • The Hadoop ecosystem
  • Run Hadoop on Dataproc
  • Cloud Storage instead of HDFS
  • Optimizing Dataproc

Objectives:

  • Review the Hadoop ecosystem.
  • Discuss how to lift and shift your existing Hadoop workloads to the cloud using Dataproc.
  • Explain when to use Cloud Storage instead of HDFS storage.
  • Explain how to optimize your Dataproc jobs.

Module 3 - Serverless Data Processing with Dataflow

Topics:

  • Introduction to Dataflow
  • Why customers value Dataflow
  • Dataflow pipelines
  • Aggregate with GroupByKey and Combine
  • Side inputs and windows
  • Dataflow templates

Objectives:

  • Identify the features that customers value in Dataflow.
  • Discuss core concepts in Dataflow.
  • Review the use of Dataflow templates and SQL.
  • Write a simple Dataflow pipeline and run it both locally and on the cloud.
  • Identify map and reduce operations, execute the pipeline, and use command line parameters.
  • Read data from BigQuery into Dataflow and use the output of a pipeline as a sideinput to another pipeline

Module 4 - Manage Data Pipelines with Cloud Data Fusion and Cloud Composer

Topics:

  • Building batch data pipelines visually with Cloud Data Fusion
    • Components
    • UI overview
    • Building a pipeline
    • Exploring data using Wrangler
  • Orchestrating work between Google Cloud services with Cloud Composer
    • Apache Airflow environment
    • DAGs and operators
    • Workflow scheduling
    • Monitoring and logging

Objectives:

  • Discuss how to manage your data pipelines with Data Fusion and Cloud Composer.
  • Summarize how Cloud Data Fusion allows data analysts and ETL developers to wrangle data and build pipelines in a visual way.
  • Describe how Cloud Composer can help to orchestrate the work across multiple Google Cloud services.
There are no reviews yet.

Share your review

Do you have experience with this course? Submit your review and help other people make the right choice. As a thank you for your effort we will donate $1.- to Stichting Edukans.

There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.