Course Credits
Select the pre-paid training investment that’s right for you and help your money stretch a little further with our course credits.
Jellyfish is an award-winning Google Cloud Partner. Our trainers work with Google Cloud on a daily basis, so you'll benefit from the years of industry experience they’ll share with you.
On this one-day course, you'll learn about ways to address data engineering challenges. We'll teach you everything you need to know about the role of a data engineer and identifying data engineering tasks and core components used on Google Cloud.
You'll also learn how to create and deploy data pipelines of varying patterns on Google Cloud, as well as how to identify and utilize various automation techniques on the platform.
This Introduction to Data Engineering on Google Cloud course is available as a private session that can be delivered virtually or at a location of your choice in South Africa.
Course overview
Who should attend:
This course is ideal for data engineers, database administrators and system administrators, as well as any other individuals interested in learning about data engineering techniques on Google Cloud.
What you'll learn:
By the end of this course, you will be able to:
- Understand the role of a data engineer
- Identify data engineering tasks and core components used on Google Cloud
- Understand how to create and deploy data pipelines of varying patterns on Google Cloud
- Identify and utilise various automation techniques on Google Cloud
Prerequisites
In order to get the most out of this course, you should have prior Google Cloud experience at the fundamental level; especially when it comes to using Cloud Shell and accessing products from the Google Cloud console. Basic proficiency with a common query language such as SQL, experience with data modelling and ETL (extract, transform, load) activities, and experience developing applications using a common programming language such as Python is also recommended.
Course agenda
- The role of a data engineer
- Data sources versus data sinks
- Data formats
- Storage solution options on Google Cloud
- Metadata management options on Google Cloud
- Sharing datasets using Analytics Hub
- Replication and migration architecture
- The gcloud command-line tool
- Moving datasets
- Datastream
- Extract and load architecture
- The bq command-line tool
- BigQuery Data Transfer Service
- BigLake
- Extract, load, and transform (ELT) architecture
- SQL scripting and scheduling with BigQuery
- Dataform
- Extract, transform, and load (ETL) architecture
- Google Cloud GUI tools for ETL data pipelines
- Batch data processing using Dataproc
- Streaming data processing options
- Bigtable and data pipelines
- Automation patterns and options for pipelines
- Cloud Scheduler and Workflows
- Cloud Composer
- Cloud Run Functions
- Eventarc