This course has been specifically designed to equip you with fundamental knowledge on data platforms, and will teach you about the key building blocks of data architectures including extraction, storage, consumption, security and governance.
During the session, we’ll use lectures, hands-on labs, activities and quizzes to teach you about what it takes to design and implement effective data solutions.
By the end of the course, you’ll also understand the goals of architecting data solutions, and will be able to compare and contrast various ingestion models, storage options, and cloud service providers.
This Foundations for Architecting Data Solutions course is available as a private training session that can be delivered via Virtual Classroom or at a location of your choice in South Africa.
Course overview
Who should attend:
This course is ideal if you're an aspiring data architect or data analyst, and is also perfect for data engineers who want to learn how solutions are designed, and BI developers who are looking to get a holistic view of data architectures.
What you'll learn:
By the end of this course, you will be able to:
- Understand what data architecture is, and the high-level building blocks it consists of
- Understand what a data source is, and extract data from these sources
- Evaluate the different data storage options available
- Know the difference and applications of batch data processing and streaming data processing
- Call out the applications of real time data analytics solutions and the key components required to build a streaming analytics application
- Understand the importance of data governance in designing data solutions
- Understand the importance of data security and privacy
- Build solutions that are compliant with regulatory guidelines
- Understand the importance of non-functional requirements, such as performance, scalability, reliability
- Build resilient and fault-tolerant data solutions
Prerequisites
In order to get the most out of this course, you should already work with data to some capacity and know the concepts of data, databases, data models and transactional processing/analytical processing systems.
Course agenda
- Overview of data architecture: Definition, importance, and goals
- Key components of data systems: Data ingestion, storage, processing, analysis, and visualization
- Principles: Benefits and challenges of designing portable data solutions
- Data ingestion methods: Batch vs. streaming data
- Data integration techniques: ETL (Extract, Transform, Load) vs. ELT (Extract, Load, Transform)
- Tools and APIs for data ingestion
- Understanding data lakes vs. data warehouses
- Introduction to lakehouse architecture: Combining the best of data lakes and warehouses
- Data mesh
- Schema design and best practices: Data partitioning, indexing, and cataloging
- Data processing frameworks: Batch and stream processing
- Real-time analytics: Tools and techniques for processing data in real-time
- Writing Cloud-agnostic data processing Jobs: Considerations for portability
- Data consumption: Visualizations and ML introduction
- Data Governance: Policies and standards for data management
- Security best practices: Encryption, access control, and audit logs
- Regulatory compliance: GDPR, HIPAA, and other relevant regulations
- Scalability and performance considerations: Vertical vs. horizontal scaling
- Designing for high availability and disaster recovery
- Monitoring and optimization: Tools and strategies for maintaining efficient data solutions