01 Concepts and Best Practices | Databricks for Data Engineering | Databricks Hindi Podcast 2025

Опубликовано: 02 Июнь 2025
на канале: Prajesh Jha
357
1

Welcome to DE01 – a 1.5-hour deep dive into data engineering concepts and best practices using Databricks. This session is packed with focused topics that every data engineer should understand. Whether you're working with batch or streaming pipelines, building CDC-based architectures, or optimizing performance, this video has you covered.

Source - https://docs.databricks.com/aws/en/da...
Audio Generrated using https://notebooklm.google/

🎧 Chapters / Audio Topics Included:

00:00 - 01 Data engineering with Databricks
05:15 - 02 Procedural vs. declarative data processing
10:44 - 03 Batch vs. streaming data processing
16:29 - 04 Tables and views in Databricks
23:16 - 05 What is Change Data Capture (CDC)?
30:13 - 06 Work with joins on Databricks
36:56 - 07 Aggregate data on Databricks
42:39 - 08 Optimize join performance
48:00 - 09 Data modeling – Best Practices
53:42 - 10 Configure RocksDB state store on Databricks
01:00:30 - 11 Asynchronous state checkpointing for stateful queries
01:06:53 - 12 Asynchronous progress tracking
01:13:20 - 13 Production considerations for Structured Streaming
01:19:42 - 14 Clean and validate data with batch or stream processing

🛠️ Tools Mentioned: Databricks, Apache Spark, RocksDB
📘 Ideal for: Data engineers, analytics engineers, and architects building modern data platforms.

💬 Drop your questions or feedback in the comments.

#DataEngineering #Databricks #ApacheSpark #BatchProcessing #StreamingData #CDC #StructuredStreaming #BigData #ETL #ELT #DataPipelines #DataModeling #Joins #PerformanceTuning #DataEngineering
#Databricks
#ApacheSpark
#BigData
#DataPipelines
#BatchProcessing
#StreamingData
#StructuredStreaming
#CDC
#ETL
#DataModeling
#Joins
#PerformanceOptimization
#StatefulProcessing
#DataValidation