Learn how to use Data Lakehouse, Apache Iceberg, Project Nessie and Apache Spark to create a functional data lake with this video walkthrough.
Data Lakehouse is an architectural approach to data management that leverages a combination of technologies and processes to combine the best features of data warehouses, data lakes and data lake engines. It provides an ideal platform for enterprises to store, manage and analyze their data at scale.
Apache Iceberg is an open-source table format designed for efficient storage and processing of large-scale datasets in the cloud. It provides a unified view of all the tables stored in a cloud system, making it easier for users to access and query their data.
Project Nessie provides a catalog for managing tables stored in Apache Iceberg and other formats. It enables users to quickly search, browse and visualize their datasets using intuitive interfaces and powerful query languages such as SQL or HiveQL.
Finally, Apache Spark is an open-source framework for distributed computing that enables developers to write applications in Java, Python or Scala. It provides high performance analytics capabilities on large datasets stored in HDFS or other distributed file systems.
In this video walkthrough, Dremio Developer Advocate Dipankar Mazumdar shows us how to achieve a minimal functional lakehouse using Apache Iceberg as the data lake table format & Project Nessie as the catalog. He walks us through setting up the environment, configuring the necessary settings, creating tables with Iceberg & Nessie and running queries on them using Apache Spark. This is accompanied by detailed explanations of each step so that you can get started with your own Data Lakehouse project quickly and easily.
If you’re looking for a comprehensive guide on how to use Data Lakehouse, Apache Iceberg, Project Nessie & Apache Spark together to get your own functional data lake up & running quickly then this video walkthrough will be invaluable. With its clear explanations & step-by-step guide you’ll be able to get your project up & running in no time!
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN