When you’re a newcomer to data engineering, data transformations are yet another concept for you to learn. @DataSurfer will go through the main concepts using a relatable use case and provide Python examples for how you can apply them using a batch approach with the popular Pandas DataFrame library. We’ll take that same code and show you how to modify it to use Streaming DataFrames, bringing it into the world of Kafka and stream processing with Quix Streams.
We’ll cover value transformations, schema transformations and stateless/stateful transformations. Check out the GitHub repo with the full source code for the examples and use Docker Compose to try them out on your machine.
📥 Source code: https://github.com/quixio/community-h...
🐱 kcat: https://github.com/edenhill/kcat
🐋 Docker Compose: https://docs.docker.com/compose
🏘️ Slack community: https://quix.io/slack-invite
📖 Quix Streams docs: https://quix.io/docs/quix-streams/int...
🌊 Quix Streams on GitHub: https://github.com/quixio/quix-streams
🐍 Quix Streams on PyPI: https://pypi.org/project/quixstreams
0:00 — Intro
0:21 — What is a transform?
1:12 — Use case and example event
1:57 — Value transformation in Pandas
3:49 — Value transformation in Quix Streams
9:42 — Schema transformation in Pandas
12:19 — Schema transformation in Quix Streams
16:10 — Stateless and stateful transformations
17:02 — Stateful transformation in Quix Streams
22:29 — Wrap up