Simple Mage-ai Pipeline from API to Google Cloud Storage

Опубликовано: 04 Май 2023
на канале: Data Slinger
1,982
34

Quick run through of using mage-ai to get data from an api, do a simple transformation and save it to parquet in a GCP Bucket.

Note: In the video I state that the format argument parquet needs to be lowercase. But actually upper or lowercase appear to work. Oops. :-)

HUGE thanks to Xiaoyou Wang in the Mage Slack for helping me through my errors when putting this together. And the entire Mage Slack team/community. You all rock!

Errata:

At the end you see me set the trigger. This time is UTC in Mage so I should have taken that into account when setting the trigger time.

I did not get the date converted to datetime. You can do this a "hacky" way by using the example fill_in_missing_values transformer block and updating the transform_df function to just have:

import pandas

def transform_df(df: DataFrame, *args, **kwargs) ...:
df['Date'] = pd.to_datetime(df['Date'])

return df