Python Pandas For Your Grandpa - 4.9 Challenge: Session Groups

Опубликовано: 01 Февраль 2021
на канале: GormAnalysis
134
6

You run an ecommerce site called shoesfordogs.com. You want to do some analysis of your visitors, so you compile a DataFrame called hits that represents each time a visitor hit some page on your site.

You suspect that the undocumented third-party tracking system on your website is buggy and sometimes splits one session into two or more session_ids. You want to correct this behavior by creating a field called session_group_id that stitches broken session_ids together.

Two session, A & B, should belong to the same session group if

1. They have the same visitor_id and
2.a. Their hits overlap in time or
2.b. The latest hit from A is within five minutes of the earliest hit from B, or vice-versa

Also associativity applies. So, if A is grouped with B, and B is grouped with C, then A should be grouped with C as well.

Create a column in hits called session_group_id that identifies which hits belong to the same session group.

0:00 - intro / setup
1:06 - solution

-- Code -----------------------
https://www.practiceprobs.com/problem...

-- Vids & Playlists ---------------------------------
Google Colab -    • Introduction to Google Colab  
NumPy -    • Python NumPy For Your Grandma  
Pandas -    • Python Pandas For Your Grandpa  
Neural Networks -    • Neural Networks For Your Dog  

-- Subscribe To Mailing List ---------------------------------
https://eepurl.com/hC1Pmj

-- Song ---------------------------------
We Be Chillin by Mikey Geiger

-- Support -----------------------
https://merchonate.com/gormanalysis