Kasia works as a data scientist at Medium. She has a master’s in theoretical physics from University College London and after graduation, she moved to San Francisco from London. In her spare time, she enjoys volunteering for women-related organizations and diversity causes, scuba diving and traveling. Kasia is also a San Francisco PyLadies organizer.
For developers and analysts using SQL, transitioning to Python to make calculations may initially seem daunting. In my talk, I will discuss different options for data munging before diving into an introduction of how to leverage pandas to ‘translate’ certain SQL functions, starting with simple aggregations and joins to finally go deeper into window functions and rolling averages.
This talk will discuss my experiences of trying different options when wanting to use both SQL and pandas: the SQL Jupyter extension, Python SQL module, connecting to a database through Python and finally, ‘translating’ all calculations into pandas. I will touch on advantages and disadvantages of all of these methods and I will then dive deeper into slicing and dicing pandas DataFrames, performing joins, unions, aggregations and more advanced calculations such as window functions and rolling averages. All of the calculations shown will use pandas, one of the most common data science libraries.