Pandas Recipes for New Python Users

Couldn't live without it anymore...

2022-03-21 282 words 2 minutes

Contents

Eventually I got to the point in data analytics where keeping things in lists, or list of lists was no longer quite cutting it. My processing was slowly starting to grind to a halt, and things were getting way too abstract.

I decide to call up a friend who had worked in the business longer than me and they suggested “pandas”. I was vaguely familiar as users/clients had used it in the past. A “DataFrame” did sound like it would take care of a lot of my problems after reading the documentation casually…

Fast forward a year and pandas is now core to everything I do in Python. Couldn’t live without it anymore, and as such my second talk at SHARCNET was about pandas.

Below is my abstract for the talk as well as the recording:

“Often programmers find themselves in need of an effective way of working with “labeled” data. In the case of Python, Pandas is the most mature and reliable package that interacts effectively with other well known packages such as NumPy, and TensorFlow. As a package, Pandas is said to provide “fast, flexible, and expressive data structures” for what is known as “labeled data”. Features include: easy handling of missing data points, grouping functionality, simple indexing, time series support, numerous conversion functions, and more. This webinar will provide a basic introduction on how to install Pandas, a discussion of its strengths and various use cases, and lastly a demonstration of various common operations (recipes) that occur with labeled data. Experience with beginner Python concepts will be expected, while familiarity with Jupyter notebooks will be helpful. Webinar material and code will be made available on GitHub for future reference.”