[Video 168] Joris Van Den Bossche: Introduction to Pandas

Pandas is a Python library for reading and manipulating structured data, and is quickly becoming a standard among a large number of data scientists looking to work with, clean, and analyze data. Indeed, that’s what many data scientists (and other analysts) spend a great deal of time doing: They have to take dirty data sets and clean them.  Then, after cleaning the data sets, the data has to be manipulated.  Finally, the results need to be displayed for others to see and use.  Pandas makes all of these tasks fairly easy, but also efficient — thanks in no small part to its use of NumPy arrays.

In this talk by researcher Joris Van den Bossche, we’re introduced to Pandas, learning about its functionality but also where and how to use it. If you’re a data scientist, or experimenting with such manipulations, then this talk will help you to understand Pandas from the perspective of someone who uses it every day.

Slides for the talk are at http://www.slideshare.net/PoleSystematicParisRegion/track-13-joris-van-den-bossche.

Leave a Reply