Data Analysis with Python and Pandas


"Attending the bespoke course Data Munging with Pandas at Textkernel has proven to be an excellent choice. Jeroen’s personal approach and highly interactive way of teaching made this course valuable to a diverse group of developers and analysts, as did the possibility to apply theory on our own data and API during the courses. I’ve since been able to code cleaner and more efficient, and applied the pandas package in several monitoring and analytics scripts." - 2020-11-10 09:37
"Attending the bespoke course Data Munging with Pandas at Textkernel has proven to be an excellent choice. Jeroen’s personal approach and hig… read full review - 2020-11-10 09:37
Starting dates and places
Data Science Workshops B.V. offers their products as a default in the following regions: 's-Hertogenbosch, Alkmaar, Almere / Lelystad, Alphen aan den Rijn, Amersfoort, Amsterdam, Antwerpen, Apeldoorn, Arnhem, Assen, Breda, Brugge, Brussel, Delft, Den Haag, Deventer, Dordrecht, Drachten, Ede, Eindhoven, Emmen, Enschede, Gent, Gouda, Groningen, Haarlem, Haarlemmermeer, Heerenveen, Hilversum, Leeuwarden, Leiden, Luik, Maastricht, Middelburg, Nijmegen, Roermond, Rotterdam, Terneuzen, Tilburg, Utrecht, Veenendaal, Venlo, Westland, Zaanstad, Zoetermeer, Zwolle
Description
Introduction
Learn how to accelerate your data analyses using Pandas, a Python library specifically designed for working with medium-sized data sets. Together with JupyterLab it enables a convenient environment for interactive data analysis.
Pandas is part of the so-called PyData ecosystem, and in this workshop we'll start by providing an overview of PyData and explain where Pandas stands and how it interacts with other libraries such as NumPy and Seaborn. Pandas introduces a few new data structures, most importantly the DataFrame, which are essential to understand how to work with tabular data efficiently.
Pandas offers many features, and in one day, through a good balance of presentation a…
Frequently asked questions
There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.
Introduction
Learn how to accelerate your data analyses using Pandas, a Python library specifically designed for working with medium-sized data sets. Together with JupyterLab it enables a convenient environment for interactive data analysis.
Pandas is part of the so-called PyData ecosystem, and in this workshop we'll start by providing an overview of PyData and explain where Pandas stands and how it interacts with other libraries such as NumPy and Seaborn. Pandas introduces a few new data structures, most importantly the DataFrame, which are essential to understand how to work with tabular data efficiently.
Pandas offers many features, and in one day, through a good balance of presentation and interactive exercises, we're going to cover the most important ones, including: importing, filtering, grouping, joining, exploring, and visualising data. By the end of this workshop, you'll understand the fundamentals of Pandas, be aware of common pitfalls, and be ready to perform your own analyses.
What you'll learn
- Load data from text files, spreadsheets, databases, and APIs
- Use the Split-Apply-Combine paradigm to summarise data
- Performing advanced joins and merges
- Generating insightful pivot tables
- Transforming data between wide and long formats
- Working with time series data
- Explore data using a variety of visualisation types
- Avoid common pitfalls in NumPy and Pandas by understanding general concepts and principles
This workshop is for you because
- You have experience in Excel or R and want to learn about Pandas and the PyData ecosystem
- You have programming experience in Python and want to start analysing data using pandas
- You want to improve your understanding of Pandas
Schedule
- Overview of the PyData ecosystem
- NumPy, SciPy, Pandas
- Matplotlib, Seaborn
- SciKit-Learn
- Essential data structures
- Numpy data types
- Numpy arrays
- Pandas Series
- Pandas DataFrame
- Pandas Index, MultiIndex
- Importing data
- From CSV
- From Excel
- From Databases
- From APIs
- Manipulating data
- Selecting rows and columns
- Filtering rows
- Joining and concatenating
- Missing values, duplicates
- Converting data types
- Dates and times
- Working with categorical data
- String manipulation
- Exploring data
- Computing aggregate statistics
- Pivot tables
- Correlations
- Visualising data
- Histogram
- Densityplot
- Boxplot
- Bar chart
Prerequisites
You're expected to have some experience with programming in Python. Our workshop Introduction to Programming in Python is one option that can help you with that. Roughly speaking, if you're familiar with the following Python syntax and concepts, then you'll be fine:
- assignment, arithmetic, boolean expression, tuple unpacking
- bool, int, float, list, tuple, dict, str, type casting
- in operator, indexing, slicing
- if, elif, else, for, while
- range(), len(), zip()
- def, (keyword) arguments, default values
- import, import as, from import ...
- lambda functions, list comprehension
- JupyterLab or Jupyter Notebook
Recommended preparation
We're going to use Python together with JupyterLab and the following packages:
- numpy
- pandas
- seaborn
The recommended way to get everything set up is to download and install the Anaconda Distribution.
Alternatively, if you don't want to use Anaconda, then you can install everything using pip. In any case, if running import pandas, seaborn doesn't produce any errors then you know you've set up everything correctly.
Clients
I’ve previously delivered this workshop at:
- Brabant Water
- Jheronimus Academy of Data Science
- Textkernel
- Transavia
- Vocalink
Testimonials
"At Brabant Water, most of us were still using spreadsheets to clean, analyse, and model our data. Thanks to Jeroen, who delivered an engaging, hands-on workshop at our office, many of us have switched to Python and Jupyter Notebook, which allows our analyses to be much more advanced and reliable."
--Stijn de Jong, Senior Advisor Water Supply, Brabant Water
"Attending the bespoke course Data Munging with Pandas at Textkernel has proven to be an excellent choice. Jeroen's personal approach and highly interactive way of teaching made this course valuable to a diverse group of developers and analysts, as did the possibility to apply theory on our own data and API during the courses. I've since been able to code cleaner and more efficient, and applied the pandas package in several monitoring and analytics scripts."
--Karlijn Dinnissen, Data Quality Analyst, Textkernel
"Attending the bespoke course Data Munging with Pandas at Textkernel has proven to be an excellent choice. Jeroen’s personal approach and highly interactive way of teaching made this course valuable to a diverse group of developers and analysts, as did the possibility to apply theory on our own data and API during the courses. I’ve since been able to code cleaner and more efficient, and applied the pandas package in several monitoring and analytics scripts." - 2020-11-10 09:37
"Attending the bespoke course Data Munging with Pandas at Textkernel has proven to be an excellent choice. Jeroen’s personal approach and hig… read full review - 2020-11-10 09:37
"A very eager teacher who wants to be challenged in his knowledge. You will learn the material by doing and he managed to make it interesting for advanced, rusty and beginner students. " - 2020-11-09 09:33
"A very eager teacher who wants to be challenged in his knowledge. You will learn the material by doing and he managed to make it interesting… read full review - 2020-11-09 09:33
"One of the best training sessions I've had so far. Jeroen is extremely knowledgable, he was able to balance a class of development and non-development members and still got practical results from everyone.
We covered a wide range of topics from Basic Python, Pandas, Data analysis fundamentals, to introductory Machine Learning.
It was amazing how much impact his short stay in our HQ, Kano Nigeria had on my general outlook to Python, Data Analysis and ML.
I highly recommend Data Science workshops.com to anyone at any level
" - 2020-11-04 21:35
"One of the best training sessions I've had so far. Jeroen is extremely knowledgable, he was able to balance a class of development and non-d… read full review - 2020-11-04 21:35
"Jeroen had a total understanding of the subject matter and the training was really tailored to my need in understanding data sciences. It kept pushing me and engineering me to understand more about Pandas and Python, putting me in front of where I used to be. Thanks for your mentorship, Jeroen." - 2020-11-04 08:55
"Jeroen had a total understanding of the subject matter and the training was really tailored to my need in understanding data sciences. It ke… read full review - 2020-11-04 08:55
"He makes sure everything works as interactive as possible. So you really get "hands on" experience.
It was possible any moment to ask questions, though that never slowed down the course program too much." - 2020-11-04 07:26
"He makes sure everything works as interactive as possible. So you really get "hands on" experience. It was possible any moment to ask que… read full review - 2020-11-04 07:26
There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.