This lesson is still being designed and assembled (Pre-Alpha version)

Data Pre-Processing using Python


In this workshop, we will look into the steps for data pre-processing, visualization and the libraries in python that can be used to do this.

The data set being used in this workshop is “auto-mpg.csv”. It contains information regarding varios parts. It was collected by Carnegie Mello University. We will perform data pre-processing on this workshop. Additionally, as homework, you will be required to perform visualization on this dataset.

Workshop goals


Pre-requisites

STOP: before starting this workshop, please attend the following Digital Scholarship Lab workshop(s) before completing this one:

Workshop Content

Time Estimate Section Keypoints
Pre-Workshop Setup Install required software and download files required for the lesson
00:00 1. Steps In A Data Science Project What is Data Science?
What are the steps in a Data Science Project?
00:00 2. Steps in Data Pre-Processing What is Data Processing?
How to create a Variable?
00:00 3. Visualizing data in Python What do you need to visualize data in python?
Types of plots in python
00:00 Finish Please fill out the workshop survey

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.

Workshop Recording

Survey

Thank you for attending this workshop or reading through the workshop material! If you could take 3-5 min to respond to our anonymous survey, we can continue to improve this workshop. We appreciate any and all feedback!