PDS (MIDS) (IDS 720)

PDS (MIDS) (IDS 720)#

A one semester version of Practical Data Science specifically tailored to Masters of Interdisciplinary Data Science (MIDS) students. As all MIDS students complete a mandatory, 4-week, in-person, intensive summer Python programming bootcamp in August before the start of classes, this class assumes a strong foundational understanding of the Python standard library. Because this class is a one-semester class and MIDS students all take a full NLP class, it skips some topics covered in Practical Data Science II, like NLP and GIS.

Syllabus#

Course syllabus can be found here.

Class Schedule#

Class will be Tuesdays and Thursdays from 1:25pm-2:40pm (EDT).

Date, Rm

Topic

Do Before Class

In-Class Exercise

Tues, Aug 27

  • Class Introduction

  • Welcome to VS Code

Thurs, Aug 29

  • Command Line Basics

Tues, Sep 3

  • Advanced Command Line

  • Packages

  • Jupyter

Thurs, Sep 5

  • Python Debugger

  • R / Python Differences

  • Packages

Tues, Sep 10

  • Git

Thurs, Sep 12

  • Numpy Basics

Tues, Sep 17

  • Numpy Arrays

More Numpy Concepts:

Matrices:

ND Arrays:

  • Ex 1

  • (Finish other numpy)

Thurs, Sep 19

  • Pandas: Series

Tues, Sep 24

  • Pandas: DataFrames

Thurs, Sep 26

  • Pandas: Indices & Missing

Tues, Oct 1

  • Pandas: Cleaning

  • Tracebacks

Thurs, Oct 3

  • Plotting

Tues, Oct 8

  • Plotting

Thurs, Oct 10

  • Merging

Tues, Oct 15

FALL BREAK

Thurs, Oct 17

FALL BREAK (Not technically, but take the week)

Tues, Oct 22

  • Statistics with statsmodels

Thurs, Oct 24

  • Big Data: What is it, how do I work with it?

Tues, Oct 29

  • Defensive Programming

  • Workflow

  • Backwards Design

  • Getting Help Online

  • Git and Github

Discuss mid-semester project in class - Opioid Project

Thurs, Oct 31

  • Defensive Programming

  • Groupby / Split-Apply-Combine

Tues, Nov 5

  • Pandas: Reshaping

  • Pandas: Categoricals

Thurs, Nov 7

  • Speed and Performance in Python

Tues, Nov 12

  • Machine Learning with sckikit-learn

Thurs, Nov 14

  • Defining Your Own Estimators

Tues, Nov 19

  • Parallelism

  • Distributed Computing

(Note reading includes a 45 minute video to watch)

Thurs, Nov 21

LAST CLASS

  • ROUGH DRAFT OF OPIOID PROJECT DUE, extendable to Morning of Mon Nov 26 but feedback may be later

Wed Dec 12

Final Project Report Due