PDS (MIDS) (IDS 720)

PDS (MIDS) (IDS 720)#

A one semester version of Practical Data Science specifically tailored to Masters of Interdisciplinary Data Science (MIDS) students. As all MIDS students complete a mandatory, 4-week, in-person, intensive summer Python programming bootcamp in August before the start of classes, this class assumes a strong foundational understanding of the Python standard library. Because this class is a one-semester class and MIDS students all take a full NLP class, it skips some topics covered in Practical Data Science II, like NLP and GIS.

Syllabus#

Course syllabus can be found here.

Class Schedule#

Class will be Tuesdays and Thursdays from 1:25pm-2:40pm (EDT).

Date, Rm

Topic

Do Before Class

In-Class Exercise

Tues, Aug 26

  • Class Introduction

  • Welcome to VS Code

Thurs, Aug 28

  • Command Line Basics

Tues, Sep 2

  • Advanced Command Line

  • Packages

  • Jupyter

Thurs, Sep 4

  • Python Debugger

  • R / Python Differences

  • Packages

Tues, Sep 9

  • Git

Thurs, Sep 11

  • Numpy Basics

Tues, Sep 16

  • Numpy Arrays

More Numpy Concepts:

Matrices:

ND Arrays:

  • Ex 1

  • (Finish other numpy)

Thurs, Sep 18

  • Pandas: Series

Tues, Sep 23

  • Pandas: DataFrames

Thurs, Sep 25

  • Pandas: Indices & Missing

Tues, Sep 30

  • Pandas: Cleaning

  • Tracebacks

Thurs, Oct 2

  • Plotting

Tues, Oct 7

  • Plotting

Thurs, Oct 9

  • Merging

Tues, Oct 14 (Break)

FALL BREAK

Thurs, Oct 16 (Break)

FALL BREAK (Not technically, but take the week)

Tues, Oct 21

  • Statistics with statsmodels

Thurs, Oct 23

  • Big Data: What is it, how do I work with it?

Tues, Oct 28

  • Defensive Programming

  • Workflow

  • Backwards Design

  • Getting Help Online

  • Git and Github

Thurs, Oct 30

  • Defensive Programming

  • Groupby / Split-Apply-Combine

Tues, Nov 4

  • Pandas: Reshaping

  • Pandas: Categoricals

Thurs, Nov 6

  • Speed and Performance in Python

Tues, Nov 11

  • Machine Learning with sckikit-learn

Thurs, Nov 13

  • Defining Your Own Estimators

Tues, Nov 18

  • Parallelism

  • Distributed Computing

(Note reading includes a 45 minute video to watch)

Thurs, Nov 20

LAST CLASS

  • ROUGH DRAFT OF OPIOID PROJECT DUE, extendable to Morning of Mon Nov 26 but feedback may be later

Wed Dec 12

Final Project Report Due