PDS I (IDS 590)

PDS I (IDS 590)#

Note: may appear as “Data Science 4 Social Science” on DukeHub, sorry — new class, some administrative changes pending.

Practical Data Science I is a flipped-classroom, exercise and project-focused course. It requires zero prior experience with programming and begins with an introduction to Python, computational thinking, and the principles of good programming using the 7 Steps method. The class focus then shifts to data analysis with an emphasis on the type of analyses of interest to social scientists and public policy students. The course provides students with experience manipulating and analyzing real (often messy, error-ridden, and poorly documented) data using the full range of Python data science tools (like the command line, git, VS Code, numpy, pandas, matplotlib, statsmodels, and more).

Syllabus#

A preliminary syllabus can be found here.

Class Schedule#

Class will be Tuesdays and Thursdays from 10:05am-11:25am (EDT).

Key:

Date, Rm

Topic

Do Before Class

In-Class Exercise

Tues, Aug 27

  • Class Introduction

  • Welcome to VS Code

Thurs, Aug 29

  • Setting Up An Algorithm

  • 7 Steps

  • PPF1, Working an Example of the Seven Steps Process

  • PPF1, Creating an Algorithm

Tues, Sep 3

  • Ideas Into Code

Thurs, Sep 5

  • Testing Your Code

Tues, Sep 10

  • Lists In Depth

  • PPF3, Approaches to Testing

  • PPF4, Looking More into Lists.

Thurs, Sep 12

  • Buffer Day

Buffer day to allow extra time for any topics needing additional attention

Tues, Sep 17

  • Object Oriented Programming

  • DSwithNSD1, Basics of Object Oriented Programming

  • Point

  • Circle

Thurs, Sep 19

  • Object Oriented Programming

Tues, Sep 24

  • Command Line Basics

  • Advanced Command Line

  • Packages

  • Jupyter

Thurs, Sep 26

  • Numpy Basics

Tues, Oct 1

  • Numpy Basics

Additional Concepts

  • Ex 1

  • (Finish other numpy)

Thurs, Oct 3

  • Numpy Arrays

Tues, Oct 8

  • Pandas: Series

Thurs, Oct 10

  • Pandas: DataFrames

Tues, Oct 15 (Break)

FALL BREAK

Thurs, Oct 17 (Break)

  • Pandas: Indices & Missing

Tues, Oct 22

  • Pandas: Cleaning

  • Tracebacks

Thurs, Oct 24

  • Plotting

Tues, Oct 29

  • Plotting

Thurs, Oct 31

  • Statistics with statsmodels

Tues, Nov 5

  • Defensive Programming

  • Groupby / Split-Apply-Combine

Thurs, Nov 7

  • Pandas: Reshaping

  • Pandas: Categoricals

Tues, Nov 12

  • Merging

Thurs, Nov 14

  • Defensive Programming

  • Workflow

  • Backwards Design

  • Getting Help Online

  • Git and Github

  • Start git?

Tues, Nov 19

  • Git

Thurs, Nov 21

  • Git

  • Git Day 2

Wed Dec 12

LAST CLASS

  • ROUGH DRAFT OF OPIOID PROJECT DUE, extendable to Morning of Mon Nov 26 but feedback may be later