PDS I (IDS 590)

PDS I (IDS 590)#

Note: may appear as “Data Science 4 Social Science” on DukeHub, sorry — new class, some administrative changes pending.

Practical Data Science I is a flipped-classroom, exercise and project-focused course. It requires zero prior experience with programming and begins with an introduction to Python, computational thinking, and the principles of good programming using the 7 Steps method. The class focus then shifts to data analysis with an emphasis on the type of analyses of interest to social scientists and public policy students. The course provides students with experience manipulating and analyzing real (often messy, error-ridden, and poorly documented) data using the full range of Python data science tools (like the command line, git, VS Code, numpy, pandas, matplotlib, statsmodels, and more).

Office Hours#

Adriane: Signup with this link

Nick (Zoom or in Gross 231): Thursday 8:45 am - 9:45 am. Please email me in advance if you plan to attend. I’ll usually be there regardless, but knowing people are coming is useful if conflicts arise.

Diego (Zoom or Gross Hall 2nd Floor, 230K): Wednesday 2:00 - 3:00 pm

Meron (Zoom or Gross Hall 2nd Floor, 230N): Tuesday 12:15 - 1:15pm.

Syllabus#

A preliminary syllabus can be found here.

Class Schedule#

Class will be Tuesdays and Thursdays from 10:05am-11:25am (EDT).

Key:

Date, Rm

Topic

Do Before Class

In-Class Exercise

Tues, Aug 26

  • Class Introduction

  • Welcome to VS Code

Thurs, Aug 28

  • Setting Up An Algorithm

  • 7 Steps

  • PPF1, Seven Steps for Algorithm Design

  • PPF1, Working an Example of the Seven Steps Process

  • PPF1, SKIP Creating an Algorithm

Tues, Sep 2

  • Ideas Into Code

Thurs, Sep 4

  • Testing Your Code

Tues, Sep 9

  • Lists In Depth

Thurs, Sep 11

  • Buffer Day

  • PPF3, Approaches to Testing. Do do Coursera exercises in this section!

Tues, Sep 16

  • List Practice

  • PPF4, Diving Deeper with Lists (both Looking More into Lists and Heart Rate: An Example)

Thurs, Sep 18

  • More Lists

Tues, Sep 23

  • Object Oriented Programming

Thurs, Sep 25

  • Object Oriented Programming

  • More OOP

Tues, Sep 30

  • Object Oriented Programming

  • DSwithNSD1, Sets and Big-O Notation (and Dicts)

Thurs, Oct 2

  • Command Line Basics

  • Advanced Command Line

  • Packages

  • Jupyter

Tues, Oct 7

  • Numpy Basics

Thurs, Oct 9

  • Numpy Basics

Additional Concepts

Tues, Oct 14 (Break)

  • Numpy Arrays

Thurs, Oct 16 (Break)

FALL BREAK

Tues, Oct 21

  • Pandas: Series

Thurs, Oct 23

  • Pandas: DataFrames

Tues, Oct 28

  • Pandas: Indices & Missing

Thurs, Oct 30

  • Pandas: Cleaning

  • Regex

Tues, Nov 4

  • Defensive Programming

  • Groupby / Split-Apply-Combine

Thurs, Nov 6

  • Merging

Tues, Nov 11

  • Statistics with statsmodels

Thurs, Nov 13

  • Machine Learning with sckikit-learn

Tues, Nov 18

  • Plotting

Thurs, Nov 20

  • More plotting

Wed Dec 12

Final project due date