PDS II (IDS 591)

PDS II (IDS 591)#

Practical Data Science II is a flipped-classroom, exercise and project-focused course. Building on the computational thinking skills developed in Practical Data Science I, this course introduces students to a range of computational inquiry methods, including network analysis, geospatial analysis, advanced plotting, and natural language processing (NLP). Throughout, the focus will be on developing hands-on experience implementing these methods with messy real-world data to ensure students are prepared to deploy these tools to answer the questions they care about. Requirements: Practical Data Science I.

Syllabus#

A preliminary syllabus can be found here.

Class Schedule (Preliminary)#

Date

Topic

Do Before Class

In-Class Exercise

Th Jan 9

  • Speed and Performance in Python

Tu Jan 14

  • Big Data: What is it, how do I work with it?

Th Jan 16

  • Parallelism

  • Distributed Computing

(Note reading includes a 45 minute video to watch)

Tu Jan 21

  • Defining Your Own Estimators

Th Jan 23

  • GIS: Intro

  • GIS: Geopandas and Vector Data

Tu Jan 28

  • GIS: Projections

Th Jan 30

  • GIS: Vector Spatial Joins and Maps

Tu Feb 4

  • GIS: Raster Data

  • Raster Data

  • Intro to Rasterio

Th Feb 6

  • GIS: Satellite Data

  • Remote Sensing and Satellite Data

Tu Feb 11

  • GIS: Travel Distances

  • Distance APIs

Th Feb 13

  • Network Data

  • What is Network Data?

Tu Feb 18

  • Network Data

  • Intro to graph-tool

Th Feb 20

  • Network Data

  • Community Detection

Tu Feb 25

  • Network Data

  • Shortest Path

Th Feb 27

  • Advanced Modelling

  • Matching

Tu Mar 4

  • Advanced Modelling

  • pyGAM

Th Mar 6

  • NLP

  • Stemming, stop words, etc.

Tu Mar 11

NO CLASS

Th Mar 13

NO CLASS

Tu Mar 18

  • NLP

  • Topic models, bag of words

Th Mar 20

  • NLP

  • Embeddings

Tu Mar 25

Th Mar 27

  • NLP

  • LLMs

Tu Apr 1

  • Project

Th Apr 3

  • Project

Tu Apr 8

LAST CLASS

  • ROUGH DRAFT OF OPIOID PROJECT DUE, extendable to Morning of Mon Nov 26 but feedback may be later

Th Apr 9

Final Project Report Due

Tu Apr 15

Wed, April 23

Wed, Apr 30