PDS II (IDS 541)

PDS II (IDS 541)#

Practical Data Science II is a flipped-classroom, exercise and project-focused course. Building on the computational thinking skills developed in Practical Data Science I, this course introduces students to a range of computational inquiry methods, including network analysis, geospatial analysis, advanced plotting, and natural language processing (NLP). Throughout, the focus will be on developing hands-on experience implementing these methods with messy real-world data to ensure students are prepared to deploy these tools to answer the questions they care about. Requirements: Practical Data Science I.

Syllabus#

A preliminary syllabus can be found here.

Class Schedule (Preliminary)#

Date

Topic

Do Before Class

In-Class Exercise

Th Jan 9

  • Pandas: Reshaping

  • Pandas: Categoricals

Tu Jan 14

  • Speed and Performance in Python

Th Jan 16

  • Big Data: What is it, how do I work with it?

Tu Jan 21

  • Parallelism

  • Distributed Computing

(Note reading includes a 45 minute video to watch)

Th Jan 23

  • Working with Text

Tu Jan 28

-GIS: Vector Data

Th Jan 30

-GIS: Vector Data

Tu Feb 4

-GIS: Vector Data

Th Feb 6

-GIS: Vector Data

Tu Feb 11

  • GIS: Rasters

  • Raster Data

  • Intro to Rasterio and X-Array

Th Feb 13

  • GIS: Rasters

  • Plotting raster data

Tu Feb 18

  • GIS: Rasters

  • Remote Sensing and Satellite Data

  • Band Algebra

Th Feb 20

  • GIS: Rasters

  • Reprojection, resampling, and interpolation

Tu Feb 25

  • GIS: Mixing Vector and Raster

  • Zonal aggregation (summary statistics)

  • Rasterization/Geohashing

Th Feb 27

  • Machine Learning

  • Géron, Chpt 1: The Machine Learning Landscape (stop before “Batch and Online Learning,” then read the “Testing and Validating” section)

  • Géron, Chpt 2: End-to-End Machine Learning Project

Tu Mar 4

  • Machine Learning

  • Prediction versus Inference

  • Supervised ML Workflow

Th Mar 6

  • Machine Learning

  • Supervised ML

Tu Mar 11

  • Machine Learning

  • SciKit-Learn

Th Mar 13

  • Solving Problems with Data

Tu Mar 18

  • Solving Problems with Data

  • Problem Taxonomy

Th Mar 20

NO CLASS

Tu Mar 25

NO CLASS

Th Mar 27

  • Solving Problems with Data

Tu Apr 1

  • Network Data

  • Intro to graph-tool

Th Apr 3

  • Network Data

  • Community Detection

Tu Apr 8

  • Network Data

  • Shortest Path

LAST CLASS