Welcome non-Duke MIDS Students#
In designing this course, I’ve endeavored to make all the materials and resources accessible to students anywhere. However, the materials in this course are tailored to meet the needs of incoming Duke Masters in Data Science (MIDS) students, and so if you are not an incoming Duke MIDS student, there are a couple things you should know about how the course has been designed.
In short:
This course assumes a familiarity with (regular) Python. Guidance for those new to Python below.
MIDS students have also seen a little bit of the two Python data science packages we’ll use a lot in this course:
numpy
, andpandas
. As a result, this course is not optimally designed for those who have no exposure to these tools. A person who knows Python but has never seennumpy
andpandas
can still learn all they need from the materials in this course, but for reasons discussed below, it may be harder than is really necessary. Guidance for students in this position below.No experience or past exposure is assumed for topic areas other than a very basic sense of
numpy
andpandas
.
No Python Experience?#
Great news! There are now two paths forward.
Take IDS 590!#
The first — starting in Fall 2025 — is to enroll in IDS 590 with Adriane Fresh and Nick Eubank (T/Th 10:05-11:20 — “Social Science 4 Data Science”). This is a version of the class we’re creating for non-MIDS students who will not have completed that summer bootcamp.
Don’t be confused by the 590 numbering! The level will be the same in IDS 590 as in IDS 720. The two classes primarily differ only in whether the course itself includes the material MIDS students do as a summer bootcamp.
The fact IDS 720 is a 700 level class and IDS 590 is a 500 level class is not meant to imply one is more serious or rigorous than the other — IDS 720 was created with MIDS in mind, so we scheduled it as a 700-level course number because that makes it “grad only.” But because we want to allow advanced undergraduates to take IDS 590, we scheduled it as a 500-level course number because that allows graduate students and advanced undergradutes to take the class.
Complete the MIDS Summer Bootcamp Material On Your Own On Coursera#
If there is space in IDS 720, then the second option is to work through the Python introductory materials that MIDS students do when they arrive on campus for a three-week in-person bootcamp on Coursera. These materials assume zero past experience with Python or programming.
Moreover, if you’re a Duke student, then provided you create a Coursera account using your @duke.edu
email address, this material will all be free.
To be prepared for the course, you’ll need to:
Python Programming Fundamentals:
Do entire course.
Data Science with NumPy, Sets and Dictionaries:
Week/Module 1: “Sets and Dictionaries: Storing and Working with Data”
-
Week/Module 1: “Intro to Pandas For Data Science + Strings and I/O”
Designing Larger Python Programs for Data Science:
Do entire course
If you’ve done some Python#
I still strongly recommend doing these materials. A lot of people “know” how to use Python, but really have a limited understanding of the programming principles and concepts that every good data scientist should know.
Do the exercises#
This course is taught (for in-person students) as a flipped classroom: students are required to read instructional materials at home before class, then students spend class time is spent doing exercises in pairs.
For non-Duke students looking to use this website to develop their data science skills, there are two consequences of this organization:
Duke students are using the exact same instructional materials you are. You aren’t missing lecture materials.
Duke students are being required to do the exercises associated with each topic, and they are integral to the course in two ways. First, there are lessons that come up in the exercises that aren’t covered (or aren’t covered as well as I’d like) in instructional materials. If you don’t do the exercises, you will miss important take-aways. And second, programming is a skilled learned by doing. Requiring students to do the exercises in class is a way of making sure that they get appropriate attention. There’s no way to make them “mandatory” online, but I hope you will take to heart my strong encouragement that you complete the exercises that follow each lesson if you want to get the most out of this site.