Welcome non-Duke MIDS Students#

In designing this course, I’ve endeavored to make all the materials and resources accessible to students anywhere. However, the materials in this course are tailored to meet the needs of incoming Duke Masters in Data Science (MIDS) students, and so if you are not an incoming Duke MIDS student, there are a couple things you should know about how the course has been designed.

In short:

  • This course assumes a familiarity with (regular) Python. Guidance for those new to Python below.

  • MIDS students have also seen a little bit of the two Python data science packages we’ll use a lot in this course: numpy, and pandas. As a result, this course is not optimally designed for those who have no exposure to these tools. A person who knows Python but has never seen numpy and pandas can still learn all they need from the materials in this course, but for reasons discussed below, it may be harder than is really necessary. Guidance for students in this position below.

  • No experience or past exposure is assumed for topic areas other than a very basic sense of numpy and pandas.

If you’ve never used Python before#

Great news! You can now work through the exact same Python introductory materials that MIDS students do when they arrive on campus for a three-week in-person bootcamp. These materials assume zero past experience with Python or programming!

Moreover, if you’re a Duke student, then provided you create a Coursera account using your @duke.edu email address, this material will all be free.

To be prepared for the course, you’ll need to:

If you’ve done some Python#

I still strongly recommend doing these materials. A lot of people “know” how to use Python, but really have a limited understanding of the programming principles and concepts that every good data scientist should know.

Do the exercises#

This course is taught (for in-person students) as a flipped classroom: students are required to read instructional materials at home before class, then students spend class time is spent doing exercises in pairs.

For non-Duke students looking to use this website to develop their data science skills, there are two consequences of this organization:

  • Duke students are using the exact same instructional materials you are. You aren’t missing lecture materials.

  • Duke students are being required to do the exercises associated with each topic, and they are integral to the course in two ways. First, there are lessons that come up in the exercises that aren’t covered (or aren’t covered as well as I’d like) in instructional materials. If you don’t do the exercises, you will miss important take-aways. And second, programming is a skilled learned by doing. Requiring students to do the exercises in class is a way of making sure that they get appropriate attention. There’s no way to make them “mandatory” online, but I hope you will take to heart my strong encouragement that you complete the exercises that follow each lesson if you want to get the most out of this site.