Guidelines for Using the Gradescope Autograder¶

In an effort to provide rapid feedback and to provide you with opportunities learn from your mistakes, most assignments in this class will be at least partially autograded upon submission to Gradescope.

The autograder will expect you to upload a Jupyter Notebook with Python code. The notebook will then be executed, and then the autograder will look for a dictionary called results in which it expects you to have stored your answers. The autograder will expect your answers to be stored using the strings specified in the exercise as keys (e.g. you may be asked to store the youngest age in a dataset with the key "ex5_age_young").

However, be aware TAs will also review your notebooks for answers to other questions and to ensure your notebook is well written and formatted, so even though the autograder won’t care if you organize your notebook well and explain what you’re doing clearly, we will! So please use best practices in writing good notebooks.

When storing numeric answers, please put the exact computed value into your results dictionary. We will use np.isclose() to evaluate numeric solutions to be sure that very small floating point errors won’t cause problems, but the closer you are to an exact solution the less likely you are to have issues.

Specific Guidelines¶

The only requirements for your submitted notebook are:

Unless otherwise noted in the exercise, you may only import pandas, numpy, matplotlib, seaborn, statsmodels, patsy and sklearn libraries.
Please make sure you are using at least Python 3.11 and at least pandas 2.1.
Import data from a URL (since the autograder in the cloud won’t see your file system)
Name your notebook as instructed in the exercise before submission. These will generally follow a format like exercise_[name of topic].ipynb (e.g., exercise_sklearn.ipynb).
Store your solutions in a dictionary called results with answers assigned to the keys provided in the exercise prompts.
Your notebook runs from start to completion without any errors. That means that for questions that (deliberately) invite you to write code that causes errors, before submitting comment out the code that generates the error to ensure it won’t interrupt notebook execution.
Your notebook does not include any IPython magics—these are the types of commands you can run in a jupyter notebook that starts with a % or %%, but which aren’t valid regular Python commands.
Jupyter Notebooks now allow you to write pip install [package] directly into a code cell. However, this is NOT valid Python code, and will cause problems with the autograder.
- Similarly, do not start cells with ! (an IPython magic for accessing the terminal from inside a notebook). That’s also not valid Python code.
You don’t overwrite any standard functions. This is actually good practice in general—in Python, you can reassign any function at any time, but doing so will cause problems for other people using your code who expect certain standard functions—like list(), set(), mean(), etc.—to do what they do in the standard library. So if you assign something to a variable called list (e.g. you run list = pd.Series(["Nick", "Trillian"])), then anyone who tries to create a list using the list function (e.g. my_list = list(x)) later will get an error.
Your notebook should be formatted with black prior to submission (directions).

Submission Limits¶

Unlike many autograders, beware that in this class the number of submissions allowed for each assignment will be limited. Usually, this limit will be three submissions. Your last submission (if you submit 3 or fewer times), or your third submission (if you submit more than 3 times) will determine your grade. Submissions that error out will not count against this total.

Why?

In software development, we can run our test suite as many times as we want. Indeed, we are encouraged to do so with Continuous Integration tools. But in Data Science, we generally never know if our answer is actually right. Instead, we have to learn to think carefully as we write our code and make sure that our intermediate results make sense because the whole point of generating a new result is that it’s something we didn’t know before!

In this class, however, I would also like to provide the opportunity to learn from your mistakes and iterate, and this “submission limit” model seems like a reasonable compromise.