Cheat Sheets

Being a good data scientist does not mean you know the syntax for a million tools by heart. Data Science is about knowing strategies for answering questions, and while it is important to understand how different software libraries work, and what they can offer (so you know what to google), memorizing syntax is not that important. Honestly, after more than a decade of working in Data Science, I still google function syntax nearly every day.

So to make life easier, here is a curated list of cheatsheets I recommend for different pieces of software. They won’t teach you how this packages work, but when you can’t remember the name of the function to do the thing you know you can do in a package, these are a great resource.

  • Bash

  • numpy

  • pandas

  • ggplot / plotnine

    • Note: this cheatsheet was written for ggplot in R, not plotnine specifically, so the syntax will require small tweaks, like putting column names in quotes and using underscores instead of periods

  • scikit-learn

  • git

  • dask