Azure from the Command Line

In our past tutorials, we mostly manipulated our Azure resources using the Azure web interface, but once you get comfortable with Azure, you’ll likely find this a little clunky for some things.

Thankfully, Azure also has a Command Line Interface (CLI)! In addition to letting you manage all your resources, it can also help you managing things like data uploads or downloads (so you don’t have to navigate to your Container in your browser).

Installing

Azure CLI is easy to install using the directions for your operating system here.

Then all authorization is managed by just running az login – that’ll open a login in your browser which will then authorize your session for whatever you want to do!

The only catch is that if you want to login to a different “tenant” within your account (e.g. if you have an account with free student credits AND the ability to use resources billed to your company or school), you have to specify your tenant (e.g. az login --tenant <tenant>). You can find your tenant ID by going to portal.azure.com, selecting “Azure Active Directory”, and looking at your Overview tab.

Then you can basically do anything you can do through the web interface from the command line! You can find a great quickstart guide for an overview of how it all works.

Managing Storage with Azure CLI

The one set of tools within the Azure CLI I will make sure to point out are tools for uploading and downloading data. There’s a tutorial here, but the basic syntax is very simple:

Upload Data:

az storage blob upload \
    --account-name <storage-account> \
    --container-name <container> \
    --name helloworld \
    --file helloworld \
    --auth-mode login

Download Data:

az storage blob download \
    --account-name <storage-account> \
    --container-name <container> \
    --name helloworld \
    --file ~/destination/path/for/file \
    --auth-mode login

(Though note that doing so requires adding a “role” to your account to authorize this behavior. This is discussed below, and the CLI will also give you directions if you don’t have the relevant role).

Moving Lots of Data?

The Azure CLI is a good tool, but if you’re moving data around a lot, here’s an even better tool that may be worth your investment: AzCopy. It’s a little less friendly to setup than the Azure CLI, but its much more powerful.

For example, we often have folders of data we want to mirror on Azure, use for computations, then when we’re done running some calculations, we might want to bring the updated version of the folder back to our computer. Rather than moving data file by file with Azure CLI, or even just copying the data all together, we can use azcopy to sync the two folders – you just point at the folder you want to sync, and it will synchronize the contents across platforms, transfering only the data that’s actually different between two folders (like rsync, if you know what that is!)

So here’s a setup guide for AzCopy, sorry it’s kinda annoying.

Installing azcopy

To install azcopy, download the relevant version from here and unzip the download. The follow these directions:

Mac

  1. Open a terminal session and type echo $PATH. Confirm that /usr/local/bin is one of the files listed.

  2. Type open /usr/local/bin.

  3. Drag azcopy into that folder.

  4. Apple doesn’t initially trust this program so before you close the folder, right-click on azcopy and select “Open”. You’ll get a warning, and say “Open Anyway”.

  5. Now open a new terminal session and type azcopy -h to make sure the install worked.

Linux

See above, but skip step 4.

Windows:

Similarly, you want to copy the downloaded folder somewhere on your PATH variable, so run echo $PATH, then put the azcopy file in any folder in that list of folders.

Authorizing azcopy

The next step is a little annoying, but here we go: you have to visit the webpage for the Storage Account you want to use and add a “role” to your Azure account.

To do so, go to Azure Portal, click on Storage Accounts, then select the account you want to work with. Once you’re inside:

  1. Click on “Access Control (IAM)”

  2. Click the “+ Add” button in the top left and select “Add Role Assignment”

  3. Under “Role”, select “Storage Blob Data Contributor”

  4. Under “Assign role to” select “User, group, or service principal”

  5. Under “select” look up your Azure account

  6. Save.

For me, this looks like:

azure_storageblobdatacontributor

Using azcopy

We’ll demonstrate using azcopy by uploading our Climate Data we used in the exercise we did in the Big Data section where we loaded global temperature data and measure global warming at a number of locations. You can get the data we’re using for this exercise here). Note I’m decompressing the ghcnd_daily_30gb.tar.gz file before upload.

[1]:
cd /users/nick/dropbox/MIDS_Data_Prep/ClimateData/processed_for_students/global_climate_data
ls
ghcnd-countries.txt     ghcnd-version.txt       ghcnd_daily_30gb.dat
ghcnd-states.txt        ghcnd_daily.csv         ghcnd_daily_30gb.tar.gz
ghcnd-stations.txt      ghcnd_daily.tar.gz      readme.txt

Now we need to authenticate. To do so, you’ll need your Tenant ID – to get this, got to the Azure Portal and search “Tenant” in the search bar and select “Tenant Properties”. There you’ll find a Tenant ID, which you insert below:

# This launches a web browser login
azcopy login --tenant-id "XXXXXX-XXXX-XXXX-XXXX-XXXXXXXX"

This will result in a message like:

To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code EA64FHRE5 to authenticate.

So do what it says, and when you come back the message should have changed to:

INFO: Login succeeded.

Now we’ll create a new container into which we can put our data:

azcopy make 'https://nce8sa.blob.core.windows.net/globaltemps'

This creates a blob container in my nce8sa Storage Account called globaltemps.

Now we’ll upload! Note that azcopy requires quotes around both the upload files and the destination address, even if you don’t have any spaces. So this will upload all files within the current directory:

azcopy copy "*" "https://nce8sa.blob.core.windows.net/globaltemps/"

As you can see, the syntax is pretty simple. The URL structure for Azure Blob storage is always:

https://[account].blob.core.windows.net/[container]/[path/to/blob]"

And the syntax for azcopy is just

azcopy copy [source] [destination] [flags]

Note that you can also add a --recursive flag and point to a directory instead of using wildcards.

Downloading Files with azcopy

To download with azcopy, just flip the source and destination!

azcopy copy "https://nce8sa.blob.core.windows.net/temperatures/readme.txt" "/users/nick/desktop/readme.txt"

Here’s the full azcopy documentation.