Solid Serialization of Béyoncé and her Spotify Friends

This assignment is part of: :doc`/syllabus/assignments/homework/solid-serialization-skills`

Using a JSON data response from Spotify’s API, sort/filter/analyze the artists most related to Béyoncé, according to Spotify’s algorithm.

The data file is mirrored here:

You can read Spotify’s documentation of its related-artists endpoint here, including a description of each field:

While messing around with the Spotify API may not be the most high-minded civic use of our time, it’s certainly a really fun API with broad appeal to just about anyone. The fact that it’s one of the best documented and easily accessible commercial APIs makes it a no-brainer to play with.


You are expected to deliver a script named:

This script has 4 prompts, which means at the very least, your script will contain 4 separate function definitions, from foo_1 to foo_4.

When I run your script from the command line, I expect to see something like this:

$ python
Done running assertions!

And I expect your script to have this code at the bottom:

def foo_assertions():
    assert type(foo_1()) is list
    assert len(foo_1())  is 20
    assert foo_1()[0] == "Destiny's Child"

    assert type(foo_2()) is list
    assert foo_2()[0]  == ['name', 'followers']
    assert foo_2()[-1][0]  == "Kelis"

    assert type(foo_3()) is list
    assert foo_3()[0]['name']  == 'Rihanna'
    assert foo_3()[-1]['popularity'] == 62

    assert type(foo_4()) is list
    assert foo_4()[0][1] == 20

if __name__ == '__main__':
    print("Done running assertions!")

That script should have a if __name__ == '__main__' conditional block, in which a function named foo_assertions() is executed.


The mirror of the related-artists.json result for Béyoncé is here:

Download (and cache) that file, then parse/deserialize the text data, then write the foo_ functions to satisfy the following prompts:

2. List artists’ names with number of followers, rank by followers

Expected output:

[['name', 'followers'],
 ['Rihanna', 8792360],
 ['Chris Brown', 3832426],
 ['Justin Timberlake', 2853318],
 ['Jennifer Lopez', 1493872],
 ['Alicia Keys', 1464533],
 ['Christina Aguilera', 1168887],
 ['Mariah Carey', 1106762],
 ['Whitney Houston', 903522],
 ['Ciara', 803715],
 ["Destiny's Child", 634049],
 ['Fergie', 539632],
 ['Mary J. Blige', 536947],
 ['Kelly Rowland', 479842],
 ['The Pussycat Dolls', 414452],
 ['TLC', 387450],
 ['Ashanti', 245022],
 ['Keri Hilson', 221302],
 ['Jennifer Hudson', 195102],
 ['Cassie', 140344],
 ['Kelis', 115659]]


For a more thorough exploration of the Spotify API (albeit from the Command Line), check out my guide from a couple years back:

The actual URL for the live endpoint of related-artists for Béyoncé can be found here:

Answers (Partial)

I like to start off by writing a function that does the downloading and deserializing:

import json
from os import makedirs
from os.path import exists, join, basename
import requests

SRC_URL  = ''
DATA_DIR = 'data-files'
DATA_FNAME = join(DATA_DIR, basename(SRC_URL))

def bootstrap_data():
    """ returns serialized object"""
    if exists(DATA_FNAME):
        rawjson = open(DATA_FNAME).read()
        r = requests.get(SRC_URL)
        makedirs(DATA_DIR, exist_ok=True)
        with open(DATA_FNAME, 'w') as f:
        rawjson = r.text
    return json.loads(rawjson)

The above script is a result of my desire for some organization. I only want to download the source data once. And I want that data file to be saved in a subdirectory:


However, you may not care about that. In any case, if you’re going to use my code above, make sure you know what each line does, e.g. basename(SRC_URL).

After you’ve created a bootstrap_data() function that does all the data grabbing/parsing work, each subsequent function that needs the data can just call bootstrap_data():

def foo_1():
    Return a list of artist names, from the list of artists most related to Béyoncé, according to the Spotify API, sorted by popularity

    data = bootstrap_data()
    names = []
    for a in data['artists']:

    return names

Or, if you prefer brevity:

def foo_1():
  return [a['name'] for a in bootstrap_data()['artists']]