Automating Spotify Playlist Music Download [ Spotify Free Version ]
Last Updated : `02 April 2023` v1.1.3
Youtube Video : Here
Okay, Hello there everyone. I’m back with a neat idea I had when trying to figure out on how to download my favourite playlists in Spotify. Due to not having a credit card and Spotify not in my country, I am using the Spotify free version. With this version, sadly, you cannot download your favourite songs. :(
I did some random google and came across an amazing website that lets you download Spotify songs.
See here for the site : Spotify Downloader
If you want to skip the whole article and dig right into the code, here’s the link to the code :
You can use Jupyter notebook to use the code or even google collab :
GitHub : https://github.com/surenjanath/Spotify-Playlist-Download
Youtube Video on how to use : https://youtu.be/1wGU652jF9g
With this site, you can basically download any playlist, but there’s a catch!
You can only do one song at a time, and I don’t like repetitive tasks. Whenever there are repetitive tasks involve, then that opens up ideas for automation! Never do one repetitive task. Not even once, there’s always a way to automate that if not then make a way. So, I would walk you through the investigation on how to do this with this website.
Getting Spotify Playlist
Music has always been a passion of mine, and I love discovering new artists and songs to add to my collection. Over the years, I’ve curated a playlist on Spotify to keep track of all my favourite tracks. This playlist is a mix of genres, from indie rock to electronic to hip-hop, and it’s constantly evolving. For now, it’s at 602 songs and today and there’s no way I can download all these songs… until now, we will download it using the said website above and automating it to download all 602 songs.
Here’s my playlist link : https://open.spotify.com/playlist/78vvv1LWPLKwKdgIa60VbR?si=dfeb05550ced4f53
We will open up our inspect tool on the website and then paste and click on download to see what requests method and data it is sending in the backend under the network tab. As you can see, the requests that are sent to the server and return data. This is amazing.
Click on each list items, and you will see the data that we will be interested in. Most likely it will be the second fetch.
The data should look like this :
You can always further expand each list entry in tracklist.
Now go to Headers and let’s look at the requests header and method used.
The Request URL is the link that we would be sending a request using requests in python. Here’s a tip, to verify that the API is authenticated free, you can simply copy the request URL in an incognito tab and see if there are data being received in it, which you would see that there is. Any website, you can always test that. Sometimes some websites have a bearer token, but that will be discussed later in another article.
Take a close look at the Request URL, you’ll see something that resembles the link that we’ve pasted into the input section of the website. The Spotify playlist URL have an ID and the Request URL took that ID and send it to the backend. This is how it is set up :
Spotify Link : https://open.spotify.com/playlist/78vvv1LWPLKwKdgIa60VbR?si=bc21dc3d990a4b45
Request URL : https://api.spotifydown.com/trackList/playlist/78vvv1LWPLKwKdgIa60VbR
We can always use regex to get the ID out of the Spotify playlist link, but a simple split and splicing can do the trick. Like this :
Code :
SPOTIFY_PLAYLIST_LINK = 'https://open.spotify.com/playlist/78vvv1LWPLKwKdgIa60VbR?si=bc21dc3d990a4b45'
ID = SPOTIFY_PLAYLIST_LINK.split('/')[-1].split('?si')[0]
print('[*] SPOTIFY PLAYLIST ID : ',ID)
Now let's set up our python program for this project. I’m using Jupyter notebook in VS Code.
Coding Project
Processing Download Button on Playlist
Some Libraries we would use :
import requests # Used to send GET and POST requests
import os # Used to create folder and path for mp3
If you get an error for the requests library just pip install requests
This piece of code is for file management, it’s basically getting the current working directory and creating a folder called Music if it’s not found
# variables
CWD = os.getcwd()
LOCATION = os.path.join(CWD,'MUSIC')
if os.path.isdir(LOCATION)==False:
os.mkdir(LOCATION)
Now that we have discussed getting the ID from the Spotify URL, we would use it to create a link for our API calls.
This is our link to our API https://api.spotifydown.com/trackList/playlist/78vvv1LWPLKwKdgIa60VbR
To make this link dynamic, we would use f strings like this : Playlist_Link = f'https://api.spotifydown.com/trackList/playlist/{ID}'
where ID is our Spotify Playlist ID.
For this project it is a good practice to use session because we would need to preserve the cookies and other data to make other API calls which we will discuss later down.
Let’s instantiate our session call and call our API :
session = requests.Session()
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
}
response = session.get(url = Playlist_Link, headers=headers)
As you can see, we have instantiated our session call using requests.Session()
and then called session.get
where we pass variable headers and the URL Playlist_Link.
Please note that it also is a good practice to send headers to the website because requests has it own header which sometimes websites detects it as a bot so to avoid that just practice sending browser headers.
Ha! Caught you there, if you’ve tried passing the header, you’ll find the API is taking long and returning a max retry error code. This is because the API does not our user agent, so just delete that part
response = session.get(url = Playlist_Link)
We would check our status code to make sure that we’ve got a 200
status code and then print the response.json()
.
Why we use .json()
is because our data response is JSON data according to our response header in our inspect tool :content-type: application/json; charset=utf-8
Our data will look like
A neat way of printing this would be like :
if response.status_code == 200 :
Tdata = response.json()['trackList']
page = response.json()['nextOffset']
for count,song in enumerate(Tdata):
print('*'*25, str(count+1) + '/' + str(len(Tdata)), '*'*25)
print('[*] Name of Song : ', song['title'])
print('[*] Spotify ID of Song : ',song['id'])
Now you may be wondering what’s nextOffset
. We will discuss that later down. Right now that we’ve got our response data. We have the following structure :
For now, we will only need the title, artists and ID. No need for the album and cover for now. For the next step, we would be clicking on download on a specific song on the website to see what requests it sends to the server side.
Processing Download Button on a Song
After we’ve selected our song of choice, we've seen the requests and method that was sent to the server. To do this, we would go to our network tab and look at the requests it sends.
You would come across something like :
Where the ID is our song of choice ID from the JSON data that was received. This is the data that URL above send back.
Why do we need this ?
To answer that question, we would look at the other requests.
We would see another request like this :
And this request takes a payload, also this request is a POST request, so something fishy happens here. Let’s take a look at the payload, and then we will look at the response.
This payload looks normal, but why did we need the first request?
If you look closely, you would notice k_query
have the ID we’ve got from the previous request. Hmm ?
So this looks like a typical request just need to change that one variable with the ID that was retrieved in the previous request.
Ahh! This has some very useful data, I would say. We can also download video with this, hmm, but we’re after the mp3. Let’s expand that.
Right here we have a code k
variable. mp3128
would be a sufficient data to take. So we would take note of this also. I know this is getting a bit long, but we would need to understand what we are getting back before we go straight into coding it. So we have all these data from this API. Let’s move on to see what else we have.
We have another POST request, let’s see what that one is about:
This also have a payload and this link is convertv2 whilst the previous link was analyzev2… Hmm, it seems we need to analyse first before we convert it successfully.
Yikes, we took 2 piece of data from the analysis request and post it to convertV2 to get some data.
Finally, we can see the download link of the song and title. Nice, we can directly download this link, or we can do another post to get the data
and that’s it for investigating the requests and methods. Now for the real part. Let’s code these into functions.
Coding Functions for the External API calls
Our First function would be to fetch the getId
from the API, so we can pass this ID to our analyze function request.
Our function would look like this :
def get_ID(session, id):
LINK = f'https://api.spotifydown.com/getId/{id}'
headers = {
'authority': 'api.spotifydown.com',
'method': 'GET',
'path': f'/getId/{id}',
'origin': 'https://spotifydown.com',
'referer': 'https://spotifydown.com/',
'sec-ch-ua': '"Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"',
'sec-fetch-mode': 'cors',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
}
response = session.get(url = LINK, headers=headers)
if response.status_code == 200 :
data = response.json()
return data
return None
We now test this function :
data = get_ID(session=session, id='5QUbOFOkbDlt4ZY7bdL4Am')
if data != None :
print('[*] Results : ', data)
and our results would be :
Now let’s code our next function ( Analyze Function ) :
def generate_Analyze_id(session, yt_id):
DL = 'https://corsproxy.io/?https://www.y2mate.com/mates/analyzeV2/ajax'
data = {
'k_query': f'https://www.youtube.com/watch?v={yt_id}',
'k_page': 'home',
'hl': 'en',
'q_auto': 0,
}
headers = {
'authority': 'corsproxy.io',
'method': 'POST',
'path': '/?https://www.y2mate.com/mates/analyzeV2/ajax',
'origin': 'https://spotifydown.com',
'referer': 'https://spotifydown.com/',
'sec-ch-ua': '"Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"',
'sec-fetch-mode': 'cors',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
}
RES = session.post(url=DL, data=data, headers=headers)
if RES.status_code ==200:
return RES.json()
return None
Now with the generate_Analyze_id
we would use session.post
instead of session.get
as this is a POST request and takes in payload. To pass the payload into the function, we would need to pass it as data.
Using f strings we would modify the k_query replacing the ID with yt_id, this ID is what we would pass into the function from the getId
function request. Some header details are from the POST header that we saw in the network tab for this specific link. Our results as follows from the generate_Analyze_id
getId_id = data['id']
print('[*] getId : ', getId_id)
data_analyze = generate_Analyze_id(session=session, yt_id=getId_id)
if data_analyze != None :
print('[*] Results : ', data_analyze)
So far our functions are working amazing, let’s complete the generate_Conversion_id
function
def generate_Conversion_id(session, analyze_yt_id, analyze_id):
DL = 'https://corsproxy.io/?https://www.y2mate.com/mates/convertV2/index'
data = {
'vid' : analyze_yt_id,
'k' : analyze_id,
}
headers = {
'authority': 'corsproxy.io',
'method': 'POST',
'path': '/?https://www.y2mate.com/mates/analyzeV2/ajax',
'origin': 'https://spotifydown.com',
'referer': 'https://spotifydown.com/',
'sec-ch-ua': '"Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"',
'sec-fetch-mode': 'cors',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
}
RES = session.post(url=DL, data=data, headers=headers)
if RES.status_code ==200:
return RES.json()
return None
How we would set up our conversion function now.
data_analyze_vid = data_analyze['vid']
data_analyze_id = data_analyze['links']['mp3']['mp3128']['k']
print('[*] Vid : ', data_analyze_vid)
print('[*] ID : ', data_analyze_id)
data_conversion = generate_Conversion_id(session, data_analyze_vid, data_analyze_id)
if data_conversion != None :
print('[*] Results : ', data_conversion)
Our results:
As you can see, our download link, if you click that link you’ll get the song directly downloading in your browser.
Finally, to download that, we just do the following :
## DOWNLOAD
link= session.get(DL_LINK)
## Save
with open(os.path.join(LOCATION, filename), 'wb') as f:
f.write(link.content)
Where os.path.join(LOCATION, filename)
is the location where we want the song to save and the filename. To create a filename, we can do the following :
filename = song[‘title’].translate(str.maketrans(‘’, ‘’, string.punctuation)) + ‘ — ‘ + song[‘artists’].translate(str.maketrans(‘’, ‘’, string.punctuation)) + ‘.mp3’
the reason why we do it like this is that sometimes the song title have characters that windows do not like as names, e.g. “:”
So now to wrap up everything into a functional code:
Playlist_Link = f'https://api.spotifydown.com/trackList/playlist/{ID}'
session = requests.Session()
offset_data = {}
response = session.get(url = Playlist_Link)
offset = 0
while offset != None :
if response.status_code == 200 :
Tdata = response.json()['trackList']
page = response.json()['nextOffset']
for count,song in enumerate(Tdata):
yt_id = get_ID(session=session, id=song['id'])
filename = song['title'].translate(str.maketrans('', '', string.punctuation)) + ' - ' + song['artists'].translate(str.maketrans('', '', string.punctuation)) + '.mp3'
print('*'*25, str(count+1) + '/' + str(len(Tdata)), '*'*25)
print('[*] Name of Song : ', song['title'])
print('[*] Spotify ID of Song : ',song['id'])
print('[*] Youtube ID of Song : ',yt_id['id'])
data = generate_Analyze_id(session = session, yt_id = yt_id['id'])
DL_ID = data['links']['mp3']['mp3128']['k']
DL_DATA = generate_Conversion_id(session= session, analyze_yt_id = data['vid'], analyze_id = DL_ID )
DL_LINK = DL_DATA['dlink']
## DOWNLOAD MP3
link= session.get(DL_LINK)
## Save MP3
with open(os.path.join(LOCATION, filename), 'wb') as f:
f.write(link.content)
if page!=None:
offset_data['offset'] = page
response = session.get(url = Playlist_Link, params=offset_data)
else:
break
In the above code, you would realize that I have a while loop with everything wrap inside it, this is where the offset comes into play. The API have a limit of 100 songs, but sometimes you can have a playlist with 135 songs or even 600 plus. The while loop would go through each link and getting the offset_data[‘offset’] from our first response and loop through it again until offset is set to none as seen in this piece of code
if page!=None:
offset_data['offset'] = page
response = session.get(url = Playlist_Link, params=offset_data)
else:
break
And basically that’s it.
Here’s the link to the full code:
Github : https://github.com/surenjanath/Spotify-Playlist-Download
Follow me for more web scraping and other python programming articles if you like this.