24 Commits
0.0.1 ... 0.1.2

Author SHA1 Message Date
Tim Rae
a16f764bee Bump version to 0.1.2 2024-06-03 23:38:21 +02:00
Tim Rae
c1956d19cc Fix bug where occasionally wrong track is inserted 2024-06-03 23:38:21 +02:00
Tim Rae
faaf103d23 Bump version to 0.1.1 2024-06-03 22:32:28 +02:00
Tim Rae
a2e62ea20d Fix error due to missing type 2024-06-03 22:32:28 +02:00
Tim Rae
87ae9acbd3 Bump version to 0.1.0 2024-06-03 09:14:10 +02:00
Tim Rae
1e8366a0e8 Performance optimization / refactoring (#43)
This replaces #36 and adds some other fixes!

Execution speed should be much faster now, especially when there are not
many changes to synchronize.

* Maintain track cache between different playlists (thanks to @joshrmcdaniel for amazing work on that!)
* Fix incorrect tidal_playlist_is_dirty() implementation
* Remove more redundant API calls
* Avoid unnecessarily spinning up tasks for tracks that were in match failure cache
* Introduce new rate_limit configuration parameter implemented with leaky bucket rate-limiting algorithm
* Where possible, add new tracks to existing playlist instead of erasing the old ones
* Use asyncio multithreading instead of multiprocessing
* When user has large number of spotify playlists, fetch them in parallel instead of one by one
* More typing hints / typing fixes
2024-06-03 09:11:56 +02:00
Tim Rae
689637510d Fix AttributeError: 'UserPlaylist' object has no attribute 'requests' (#49)
Fixes #48
2024-05-27 16:36:19 +02:00
lokopeto
009db68283 import math (#46)
missing include from #40
2024-05-26 09:26:06 +02:00
Robin Hirst
8a1d0df6dc Add contributors section to readme.md (#44) 2024-05-25 23:03:37 +02:00
Tim Rae
9ad8f9e498 Remove requirements.txt and update readme 2024-05-25 12:13:02 +02:00
Tim Rae
fc20f7b577 Parallelize querying Spotify playlist tracks
This can take quite a long time on large playlists, so makes sense to do
in parallel
2024-05-25 12:13:02 +02:00
Tim Rae
bc75fbf779 Reduce default num subprocesses
The recent caching changes seem to have worsened the situation for me
2024-05-25 12:13:02 +02:00
Tim Rae
42ddaff7b9 Show more progress updates when starting up script 2024-05-25 12:13:02 +02:00
Tim Rae
1a2aedf7a5 Set 2s timeout when connecting to Spotify
I've been having an issue where the script hangs for a long time
when starting up. It seems to be caused by the connection request
to Spotify failing on the first request (which uses IPv6) and
having to wait for the retry attempt using IPv4

2s should be plenty for just getting the refresh token and is
much more tolerable in case of IPv6 failure
2024-05-25 12:13:02 +02:00
Tim Rae
8884ec8c8f Merge pull request #41 from spotify2tidal/feature/miss_cache
Add cache of match failures
2024-05-25 09:41:12 +02:00
Josh Mcdaniel
6294638613 Merge pull request #42 from spotify2tidal/bug/missing_file
Add missing config file
2024-05-24 17:10:37 -05:00
joshrmcdaniel
b25e1c3b36 rm playlist import 2024-05-24 16:59:26 -05:00
joshrmcdaniel
df3e406570 playlist 2024-05-24 16:59:01 -05:00
Tim Rae
311822ecdc Add cache of match failures
This change introduces an sqlite database that contains the track_id,
db insertion time, and ttl in the cache. The ttl starts with one week,
and increases exponentially by a factor of 2 each time the same track_id
is added to the database.

This significantly reduces the execution of the time script when there
are a lot of match failures accumulating, which do not need to check
every time.
2024-05-22 14:54:17 +02:00
Josh Mcdaniel
e2236e429e Merge pull request #34 from spotify2tidal/dev/package
Move code into package
2024-05-12 11:10:44 -05:00
joshrmcdaniel
9e3285686e toml 2024-05-12 10:46:57 -05:00
joshrmcdaniel
76f502f2bc types 2024-05-12 10:46:39 -05:00
joshrmcdaniel
6aaf72bdd1 type hint, move 2024-05-12 10:27:23 -05:00
joshrmcdaniel
4e0c81071b move to src 2024-05-12 10:20:52 -05:00
14 changed files with 624 additions and 333 deletions

2
.gitignore vendored
View File

@@ -1,7 +1,7 @@
# Config and cache files
config.yml
config.yaml
.cache-*
.cache*
.session.yml
# Byte-compiled / optimized / DLL files

View File

@@ -13,3 +13,7 @@ spotify:
# uncomment this block if you want to sync all playlists in the account with some exceptions
#excluded_playlists:
# - spotify:playlist:1ABCDEqsABCD6EaABCDa0a
# increasing these parameters should increase the search speed, while decreasing reduces likelihood of 429 errors
max_concurrency: 10 # max concurrent connections at any given time
rate_limit: 12 # max sustained connections per second

23
pyproject.toml Normal file
View File

@@ -0,0 +1,23 @@
[build-system]
requires = ["setuptools >= 61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "spotify_to_tidal"
version = "0.1.2"
requires-python = ">= 3.10"
dependencies = [
"spotipy~=2.21.0",
"tidalapi==0.7.6",
"pyyaml~=6.0",
"tqdm~=4.64",
"sqlalchemy~=2.0"
]
[tools.setuptools.packages."spotify_to_tidal"]
where = "src"
include = "spotify_to_tidal*"
[project.scripts]
spotify_to_tidal = "spotify_to_tidal.__main__:main"

View File

@@ -5,7 +5,7 @@ Installation
Clone this git repository and then run:
```bash
python3 -m pip install -r requirements.txt
python3 -m pip install -e .
```
Setup
@@ -18,18 +18,24 @@ Setup
Usage
----
To synchronize all of your Spotify playlists with your Tidal account run the following
To synchronize all of your Spotify playlists with your Tidal account run the following from the project root directory
```bash
python3 sync.py
spotify_to_tidal
```
This will take a long time because the Tidal API is really slow.
You can also just synchronize a specific playlist by doing the following:
```bash
python3 sync.py --uri 1ABCDEqsABCD6EaABCDa0a
spotify_to_tidal --uri 1ABCDEqsABCD6EaABCDa0a # accepts playlist id or full playlist uri
```
See example_config.yml for more configuration options, and `sync.py --help` for more options.
See example_config.yml for more configuration options, and `spotify_to_tidal --help` for more options.
---
#### Join our amazing community as a code contributor
<br><br>
<a href="https://github.com/spotify2tidal/spotify_to_tidal/graphs/contributors">
<img class="dark-light" src="https://contrib.rocks/image?repo=spotify2tidal/spotify_to_tidal&anon=0&columns=25&max=100&r=true" />
</a>

View File

@@ -1,5 +0,0 @@
spotipy==2.21.0
requests>=2.28.1 # for tidalapi
tidalapi==0.7.2
pyyaml==6.0
tqdm==4.64.1

View File

@@ -0,0 +1,37 @@
import yaml
import argparse
import sys
from . import sync as _sync
from . import auth as _auth
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--config', default='config.yml', help='location of the config file')
parser.add_argument('--uri', help='synchronize a specific URI instead of the one in the config')
args = parser.parse_args()
with open(args.config, 'r') as f:
config = yaml.safe_load(f)
print("Opening Spotify session")
spotify_session = _auth.open_spotify_session(config['spotify'])
print("Opening Tidal session")
tidal_session = _auth.open_tidal_session()
if not tidal_session.check_login():
sys.exit("Could not connect to Tidal")
if args.uri:
# if a playlist ID is explicitly provided as a command line argument then use that
spotify_playlist = spotify_session.playlist(args.uri)
tidal_playlists = _sync.get_tidal_playlists_dict(tidal_session)
tidal_playlist = _sync.pick_tidal_playlist_for_spotify_playlist(spotify_playlist, tidal_playlists)
_sync.sync_list(spotify_session, tidal_session, [tidal_playlist], config)
elif config.get('sync_playlists', None):
# if the config contains a sync_playlists list of mappings then use that
_sync.sync_list(spotify_session, tidal_session, _sync.get_playlists_from_config(spotify_session, tidal_session, config), config)
else:
# otherwise just use the user playlists in the Spotify account
_sync.sync_list(spotify_session, tidal_session, _sync.get_user_playlist_mappings(spotify_session, tidal_session, config), config)
if __name__ == '__main__':
main()
sys.exit(0)

View File

@@ -6,12 +6,18 @@ import tidalapi
import webbrowser
import yaml
def open_spotify_session(config):
__all__ = [
'open_spotify_session',
'open_tidal_session'
]
def open_spotify_session(config) -> spotipy.Spotify:
credentials_manager = spotipy.SpotifyOAuth(username=config['username'],
scope='playlist-read-private',
client_id=config['client_id'],
client_secret=config['client_secret'],
redirect_uri=config['redirect_uri'])
redirect_uri=config['redirect_uri'],
requests_timeout=2)
try:
credentials_manager.get_access_token(as_dict=False)
except spotipy.SpotifyOauthError:
@@ -19,7 +25,7 @@ def open_spotify_session(config):
return spotipy.Spotify(oauth_manager=credentials_manager)
def open_tidal_session(config = None):
def open_tidal_session(config = None) -> tidalapi.Session:
try:
with open('.session.yml', 'r') as session_file:
previous_session = yaml.safe_load(session_file)

View File

@@ -0,0 +1,84 @@
import datetime
import sqlalchemy
from sqlalchemy import Table, Column, String, DateTime, MetaData, insert, select, update, delete
from typing import Dict, List, Sequence, Set, Mapping
class MatchFailureDatabase:
"""
sqlite database of match failures which persists between runs
this can be used concurrently between multiple processes
"""
def __init__(self, filename='.cache.db'):
self.engine = sqlalchemy.create_engine(f"sqlite:///{filename}")
meta = MetaData()
self.match_failures = Table('match_failures', meta,
Column('track_id', String,
primary_key=True),
Column('insert_time', DateTime),
Column('next_retry', DateTime),
sqlite_autoincrement=False)
meta.create_all(self.engine)
def _get_next_retry_time(self, insert_time: datetime.datetime | None = None) -> datetime.datetime:
if insert_time:
# double interval on each retry
interval = 2 * (datetime.datetime.now() - insert_time)
else:
interval = datetime.timedelta(days=7)
return datetime.datetime.now() + interval
def cache_match_failure(self, track_id: str):
""" notifies that matching failed for the given track_id """
fetch_statement = select(self.match_failures).where(
self.match_failures.c.track_id == track_id)
with self.engine.connect() as connection:
with connection.begin():
# Either update the next_retry time if track_id already exists, otherwise create a new entry
existing_failure = connection.execute(
fetch_statement).fetchone()
if existing_failure:
update_statement = update(self.match_failures).where(
self.match_failures.c.track_id == track_id).values(next_retry=self._get_next_retry_time())
connection.execute(update_statement)
else:
connection.execute(insert(self.match_failures), {
"track_id": track_id, "insert_time": datetime.datetime.now(), "next_retry": self._get_next_retry_time()})
def has_match_failure(self, track_id: str) -> bool:
""" checks if there was a recent search for which matching failed with the given track_id """
statement = select(self.match_failures.c.next_retry).where(
self.match_failures.c.track_id == track_id)
with self.engine.connect() as connection:
match_failure = connection.execute(statement).fetchone()
if match_failure:
return match_failure.next_retry > datetime.datetime.now()
return False
def remove_match_failure(self, track_id: str):
""" removes match failure from the database """
statement = delete(self.match_failures).where(
self.match_failures.c.track_id == track_id)
with self.engine.connect() as connection:
with connection.begin():
connection.execute(statement)
class TrackMatchCache:
"""
Non-persistent mapping of spotify ids -> tidal_ids
This should NOT be accessed concurrently from multiple processes
"""
data: Dict[str, int] = {}
def get(self, track_id: str) -> int | None:
return self.data.get(track_id, None)
def insert(self, mapping: tuple[str, int]):
self.data[mapping[0]] = mapping[1]
# Main singleton instance
failure_cache = MatchFailureDatabase()
track_match_cache = TrackMatchCache()

326
src/spotify_to_tidal/sync.py Executable file
View File

@@ -0,0 +1,326 @@
#!/usr/bin/env python3
import asyncio
from .cache import failure_cache, track_match_cache
from functools import partial
from typing import List, Sequence, Set, Mapping
import math
import requests
import sys
import spotipy
import tidalapi
from .tidalapi_patch import add_multiple_tracks_to_playlist, set_tidal_playlist
import time
from tqdm.asyncio import tqdm as atqdm
import traceback
import unicodedata
import math
from .type import spotify as t_spotify
def normalize(s) -> str:
return unicodedata.normalize('NFD', s).encode('ascii', 'ignore').decode('ascii')
def simple(input_string: str) -> str:
# only take the first part of a string before any hyphens or brackets to account for different versions
return input_string.split('-')[0].strip().split('(')[0].strip().split('[')[0].strip()
def isrc_match(tidal_track: tidalapi.Track, spotify_track) -> bool:
if "isrc" in spotify_track["external_ids"]:
return tidal_track.isrc == spotify_track["external_ids"]["isrc"]
return False
def duration_match(tidal_track: tidalapi.Track, spotify_track, tolerance=2) -> bool:
# the duration of the two tracks must be the same to within 2 seconds
return abs(tidal_track.duration - spotify_track['duration_ms']/1000) < tolerance
def name_match(tidal_track, spotify_track) -> bool:
def exclusion_rule(pattern: str, tidal_track: tidalapi.Track, spotify_track: t_spotify.SpotifyTrack):
spotify_has_pattern = pattern in spotify_track['name'].lower()
tidal_has_pattern = pattern in tidal_track.name.lower() or (not tidal_track.version is None and (pattern in tidal_track.version.lower()))
return spotify_has_pattern != tidal_has_pattern
# handle some edge cases
if exclusion_rule("instrumental", tidal_track, spotify_track): return False
if exclusion_rule("acapella", tidal_track, spotify_track): return False
if exclusion_rule("remix", tidal_track, spotify_track): return False
# the simplified version of the Spotify track name must be a substring of the Tidal track name
# Try with both un-normalized and then normalized
simple_spotify_track = simple(spotify_track['name'].lower()).split('feat.')[0].strip()
return simple_spotify_track in tidal_track.name.lower() or normalize(simple_spotify_track) in normalize(tidal_track.name.lower())
def artist_match(tidal_track: tidalapi.Track, spotify_track) -> bool:
def split_artist_name(artist: str) -> Sequence[str]:
if '&' in artist:
return artist.split('&')
elif ',' in artist:
return artist.split(',')
else:
return [artist]
def get_tidal_artists(tidal_track: tidalapi.Track, do_normalize=False) -> Set[str]:
result: list[str] = []
for artist in tidal_track.artists:
if do_normalize:
artist_name = normalize(artist.name)
else:
artist_name = artist.name
result.extend(split_artist_name(artist_name))
return set([simple(x.strip().lower()) for x in result])
def get_spotify_artists(spotify_track: t_spotify.SpotifyTrack, do_normalize=False) -> Set[str]:
result: list[str] = []
for artist in spotify_track['artists']:
if do_normalize:
artist_name = normalize(artist['name'])
else:
artist_name = artist['name']
result.extend(split_artist_name(artist_name))
return set([simple(x.strip().lower()) for x in result])
# There must be at least one overlapping artist between the Tidal and Spotify track
# Try with both un-normalized and then normalized
if get_tidal_artists(tidal_track).intersection(get_spotify_artists(spotify_track)) != set():
return True
return get_tidal_artists(tidal_track, True).intersection(get_spotify_artists(spotify_track, True)) != set()
def match(tidal_track, spotify_track) -> bool:
if not spotify_track['id']: return False
return isrc_match(tidal_track, spotify_track) or (
duration_match(tidal_track, spotify_track)
and name_match(tidal_track, spotify_track)
and artist_match(tidal_track, spotify_track)
)
async def tidal_search(spotify_track, rate_limiter, tidal_session: tidalapi.Session) -> tidalapi.Track | None:
def _search_for_track_in_album():
# search for album name and first album artist
if 'album' in spotify_track and 'artists' in spotify_track['album'] and len(spotify_track['album']['artists']):
album_result = tidal_session.search(simple(spotify_track['album']['name']) + " " + simple(spotify_track['album']['artists'][0]['name']), models=[tidalapi.album.Album])
for album in album_result['albums']:
album_tracks = album.tracks()
if len(album_tracks) >= spotify_track['track_number']:
track = album_tracks[spotify_track['track_number'] - 1]
if match(track, spotify_track):
failure_cache.remove_match_failure(spotify_track['id'])
return track
def _search_for_standalone_track():
# if album search fails then search for track name and first artist
for track in tidal_session.search(simple(spotify_track['name']) + ' ' + simple(spotify_track['artists'][0]['name']), models=[tidalapi.media.Track])['tracks']:
if match(track, spotify_track):
failure_cache.remove_match_failure(spotify_track['id'])
return track
await rate_limiter.acquire()
album_search = await asyncio.to_thread( _search_for_track_in_album )
if album_search:
return album_search
await rate_limiter.acquire()
track_search = await asyncio.to_thread( _search_for_standalone_track )
if track_search:
return track_search
return None
# if none of the search modes succeeded then store the track id to the failure cache
failure_cache.cache_match_failure(spotify_track['id'])
def get_tidal_playlists_dict(tidal_session: tidalapi.Session) -> Mapping[str, tidalapi.Playlist]:
# a dictionary of name --> playlist
print("Loading Tidal playlists... This may take some time.")
tidal_playlists = tidal_session.user.playlists()
output = {}
for playlist in tidal_playlists:
output[playlist.name] = playlist
return output
async def repeat_on_request_error(function, *args, remaining=5, **kwargs):
# utility to repeat calling the function up to 5 times if an exception is thrown
try:
return await function(*args, **kwargs)
except (tidalapi.exceptions.TooManyRequests, requests.exceptions.RequestException) as e:
if remaining:
print(f"{str(e)} occurred, retrying {remaining} times")
else:
print(f"{str(e)} could not be recovered")
if isinstance(e, requests.exceptions.RequestException) and not e.response is None:
print(f"Response message: {e.response.text}")
print(f"Response headers: {e.response.headers}")
if not remaining:
print("Aborting sync")
print(f"The following arguments were provided:\n\n {str(args)}")
print(traceback.format_exc())
sys.exit(1)
sleep_schedule = {5: 1, 4:10, 3:60, 2:5*60, 1:10*60} # sleep variable length of time depending on retry number
time.sleep(sleep_schedule.get(remaining, 1))
return await repeat_on_request_error(function, *args, remaining=remaining-1, **kwargs)
async def get_tracks_from_spotify_playlist(spotify_session: spotipy.Spotify, spotify_playlist):
def _get_tracks_from_spotify_playlist(offset: int, spotify_session: spotipy.Spotify, playlist_id: str):
fields="next,total,limit,items(track(name,album(name,artists),artists,track_number,duration_ms,id,external_ids(isrc)))"
return spotify_session.playlist_tracks(playlist_id, fields, offset=offset)
output = []
print(f"Loading tracks from Spotify playlist '{spotify_playlist['name']}'")
results = _get_tracks_from_spotify_playlist( 0, spotify_session, spotify_playlist["id"] )
output.extend([r['track'] for r in results['items'] if r['track'] is not None])
# get all the remaining tracks in parallel
if results['next']:
offsets = [ results['limit'] * n for n in range(1, math.ceil(results['total']/results['limit'])) ]
extra_results = await atqdm.gather( *[asyncio.to_thread(_get_tracks_from_spotify_playlist, offset, spotify_session=spotify_session, playlist_id=spotify_playlist["id"]) for offset in offsets ] )
for extra_result in extra_results:
output.extend([r['track'] for r in extra_result['items'] if r['track'] is not None])
return output
def populate_track_match_cache(spotify_tracks_: Sequence[t_spotify.SpotifyTrack], tidal_tracks_: Sequence[tidalapi.Track]):
""" Populate the track match cache with all the existing tracks in Tidal playlist corresponding to Spotify playlist """
def _populate_one_track_from_spotify(spotify_track: t_spotify.SpotifyTrack):
for idx, tidal_track in list(enumerate(tidal_tracks)):
if match(tidal_track, spotify_track):
track_match_cache.insert((spotify_track['id'], tidal_track.id))
tidal_tracks.pop(idx)
return
def _populate_one_track_from_tidal(tidal_track: tidalapi.Track):
for idx, spotify_track in list(enumerate(spotify_tracks)):
if match(tidal_track, spotify_track):
track_match_cache.insert((spotify_track['id'], tidal_track.id))
spotify_tracks.pop(idx)
return
# make a copy of the tracks to avoid modifying original arrays
spotify_tracks = [t for t in spotify_tracks_]
tidal_tracks = [t for t in tidal_tracks_]
# first populate from the tidal tracks
for track in tidal_tracks:
_populate_one_track_from_tidal(track)
# then populate from the subset of Spotify tracks that didn't match (to account for many-to-one style mappings)
for track in spotify_tracks:
_populate_one_track_from_spotify(track)
def get_new_tracks_from_spotify_playlist(spotify_tracks: Sequence[t_spotify.SpotifyTrack], old_tidal_tracks: Sequence[tidalapi.Track]) -> list[t_spotify.SpotifyTrack]:
''' Extracts only the new tracks in the Spotify playlist that are not already on Tidal or known match failures '''
populate_track_match_cache(spotify_tracks, old_tidal_tracks)
results = []
for spotify_track in spotify_tracks:
if not spotify_track['id']: continue
if not track_match_cache.get(spotify_track['id']) and not failure_cache.has_match_failure(spotify_track['id']):
results.append(spotify_track)
return results
def get_tracks_for_new_tidal_playlist(spotify_tracks: Sequence[t_spotify.SpotifyTrack]) -> Sequence[int]:
''' gets list of corresponding tidal track ids for each spotify track, ignoring duplicates '''
output = []
seen_tracks = set()
for spotify_track in spotify_tracks:
if not spotify_track['id']: continue
tidal_id = track_match_cache.get(spotify_track['id'])
if tidal_id and not tidal_id in seen_tracks:
output.append(tidal_id)
seen_tracks.add(tidal_id)
return output
async def sync_playlist(spotify_session: spotipy.Spotify, tidal_session: tidalapi.Session, spotify_playlist, tidal_playlist: tidalapi.Playlist | None, config):
async def _run_rate_limiter(semaphore):
''' Leaky bucket algorithm for rate limiting. Periodically releases an item from semaphore at rate_limit'''
while True:
await asyncio.sleep(1/config.get('rate_limit', 12)) # sleep for min time between new function executions
semaphore.release() # leak one item from the 'bucket'
# Create a new Tidal playlist if required
if not tidal_playlist:
print(f"No playlist found on Tidal corresponding to Spotify playlist: '{spotify_playlist['name']}', creating new playlist")
tidal_playlist = tidal_session.user.create_playlist(spotify_playlist['name'], spotify_playlist['description'])
# Extract the new tracks from the playlist that we haven't already seen before
spotify_tracks = await get_tracks_from_spotify_playlist(spotify_session, spotify_playlist)
old_tidal_tracks = tidal_playlist.tracks()
tracks_to_search = get_new_tracks_from_spotify_playlist(spotify_tracks, old_tidal_tracks)
if not tracks_to_search:
print("No new tracks to search in Spotify playlist '{}'".format(spotify_playlist['name']))
return
# Search for each of the tracks on Tidal concurrently
task_description = "Searching Tidal for {}/{} tracks in Spotify playlist '{}'".format(len(tracks_to_search), len(spotify_tracks), spotify_playlist['name'])
semaphore = asyncio.Semaphore(config.get('max_concurrency', 10))
rate_limiter_task = asyncio.create_task(_run_rate_limiter(semaphore))
search_results = await atqdm.gather( *[ repeat_on_request_error(tidal_search, t, semaphore, tidal_session) for t in tracks_to_search ], desc=task_description )
rate_limiter_task.cancel()
# Add the search results to the cache
for idx, spotify_track in enumerate(tracks_to_search):
if search_results[idx]:
track_match_cache.insert( (spotify_track['id'], search_results[idx].id) )
else:
color = ('\033[91m', '\033[0m')
print(color[0] + "Could not find track {}: {} - {}".format(spotify_track['id'], ",".join([a['name'] for a in spotify_track['artists']]), spotify_track['name']) + color[1])
# Update the Tidal playlist if there are changes
old_tidal_track_ids = [t.id for t in old_tidal_tracks]
new_tidal_track_ids = get_tracks_for_new_tidal_playlist(spotify_tracks)
if new_tidal_track_ids == old_tidal_track_ids:
print("No changes to write to Tidal playlist")
elif new_tidal_track_ids[:len(old_tidal_track_ids)] == old_tidal_track_ids:
# Append new tracks to the existing playlist if possible
add_multiple_tracks_to_playlist(tidal_playlist, new_tidal_track_ids[len(old_tidal_track_ids):])
else:
# Erase old playlist and add new tracks from scratch if any reordering occured
set_tidal_playlist(tidal_playlist, new_tidal_track_ids)
def sync_list(spotify_session: spotipy.Spotify, tidal_session: tidalapi.Session, playlists, config):
for spotify_playlist, tidal_playlist in playlists:
# sync the spotify playlist to tidal
asyncio.run(sync_playlist(spotify_session, tidal_session, spotify_playlist, tidal_playlist, config) )
def pick_tidal_playlist_for_spotify_playlist(spotify_playlist, tidal_playlists: Mapping[str, tidalapi.Playlist]):
if spotify_playlist['name'] in tidal_playlists:
# if there's an existing tidal playlist with the name of the current playlist then use that
tidal_playlist = tidal_playlists[spotify_playlist['name']]
return (spotify_playlist, tidal_playlist)
else:
return (spotify_playlist, None)
def get_user_playlist_mappings(spotify_session: spotipy.Spotify, tidal_session: tidalapi.Session, config):
results = []
spotify_playlists = asyncio.run(get_playlists_from_spotify(spotify_session, config))
tidal_playlists = get_tidal_playlists_dict(tidal_session)
for spotify_playlist in spotify_playlists:
results.append( pick_tidal_playlist_for_spotify_playlist(spotify_playlist, tidal_playlists) )
return results
async def get_playlists_from_spotify(spotify_session: spotipy.Spotify, config):
# get all the user playlists from the Spotify account
playlists = []
print("Loading Spotify playlists")
results = spotify_session.user_playlists(config['spotify']['username'])
exclude_list = set([x.split(':')[-1] for x in config.get('excluded_playlists', [])])
# get all the remaining playlists in parallel
if results['next']:
offsets = [ results['limit'] * n for n in range(1, math.ceil(results['total']/results['limit'])) ]
extra_results = await atqdm.gather( *[asyncio.to_thread(spotify_session.user_playlists, config['spotify']['username'], offset=offset) for offset in offsets ] )
for extra_result in extra_results:
playlists.extend([p for p in extra_result['items'] if p['owner']['id'] == config['spotify']['username'] and not p['id'] in exclude_list])
return playlists
def get_playlists_from_config(spotify_session: spotipy.Spotify, tidal_session: tidalapi.Session, config):
# get the list of playlist sync mappings from the configuration file
def get_playlist_ids(config):
return [(item['spotify_id'], item['tidal_id']) for item in config['sync_playlists']]
output = []
for spotify_id, tidal_id in get_playlist_ids(config):
try:
spotify_playlist = spotify_session.playlist(spotify_id)
except spotipy.SpotifyException as e:
print(f"Error getting Spotify playlist {spotify_id}")
raise e
try:
tidal_playlist = tidal_session.playlist(tidal_id)
except Exception as e:
print(f"Error getting Tidal playlist {tidal_id}")
raise e
output.append((spotify_playlist, tidal_playlist))
return output

View File

@@ -1,19 +1,21 @@
from typing import List
import tidalapi
from tqdm import tqdm
def _remove_indices_from_playlist(playlist, indices):
def _remove_indices_from_playlist(playlist: tidalapi.UserPlaylist, indices: List[int]):
headers = {'If-None-Match': playlist._etag}
index_string = ",".join(map(str, indices))
playlist.requests.request('DELETE', (playlist._base_url + '/items/%s') % (playlist.id, index_string), headers=headers)
playlist.request.request('DELETE', (playlist._base_url + '/items/%s') % (playlist.id, index_string), headers=headers)
playlist._reparse()
def clear_tidal_playlist(playlist, chunk_size=20):
def clear_tidal_playlist(playlist: tidalapi.UserPlaylist, chunk_size: int=20):
with tqdm(desc="Erasing existing tracks from Tidal playlist", total=playlist.num_tracks) as progress:
while playlist.num_tracks:
indices = range(min(playlist.num_tracks, chunk_size))
_remove_indices_from_playlist(playlist, indices)
progress.update(len(indices))
def add_multiple_tracks_to_playlist(playlist, track_ids, chunk_size=20):
def add_multiple_tracks_to_playlist(playlist: tidalapi.UserPlaylist, track_ids: List[int], chunk_size: int=20):
offset = 0
with tqdm(desc="Adding new tracks to Tidal playlist", total=len(track_ids)) as progress:
while offset < len(track_ids):
@@ -22,6 +24,6 @@ def add_multiple_tracks_to_playlist(playlist, track_ids, chunk_size=20):
offset += count
progress.update(count)
def set_tidal_playlist(playlist, track_ids):
def set_tidal_playlist(playlist: tidalapi.Playlist, track_ids: List[int]):
clear_tidal_playlist(playlist)
add_multiple_tracks_to_playlist(playlist, track_ids)

View File

@@ -0,0 +1,25 @@
from .config import SpotifyConfig, TidalConfig, PlaylistConfig, SyncConfig
from .spotify import SpotifyTrack
from spotipy import Spotify
from tidalapi import Session, Track
TidalID = str
SpotifyID = str
TidalSession = Session
TidalTrack = Track
SpotifySession = Spotify
__all__ = [
"SpotifyConfig",
"TidalConfig",
"PlaylistConfig",
"SyncConfig",
"TidalPlaylist",
"TidalID",
"SpotifyID",
"SpotifySession",
"TidalSession",
"TidalTrack",
"SpotifyTrack",
]

View File

@@ -0,0 +1,26 @@
from typing import TypedDict, Literal, List, Optional
class SpotifyConfig(TypedDict):
client_id: str
client_secret: str
username: str
redirect_url: str
class TidalConfig(TypedDict):
access_token: str
refresh_token: str
session_id: str
token_type: Literal["Bearer"]
class PlaylistConfig(TypedDict):
spotify_id: str
tidal_id: str
class SyncConfig(TypedDict):
spotify: SpotifyConfig
sync_playlists: Optional[List[PlaylistConfig]]
excluded_playlists: Optional[List[str]]

View File

@@ -0,0 +1,69 @@
from spotipy import Spotify
from typing import TypedDict, List, Dict, Mapping, Literal, Optional
class SpotifyImage(TypedDict):
url: str
height: int
width: int
class SpotifyFollower(TypedDict):
href: str
total: int
SpotifyID = str
SpotifySession = Spotify
class SpotifyArtist(TypedDict):
external_urls: Mapping[str, str]
followers: SpotifyFollower
genres: List[str]
href: str
id: str
images: List[SpotifyImage]
name: str
popularity: int
type: str
uri: str
class SpotifyAlbum(TypedDict):
album_type: Literal["album", "single", "compilation"]
total_tracks: int
available_markets: List[str]
external_urls: Dict[str, str]
href: str
id: str
images: List[SpotifyImage]
name: str
release_date: str
release_date_precision: Literal["year", "month", "day"]
restrictions: Optional[Dict[Literal["reason"], str]]
type: Literal["album"]
uri: str
artists: List[SpotifyArtist]
class SpotifyTrack(TypedDict):
album: SpotifyAlbum
artists: List[SpotifyArtist]
available_markets: List[str]
disc_number: int
duration_ms: int
explicit: bool
external_ids: Dict[str, str]
external_urls: Dict[str, str]
href: str
id: str
is_playable: bool
linked_from: Dict
restrictions: Optional[Dict[Literal["reason"], str]]
name: str
popularity: int
preview_url: str
track_number: int
type: Literal["track"]
uri: str

312
sync.py
View File

@@ -1,312 +0,0 @@
#!/usr/bin/env python3
import argparse
from auth import open_tidal_session, open_spotify_session
from functools import partial
from multiprocessing import Pool
import requests
import sys
import spotipy
import tidalapi
from tidalapi_patch import set_tidal_playlist
import time
from tqdm import tqdm
import traceback
import unicodedata
import yaml
def normalize(s):
return unicodedata.normalize('NFD', s).encode('ascii', 'ignore').decode('ascii')
def simple(input_string):
# only take the first part of a string before any hyphens or brackets to account for different versions
return input_string.split('-')[0].strip().split('(')[0].strip().split('[')[0].strip()
def isrc_match(tidal_track, spotify_track):
if "isrc" in spotify_track["external_ids"]:
return tidal_track.isrc == spotify_track["external_ids"]["isrc"]
return False
def duration_match(tidal_track, spotify_track, tolerance=2):
# the duration of the two tracks must be the same to within 2 seconds
return abs(tidal_track.duration - spotify_track['duration_ms']/1000) < tolerance
def name_match(tidal_track, spotify_track):
def exclusion_rule(pattern, tidal_track, spotify_track):
spotify_has_pattern = pattern in spotify_track['name'].lower()
tidal_has_pattern = pattern in tidal_track.name.lower() or (not tidal_track.version is None and (pattern in tidal_track.version.lower()))
return spotify_has_pattern != tidal_has_pattern
# handle some edge cases
if exclusion_rule("instrumental", tidal_track, spotify_track): return False
if exclusion_rule("acapella", tidal_track, spotify_track): return False
if exclusion_rule("remix", tidal_track, spotify_track): return False
# the simplified version of the Spotify track name must be a substring of the Tidal track name
# Try with both un-normalized and then normalized
simple_spotify_track = simple(spotify_track['name'].lower()).split('feat.')[0].strip()
return simple_spotify_track in tidal_track.name.lower() or normalize(simple_spotify_track) in normalize(tidal_track.name.lower())
def artist_match(tidal_track, spotify_track):
def split_artist_name(artist):
if '&' in artist:
return artist.split('&')
elif ',' in artist:
return artist.split(',')
else:
return [artist]
def get_tidal_artists(tidal_track, do_normalize=False):
result = []
for artist in tidal_track.artists:
if do_normalize:
artist_name = normalize(artist.name)
else:
artist_name = artist.name
result.extend(split_artist_name(artist_name))
return set([simple(x.strip().lower()) for x in result])
def get_spotify_artists(spotify_track, do_normalize=False):
result = []
for artist in spotify_track['artists']:
if do_normalize:
artist_name = normalize(artist['name'])
else:
artist_name = artist['name']
result.extend(split_artist_name(artist_name))
return set([simple(x.strip().lower()) for x in result])
# There must be at least one overlapping artist between the Tidal and Spotify track
# Try with both un-normalized and then normalized
if get_tidal_artists(tidal_track).intersection(get_spotify_artists(spotify_track)) != set():
return True
return get_tidal_artists(tidal_track, True).intersection(get_spotify_artists(spotify_track, True)) != set()
def match(tidal_track, spotify_track):
return isrc_match(tidal_track, spotify_track) or (
duration_match(tidal_track, spotify_track)
and name_match(tidal_track, spotify_track)
and artist_match(tidal_track, spotify_track)
)
def tidal_search(spotify_track_and_cache, tidal_session):
spotify_track, cached_tidal_track = spotify_track_and_cache
if cached_tidal_track: return cached_tidal_track
# search for album name and first album artist
if 'album' in spotify_track and 'artists' in spotify_track['album'] and len(spotify_track['album']['artists']):
album_result = tidal_session.search(simple(spotify_track['album']['name']) + " " + simple(spotify_track['album']['artists'][0]['name']), models=[tidalapi.album.Album])
for album in album_result['albums']:
album_tracks = album.tracks()
if len(album_tracks) >= spotify_track['track_number']:
track = album_tracks[spotify_track['track_number'] - 1]
if match(track, spotify_track):
return track
# if that fails then search for track name and first artist
for track in tidal_session.search(simple(spotify_track['name']) + ' ' + simple(spotify_track['artists'][0]['name']), models=[tidalapi.media.Track])['tracks']:
if match(track, spotify_track):
return track
def get_tidal_playlists_dict(tidal_session):
# a dictionary of name --> playlist
tidal_playlists = tidal_session.user.playlists()
output = {}
for playlist in tidal_playlists:
output[playlist.name] = playlist
return output
def repeat_on_request_error(function, *args, remaining=5, **kwargs):
# utility to repeat calling the function up to 5 times if an exception is thrown
try:
return function(*args, **kwargs)
except requests.exceptions.RequestException as e:
if remaining:
print(f"{str(e)} occurred, retrying {remaining} times")
else:
print(f"{str(e)} could not be recovered")
if not e.response is None:
print(f"Response message: {e.response.text}")
print(f"Response headers: {e.response.headers}")
if not remaining:
print("Aborting sync")
print(f"The following arguments were provided:\n\n {str(args)}")
print(traceback.format_exc())
sys.exit(1)
sleep_schedule = {5: 1, 4:10, 3:60, 2:5*60, 1:10*60} # sleep variable length of time depending on retry number
time.sleep(sleep_schedule.get(remaining, 1))
return repeat_on_request_error(function, *args, remaining=remaining-1, **kwargs)
def _enumerate_wrapper(value_tuple, function, **kwargs):
# just a wrapper which accepts a tuple from enumerate and returns the index back as the first argument
index, value = value_tuple
return (index, repeat_on_request_error(function, value, **kwargs))
def call_async_with_progress(function, values, description, num_processes, **kwargs):
results = len(values)*[None]
with Pool(processes=num_processes) as process_pool:
for index, result in tqdm(process_pool.imap_unordered(partial(_enumerate_wrapper, function=function, **kwargs),
enumerate(values)), total=len(values), desc=description):
results[index] = result
return results
def get_tracks_from_spotify_playlist(spotify_session, spotify_playlist):
output = []
results = spotify_session.playlist_tracks(
spotify_playlist["id"],
fields="next,items(track(name,album(name,artists),artists,track_number,duration_ms,id,external_ids(isrc)))",
)
while True:
output.extend([r['track'] for r in results['items'] if r['track'] is not None])
# move to the next page of results if there are still tracks remaining in the playlist
if results['next']:
results = spotify_session.next(results)
else:
return output
class TidalPlaylistCache:
def __init__(self, playlist):
self._data = playlist.tracks()
def _search(self, spotify_track):
''' check if the given spotify track was already in the tidal playlist.'''
results = []
for tidal_track in self._data:
if match(tidal_track, spotify_track):
return tidal_track
return None
def search(self, spotify_session, spotify_playlist):
''' Add the cached tidal track where applicable to a list of spotify tracks '''
results = []
cache_hits = 0
work_to_do = False
spotify_tracks = get_tracks_from_spotify_playlist(spotify_session, spotify_playlist)
for track in spotify_tracks:
cached_track = self._search(track)
if cached_track:
results.append( (track, cached_track) )
cache_hits += 1
else:
results.append( (track, None) )
return (results, cache_hits)
def tidal_playlist_is_dirty(playlist, new_track_ids):
old_tracks = playlist.tracks()
if len(old_tracks) != len(new_track_ids):
return True
for i in range(len(old_tracks)):
if old_tracks[i].id != new_track_ids[i]:
return True
return False
def sync_playlist(spotify_session, tidal_session, spotify_id, tidal_id, config):
try:
spotify_playlist = spotify_session.playlist(spotify_id)
except spotipy.SpotifyException as e:
print("Error getting Spotify playlist " + spotify_id)
print(e)
results.append(None)
return
if tidal_id:
# if a Tidal playlist was specified then look it up
try:
tidal_playlist = tidal_session.playlist(tidal_id)
except Exception as e:
print("Error getting Tidal playlist " + tidal_id)
print(e)
return
else:
# create a new Tidal playlist if required
print(f"No playlist found on Tidal corresponding to Spotify playlist: '{spotify_playlist['name']}', creating new playlist")
tidal_playlist = tidal_session.user.create_playlist(spotify_playlist['name'], spotify_playlist['description'])
tidal_track_ids = []
spotify_tracks, cache_hits = TidalPlaylistCache(tidal_playlist).search(spotify_session, spotify_playlist)
if cache_hits == len(spotify_tracks):
print("No new tracks to search in Spotify playlist '{}'".format(spotify_playlist['name']))
return
task_description = "Searching Tidal for {}/{} tracks in Spotify playlist '{}'".format(len(spotify_tracks) - cache_hits, len(spotify_tracks), spotify_playlist['name'])
tidal_tracks = call_async_with_progress(tidal_search, spotify_tracks, task_description, config.get('subprocesses', 50), tidal_session=tidal_session)
for index, tidal_track in enumerate(tidal_tracks):
spotify_track = spotify_tracks[index][0]
if tidal_track:
tidal_track_ids.append(tidal_track.id)
else:
color = ('\033[91m', '\033[0m')
print(color[0] + "Could not find track {}: {} - {}".format(spotify_track['id'], ",".join([a['name'] for a in spotify_track['artists']]), spotify_track['name']) + color[1])
if tidal_playlist_is_dirty(tidal_playlist, tidal_track_ids):
set_tidal_playlist(tidal_playlist, tidal_track_ids)
else:
print("No changes to write to Tidal playlist")
def sync_list(spotify_session, tidal_session, playlists, config):
results = []
for spotify_id, tidal_id in playlists:
# sync the spotify playlist to tidal
repeat_on_request_error(sync_playlist, spotify_session, tidal_session, spotify_id, tidal_id, config)
results.append(tidal_id)
return results
def pick_tidal_playlist_for_spotify_playlist(spotify_playlist, tidal_playlists):
if spotify_playlist['name'] in tidal_playlists:
# if there's an existing tidal playlist with the name of the current playlist then use that
tidal_playlist = tidal_playlists[spotify_playlist['name']]
return (spotify_playlist['id'], tidal_playlist.id)
else:
return (spotify_playlist['id'], None)
def get_user_playlist_mappings(spotify_session, tidal_session, config):
results = []
spotify_playlists = get_playlists_from_spotify(spotify_session, config)
tidal_playlists = get_tidal_playlists_dict(tidal_session)
for spotify_playlist in spotify_playlists:
results.append( pick_tidal_playlist_for_spotify_playlist(spotify_playlist, tidal_playlists) )
return results
def get_playlists_from_spotify(spotify_session, config):
# get all the user playlists from the Spotify account
playlists = []
spotify_results = spotify_session.user_playlists(config['spotify']['username'])
exclude_list = set([x.split(':')[-1] for x in config.get('excluded_playlists', [])])
while True:
for spotify_playlist in spotify_results['items']:
if spotify_playlist['owner']['id'] == config['spotify']['username'] and not spotify_playlist['id'] in exclude_list:
playlists.append(spotify_playlist)
# move to the next page of results if there are still playlists remaining
if spotify_results['next']:
spotify_results = spotify_session.next(spotify_results)
else:
break
return playlists
def get_playlists_from_config(config):
# get the list of playlist sync mappings from the configuration file
return [(item['spotify_id'], item['tidal_id']) for item in config['sync_playlists']]
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--config', default='config.yml', help='location of the config file')
parser.add_argument('--uri', help='synchronize a specific URI instead of the one in the config')
args = parser.parse_args()
with open(args.config, 'r') as f:
config = yaml.safe_load(f)
spotify_session = open_spotify_session(config['spotify'])
tidal_session = open_tidal_session()
if not tidal_session.check_login():
sys.exit("Could not connect to Tidal")
if args.uri:
# if a playlist ID is explicitly provided as a command line argument then use that
spotify_playlist = spotify_session.playlist(args.uri)
tidal_playlists = get_tidal_playlists_dict(tidal_session)
tidal_playlist = pick_tidal_playlist_for_spotify_playlist(spotify_playlist, tidal_playlists)
sync_list(spotify_session, tidal_session, [tidal_playlist], config)
elif config.get('sync_playlists', None):
# if the config contains a sync_playlists list of mappings then use that
sync_list(spotify_session, tidal_session, get_playlists_from_config(config), config)
else:
# otherwise just use the user playlists in the Spotify account
sync_list(spotify_session, tidal_session, get_user_playlist_mappings(spotify_session, tidal_session, config), config)