Exporting watch list from TMDb

2020/04/26

I keep list of movies I’ve watched on TheMovieDB. The service is great and I’ve used it for quite a long time (>700 movies in my list). I was using another service before, which unfortunately had closed without giving me a chance to export the data and I had to scrape my watchlist from the google cache.

To prevent that from happening again, I’ve exported my watchlist from TMDb and will use the service only as a secondary source with a nice UI.

TMDb has a decent API and there are several python wrappers for it. The script for export is pretty straightforward. Retrieve request_token, open a session (login) for your account and retrieve the list of rated movies from a paginated endpoint.

import tmdbsimple as tmdb
import json
import subprocess

tmdb.API_KEY = "bafca4e813ca8fc43b3c9f1e32d1932b"
res = subprocess.run(
    ["pass", "show", "Sites/themoviedb.org"], capture_output=True, check=True
)
password = res.stdout.splitlines()[0]

auth = tmdb.Authentication()
token = auth.token_new()["request_token"]

auth.token_validate_with_login(request_token=token, username="sarg", password=password)

session_id = auth.session_new(request_token=token)["session_id"]
acc = tmdb.Account(session_id)
acc.info()

movies = []
idx = 1
while True:
    res = acc.rated_movies(page=idx)
    movies = movies + res["results"]
    idx = idx + 1
    if idx > res["total_pages"]:
        break

idx = 1
while True:
    res = acc.watchlist_movies(page=idx)
    movies = movies + res["results"]
    idx = idx + 1
    if idx > res["total_pages"]:
        break

print(json.dumps(movies, indent=2))

Unfortunately TMDb does not expose Rated Date via API. It could be extracted from the CSV export though. Request the CSV export on the “My Ratings” page (under three dots icon). It would be sent by email. Download the file and use the following script to merge the json and csv.

import csv
import json

rates = { row["TMDb ID"]: row for row in csv.DictReader(open("ratings.csv")) }
films = json.load(open("movies.json"))

for film in films:
    if str(film['id']) in rates:
        film['rated_date'] = rates[str(film['id'])]['Date Rated']

print(json.dumps(films, indent=2))