My getFlix is out of date! Maarten van Egmond has a revamped version.
Many people have hundreds or thousands of movies they have rated on Netflix, but the web site does not provide any simple way of downloading them for your own use. getFlix is a package of scripts that allow you to download and process your Netflix ratings.
You need the following:
- Perl v5 (higher should work, don’t know about lower). Get perl at perl.org.
- Python v2.3 (again, higher yes, lower don’t know). Get python at their web site.
- For Perl, install modules WWW::Mechanize, Crypt::SSLeay and Data::Dumper. Help installing perl modules. Mac OS X, most Linux distributions and BSD variants come pre-installed with both perl and python.
- For Python, install module ‘mechanize’. Help installing mechanize.
- Find out your path to Perl (type ‘whereis perl’ at command line) and your path to Python (‘whereis python’). Also, you will need your Netflix user name (email address) and password.
- And finally, the actual scripts: Download v0.1 of getFlix.
I was originally looking for someone else who had written a script to do this and had found Net::Netflix by a guy named John Resig. Very cool script- it went to the Netflix web site with your username/password and fetched your ratings. The problem was that it only got the film title and the rating. Nothing else. Great, but no cigar.
So I had to write everything else myself. Here is what I have:
- getflix.pl – A script that calls the main perl module (included, next) called Netflix.pm
- Netflix.pm – The heart of my effort here is this module,partially borrowed from JohnResig’s script but modified to get the following:
- The film’s Netflix ID
- Film title
- Film Year
- Film MPAA Rating
- Film Genre
- Your Film Rating
For example, it would get the following: “60000161~Wonder Boys~2000~R~Drama~3” for every single film you have rated and puts it in a file called ‘nflicks.txt’. Not bad for a little perl script.
- Now for all the accessories, starting with nflixHisto.py which is a very nifty script that takes all the data in nflicks.txt and generates distribution for the data. For example, the average rating for a particular year or decade. Or for a particular MPAA rating or genre. Great stuff!
- getdirectors.py: This script will get the name of the directors for each of the film you have rated (from Netflix.com) and tabulate them with their rating in a file called directors.txt.
- dirHisto.py: This script will generate meaningful data about directors that you have rated highly; i.e. average rating for a particular director. This will live in directors2.txt.
- getstars.py: This script will fetch the stars (actors and actresses) for each film you have rated and again, tabulate them (similar to the directors). No histogram script exists yet to make sense of this data.
What kind of information will I get out of it?
For a brief overview, you can read my own discoveries, but suffice to say, you can get all the information it is possible to fetch from Netflix.com regarding each of the film you have rated. That is,
- MPAA Rating
And also, a lot of analysis of the information, such as averages for years, decades, ratings and a lot more. And if you find something that is more meaningful, go ahead and add it or suggest it in the comments below.
Where would you like us to submit patches against this code?
Ok, I posted my patches here:
[…] been a Netflix member for close to 8 years and religiously rate everything I watch. Now thanks to a script I wrote (and has been improved since by others), I can pull the data out of Netflix and analyze. For […]