Making a movie collection catalog in Excel

Talk about just about anything else that is non-gaming here, but keep it clean
User avatar
Ziggy
Moderator
Posts: 14528
Joined: Mon Jun 09, 2008 5:12 pm
Location: NY

Making a movie collection catalog in Excel

Post by Ziggy »

We seem to have quite a few people on this forum that enjoy movies, so I thought I'd share this.

I have a somewhat large movie collection. Maybe not so large in comparison to some people, but definitely large compared to the average person. As with anything, the more you have of something the harder it is to keep track of or organize it all. I've been thinking about doing some sort of catalog system for a while now. Something like Discogs would be perfect. Discogs actually had a few sister sites for a while, one of them being filmogs, but they seem to have disappeared. I know there's a bunch of apps available, free and paid for, but I'm hesitant to use one. The initial setup will take a huge investment in time, and I'm just not willing to do that if I'm not absolutely sure of it. An app might be good now, but an update might cripple it. Or it might not be supported in the future and I might have to abandon it. Discogs was easy for me to set up because I was able to scan the barcodes to enter almost everything, and the site is popular enough that I'm not too concerned with it disappearing any time soon.

I've toyed with the idea of just making a simple spreadsheet for a while, but I really don't want to manually input all the data. I could just do movie title and format (VHS, DVD, BD) to get a simple list going, but I wouldn't mind having more than that. However, I'm certainly not about to manually get all that data for my entire collection. But I just found out about something awesome, and I can easily make a spreadsheet that'll auto populate most of the data for me!

IMDb has datasets that you can download in .tsv format! If you know Excel, then you'll know this means setting things up will be mostly trivial. Basically, I'm going to make a table where I input the movie title and it'll auto populate all the fields for me. I'll still have to manual input which format it is, but big deal. And I think I'll also have a comments field that I can note things in, like "part of box set" or something. I might also make a column to note if I've ever seen the movie or not. While not as bad as my Backloggery, I still own plenty of movies that I've never seen.

As for the fields that'll auto populate, I'm going to have release year, runtime and genre(s). I'm always curious to know the release year and runtime (and I hate when it's impossible to find on the back cover). As for genres, I really want them included so I can sort by genre to help find something to watch. Like, "I'm in the mood for a comedy, let me see what I have."

There's more than enough horizontal space to fit other columns, so I'm thinking about doing columns for director and top billed actors. It might be neat to sort by director or actor. Like, "I feel like watching something with Tom Hanks in it." Or, "I feel like watching a John Hughes movie." I'm not sure yet if I want to include this or not, sometimes having too much data makes it hard to look at.

If I don't include directors and actors, it wont matter much. I thought of a really nice touch. I'm going to have a Wikipedia link for every entry, and these will also auto populate simply by entering the movie title. Basically, if you add "(film)" to the end of the movie title and search that on Wikipedia, it almost always lands right on the correct article. I've tested a couple dozen so far and some of them do not go straight to the correct page, but instead it'll go to the search results page and the correct article is always the first or second link. You could easily do the same thing for a search with IMDb instead, but I don't know how to make IMDb jump straight to the correct result, just return the standard search query. Anyway, it's real easy, you just modify the search URL.

Here's the formula I'm using for it. Just replace "A2" with whatever column your movie title is in, if it isn't A. You can make "Search Wikipedia" say whatever you want, that's just what will display in the cell. You can even omit that last part of the formula and the hyperlink will just be the URL.

Code: Select all

=HYPERLINK("https://en.wikipedia.org/w/index.php?search=" &A2 &" (film)", "Search Wikipedia")


You can use the same formula to search IMDb simply by changing the first part of the URL, like this...

Code: Select all

=HYPERLINK("https://www.imdb.com/find?q="&A2&"&exact=true","IMDb")


edit: I found it better to add "exact=true" to the IMDb search URL.

IMDb has other datasets, like if you wanted to include ratings. That could be cool, but I usually don't care what ratings a movie gets. Here's the link for where you can download the datasets from IMDb.

https://www.imdb.com/interfaces/

Be warned though, they're huge! The basics one is 129 MB zipped, and 621 MB uncompressed. It literally has a million lines. Excel couldn't even load it all. I trimmed it up for my needs. I removed things like shorts, TV episodes (there also TV Shows, episodes is literally each individual episode from every show), video games, et cetera. After I took out those entries and converted it to an .xlsx it's now only 13 MB. I also made sure the movie titles are column A so vlookup works.

Which, by the way, is how you auto populate the data. Just use the VLOOKUP formula in Excel. It's very easy to do. One downside to this approach is for more complicated titles you really have to nail it. I'm using only exact matches because I've found that when I do partial matches for vlookups it's never accurate. So when you type in the movie title, it has to be exact. It's not so hard for most movies, but some get tricky. You just have to learn how IMDb formats things and then you can usually guess correctly. Some are still annoying, like Alien 3 because of the way it stylizes the 3. And then there's something like "Dracula" which returned the 1931 film, which is good because that's the only one I own. But there's TONS of movies that are titles simply "Dracula" so that's where it gets tricky. Still, it seems to work 99% of the time. I'm not afraid to have to do a manual entry here and there. Or just search the dataset to see how it was entered and then adjust my input to match it.

There's a few things I still have to work on, like converting the runtime format. Right now it just displays the runtime in minutes, but I'd prefer if it were hours and minutes. I know this can be done with formulas. I've converted time formats in Excel before, but not to/from the same formatting. It's the kind of thing that you figure out how to do with Google and then forget LOL. Does anyone know the easiest way to accomplish this? Right now the data is in just minutes, such as "134" for example. I would like to display 134 as "2 hrs 14 mins." Right now I just tacked on a "& "mins" so at least it'll display "134" as "134 mins".

I was thinking about figuring out a way to auto add movie posters, possibly using VBA. But it might be too much of a headache. And to get the image sized large enough to appreciate, it might be too large to have the table look nice. Besides, the Wiki or IMDb page always has a movie poster. So I might just forget about it.

After that, I'll just have to work on the formatting to make it look pretty. Right now I'm just messing around with the functionality of it first. So it's just a boring looking table. I was entering a few random movie titles to mess around with it, but I choose a lot of them specific to try and see how well it works. So things that are common words, like Eraser. Or tricky subtitles, things that are an ampersand instead of the word "and", and stuff like that.

Movie Collection Catalog.JPG
Movie Collection Catalog.JPG (195.77 KiB) Viewed 10938 times
User avatar
Ack
Moderator
Posts: 22286
Joined: Tue Mar 18, 2008 4:26 pm
Location: Atlanta, GA

Re: Making a movie collection catalog in Excel

Post by Ack »

Hey Ziggy,

Sounds like a good system so far. I would definitely choose to leave out images. For one, spacing and formatting will be a pain, but also you can't guarantee you'll end up with the image you want.

I would recommend director, though keep in mind some films have multiple. For actors, the standard shorthand is the first three names. This isn't always accurate (for instance, some films list by order of appearance or alphabetical), but you'll get most of what you want. It's not going to find deep cuts in folks' careers, like Schwarzenegger won't appear in the credits list for The Long Goodbye, but that's ok.

You might also consider some kind of rating or recommendations button for those films you do watch, so you can also use your setup to inform others of movies you end up liking.
Image
User avatar
Ziggy
Moderator
Posts: 14528
Joined: Mon Jun 09, 2008 5:12 pm
Location: NY

Re: Making a movie collection catalog in Excel

Post by Ziggy »

Ack wrote:Sounds like a good system so far. I would definitely choose to leave out images. For one, spacing and formatting will be a pain, but also you can't guarantee you'll end up with the image you want.


Yep, exactly. Yeah, I think I'll leave them out. It would be a nice touch, but I think I'd end up having to do too much manual tweaking. Unless there's a database of images already in existence that has a consistent size, but the formatting would still be annoying to deal with. I have the Wiki and IMDb links there, so if I really want to view the movie poster it's just a click away.

Ack wrote:I would recommend director, though keep in mind some films have multiple. For actors, the standard shorthand is the first three names. This isn't always accurate (for instance, some films list by order of appearance or alphabetical), but you'll get most of what you want. It's not going to find deep cuts in folks' careers, like Schwarzenegger won't appear in the credits list for The Long Goodbye, but that's ok.


I think I will include director at the very least, and probably actors too. I know some films have multiple directors, but it wont matter. Excel has a nifty search box when you go to sort a column in a table, so if there's multiple entries with that name they'll all come up. As far as Schwarzenegger not coming up for The Long Goodbye, that's OK. If I'm searching for a Schwarzenegger film, I want results like Commando and Predator, where he's one of the principle actors. I haven't looked at IMDB's dataset for this yet, but I think it'll work nicely.

Ack wrote:You might also consider some kind of rating or recommendations button for those films you do watch, so you can also use your setup to inform others of movies you end up liking.


Hmm, good point. Maybe I'll do a simple 5 star rating system. It would make it easy to share a lists of favorites, but it would also make it easy to just sort my favorites if I'm looking for something to put on. Thanks for the suggestion.

Which gives me another idea... I could make a fortune cookie button! I could even pick random movies based on categories, like if I've seen it or not or a certain genre.
User avatar
Ack
Moderator
Posts: 22286
Joined: Tue Mar 18, 2008 4:26 pm
Location: Atlanta, GA

Re: Making a movie collection catalog in Excel

Post by Ack »

Happy to offer my two cents! I look forward to the finished product.
Image
User avatar
Anapan
Next-Gen
Posts: 3903
Joined: Mon Nov 17, 2008 11:15 am
Location: BC, Canada

Re: Making a movie collection catalog in Excel

Post by Anapan »

I used to maintain a CSV and had properly named JPGs. Later a friend built an internet explorer compatible script based database that would autogenerate links with auto-downloaded screenshots of the synopsis for each movie. This was mostly automated and worked well for a small collection. I switched to PLEX after that so it was only error-checking - early on it had a better search that I couldn't beat. The scripts are so refined now that I rarely have a problem. Local cataloging is free and subscription/lifetime-pass-pay is only required for automated off-lan access. Even locally it will always make all local LAN-based devices no matter how old - even with the old vista media-pc samba protocol.. I haven't yet found an easier auto-tagging with metadata database that you customize. It's so widely adopted that customization based on it's well-known format is only a google or plex-forum search away - and plugins can support any online database - anidb has a well-supported 3rd party database plugin for anime for instance. You can extract all data on a whim and abandon it without penalty after it autogenerates its database.
ImageImageImageImage
ImageImageImageImage
User avatar
Ziggy
Moderator
Posts: 14528
Joined: Mon Jun 09, 2008 5:12 pm
Location: NY

Re: Making a movie collection catalog in Excel

Post by Ziggy »

So I just took a look at the datasets for directors, writers and actors. It looks like they're stored as alphanumeric identifiers. So I would need to have the name of the movie look up the unique movie identifier, then have that look up the name identifier, then have that look up the actual name. Doable, just kind of annoying.
User avatar
marurun
Moderator
Posts: 11963
Joined: Sat May 06, 2006 8:51 am
Location: Cleveland, OH
Contact:

Re: Making a movie collection catalog in Excel

Post by marurun »

Did you expect any different? That’s pretty standard database design.
Dope Pope on a Rope
B/S/T thread
My Classic Games Collection
My Steam Profile
The PC Engine Software Bible Forum, with Shoutbox chat - the new Internet home for PC Engine fandom.
User avatar
BoneSnapDeez
Next-Gen
Posts: 20116
Joined: Mon May 02, 2011 1:08 pm
Location: Maine

Re: Making a movie collection catalog in Excel

Post by BoneSnapDeez »

How's the RateYourMusic movie database looking these days? I'm not a big movie guy so I've only used the site to catalog music.
User avatar
Ziggy
Moderator
Posts: 14528
Joined: Mon Jun 09, 2008 5:12 pm
Location: NY

Re: Making a movie collection catalog in Excel

Post by Ziggy »

marurun wrote:Did you expect any different? That’s pretty standard database design.


No, I suppose not. It's just annoying because I wanted the setup to be really simple using only Excel (which means anyone could make this themselves using an Excel or Excel clone). And while Excel isn't a substitute for an actual database system, it's no issue to pull data from multiple spreadsheets. The issue is that Excel doesn't like huge files like these.

Like I said, I was able to shrink down the one dataset that contains the movie titles, release year, and runtime by removing the categories that I wont be using. Then saving it as an .xlsx, it can easily be used for vlookups. But since the other datasets are only the identifiers, I cannot shrink them in the same way. I can convert them to .xlsx which shrinks the file size considerably, but it's still a huge spreadsheet that makes Excel choke (I think it's mainly the total number of lines that's really giving Excel a hard time). In fact, Excel gives an error message when loading these that the file was not loaded completely. If I make my spreadsheet vlookup these three datasets as is, it'll either make Excel glitch out really hard or it wont work at all.

My first thought was to consolidate everything I need and try and make it as small as possible. For example, I can start with the first TSV that I'm using now, the one with the titles, year and runtime. Then I could do a series of vlookups to the other two files to get all the names I need in there, then copy those columns and paste them as data so they're no longer vlookups. This would consolidate all of the data I need into one spreadsheet, with no extraneous data. But that might still be too large for Excel to handle, at the very least it would cause my spreadsheet to freeze here and there. I suppose as I enter my movie catalog I could periodically copy and paste everything as data, that way it's just stored text and no longer doing vlookups. That's assuming the consolidated spreadsheet isn't too large for Excel to handle in the first place.

My second thought was to try using Access for this instead of Excel. From the start, I was thinking about using Access but I decided on Excel because really all I'm after is a simple spreadsheet. I'm not interesting in building or maintaining a database, I just want to pull info from one. But perhaps I should use Access to more properly use the IMDb datasets? I don't have a lot of experience with Access. The only thing is that Access doesn't appear to be able to import TSV files, which is very annoying. I can't convert them to CSV. I could convert them to Excel sheets first and import them to Access that way, but like I said when I try to load them into Excel I get an error message that the entire file wasn't loaded.

edit: Hmm, maybe I could do this for Access... I could load the TSVs with Notepad, which although is a little slow doesn't seem to have any errors with them like Excel does. I could chop them in half and create two TSVs for each dataset. Then I could load each one into Excel and convert them to .xlsx. Then I could import those into Access. I don't see why that wouldn't work.

Another issue I realized are titles that start with The. The IMDb datasets have them listed as "The Man with the Golden Gun" and not "Man with the Golden Gun, The". Which means my list wont be alphabetized correctly, all of the titles that start with The will be together under T. Hmm, I suppose after I enter the title and let the data populate, I could copy/paste as data like I said and then just correct the title manually. I suppose there aren't that many titles that start with "The," at least I don't own that many.
User avatar
marurun
Moderator
Posts: 11963
Joined: Sat May 06, 2006 8:51 am
Location: Cleveland, OH
Contact:

Re: Making a movie collection catalog in Excel

Post by marurun »

Have you thought of using an existing solution like Eric’s Movie Database?

https://www.emdb.eu/index.html
Dope Pope on a Rope
B/S/T thread
My Classic Games Collection
My Steam Profile
The PC Engine Software Bible Forum, with Shoutbox chat - the new Internet home for PC Engine fandom.
Post Reply