Are your movies being incorrectly identified?

I know that a few people are wondering why some movies in their library aren’t being recognized correctly by the automatic update, when it’s the first result when manually identifying the movie. I thought I’d write a blog post to tell you about it. This explanation mainly concerns the beta version.

When Mizuu updates your movie library, it makes a bunch of requests to a service called TMDb (The Movie Database). This service handles the search query – an altered version of the file name of the movie in question. Mizuu alters the file name by recognizing a lot of known patterns and strips away unneccessary information, like known video tags. TMDb then returns a list of results based on quite a few different factors.

Most of the time, the results are excellent and Mizuu picks the first result, as this is likely the correct result. Sometimes, however, Mizuu will pick something else from the list of results. The results include a measure of popularity for each movie, and I have found out that the popular movies are sometimes the correct results even if they’re not the first result. That’s why Mizuu – in some cases – picks something other than the first result.

So, why isn’t this happening when manually identifying movies? If you’re manually identifying a movie, Mizuu will simply show the results as they were returned from the TMDb service. Mizuu selects movies like that, and so should you.

What to do if this happens all the time and you want to make sure that it won’t incorrectly identify them? Well, the easiest fix is to change the filename of the file in question. Change it to be the exact same as the listing on TMDb and make sure to include the release year in the filename as well. The release year can help a lot, but it can also result in bad results if it isn’t the same as the listing on TMDb, so make sure it’s all the same.

The recommended file naming convention is the title followed by the year, and that’s it. Simple and easy to maintain. An example of this would be: The Avengers (2012) or Ted (2012).

In the end, no solution is perfect and it’s always going to be a trade-off regardless of which solution Mizuu uses, I just think it’s a slightly better trade-off with the method used by the automatic update. It identifies a lot of movies correctly, including some that would have been incorrectly identified if it just selected the first result – especially if your filenames are correct. That is honestly the best thing you can do – make sure your filenames are as good as they can possibly be.

PS. TMDb is a free service available at www.themoviedb.org, and I’d highly recommend everyone using Mizuu to help out on the site, either by adding new movies, translated details or even something as simple as ratings. The main developer behind the service is just awesome, and the more content on the site, the better the experience gets for everyone.

2 Comments

  1. Hi Michell,

    First, I should say I think you’ve done amazing work! I simply love the app.

    Second, that’s a great explanation, and your algorithm (including popularity in your internal ranking) makes a lot of sense.

    I have been following the filename advice whenever I have things incorrectly identified and simply validating the exact title and adding the year to the filename has worked in nearly every case (I have a fairly extensive collection of hundreds of movies). The only instance where this has not been sufficient, and the only workaround was to manually identify the movie, was with an obscure movie with a 2-word title starting with The (The Guard) which was made in the same year (2011) as a popular movie which also starts with The (The Adjustment Bureau). Even having my filename be an exact text match to TMDB including the year did not override The Guard being identified as The Adjustment Bureau. This is likely a combination of the unpopularity of the movie, and it having a 2-word title with the first word being the throwaway “The”, meaning every “The” movie is a 50% match already. Perhaps a slight alteration of the identification algorithm to always pick the exact text match if one exists would be an option. Or perhaps I’m wrong and that’s what you already do, and the text which TMDB shows me online isn’t the exact text of this movie returned in the lookup API.

    Either way, it’s not a big deal at all — correcting one entry is a one-time thing and this is the only one of my ~400 movies that requires any intervention by me.

    Keep up the great work on Mizuu!

    Reply
    • Thanks a lot, Grant – much appreciated 🙂 Glad to hear that it’s working as it should, especially with such a large library 🙂

      Reply

Add a comment