
I'm trying to build a personal movie database and i want the data to be fetched from imdb ... Yes i know there are plenty api and grabber out there but none of them is doing what is need,,,

So far i couldn't come up with a solution to parse list and get my data from it...

I've tried to do it by a curl script but no luck !

For e.g:

I want to know if The Godfather: Part II is in top 250 ?if yes what is the rank...

Was it helpful?



I would look into whether or not IMDB have an API available... If they do this will likely be as simple as querying a URL and parsing the data returned with json_decode...

No API available?

Get the webpage

No need to use CURL a simple file_get_contents will do the trick...

Extract the list

Now you have the web page you then have two options:

  1. Parse the web page with a DOM parser (long winded, not necessary)
  2. Regex to extract the info you're after (simple, short)


A quick look at the source code of the list shows the list is in the format:

<td class="titleColumn">RANK. <a href="/link/to/film" title="Director/Leads" >FILM TITLE</a>

See CAPS for required information

Now converting this into a regex is simple; just remove the noise and replace with (non-greedy) wild cards...

<td class="titleColumn">RANK. <a.*?>FILM TITLE</a>

Add your capture groups:

<td class="titleColumn">(RANK). <a.*?>(FILM TITLE)</a>

and that's it...

#<td class="titleColumn">(\d+)\. <a.*?>(.*?)</a>#


Using this in practice:

$page = file_get_contents(""); //Download the page

preg_match_all('#<td class="titleColumn">(\d+)\. <a.*?>(.*?)</a>#', $page, $matches); //Match ranks and titles

$top250 = array_combine($matches[1], $matches[2]);          //Final array in format RANK=>TITLE

Then you can do something like:

echo $top250[1];


The Shawshank Redemption


echo array_search("The Godfather", $top250);




You can then use standard PHP array functions to do things like search for films.

Side note

Especially if you use the No API method above you might like to think about storing the results locally and only updating every X Hours/Days/Weeks to save load times etc. I assume that you are already planning on doing this (as you said you wanted a personal movie data base... But just thought I'd mention it anyway!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top