Question

I am using an API service from a provider. The API usage looks like -

https://api.thesite.com/getTable1Records?offset=0 
https://api.thesite.com/getTable2Records?offset=0 
https://api.thesite.com/getTable3Records?offset=0 

(NOT the real addresses) returns JSON of 1000 records for each API call.

For the first time, I retrieved all the records and saved it to my database server. The user searching/processing is run on my database server. It is the way the API service provider recommeded.

The API service provider updates their database whenever the data are changed. I am not able to know when they change and what they change. They might add new records/ update the existings/ delete some. I need to update my database periodically (weekly, every Monday OR 2 times a week is ok).

Here is my PHP code which update one of the table

// Update Table1 
echo "STARTED@" . time() . "<br />\n"; // just for log
$offset             = 0;
$username           = "username";
$password           = "password";
$url                = "https://api.thesite.com/getTable1Records";

$c                  = curl_init();    
do{
    curl_setopt($c, CURLOPT_URL,             "$url?offset=$offset" );
    curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($c, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($c, CURLOPT_SSL_VERIFYPEER, 0);
    curl_setopt($c, CURLOPT_USERPWD,        "$username:$password");
    $json           = curl_exec($c);
    $phpobj         = json_decode($json);
    $offset         += 1000;
    update($phpobj);
    echo "1000UPDATED@" . time() . "<br />\n"; // just for log
}while( count($phpobj) > 0 );
echo "ENDED@" . time() . "<br />\n"; // just for log

function update($phpobj){    
    $host     = "localhost";
    $user     = "root";
    $pass     = "";
    $db       = "theapitest";
    $link     = mysqli_connect($host, $user, $pass, $db);    

    for( $i=0; $i<count($phpobj); $i++){
        $row     = $phpobj[$i];
        $id      = mysqli_real_escape_string($link, $row->id);
        $name    = mysqli_real_escape_string($link, $row->name);
        $query   = "INSERT INTO `tablename` VALUES('$id', '$name') 
                    ON DUPLICATE KEY UPDATE `name`='$name'";
        mysqli_query($link, $query);
    }
    mysqli_close($link);
}//end function

The problems are

  1. It is too slow. Some tables have million records. (Any better ways?)
  2. Some tables have no primary key (I cannot use INSERT INTO ... ON DUPLICATE KEY UPDATE)
  3. For the record deletion (I don't know how to do; I think delete all records and add all again is not the best idea)

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top