Question

I'm parsing a json feed routinely and need to insert only the newest users from the feed and ignore existing users.

I think what I need is ON DUPLICATE KEY UPDATE or INSERT IGNORE based on some searching but I'm not quite sure which is why I'm asking - so for example:

users
1     John
2     Bob

Partial JSON:

{ userid:1, name:'John' },
{ userid:2, name:'Bob' },
{ userid:3, name:'Jeff' }

From this feed I only want to insert Jeff. I could do a simple loop through all users and do a simple SELECT query and see if the user id is already in the table, if not I do an INSERT, however I suspect it won't be an efficient and practical method.

By the way, I'm using Zend_Db for the database interaction if anyone would like to cater a specific answer :) I don't mind a generic strategic solution though.

Was it helpful?

Solution

The ON DUPLICATE KEY UPDATE alternative allows you to refer the update vs. insert decision to the database:

INSERT INTO table (userid, name) VALUES (2, 'Bobby');
  ON DUPLICATE KEY UPDATE name = 'Bobby';

would update the name field to 'Bobby', if an entry with userid 2 already exists.

You can use it as an alternative to the INSERT IGNORE if you supply a noneffective operation to the UPDATE:

INSERT INTO table (userid, name) VALUES (2, 'Bobby');
  ON DUPLICATE KEY UPDATE name = name;

This would do nothing if userid 2 already exists, thus avoiding the warning and swallowing of other errors you'd get when using INSERT IGNORE.


Another alternative would be REPLACE:

REPLACE INTO table (userid, name) VALUES (2, 'Bobby');

This would do a normal insert if the userid 2 does not exist yet. If it does exist, it will delete the old entry first and then insert a new one.


Be aware that both versions are MySQL specific extensions to SQL.

OTHER TIPS

For Zend Framework, what I do is a try/catch of the insert statement and take further action based on the exception code:

class Application_Model_DbTable_MyModel extends Zend_Db_Table_Abstract

    public function insert($aData, $bIgnore = false) 
    {

        try 
        {
            $id =  parent::insert($aData);
        } 
        catch(Zend_Db_Statement_Mysqli_Exception $e) 
        {
            // code 1062: Mysqli statement execute error : Duplicate entry
            if($bIgnore && $e->getCode() == 1062) 
            {
                // continue;
            } 
            else 
            {
                throw $e;
            }
        }
        return !empty($id) ? $id : false;
    }
}

You need to define userid as a PRIMARY or UNIQUE key and use something like:

INSERT IGNORE INTO table (userid, name) VALUES (2, 'Bob');

If the userid 2 already exists it will ignore and move on to the next insert.

You can also use ON DUPLICATE KEY UPDATE on your table schema. One other alternative might be to use the REPLACE INTO syntax.

REPLACE INTO table (userid, name) VALUES (2, 'Bob');

This will try to INSERT, if the record already exists it will DELETE it before INSERTing it again.

My solution :

/**
 * Perform an insert with the IGNORE word
 *
 * @param $data array
 * @return number the number of rows affected
 */
public function insertIgnore(array $data)
{
    // Start of query
    $sql = sprintf("INSERT IGNORE INTO %s (", $this->getAdapter()->quoteIdentifier($this->info('name')));
    // Retrieve column identifiers
    $identifiers = array_keys($data);
    foreach ($identifiers as $key => $value) {
        // Quote identifier
        $identifiers[$key] = $this->getAdapter()->quoteIdentifier($value);
    }
    // Concat column identifiers
    $sql .= implode(', ', $identifiers);
    $sql .= ") VALUES (";
    foreach ($data as $key => $value) {
        // Quote values
        $data[$key] = $this->getAdapter()->quote($value);
    }
    // Concat values identifiers
    $sql .= implode(', ', $data);
    $sql .= ")";
    // Process the query
    return $this->getAdapter()->query($sql)->rowCount();
}

In the loop, you can do a update first, if no row affected, means it's a new entry. Keep track of those 'failed updates' then do an insert of them as new entries.

I'm not familiar with Zend_Db but I assume it can return whether a update affected how many row(s).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top