문제

I have a legacy php web app that performs csv --> database imports, into a 'master' table that stores entity data, and an entity-attribute-value table that stores dynamic data for each entity.

The import process is a line-by-line iteration through the csv file, with an INSERT into the master table and multiple INSERTs into the EAV table for each line.

This process is SLOOW, and what little know about mysql tuning tells me that a LOAD DATA statement is generally far faster than a series of INSERTs; however, because of the EAV process the iteration would still have to occur, though based off the results of a database query rather than the csv file.

  • Is it worth it to make the modification?

  • Does it make a difference if there are dozens of millions of records in each file, with generally less than 2/3 of the file fields actually being mapped to attributes?

도움이 되었습니까?

해결책

Sounds like a useful modification. What I would do is pre-process the CSV into two files - master and eav tables. The tricky part is establishing some sort of linkage between these two files so you can insert into the eav table with the correct foreign key.

The problem is simplified if:

  1. you can lock out any other write access to the system while you execute the load
  2. the master table primary key is an incrementing integer

In that case, you can easily "know" the eav foreign key value ahead of time and set appropriately before loading data for either table.

If not, you'll need to figure out how to get the primary key value for the master table records, post LOAD DATA, and link up with the eav records accordingly.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top