Question

I am trying to take a rather large CSV file and insert it into a MySQL database for referencing in a project. I would like to use the first line of the file to create the table using proper data types and not varchar for each column. The ultimate goal is to automate this process as I have several similar files but the each has different data and a different amount of "columns" in CSV files. The problem that I am having is gettype() is returning 'string' for each column instead of int, float and string as I would like it to.

Platform is PHP 5, OS is ubuntu 8.04

here is my code so far:

<?php

// GENERATE TABLE FROM FIRST LINE OF CSV FILE

$inputFile = 'file.csv';
$tableName = 'file_csv';

$fh = fopen($inputFile, 'r');
    $contents = fread($fh, 5120); // 5KB
fclose($fh);

$fileLines = explode("\n", $contents); // explode to make sure we are only using the first line.

$fieldList = explode(',', $fileLines[0]); // separate columns, put into array
echo 'CREATE TABLE IF NOT EXISTS `'.$tableName.'` ('."<br/>\n";
for($i = 0; $i <= count($fieldList); $i++)
{
    switch(gettype($fieldList[$i])) {
        case 'integer':
            $typeInfo = 'int(11)';
            break;
        case 'float':
            $typeInfo = 'float';
            break;
        case 'string':
            $typeInfo = 'varchar(80)';
            break;
        default:
            $typeInfo = 'varchar(80)';
            break;
    }
if(gettype($fieldList[$i]) != NULL) echo "\t".'`'.$i.'` '.$typeInfo.' NOT NULL, --'.gettype($fieldList[$i]).' '.$fieldList[$i]."<br/>\n";
}
echo '  PRIMARY KEY  (`0`)'."<br/>\n";
echo ') ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;';

Example First line: 1,0,0,0,0,0,0,0,0,0,0,0,0.000000,0.000000,0,0,0,,0,0,1,0,50,'Word of Recall (OLD)',

Was it helpful?

Solution

Building on Nouveau's code you could do this

for($i = 0; $i <= count($fieldList); $i++)
{
    if (is_numeric($fieldList[$i]))
    {
        if (strpos($fieldList[$i],'.') !== false){
            $fieldList[$i] = (int)$fieldList[$i];
        }else{
            $fieldList[$i] = (float)$fieldList[$i];
        }
    }

    switch(gettype($fieldList[$i])) {
        case 'integer':
            $typeInfo = 'int(11)';
            break;
        case 'float':
        case 'double':
            $typeInfo = 'float';
            break;

        case 'string':
            $typeInfo = 'varchar(80)';
            break;
        default:
            $typeInfo = 'varchar(80)';
            break;
    }
if(gettype($fieldList[$i]) != NULL) echo "\t".'`'.$i.'` '.$typeInfo.' NOT NULL, --'.gettype($fieldList[$i]).' '.$fieldList[$i]."<br/>\n";

}

That works, note the addition of "case 'double':" in the switch, but there may be a better way to do the int/float check as they would only work with standard uk/us numbers.

OTHER TIPS

Try casting the value and comparing it with the original one:

define('DECIMAL_SEPARATOR', '.');

switch ($fieldList[$i])
{
    case (string)(int)$fieldList[$i]:
        $typeInfo = (strpos($fieldList[$i], DECIMAL_SEPARATOR) === false) ? 'int(11)' : 'float';
        break;
    case (string)(float)$fieldList[$i]:
        $typeInfo = 'float';
        break;
    default:
        $typeInfo = 'varchar(80)';
        break;
}

Additionaly, check for the presence of decimal separator in the first case for numbers that are round, yet they have the decimal fraction part.

<?php

// GENERATE TABLE FROM FIRST LINE OF CSV FILE

$inputFile = 'file.csv';
$tableName = 'file_csv';

$fh = fopen($inputFile, 'r');
    $contents = fread($fh, 5120); // 5KB
fclose($fh);

$fileLines = explode("\n", $contents);

$fieldList = explode(',', $fileLines[0]);
echo 'CREATE TABLE IF NOT EXISTS `'.$tableName.'` ('."<br/>\n";
for($i = 0; $i <= count($fieldList); $i++)
{

    if(strlen($fieldList[$i]) == 0) $typeInfo = 'varchar(80)';
    if(preg_match('/[0-9]/', $fieldList[$i])) $typeInfo = 'int(11)';
    if(preg_match('/[\.]/', $fieldList[$i])) $typeInfo = 'float';
    if(preg_match('/[a-z\\\']/i', $fieldList[$i])) $typeInfo = 'varchar(80)';

    echo "\t".'`'.$i.'` '.$typeInfo.' NOT NULL, -- '.gettype($fieldList[$i]).' '.$fieldList[$i]."<br/>\n";
}
echo '  PRIMARY KEY  (`0`)'."<br/>\n";
echo ') ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;';

Using regular expressions will give you a flexible solution. If you need to detect date fields then this is the way to go.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top