Question

I have 2 large XML files which contains product details of a webshop. The first contains the product codes, names, and informations about the product avaibilities in stock, the second contains the product codes too, furthermore the names, the prices and other details of the products. I have to create a list of the products avaible in stock with all details, outputted to a (html) table.

My problem is the following: in the XML files are about 13000 products. With the first step (outputting the codes of the avaible products) i haven't problems, but when i try output the data from the second xml too, it doesn't works, the browser always shows "no data received". It's logical, there are about 2000-3000 products avaible in stock, what means, that the second XML file should be readed through 2000-3000 times.

How can i solve this problem? I can edit only the second XML file, the first is loaded from an external source, where i doesn't have access. Should I import the second XML file to an SQL table, or that isn't a good idea neither? Then what should i do?

Thanks (and sorry for the little bad english)!

My PHP code:

<?php

$zasoby_xml = file_get_contents('zasoby.xml');

$sxe0 = new SimpleXMLElement($zasoby_xml);
$sxe0->registerXPathNamespace('lStk', 'http://www.stormware.cz/schema/version_2/list_stock.xsd');
$lStkStock = $sxe0->xpath('//lStk:stock');
$cnt = count($lStkStock);

$sxe = new SimpleXMLElement($zasoby_xml);
$sxe->registerXPathNamespace('stk', 'http://www.stormware.cz/schema/version_2/stock.xsd');
$stkCode = $sxe->xpath('//stk:code'); //product code
$stkName = $sxe->xpath('//stk:name'); //product name
$stkCount = $sxe->xpath('//stk:count'); //count in the stock

$db_xml = simplexml_load_file('db.xml');

for ($i = 0;$i < $cnt;$i++) {
    if ($stkCount[$i] > 0) {
        echo $stkCode[$i]."&nbsp;&nbsp;";
        $j = 0;
        while($stkCode[$i] != $db_xml->record[$j]->product_id) {
            $j++;
        }
        echo $db_xml->record[$j]->category_path."<br>";
    }
}
?>

First XML file example:

<?xml version="1.0" encoding="Windows-1250"?>
<rsp:responsePack version="2.0" id="Usr01" state="ok" note="46895680" programVersion="10608.3 E1 (13.3.2014)" xmlns:rsp="http://www.stormware.cz/schema/version_2/response.xsd" xmlns:lStk="http://www.stormware.cz/schema/version_2/list_stock.xsd" xmlns:stk="http://www.stormware.cz/schema/version_2/stock.xsd">
<rsp:responsePackItem version="2.0" id="Usr01" state="ok">
<lStk:listStock version="2.0" dateTimeStamp="2014-04-08T14:18:14" dateValidFrom="2014-04-08" state="ok">
<lStk:stock version="2.0">
    <stk:code>90000000</stk:code>
    <stk:count>975.0</stk:count>
    <stk:name>Product name</stk:name>
</lStk:stock>
</lStk:listStock></rsp:responsePackItem></rsp:responsePack>

Second XML file example:

<?xml version="1.0" encoding="utf-8" ?>
<data>
<record>
    <product_id><![CDATA[77778888]]></product_id>
    <name><![CDATA[productname]]></name>
    <Deeplink><![CDATA[product url]]></Deeplink>
    <Img_url><![CDATA[product img_url]]></Img_url>
    <category_path><![CDATA[product category]]></category_path>
    <Price><![CDATA[product price]]></Price>
</record>
</data>
Was it helpful?

Solution

Using a while loop to go through the entire $db_xml document each time you need to search for a product is inefficient. Importing the second XML file to an SQL table is not a bad idea, but it seems a bit annoying when you can actually use a PHP array indexed by product_id.

I've prepared some code to illustrate my point:

<?php

$zasoby_xml = file_get_contents('zasoby.xml');

$sxe0 = new SimpleXMLElement($zasoby_xml);
$sxe0->registerXPathNamespace('lStk', 'http://www.stormware.cz/schema/version_2/list_stock.xsd');
$lStkStock = $sxe0->xpath('//lStk:stock');
$cnt = count($lStkStock);

$sxe = new SimpleXMLElement($zasoby_xml);
$sxe->registerXPathNamespace('stk', 'http://www.stormware.cz/schema/version_2/stock.xsd');
$stkCode = $sxe->xpath('//stk:code'); // product code
$stkName = $sxe->xpath('//stk:name'); // product name
$stkCount = $sxe->xpath('//stk:count'); // count in the stock

$db_xml = simplexml_load_file('db.xml');

// Loop through record elements on db.xml to build an array that can be accessed by product_id

$records = array();

foreach ($db_xml->record as $record) {
    $records[(string)$record->product_id] = $record;
}

// Loop through all products to display their information

for ($i = 0; $i < $cnt; $i++) {

    // Display only products in stock

    if ($stkCount[$i] > 0) {

        // Access this record directly by product_id (code) instead of looping through all records in db.xml

        if (isset($records[(string)$stkCode[$i]])) {
            echo sprintf(
                "<b>Code</b> %s <b>Category</b> %s", 
                $stkCode[$i], $records[(string)$stkCode[$i]]->category_path
            );
        }
    }
}

?>

zasoby.xml

<?xml version="1.0" encoding="Windows-1250"?>
<rsp:responsePack version="2.0" id="Usr01" state="ok" note="46895680" programVersion="10608.3 E1 (13.3.2014)" xmlns:rsp="http://www.stormware.cz/schema/version_2/response.xsd" xmlns:lStk="http://www.stormware.cz/schema/version_2/list_stock.xsd" xmlns:stk="http://www.stormware.cz/schema/version_2/stock.xsd">
<rsp:responsePackItem version="2.0" id="Usr01" state="ok">
<lStk:listStock version="2.0" dateTimeStamp="2014-04-08T14:18:14" dateValidFrom="2014-04-08" state="ok">
<lStk:stock version="2.0">
    <stk:code>90000000</stk:code>
    <stk:count>975.0</stk:count>
    <stk:name>Product name</stk:name>
</lStk:stock>
</lStk:listStock></rsp:responsePackItem></rsp:responsePack>

db.xml

<?xml version="1.0" encoding="utf-8" ?>
<data>
<record>
    <product_id><![CDATA[90000000]]></product_id>
    <name><![CDATA[productname]]></name>
    <Deeplink><![CDATA[product url]]></Deeplink>
    <Img_url><![CDATA[product img_url]]></Img_url>
    <category_path><![CDATA[product category]]></category_path>
    <Price><![CDATA[product price]]></Price>
</record>
</data>

With these XML files I'm getting the following output:

Code 90000000 Category product category

A problem with this implementation is the memory consumption of the $records array. If the second XML file gets too big you are going to end up with an array of thousands of elements. If this problem arises you could solve it by building an SQLite database file on disk instead of an array, or maybe not storing the full SimpleXMLElement $record object in the array under each product_id key.

EDIT: Fixed an error in line 23 of the script.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top