Exporting excel to xml spreadsheet with blank cells
-
08-07-2019 - |
Question
I am exporting an excel workbook into xml spreadsheet. The excel has lets say 10 columns and 10 rows. Some of the cells are empty(i.e with no value).
When i save the file into xml spreadsheet and review the row that has blank cell in it, it has only cells: the cell with the empty value is not there and the xml show that the cell before the blank, and the cell after the blank are one after another (the empty cell just doesn't exists).
Here is a sample of the xml:
<Cell ss:StyleID="s36"><Data ss:Type="Number">cell1</Data><NamedCell
ss:Name="Print_Area"/></Cell>
<Cell><Data ss:Type="String">cell2</Data><NamedCell ss:Name="Print_Area"/></Cell>
<Cell><Data ss:Type="String">cell4</Data><NamedCell
ss:Name="Print_Area"/></Cell>
The missing cell is cell3
Is there a way to ask excel not to save space? The recreation is not that easy as it seems using xslt?
Solution
If the cell is empty this seems a reasonable optimization to save space - why should it not be missing.
You have enough information to recreate the original spreadsheet
OTHER TIPS
Exactly where is the information stored that lets him recreate the spreadsheet? If these rows:
- Data, empty, Data, empty, Data
- Data, Data, Data, empty, empty
- Data, empty, empty, Data, Data
all give
- Row
- Cell Data /Data /Cell
- Cell Data /Data /Cell
- Cell Data /Data /Cell
- /Row
You may build your own VBA macro. Like this one. And add a reference to Microsoft.xml.
Sub makeXml()
ActiveCell.SpecialCells(xlLastCell).Select
Dim lastRow, lastCol As Long
lastRow = ActiveCell.Row
lastCol = ActiveCell.Column
Dim iRow, iCol As Long
Dim xDoc As New DOMDocument
Dim rootNode As IXMLDOMNode
Set rootNode = xDoc.createElement("Root")
Dim rowNode As IXMLDOMNode
Dim colNode As IXMLDOMNode
'loop over the rows
For iRow = 2 To lastRow
Set rowNode = xDoc.createElement("Row")
'loop over the columns
For iCol = 1 To lastCol
If (Len(ActiveSheet.Cells(1, iCol).Text) > 0) Then
Set colNode = xDoc.createElement(GetXmlSafeColumnName(ActiveSheet.Cells(1, iCol).Text))
colNode.Text = ActiveSheet.Cells(iRow, iCol).Text
rowNode.appendChild colNode
End If
Next iCol
rootNode.appendChild rowNode
Next iRow
xDoc.appendChild rootNode
fileSaveName = Application.GetSaveAsFilename( _
fileFilter:="XML Files (*.xml), *.xml")
xDoc.Save (fileSaveName)
set xDoc = Nothing
End Sub
Function GetXmlSafeColumnName(name As String)
Dim ret As String
ret = name
ret = Replace(ret, " ", "_")
ret = Replace(ret, ".", "")
ret = Replace(ret, ",", "")
ret = Replace(ret, "&", "")
ret = Replace(ret, "!", "")
ret = Replace(ret, "@", "")
ret = Replace(ret, "$", "")
ret = Replace(ret, "#", "")
ret = Replace(ret, "%", "")
ret = Replace(ret, "^", "")
ret = Replace(ret, "*", "")
ret = Replace(ret, "(", "")
ret = Replace(ret, ")", "")
ret = Replace(ret, "-", "")
ret = Replace(ret, "+", "")
GetXmlSafeColumnName = ret
End Function
I had the same issues before I've written some code to deal with omitted empty cells. You just need to use ss:Index
attribute value of Cell
element if it exists (read XML Spreadsheet Reference for details) and store Cell
contents into a proper indexed array position to recreate the original cells order.
<?php
$doc = new DOMDocument('1.0', 'utf-8');
if (!$doc->load('sample.xml'))
die();
$root = $doc->documentElement;
$root->removeAttributeNS($root->getAttributeNode('xmlns')->nodeValue, '');
$xpath = new DOMXPath($doc);
foreach ($xpath->query('/Workbook/Worksheet/Table/Row') as $row)
{
$cells = array();
$cell_index = 0;
foreach ($xpath->query('./Cell', $row) as $cell)
{
if ($cell->hasAttribute('ss:Index'))
$cell_index = $cell->getAttribute('ss:Index');
else
++$cell_index;
$cells[$cell_index - 1] = $cell->nodeValue;
}
// now process data
print_r($cells);
}
Note that empty cells will not be added to the array, while everything else is on its place. You may calculate the maximum possible cell index (the number of table columns) through all rows if you need some.