Question

I need to import some CSVs into Excel spreadsheet, the row/column numbers of the CSVs are different for each of them. The problem is that some values are long digit strings like
341235387313289173719237217391,

Excel will treat those values as (double) numbers, and then results in lost of data.

My way to solve it is to use the following function to do the job:

Sub readCSV(f As TextStream, sh As Worksheet)
    i = 1
    Do
        l = Trim(f.ReadLine)
        If l = "" Then Exit Sub 'skip the last empty line(s)
        l = Mid(l, 2, Len(l) - 1)
        ss = Split(l, """,""")
        For j = LBound(ss) To UBound(ss) 'j starts from 0
            Dim a As Range
            With sh.Cells(i, j + 1)
                .NumberFormat = "@" 'Force to text format
                .Value = ss(j)
            End With
            DoEvents 'Avoid blocking the GUI
        Next j
        i = i + 1
    Loop Until f.AtEndOfStream
End Sub

The problem is the performance. It is much slower than importing the data through Data->From Text or just open the CSVs directly.

Are there any way to do it more efficiently?

Was it helpful?

Solution

You can format/write each line in one shot:

Sub readCSV(f As TextStream, sh As Worksheet)

     Dim i As Long
     Dim ss, l
     i = 1

    With Application
        .ScreenUpdating = False
        .Calculation = xlCalculationManual
    End With

    Do
        l = Trim(f.ReadLine)
        If l = "" Then Exit Sub 'skip the last empty line(s)
        l = Mid(l, 2, Len(l) - 1)
        ss = Split(l, """,""")

        With sh.Cells(i, 1).Resize(1, (UBound(ss) - LBound(ss)) + 1)
            If (i-1) Mod 100 = 0 Then .Resize(100).NumberFormat = "@"
            .Value = ss
        End With

        i = i + 1
    Loop Until f.AtEndOfStream

    With Application
        .ScreenUpdating = True
        .Calculation = xlCalculationAutomatic
    End With


End Sub

EDIT: after testing, the real performance killer is setting the cell format to text- revised code to set this in blocks of 100 lines instead of each line.

OTHER TIPS

Rather than work in Excel (by cell or by line) you could use a Regexp to quikcly create a second version of your CSV file with everystring of alpanumerics longer than 16 characters updated with a preceding '

Then simply import or open the entire new csv in Excel

Sample code that runs on a CSV file StrIn with this path for the example, "c:\Temp\test.csv"

Sub Main()
Dim objFSO As Object
Dim objRegex As Object
Dim objTF As Object
Dim objTF2 As Object
Dim tf As Object
Dim strIn As String
Dim strOut As String
Dim strFile As String

strIn = "c:\Temp\test.csv"
strOut = "c:\Temp\test2.csv"

Set objFSO = CreateObject("scripting.filesystemobject")
Set objTF = objFSO.getfile(strIn)
Set objRegex = CreateObject("vbscript.regexp")
Set tf = objTF.OpenAsTextStream(ForReading)
strFile = tf.ReadAll

With objRegex
.Pattern = "(\w{16,})"
.Global = True
strFile = .Replace(strFile, "'" & "$1")
End With

Set objTF2 = objFSO.OpenTextFile(strOut, ForWriting, True)
objTF2.Write strFile
objTF2.Close
tf.Close
End Sub

Try .Value = "'" & ss(j)

The ' forces the value to display as a text string in Excel.

Also, try declaring your ss array in a string so it doesnt store the numbers as longs after splitting. Something like:

Dim ss() as String = Split(l, """,""")
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top