Parse and import unstructured text files into Microsoft Access (file has potential delimiters)

StackOverflow https://stackoverflow.com/questions/19252925

Domanda

I have a bunch of text files that I need to import into MS Access (thousands) - can use 2007 or 2010. The text files have categories that are identified in square brackets and have relevant data between the categories - for example:

[Location]Tenessee[Location][Model]042200[Model][PartNo]113342A69447B6[PartNo].

I need to capture both the categories and the data between them and import them into Access - the categories to one table, the data to another. There are hundreds of these categories in a single file and the text file has no structure - they are all run together as in the example above. The categories in the brackets are the only clear delimiters.

Through research on the web I have come up with a script for VBS (I am not locked into VBS, willing to use VBA or another method), but when I run it, I am getting a VBS info window with nothing displaying in it. Any advice or guidance would be most gratefully appreciated (I do not tend to use VBS and VBA) and I thank you.

The Script:

Const ForReading = 1

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFile = objFSO.OpenTextFile("C:\Users\testGuy\Documents\dmc_db_test\DMC-TEST-A-00-00-00-00A-022A-D_000 - Copy01.txt", ForReading)

    strContents = objFile.ReadAll
    objFile.Close

    Set objRegEx = CreateObject("VBScript.RegExp")

    objRegEx.Global = True   
    objRegEx.Pattern = "\[.{0,}\]"

    Set colMatches = objRegEx.Execute(strContents)  

    If colMatches.Count > 0 Then
       For Each strMatch in colMatches   
           strMatches = strMatches & strMatch.Value 
       Next
    End If

    strMatches = Replace(strMatches, "]", vbCrlf)
    strMatches = Replace(strMatches, "[", "")

    Wscript.Echo strMatches
È stato utile?

Soluzione

Regular expressions are wonderful things, but in your case it looks like they might be overkill. The following code uses plain old InStr() to find the [Tags] and parses the file(s) out to a single CSV file. That is, for input files

testfile1.txt:

[Location]Tennessee[Location][Model]042200[Model][PartNo]113342A69447B6[PartNo]
[Location]Mississippi[Location][Model]042200[Model][SerialNo]3212333222355[SerialNo]

and testfile2.txt:

[Location]Missouri[Location][Model]042200[Model][PartNo]AAABBBCCC111222333[PartNo]

...the code will write the following output file...

"FileName","LineNumber","ItemNumber","FieldName","FieldValue"
"testfile1.txt",1,1,"Location","Tennessee"
"testfile1.txt",1,2,"Model","042200"
"testfile1.txt",1,3,"PartNo","113342A69447B6"
"testfile1.txt",2,1,"Location","Mississippi"
"testfile1.txt",2,2,"Model","042200"
"testfile1.txt",2,3,"SerialNo","3212333222355"
"testfile2.txt",1,1,"Location","Missouri"
"testfile2.txt",1,2,"Model","042200"
"testfile2.txt",1,3,"PartNo","AAABBBCCC111222333"

...which you can then import into Access (or whatever) and proceed from there. This is VBA code, but it could easily be tweaked to run as a VBScript.

Sub ParseSomeFiles()
Const InFolder = "C:\__tmp\parse\in\"
Const OutFile = "C:\__tmp\parse\out.csv"
Dim fso As FileSystemObject, f As File, tsIn As TextStream, tsOut As TextStream
Dim s As String, Lines As Long, Items As Long, i As Long
Set fso = New FileSystemObject
Set tsOut = fso.CreateTextFile(OutFile, True)
tsOut.WriteLine """FileName"",""LineNumber"",""ItemNumber"",""FieldName"",""FieldValue"""
For Each f In fso.GetFolder(InFolder).Files
    Debug.Print "Parsing """ & f.Name & """..."
    Set tsIn = f.OpenAsTextStream(ForReading)
    Lines = 0
    Do While Not tsIn.AtEndOfStream
        s = Trim(tsIn.ReadLine)
        Lines = Lines + 1
        Items = 0
        Do While Len(s) > 0
            Items = Items + 1
            tsOut.Write """" & f.Name & """," & Lines & "," & Items
            i = InStr(1, s, "]", vbBinaryCompare)
            ' write out FieldName
            tsOut.Write ",""" & Replace(Mid(s, 2, i - 2), """", """""", 1, -1, vbBinaryCompare) & """"
            s = Mid(s, i + 1)
            i = InStr(1, s, "[", vbBinaryCompare)
            ' write out FieldValue
            tsOut.Write ",""" & Replace(Mid(s, 1, i - 1), """", """""", 1, -1, vbBinaryCompare) & """"
            s = Mid(s, i)
            i = InStr(1, s, "]", vbBinaryCompare)
            ' (no need to write out ending FieldName tag)
            s = Mid(s, i + 1)
            tsOut.WriteLine
        Loop
    Loop
    tsIn.Close
    Set tsIn = Nothing
Next
Set f = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Debug.Print "Done."
End Sub
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top