Parse and import unstructured text files into Microsoft Access (file has potential delimiters)

StackOverflow https://stackoverflow.com/questions/19252925

문제

I have a bunch of text files that I need to import into MS Access (thousands) - can use 2007 or 2010. The text files have categories that are identified in square brackets and have relevant data between the categories - for example:

[Location]Tenessee[Location][Model]042200[Model][PartNo]113342A69447B6[PartNo].

I need to capture both the categories and the data between them and import them into Access - the categories to one table, the data to another. There are hundreds of these categories in a single file and the text file has no structure - they are all run together as in the example above. The categories in the brackets are the only clear delimiters.

Through research on the web I have come up with a script for VBS (I am not locked into VBS, willing to use VBA or another method), but when I run it, I am getting a VBS info window with nothing displaying in it. Any advice or guidance would be most gratefully appreciated (I do not tend to use VBS and VBA) and I thank you.

The Script:

Const ForReading = 1

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFile = objFSO.OpenTextFile("C:\Users\testGuy\Documents\dmc_db_test\DMC-TEST-A-00-00-00-00A-022A-D_000 - Copy01.txt", ForReading)

    strContents = objFile.ReadAll
    objFile.Close

    Set objRegEx = CreateObject("VBScript.RegExp")

    objRegEx.Global = True   
    objRegEx.Pattern = "\[.{0,}\]"

    Set colMatches = objRegEx.Execute(strContents)  

    If colMatches.Count > 0 Then
       For Each strMatch in colMatches   
           strMatches = strMatches & strMatch.Value 
       Next
    End If

    strMatches = Replace(strMatches, "]", vbCrlf)
    strMatches = Replace(strMatches, "[", "")

    Wscript.Echo strMatches
도움이 되었습니까?

해결책

Regular expressions are wonderful things, but in your case it looks like they might be overkill. The following code uses plain old InStr() to find the [Tags] and parses the file(s) out to a single CSV file. That is, for input files

testfile1.txt:

[Location]Tennessee[Location][Model]042200[Model][PartNo]113342A69447B6[PartNo]
[Location]Mississippi[Location][Model]042200[Model][SerialNo]3212333222355[SerialNo]

and testfile2.txt:

[Location]Missouri[Location][Model]042200[Model][PartNo]AAABBBCCC111222333[PartNo]

...the code will write the following output file...

"FileName","LineNumber","ItemNumber","FieldName","FieldValue"
"testfile1.txt",1,1,"Location","Tennessee"
"testfile1.txt",1,2,"Model","042200"
"testfile1.txt",1,3,"PartNo","113342A69447B6"
"testfile1.txt",2,1,"Location","Mississippi"
"testfile1.txt",2,2,"Model","042200"
"testfile1.txt",2,3,"SerialNo","3212333222355"
"testfile2.txt",1,1,"Location","Missouri"
"testfile2.txt",1,2,"Model","042200"
"testfile2.txt",1,3,"PartNo","AAABBBCCC111222333"

...which you can then import into Access (or whatever) and proceed from there. This is VBA code, but it could easily be tweaked to run as a VBScript.

Sub ParseSomeFiles()
Const InFolder = "C:\__tmp\parse\in\"
Const OutFile = "C:\__tmp\parse\out.csv"
Dim fso As FileSystemObject, f As File, tsIn As TextStream, tsOut As TextStream
Dim s As String, Lines As Long, Items As Long, i As Long
Set fso = New FileSystemObject
Set tsOut = fso.CreateTextFile(OutFile, True)
tsOut.WriteLine """FileName"",""LineNumber"",""ItemNumber"",""FieldName"",""FieldValue"""
For Each f In fso.GetFolder(InFolder).Files
    Debug.Print "Parsing """ & f.Name & """..."
    Set tsIn = f.OpenAsTextStream(ForReading)
    Lines = 0
    Do While Not tsIn.AtEndOfStream
        s = Trim(tsIn.ReadLine)
        Lines = Lines + 1
        Items = 0
        Do While Len(s) > 0
            Items = Items + 1
            tsOut.Write """" & f.Name & """," & Lines & "," & Items
            i = InStr(1, s, "]", vbBinaryCompare)
            ' write out FieldName
            tsOut.Write ",""" & Replace(Mid(s, 2, i - 2), """", """""", 1, -1, vbBinaryCompare) & """"
            s = Mid(s, i + 1)
            i = InStr(1, s, "[", vbBinaryCompare)
            ' write out FieldValue
            tsOut.Write ",""" & Replace(Mid(s, 1, i - 1), """", """""", 1, -1, vbBinaryCompare) & """"
            s = Mid(s, i)
            i = InStr(1, s, "]", vbBinaryCompare)
            ' (no need to write out ending FieldName tag)
            s = Mid(s, i + 1)
            tsOut.WriteLine
        Loop
    Loop
    tsIn.Close
    Set tsIn = Nothing
Next
Set f = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Debug.Print "Done."
End Sub
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top