Question

I'm trying to extract one line of text from each of ~1500 identically formatted txt files, then save all the values in these lines--along with a corresponding date (the txt file name)--to a csv file.

The lines in said txt files are thus formatted (there are rows of data above and below):

DAILY AVG:       14.64          9.49          9.46          0.16       243.71

I am hoping to ultimately generate an xls file that looks like this:

Date      AVG1   AVG2  AVG3  AVG4  AVG5
12-13-06  14.64  9.49  9.46  0.16  243.71

I have considered using grep or awk, but frankly have no idea where to start. Is a command-line batch procedure the best line of attack? The desired end result is to have all of these daily averages, along with their corresponding dates, imported to an excel spreadsheet. The import txt option in excel would work on a file-by-file basis, but the problem here is the time it would take to manually import 1500 individual files from txt into xls files would be unfeasible unless I had an army of people.

Any insight or direction would be greatly appreciated.

Was it helpful?

Solution

Assumptions:

  • All text files are located in the same folder
  • The text files are tab delimited

Use this Excel VBA code:

Sub tgr()

    Dim oShell As Object
    Dim oFSO As Object
    Dim arrData(1 To 65000) As String
    Dim strFolderPath As String
    Dim strFileName As String
    Dim strText As String
    Dim DataIndex As Long
    Dim lAvgLoc As Long

    Set oShell = CreateObject("Shell.Application")
    On Error Resume Next
    strFolderPath = oShell.BrowseForFolder(0, "Select a Folder", 0).Self.Path & Application.PathSeparator
    Set oShell = Nothing
    On Error GoTo 0
    If Len(strFolderPath) = 0 Then Exit Sub 'Pressed cancel

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    strFileName = Dir(strFolderPath & "*.txt*")
    Do While Len(strFileName) > 0
        strText = oFSO.OpenTextFile(strFolderPath & strFileName).ReadAll
        lAvgLoc = InStr(1, strText, "Daily Avg", vbTextCompare)
        If lAvgLoc > 0 Then
            strText = Mid(strText, lAvgLoc)
            strText = Trim(Mid(Replace(strText, vbCrLf, String(255, " ")), Evaluate("MIN(FIND({1,2,3,4,5,6,7,8,9,0},""" & strText & """&1234567890))"), 240))
            DataIndex = DataIndex + 1
            arrData(DataIndex) = DateValue(Replace(strFileName, ".txt", vbNullString)) & vbTab & strText
        End If
        strFileName = Dir
    Loop

    If DataIndex > 0 Then
        With Sheets.Add
            .Range("A1:F1").Value = Array("DATE", "AVG1", "AVG2", "AVG3", "AVG4", "AVG5")
            With .Range("A2").Resize(DataIndex)
                .Value = Application.Transpose(arrData)
                .TextToColumns .Cells, xlDelimited, xlTextQualifierDoubleQuote, Tab:=True
                .NumberFormat = "mm-dd-yy"
            End With
            Application.DisplayAlerts = False
            .SaveAs strFolderPath & "Daily Averages.csv", xlCSV
            Application.DisplayAlerts = True
        End With
    End If

    Set oFSO = Nothing
    Erase arrData

End Sub

How to use a macro:

  1. Make a copy of the workbook the macro will be run on
    • Always run new code on a workbook copy, just in case the code doesn't run smoothly
    • This is especially true of any code that deletes anything
  2. In the copied workbook, press ALT+F11 to open the Visual Basic Editor
  3. Insert | Module Copy the provided code and paste into the module
  4. Close the Visual Basic Editor
  5. In Excel, press ALT+F8 to bring up the list of available macros to run
  6. Double-click the desired macro (I named this one tgr)

OTHER TIPS

You can use this bash shell script.

#!/bin/sh

echo Date AVG1 AVG2 AVG3 AVG4 AVG5 > output.txt
for i in *.txt
do  
    STRING=${i%.txt}
    DATA=`cat $i | tr -s ' ' | cut -d ' ' -f 3-`
    echo $STRING $DATA >> output.txt
done

Assuming all data files are in the current directory, this will put your desired output in output.txt. You can then load this into Excel.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top