Question

I have a file with following sample text. (The actual text is huge).

2014/05/08-19:15:44.544824-<String1>
2014/05/08-19:21:54.544824-<String2>
2014/05/08-19:34:59.564461-<String3>

I have to extract data between two timestamps like 19:15:00 - 19:20:00 and so on until the eof. I have tried using sed & awk, but the problem is that at certain points they give out every thing. (e.g they'd give out everything from 19:15:00 --> EOF). The commands which i used were

awk '/19:15:00/,/19:20:00/' InputFile

and

sed '/19:15:00/,/19:20:00/p' Inputfile.

Any ideas as how to do it in perl or sed & awk, which might actually work ? Is the wrong output has something to do with the format of the time stamps ? (just thinking)

PS. I am using the following code to make time stamps. (is it possible because of localtime function this is happening ?

$curr = timelocal(0, 0, 0, (split /\//, $ARGV[0])[1], (split /\//, $ARGV[0])[0]-1, (split /\//, $ARGV[0])[-1]);
$currentTime = strftime "%H:%M:%S", localtime($curr);
Was it helpful?

Solution

Using a perl one-liner. Capture the time and then just compare.

perl -ne '$t = /(\d+:[\d:.]+)/ ? $1 : undef; 
    print if $t ge "19:15:00" && $t le "19:20:00";' file.txt

OTHER TIPS

The awk and sed commands you show above won't work because they're doing pattern matching -- they're not comparing timestamps to each other in chronological terms (ie, if it doesn't see an exact 19:20:00 string, it will continue to the end even if it sees a 19:21:00 along the way).

You could probably do it in perl using something similar to the two lines you show at the end where you're generating the timestamps, but in reverse to parse them, convert them to a time object and compare their values.

It may not be obvious, but a date/time representation that has fixed-width fields in decreasing order of magnitude (like ISO 8601 %Y-%m-%dT%H:%M:%S) can simply be compared as strings, so '19:21:54.544824' gt '19:20' is true, while 19:15:44.544824 lt '19:15' is false.

That means you can just use split to extract the field and do literal comparisons, like this

use strict;
use warnings;

while (<DATA>) {
  my $time = (split /-/)[1];
  print if $time ge '19:15' and $time le '19:20';
}

__DATA__
2014/05/08-19:15:44.544824-<String1>
2014/05/08-19:21:54.544824-<String2>
2014/05/08-19:34:59.564461-<String3>

output

2014/05/08-19:15:44.544824-<String1>

Why all the complexity?

$ awk -F'[-.]' '"19:15:00"<=$2 && $2<="19:20:00"' file
2014/05/08-19:15:44.544824-<String1>

or less readably but more efficiently if the file is sorted:

$ awk -F'[-.]' '$2>"19:20:00"{exit} $2>="19:15:00"' file
2014/05/08-19:15:44.544824-<String1>

Using awk

awk '{gsub(/:/,X,$2)}$2>=191500&&$2<=192000' FS="[-.]" file
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top