Question

I am trying to write a bash script to pick out particular data files from a folder so I can make plots from them. I am trying to write a bash script that uses grep to do this. At this point I am piping the results from ls -1 into grep using a regular expression to generate a list of file names. The following are the file names I am sorting through, this pattern continues. Bolded ones are the names I would like to match with the regex: ifrontThermal.\d+

ifrontThermal64.00490
ifrontThermal64.00490.HeI
ifrontThermal64.00490.HeII
ifrontThermal64.00490.HI
ifrontThermal64.00490.radFlux
ifrontThermal64.00490.radTens
ifrontThermal64.00490.u
ifrontThermal64.00490.uNoncool
ifrontThermal64.00500
ifrontThermal64.00500.HeI
ifrontThermal64.00500.HeII
ifrontThermal64.00500.HI
ifrontThermal64.00500.radFlux
ifrontThermal64.00500.radTens
ifrontThermal64.00500.u
ifrontThermal64.00500.uNoncool

These commands return nothing

$ (ls -1)|(grep ifrontThermal64.\d+)
$ (ls -1)|(grep ifrontThermal64\.\d+)
$ (ls -1)|(grep ifrontThermal64.[0-9]+)

These command returns what I expect but not what I want.

 $ (ls -1)|(grep ifrontThermal64.)
 $ (ls -1)|(grep ifrontThermal64.[0-9])

When I test the 3 ones that don't work at http://regexpal.com/ these seem to be fine.

Thanks in advance for any help!

Était-ce utile?

La solution

If the number suffixes of interest are of fixed length and all you care about is filtering out the files that have an additional extension, the following glob (NOT a regex, but a wildcard expression) will do:

ifrontThermal64.[0-9][0-9][0-9][0-9][0-9]

E.g.:

printf "%s\n" ifrontThermal64.[0-9][0-9][0-9][0-9][0-9]

Note that globs always match against the entire filename, whereas grep performs substring matching by default.

As for why your approach didn't work:

  • Your regex isn't quoted, so the shell's parsing 'eats' the \, thereby altering it.
  • Also, whether grep recognizes \d is platform-dependent; to rule out such issues, use [0-9] instead.
  • If you use grep without -E, it uses so-called basic regular expressions, which require that the quantifier + be escaped as \+; while you could do that, the generally better option is to instead use grep -E or to simply invoke grep as egrep in order to use extended regexes, which mostly behave like regular expressions in other languages.
  • ., when intended to be a literal, should be \-escaped in a regex (which you did in one of your attempts).
  • The -1 option of ls is implied when ls is not outputting to a terminal.
  • grep uses substring matching by default, so use -x to match against the entire input line (alternatively, use the anchors ^ and $) so as to rule out filenames that match the expression but have an additional extension.

Thus, a corrected version of the original command is:

 ls | egrep -x 'ifrontThermal64\.[0-9]+'

As an aside: there's no point in enclosing your commands in parentheses; you'll needlessly create subshells (unless they're optimized away, but the point is that they're not needed).

Autres conseils

You can use the regex with egrep, which uses extended regex

ifrontThermal64.\d+$

The $ is an anchor identifying end of a string.

For your case, no need to follow the step ls->egrep->plot. You can use

find . -name 'ifrontThermal64.\d+$' -exec plot {} \;
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top