Question

I have lots of strings in a text file, like this:

"/home/mossen/Desktop/jeff's project/Results/FCCY.png"
"/tmp/accept/FLWS14UU.png"
"/home/tten/Desktop/.wordi/STSMLC.png"

I want to get only the file names from the string as I read the text file line by line, using a bash shell script. The file name will always end in .png and will always have the "/" in front of it. I can get each string into a var, but what is the best way to extract the filenames (FCCY.png, FLWS14UU.png, etc.) into vars? I can't count on the user having Perl, Python, etc, just the standard Unix utils such as awk and sed.

Thanks,
mossen
Was it helpful?

Solution

You want basename:

$ basename /tmp/accept/FLWS14UU.png
FLWS14UU.png

OTHER TIPS

basename works on one file/string at a time. If you have many strings you will be iterating the file and calling external command many times.

use awk

$ awk -F'[/"]' '{print $(NF-1)}' file
FCCY.png
FLWS14UU.png
STSMLC.png

or use the shell

while read -r line
do
    line=${line##*/}
    echo "${line%\"}"
done <"file"

newlist=$(for file in ${list} ;do basename ${file}; done)

$ var="/home/mossen/Desktop/jeff's project/Results/FCCY.png"
$ file="${var##*/}"

Using basename iteratively has a huge performance hit. It's small and unnoticeable when you're doing it on a file or two but adds up over hundreds of them. Let me do some timing tests for you to exemplify why using basneame (or any system util callout) is bad when an internal feature can do the job -- Dennis and ghostdog74 gave you the more experienced BASH answers.

Sample input files.txt (list of my pics with full path): 3749 entries

external.sh

while read -r line
do
  line=`basename "${line}"`
  echo "${line%\"}"
done < "files.txt"

internal.sh

while read -r line
do
  line=${line##*/}
  echo "${line%\"}"
done < "files.txt"

Timed results, redirecting output to /dev/null to get rid of any video lag:

$ time sh external.sh 1>/dev/null 

real   0m4.135s
user   0m1.142s
sys    0m2.308s

$ time sh internal.sh 1>/dev/null 

real   0m0.413s
user   0m0.357s
sys    0m0.021s

The output of both is identical:

$ sh external.sh | sort > result1.txt
$ sh internal.sh | sort > result2.txt
$ diff -uN result1.txt result2.txt

So as you can see from the timing tests you really want to avoid any external calls to system utilities when you can write the same feature in some creative BASH code/lingo to get the job done, especially when it's going to be called a whole lot of times over and over.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top