Question

I am trying to understand the differences between these two similar commands.

aa=$(foo | bar | head -1)
read aa < <(foo | bar | head -1)
  • I know that <() requires #!/bin/bash, but does that make it slower?
  • Do they create the same amount of subshells?
  • Do they require the same amount of bash or sh processes?

I am looking to use the command with the best performance.

Was it helpful?

Solution

  1. No. Bash is faster in non-POSIX mode.
  2. It depends. This is both an implementation-specific and platform-specific detail.
    • In Bash, both require the same number of processes. If lastpipe is not enabled, there is a process for each pipeline element plus a subshell for either substitution plus the parent process.
    • In Bash, if lastpipe is enabled, the last element of the pipeline will exec without forking in both cases, still requiring the same number of processes.
    • In ksh93, both should require the same number of processes in this case, but if the last pipeline element were a builtin, it would run in the process of the parent for the command substitution, making it even faster.
    • In both Bash and ksh93, if the shell is compiled on a system that does not support /dev/fd/*, the shell will create named pipes for process substitutions instead. This likely affects performance.
  3. The previous bullet should go here perhaps. Note "subshell" doesn't necessarily imply a separate process, though in almost all shells, it does (with the exception of $(<...) in everything except Bash that supports it). In mksh and ksh93, there's also the ${ ;} style command substitution, but each shell implements this differently. In ksh93, it may or may not give a speedup. in mksh, probably not. mksh doesn't support process substitutions, and zsh doesn't support (and has no way of simulating) BASHPID, so I haven't looked into it.

There is nothing intrinsically faster about a command substitution than a process substitution in Bash, but the head is redundant in the case of read since you're only reading a single line there. As an aside, always use head -n ... -- -1 is not portable. Also, don't use read without -r unless you want the shell to mangle the input.

OTHER TIPS

The best way to improve performance here is to get rid of forks and pipes as much as you can.

For all intents and purposes, you should not worry about performance issues as stated. 99% of execution time is likely to be determined by the particular commands, not the difference in process-substitution versus command-substitution. Do you know the first law of optimization? Don't. Especially if you're sacrificing portability. Use $(whatever) and forget everything else. If you really do worry about performance, it is the commands/pipes/forks you need to address. Otherwise you're trying to lose weight by squeezing tears from your eyes.

Benchmarking with Bash's built-in time, the first form is slower than the second.

You can test it yourself with :

bash -c 'time PIPELINE...'

Both create subshells -- with the shell reading and expanding the output of a subshell in the first case, and the shell's read builtin reading from a background process in the second.

See:

Process substitution bypasses subshells created by piplines/command substitution. The substitution syntax is replaced by the name of a FIFO or FD, and the command inside it is run in the background. The substitution is performed at the same time as parameter expansion and command substitution.

Check out information on process substitution used with "tee" as well.

http://mywiki.wooledge.org/ProcessSubstitution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top