I'm curious about the functional differences between two while loops in Bash that I've come across. Here are the two snippets:
1. `find /path/ -type f -name "file.pdf" | while read -r file; do echo $file; done`
2. `while read -r file; do echo $file; done < <(find /path/ -type f -name "file.pdf")`
Are there any key distinctions between how they operate, especially concerning their environments or contexts?
3 Answers
I had an epiphany while using these loops!
If you want to count lines across several files:
(A) If you're okay with just printing the line count for each file, then piping works great.
(B) But if you need to total all those counts and use a variable to accumulate the total, you’ll run into issues using a piped sequence as it separates each execution into its own environment. For that, you should use the redirection method. So, the choice depends on whether you need to access variables set in the main shell.
Just a heads up: the second snippet uses a named pipe, which needs writable disk access. On most systems, including Linux, it’s handled through a virtual filesystem. For example, using `<(command)` generally creates a temporary file in RAM. So, you’re good as long as you’re on a Linux system.
The big difference is in how these loops run. In the first snippet, every part runs in its own subshell because of the pipeline. This means if it tries to change a variable outside the loop (like setting `a`), it won't affect the main script. For example:
```bash
a=nil
find /path/ -type f -name "file.pdf" | while read -r file; do a=$file; done
echo $a
```
This will always print `nil` because `a` is being set in a subshell, not the main script.
In the second snippet, the while loop runs in the main script's context, so it can properly set `a`. So, it’ll reflect the last file found by `find`. Additionally, you don't have to deal with that subshell overhead, making it faster overall. There's more about this in [BashFAQ/024](https://mywiki.wooledge.org/BashFAQ/024).
I totally get this! It's so frustrating when you forget about that subshell thing. The second loop feels way more straightforward too.
Definitely caught me off guard a few times! Also, the efficiency of the second loop is a big plus.