Programming

How can I optimize my Bash script that uses ‘find’ for processing large file lists?

January 10, 2026

Asked By CuriousCoder123 On January 10, 2026

I'm working on a Bash script that processes a huge number of files, specifically those named "output*.txt". The script currently removes an old result file and then uses the 'find' command to execute a series of commands on each file. My goal is to extract the 6th last and 3rd last line from each file, grabbing the second column from these lines (which is always an integer). However, the current approach creates a new shell for each file, which seems to slow things down. I'm curious if there's a way to improve the performance by loading the file list into a variable, or perhaps using a loop instead of 'find -exec'. Although it only takes about a minute to run, I'm looking for ways to streamline this process. Here's the relevant script:

#!/usr/bin/bash
rm -fv plot.dat
find . -iname "output*.txt" -exec sh -c '
BASE=$(tail -6 < {} | head -n 1 | cut -d " " -f 2)
FAKE=$(tail -3 > plot.dat
' {} ;
sort -k1 -n < plot.dat
echo "All done"

3 Answers

Answered By TailGuru88 On January 11, 2026

I suggest letting tail read the files directly instead of piping the input from stdin. Also, if the order of processing doesn't matter, you could explore `xargs` or `GNU parallel` to speed things up. They can both handle multiple files at once and potentially cut down your runtime.

Answered By LoopLover42 On January 10, 2026

Why not just store the output of the find command in a variable? You could do something like this:

`files=$(find . -name "output*.txt")`

And then process each file with a for loop:

`for file in $files; do ... done`
It simplifies handling the files without spawning new shells each time!

Answered By ScriptingNerd79 On January 10, 2026

You can streamline the process by combining 'find' and 'awk'. Instead of spawning a shell for every file, try this approach:

`find . -name "output*.txt" -exec tail -q -n 6 {} + |
awk 'NR % 6 == 1 { base_val=$2 } NR % 6 == 4 { print base_val, $2 }' >> plot.dat`
This way, you keep everything more efficient, as the `-q` option prevents file headers from appearing, giving you a continuous stream of data to work with.

How can I optimize my Bash script that uses ‘find’ for processing large file lists?

3 Answers

Related Questions

How To: Running Codex CLI on Windows with Azure OpenAI

Set Wordpress Featured Image Using Javascript

How To Fix PHP Random Being The Same

Why no WebP Support with Wordpress

Replace Wordpress Cron With Linux Cron

Customize Yoast Canonical URL Programmatically

LEAVE A REPLY Cancel reply