How can I run a command in Bash for 25,000 files?

0
0
Asked By BlueSky4u2 On

I'm trying to use an executable called `process_data` to handle some data files. I can easily run commands for a few files like this: `process_data --foo "bar" /path/to/file1 /path/to/file2 /path/to/file3 &> process.log`. This works well, but I have about 25,000 files to process at once, and the command exceeds the limits for a single argument. I attempted using `find ..../data/ -path subfolder_* -name *.dat -print0 | xargs -0 process_data --foo "bar" &> process.log`, but that didn't work either due to how `process_data` is set up—I need to provide all file locations at once. I'm concerned that I'm hitting an output limit with `xargs`, which is why files from `subfolder_z` may be missing. Any ideas on how to run `process_data` with this many files?

2 Answers

Answered By TechieGal99 On

If `process_data` can process files independently, a workaround might be to write a small script that divides the 25,000 files into batches of 100 and runs `process_data` on each batch. That way you can handle all files without exceeding command limits.

Answered By WittyCoder79 On

Have you considered modifying `process_data` to accept input files via standard input instead of through command line arguments? It could help manage the file limits more efficiently.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.