I'm looking to create a program that can recursively copy files while showing the progress, including bytes copied and bytes remaining. I plan to first determine the total file size by going through the entire directory tree. To avoid making system calls again during the copy, I want to keep this tree structure in memory for size calculations. However, I'm concerned—if I have a massive number of files, the memory usage could be significant. For example, if each node in the tree takes up about 200 bytes, copying 10 million files could use 2 GB of memory, which seems too much. Are there more efficient methods to handle this? What's typically done in such cases?
1 Answer
Consider using a queue of a reasonable size to separate directory enumeration from the actual file copying. This way, you can update progress in the user interface without holding up the copy process. When the queue is full, you can display that there are 'X files remaining,' which can enhance the user experience.

Decoupling sounds smart! But does that mean you'd have to start copying files while you're still traversing? I guess a queue could limit the number of files in copy at once.