Make MC faster at copying files withing one HDD: add a large buffer option #2193

mc-butler · 2010-05-13T10:43:06Z

Important

This issue was migrated from Trac:

Origin	https://midnight-commander.org/ticket/2193
Reporter	`birdie` (aros@….com)
Mentions	`gotar@….pl`, `powerman-asdf@….ru`

Currently MC has the same small buffer (64K) for all copy operations regardless their source and destination.

This has the following problem: when you need to copy a small file withing one physical HDD, HDD itself will spend a good chunk of time repositioning heads to read the next tiny portion of data.

I propose to implement a new Copy File(s) dialog option:

[X] Use large buffers

where copy_large_buffer can be defined as an option of the mc.ini file, with a default value of 64MB (it's quite sane for modern PCs).

The text was updated successfully, but these errors were encountered:

mc-butler · 2010-05-13T10:51:34Z

Changed by birdie (aros@….com) on May 13, 2010 at 10:51 UTC (comment 1)

PS This option applies to "Move file(s)" operation when a new destination is the same HDD but a different partition.

mc-butler · 2010-05-13T12:08:22Z

Changed by ossi (@ossilator) on May 13, 2010 at 12:08 UTC (comment 2)

this should be no visible option, as it bothers the user with internal stuff.

mc should feel free to allocate as much buffer memory as it wants as long as it is not an excessive amount of the system's total physical memory (exact formula to be determined). sane allocators will actually return such big allocations to the system when they are freed, so the huge peak memory usage is of no concern.

one concern of huge buffers is abortability and accurate progress information. therefore the algorithm should start with some conservative chunk size (default determined by the media type) and adaptively adjust the size to make the processing of each chunk take about 200ms or so. for media with wildly differing bandwidths (e.g., hdd vs. ftp over dsl), the chunk sizes for reading and writing could also differ significantly. note that the finer chunking does not imply that each read is followed by one write - to achieve the optimization suggested by birdie, one would easily accumulate 16 4mb chunks. determining whether the source and destination live on the same physical media and thus whether a higher interleaving factor should be used is a bit of a challenge, though.

mc-butler · 2010-05-13T13:11:11Z

Changed by birdie (aros@….com) on May 13, 2010 at 13:11 UTC (comment 2.3)

Replying to ossi:

determining whether the source and destination live on the same physical media and thus whether a higher interleaving factor should be used is a bit of a challenge, though.

That's why this time a user selectable option seems like a good way to go :) I'm not sure if POSIX API allows to determine if source and destination reside on the same media.

mc-butler · 2010-05-14T12:42:01Z

Changed by ossi (@ossilator) on May 14, 2010 at 12:42 UTC (comment 3.4)

Replying to birdie:

That's why this time a user selectable option seems like a good way to go :)

"oh, it could be hard. let's do some user-unfriendly crap instead."

I'm not sure if POSIX API allows to determine if source and destination reside on the same media.

first off, let's assume we already have a real file system path (i.e., mcvfs needs to give us one).
then it gets tricky. posix as such will indeed not be enough. i think the most promising approach is querying the mount table (just calling mount and parsing the output) and recursively resolving the mount points to obtain the volumes the files live on. next, one would stat() the two devices and compare the major device ids returned in st_dev. caveats: a) the device id stuff is system-specific, i.e., it means googling for lots of man pages. b) even on linux, FUSE may mess up the detection of the real device. that's a minor problem, though: it's unlikely that huge files which need the above optimization live on a FUSE mount.

mc-butler · 2010-05-14T14:20:49Z

Changed by birdie (aros@….com) on May 14, 2010 at 14:20 UTC (comment 5)

OK, first of all, Total Commander have this option. :)

Secondly, long forgotten Dos Navigator had this options (buried very deeply, but that's not what really matters).

Third of all,

# cd /tmp; mkdir loop; mount -o loop,ro geexbox-1.2.4-en.i386.glibc.iso loop;

# mount | grep loop
/dev/loop0 on /tmp/loop type iso9660 (ro)

Now try to understand, where files from /mnt/loop belong to.

Fourthly, using precalculated RAM size might be extremely dangerous, because let's say we want to use 10% of free RAM, but it may turn out that there's no real free RAM, because what seems to be free RAM is in fact a shared libraries cache (which in Linux is usually shown as cached/free RAM), and thus eating this amount of RAM may lead to swapping or even OOM situation.

mc-butler · 2010-05-14T18:11:42Z

Changed by slyfox (@trofi) on May 14, 2010 at 18:11 UTC (comment 6)

I propose to implement a new Copy File(s) dialog option:

[X] Use large buffers

where copy_large_buffer can be defined as an option of the mc.ini file, with a default value of 64MB (it's quite sane for modern PCs).

My experiments didn't show any timing changes when buffers are larger, than 64KB on most of loads. Just copying 64K bytes is a relatively significant CPU work. Guess it's more, than syscall overhead. Where did you get the '64MB' digit? Do you use 'noop' scheduler on IDE/SATA disk?

If device/filesystem operates on larger data chunks (256KB SSD blocks, ~1MB flash blocks) - it caches data in block layer, so firing one more syscall to get cached data wouldn't matter.

I propose you to write benchmark, which disproves my expectations :]

mc-butler · 2010-05-26T20:15:16Z

Changed by gotar (gotar@….pl) on May 26, 2010 at 20:15 UTC (comment 6.7)

Cc set to gotar@….pl

Replying to slyfox:

I propose you to write benchmark, which disproves my expectations :]

time cp linux-2.6.33.2.tar.bz2 /mnt/
0.00s user 0.31s system 2% cpu 11.619 total

echo 1 > /proc/sys/vm/drop_caches
time cat linux-2.6.33.2.tar.bz2 > /dev/null
0.01s user 0.11s system 1% cpu 10.512 total
time cp linux-2.6.33.2.tar.bz2 /mnt/
0.01s user 0.24s system 69% cpu 0.365 total

Similar results with dd - reading entire file first gives about 5% performance improvement. IMHO it's worth doing for 500M+ files regardles of I/O schedulers and the rest.

mc-butler · 2010-07-05T20:29:17Z

Changed by angel_il (@ilia-maslakov) on Jul 5, 2010 at 20:29 UTC (comment 8)

Milestone changed from 4.7.3 to 4.7

mc-butler · 2011-03-19T08:35:21Z

Changed by birdie (aros@….com) on Mar 19, 2011 at 8:35 UTC (comment 9)

Milestone changed from 4.7 to 4.8

I've now copied a large file (4,6GB) from one partition to another one, using

64K buffer:
real    2m20.418s
user    0m0.087s
sys     0m9.309s

and using 64M buffer:
real    1m54.316s
user    0m0.040s
sys     0m10.503s

So, using a larger buffer for copying files withing one physical HDD disk makes sense (it doesn't apply to SSD disks because their seek time is close to zero).

mc-butler · 2012-04-11T12:56:15Z

Changed by krokous (krokous@….cz) on Apr 11, 2012 at 12:56 UTC (comment 9.10)

Branch state set to no branch

64K buffer:
real 2m20.418s

and using 64M buffer:
real 1m54.316s

64K may be small, but isn't 64M an overkill? What about for example a 1M buffer?
Could be large enough to have neglibigle overhead to 64M buffer, but it will eat much less memory.

Perhaps the size of the buffer can be specified somewhere in advanced config, with some reasonable, though still rather small (512K?) default.

I guess that more benchmarking (on both SSD and HDD) should be done before changing ther default.

mc-butler · 2012-05-28T19:44:52Z

Changed by powerman (powerman-asdf@….ru) on May 28, 2012 at 19:44 UTC (comment 11)

Another use case for this is using 'sync' mount option for usb flash drive (to make it possible to eject flash right after copy file dialog closes, without needs to umount first).

While 'sync' is too slow (and thus unusable) on most filesystems, it works really good on ext4. On my Corsair without 'sync' usual cp speed is 11MB/sec, with 'sync' cp speed is 4.5MB/sec, but mc speed is only 1.5MB/sec. At same time, dd bs=2M speed is 11.5MB/sec (i.e. even faster than cp without 'sync'!).

So, large buffers (1-64MB) for copying files in mc is must have feature!

And keeping in mind this bug is open already for 2 years, I'm really prefer to see this feature implemented with [X]largebuffer checkbox in UI soon, than wait for 3 more years until someone finally figure formula to increase buffer size without checkboxes. :-)

mc-butler · 2012-05-28T19:45:20Z

Changed by powerman (powerman-asdf@….ru) on May 28, 2012 at 19:45 UTC (comment 12)

Cc changed from gotar@….pl to gotar@….pl, powerman-asdf@….ru

mc-butler · 2012-05-28T19:47:27Z

Changed by powerman (powerman-asdf@….ru) on May 28, 2012 at 19:47 UTC (comment 13)

Actually, I can even live with a patch which constantly increase buffer size, if someone will provided it.

mc-butler · 2015-06-18T18:28:23Z

Changed by andrew_b (@aborodin) on Jun 18, 2015 at 18:28 UTC (comment 14)

Milestone changed from 4.8 to Future Releases

mc-butler · 2016-03-25T07:53:56Z

Changed by andrew_b (@aborodin) on Mar 25, 2016 at 7:53 UTC (comment 15)

Ticket #3624 has been marked as a duplicate of this ticket.

mc-butler · 2016-04-06T11:40:39Z

Changed by andrew_b (@aborodin) on Apr 6, 2016 at 11:40 UTC (comment 16)

Branch state changed from no branch to on review
Owner set to andrew_b
Status changed from new to accepted

Branch: 2193_copy_buffer_size
Initial [d63f6da04d315703e3ffced79431d6dcde2019bd]

The Coreutils way is used: the buffer size is based on block size of destination file system.

mc-butler · 2016-04-06T20:24:54Z

Changed by zaytsev (@zyv) on Apr 6, 2016 at 20:24 UTC (comment 17)

Oh wow, very cool, I'll try to have a look!

mc-butler · 2016-04-07T07:51:10Z

Changed by birdie (aros@….com) on Apr 7, 2016 at 7:51 UTC (comment 18)

In a perfect world, MC should use at least three threads for copying/moving files:

One thread to write to a ring buffer;
One thread to read from a ring buffer;
One thread to show progress every X seconds (for instance, 0.3 seconds).

Right now MC can be slow at copying for a different reason: because it spends too much time updating the screen.

mc-butler · 2016-04-25T10:32:05Z

Changed by andrew_b (@aborodin) on Apr 25, 2016 at 10:32 UTC (comment 19)

Branch state changed from on review to approved
Milestone changed from Future Releases to 4.8.17
Votes set to andrew_b

mc-butler · 2016-04-25T10:34:08Z

Changed by andrew_b (@aborodin) on Apr 25, 2016 at 10:34 UTC (comment 20)

Votes changed from andrew_b to committed-master
Branch state changed from approved to merged
Status changed from accepted to testing
Resolution set to fixed

Merged to master: [7b928e6].

git log --pretty=oneline 5ba9789..7b928e6

mc-butler · 2016-04-25T10:35:37Z

Changed by andrew_b (@aborodin) on Apr 25, 2016 at 10:35 UTC (comment 21)

Status changed from testing to closed

mc-butler closed this as completed Apr 25, 2016

mc-butler assigned aborodin Feb 27, 2025

mc-butler marked this as a duplicate of #3624 Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make MC faster at copying files withing one HDD: add a large buffer option #2193

Make MC faster at copying files withing one HDD: add a large buffer option #2193

mc-butler commented May 13, 2010

mc-butler commented May 13, 2010

mc-butler commented May 13, 2010

mc-butler commented May 13, 2010

mc-butler commented May 14, 2010

mc-butler commented May 14, 2010

mc-butler commented May 14, 2010

mc-butler commented May 26, 2010

mc-butler commented Jul 5, 2010

mc-butler commented Mar 19, 2011

mc-butler commented Apr 11, 2012

mc-butler commented May 28, 2012

mc-butler commented May 28, 2012

mc-butler commented May 28, 2012

mc-butler commented Jun 18, 2015

mc-butler commented Mar 25, 2016

mc-butler commented Apr 6, 2016

mc-butler commented Apr 6, 2016

mc-butler commented Apr 7, 2016

mc-butler commented Apr 25, 2016

mc-butler commented Apr 25, 2016

mc-butler commented Apr 25, 2016

Make MC faster at copying files withing one HDD: add a large buffer option #2193

Make MC faster at copying files withing one HDD: add a large buffer option #2193

Comments

mc-butler commented May 13, 2010

mc-butler commented May 13, 2010

mc-butler commented May 13, 2010

mc-butler commented May 13, 2010

mc-butler commented May 14, 2010

mc-butler commented May 14, 2010

mc-butler commented May 14, 2010

mc-butler commented May 26, 2010

mc-butler commented Jul 5, 2010

mc-butler commented Mar 19, 2011

mc-butler commented Apr 11, 2012

mc-butler commented May 28, 2012

mc-butler commented May 28, 2012

mc-butler commented May 28, 2012

mc-butler commented Jun 18, 2015

mc-butler commented Mar 25, 2016

mc-butler commented Apr 6, 2016

mc-butler commented Apr 6, 2016

mc-butler commented Apr 7, 2016

mc-butler commented Apr 25, 2016

mc-butler commented Apr 25, 2016

mc-butler commented Apr 25, 2016