this post was submitted on 26 Jan 2024
46 points (96.0% liked)

Linux

48287 readers
627 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

I'm currently watching the progress of a 4tB rsync file transfer, and i'm curious why the speeds are less than the theoretical read/write maximum speeds of the drives involved with the transfer. I know there's a lot that can effect transfer speeds, so I guess i'm not asking why my transfer itself isn't going faster. I'm more just curious what the bottlenecks could be typically?

Assuming a file transfer between 2 physical drives, and:

  • Both drives are internal SATA III drives with ~~5.0GB/s~~ ~~5.0Gb/s read/write~~ 210Mb/s (this was the mistake: I was reading the sata III protocol speed as the disk speed)
  • files are being transferred using a simple rsync command
  • there are no other processes running

What would be the likely bottlenecks? Could the motherboard/processor likely limit the speed? The available memory? Or the file structure of the files themselves (whether they are fragmented on the volumes or not)?

you are viewing a single comment's thread
view the rest of the comments
[–] Max_P@lemmy.max-p.me 18 points 9 months ago (1 children)

SATA III is gigabit, so the max speed is actually 600MB/s.

What filesystem? For example, on my ZFS pool I had to let ZFS use a good chunk of my RAM for it to be able to cache things enough that rsync would max out the throughput.

Rsync doesn't do the files in parallel so at such speeds, the process of open files, read chunks, write chunks, close files, repeat can add up. So you want the kernel to buffer as much of it as possible.

If you look at the disk graphs of both disks, you probably see a read spike, followed by a write spike on the target, instead of a smooth maxed out curve. Then the solution is increasing buffers and caching. Depending on the distro there's a sysctl that may be on by default that limits the size of caches to prevent the "I wrote a 4GB file to my USB stick and now there's 4GB of RAM used for it and it takes hours after finishing the transfer before it's flushed to the stick".

[–] archomrade@midwest.social 4 points 9 months ago* (last edited 9 months ago) (1 children)

SATA III is gigabit, so the max speed is actually 600MB/s.

~~My mistake, though still, a 4tb transfer should take less than 2hr at 5Gb/s (IN THEORY)~~ Thank you @Max_P@lemmy.max-p.me for pointing this out a second time elsewhere: 6Gb/s is what the sata 3 interface is capable of, NOT what the DRIVE is capable of. The marketing material for this drive has clearly psyched me out, the actual transfer speed is 210Mb/s

The filesystem is EXT4 and shared as a SMB... OMV has a fair amount of ram allocated to it, like 16gb or something gratuitous. I'm guessing the way rsync does it's transfers is the culprit, and I honestly can't complain because the integrity of the transfer is crucial.

[–] d3Xt3r@lemmy.nz 2 points 9 months ago* (last edited 9 months ago) (1 children)
[–] archomrade@midwest.social 2 points 9 months ago (2 children)

Thanks, corrected my comment above.

I'm interested in ksmbd... I chose SMB simply because I was using it across lunix/windows/mac devices and I was using OMV for managing it, but that doesn't mean I couldn't switch to something better.

Honestly though, I don't need faster transfers typically, I just happen to be switching out a drive right now. SMB through OMV has been perfectly sufficient otherwise.

[–] d3Xt3r@lemmy.nz 5 points 9 months ago

ksmbd is still SMB, except it's implemented within the Linux kernel. As a result, file transfers speeds are improved greatly compared to pure-Samba which runs only in userspace.

The second thing is, you need to check which SMB protocol you're using, ideally you'd want to use at least SMB 3, anything older than that will be painfully slow.

Finally, I read in your other comment that you're using spinning disks and a USB dock. That adds significant overheads.

The Ironwolf drive benchmarks starting at 250MB/s and slows down to 100MB/s as it reaches the end of the drive. (spinning disks gradually become slower the more full it becomes.) Now add file fragmentation + filesystem overheads (buffers, cluster size allocation etc) and the speeds could go down considerably.

Then there's your SATA > USB dock - no dock would ever reach 5Gbps, that's just false advertising - it's only mentioning the theoretical protocol speed. In reality, you'd be seeing something like below 100MB/s write speeds for 128k sequential writes, but if your block size is smaller, expect far slower writes.

Combine all of the above and you can imagine just how much slower this whole thing can be.

For reference, see this benchmark as an example, to see what's "normal" for a simple file transfer to a blank drive with no fragmentation: https://www.anandtech.com/show/6014/startechcom-usb-30-to-sata-ide-hdd-docking-station-review/3