this post was submitted on 09 Apr 2024
314 points (98.8% liked)
Linux
48323 readers
655 users here now
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
How do programs that measure available space like 'lsblk', 'df', 'zfs list' etc see hardlinks and estimate disk space.
If I am trying to manage disk space, does the file system correctly display disk space (for example a zfs list)? Or does it think that I have duplicate files/directories because it can't tell what is a hardlink?
Also, during move operations, zfs dataset migrations, etc... does the hardlinked file continue tracking where the original is? I know it is almost impossible at a system level to discern which is the original.
I'm not super familiar with ZFS so I can't elaborate much on those bits, but hardlinks are just pointers to the same inode number (which is a filesystem's internal identifier for every file). The concept of a hardlink is a file-level concept basically. Commands like
lsblk
,df
etc work on a filesystem level - they don't know or care about the individual files/links etc, instead, they work based off the metadata reported directly by the filesystem. So hardlinks or not, it makes no difference to them.Now this is contrary to how tools like
du
,ncdu
etc work - they work by traversing thru the directories and adding up the actual sizes of the files.du
in particular is clever about it - if one or more hardlinks to a file exists in the same folder, then it's smart enough to count it only once. Other file-level programs may or may not take this into account, so you'll have to verify their behavior.As for move operations, it depends largely on whether the move is within the same filesystem or across filesystems, and the tools or commands used to perform the move.
When a file or directory is moved within the same filesystem, it generally doesn't affect hardlinks in a significant way. The inode remains the same, as do the data blocks on the disk. Only the directory entries pointing to the inode are updated. This means if you move a file that has hardlinks pointing to it within the same filesystem, all the links still point to the same inode, and hence, to the same content. The move operation does not affect the integrity or the accessibility of the hardlinks.
Moving files or directories across different filesystems (including external storage) behaves differently, because each filesystem has its own set of inodes.
The move operation in this scenario is effectively a copy followed by a delete. The file is copied to the target filesystem, which assigns it a new inode, and then the original file is deleted from the source filesystem.
If the file had hardlinks within the original filesystem, these links are not copied to the new filesystem. Instead, they remain as separate entities pointing to the now-deleted file's original content (until it's actually deleted). This means that after the move, the hardlinks in the original filesystem still point to the content that was there before the move, but there's no link between these and the newly copied file in the new filesystem.
I believe hardlinks shouldn't affect zfs migrations as well, since it should preserve the inode and object ID information, as per my understanding.
This really clears things up for me, thanks! I guess I am not so "new" (been using linux for 8 years now), but every article I read on hardlinks just confused me. This is much of a more "layman's" explanation for me!
I believe that zfs has its own disk usage utilities