View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0018370 | CentOS-8 | kernel | public | 2021-12-03 22:58 | 2022-01-08 18:55 |
Reporter | stanhu | Assigned To | |||
Priority | urgent | Severity | major | Reproducibility | always |
Status | acknowledged | Resolution | open | ||
Product Version | 8.4.2105 | ||||
Summary | 0018370: copy_file_range() incorrectly copies files with 0 bytes in overlay filesystem | ||||
Description | This bug was introduced the Linux kernel in 5.6 but fixed in 5.11. It seems that CentOS 8.4.2105 also has this bug. As reported in https://lore.kernel.org/stable/CAMBWrQ=1MKxnMT_6Jnqp_xxr7psVywPBJc6p1qCy9ENY8RF2Qw@mail.gmail.com/T/#m88f58a0040f7788da92664bbc5aa26e37e263ced: A number of users have reported that under certain conditions using the overlay filesystem, copy_file_range() can unexpectedly create a 0-byte file. [0] This bug can cause significant problems because applications that copy files expect the target file to match the source immediately after the copy. After upgrading from Linux 5.4 to Linux 5.10, our Docker-based CI tests started failing due to this bug, since Ruby's IO.copy_stream uses this system call. We have worked around the problem by touching the target file before using it, but this shouldn't be necessary. Other projects, such as Rust, have added similar workarounds. [1] As discussed in the linux-fsdevel mailing list [2], the bug appears to be present in Linux 5.6 to 5.10, but not in Linux 5.11. We should be able to cherry-pick the following upstream patches to fix this. Could you cherry-pick them to 5.10.x stable? I've confirmed that these patches, applied from top to bottom to that branch, pass the reproduction test [3]: 82a763e61e2b601309d696d4fa514c77d64ee1be 9b91b6b019fda817eb52f728eb9c79b3579760bc The diffstat: fs/overlayfs/file.c | 59 +++++++++++++++++++++++++++++++---------------------------- 1 file changed, 31 insertions(+), 28 deletions(-) Note that these patches do not pick cleanly into 5.6.x - 5.9.x stable. [0] https://github.com/docker/for-linux/issues/1015 [1] https://github.com/rust-lang/rust/blob/342db70ae4ecc3cd17e4fa6497f0a8d9534ccfeb/library/std/src/sys/unix/kernel_copy.rs#L565-L569 [2] https://marc.info/?l=linux-fsdevel&m=163847383311699&w=2 [3] https://github.com/docker/for-linux/issues/1015#issuecomment-841915668 Per [the latest update on the kernel stable mailing list](https://lore.kernel.org/stable/Yanx6KobwiQoBQfU@kroah.com), the kernel backport fix for 5.10 has been queued for review and should land in the [`stable-5.10.y` branch](https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=linux-5.10.y) soon. To avoid this bug, I'd suggest avoid Linux v5.6.0 - v5.10.83 for now, unless you backport the two patches below: 1. https://github.com/torvalds/linux/commit/82a763e61e2b601309d696d4fa514c77d64ee1be 1. https://github.com/torvalds/linux/commit/9b91b6b019fda817eb52f728eb9c79b3579760bc I've confirmed that 5.5.19 does NOT have this bug, and it was introduced in 5.6.0 via https://github.com/torvalds/linux/commit/1a980b8cbf0059a5308eea61522f232fd03002e2. | ||||
Steps To Reproduce | See https://github.com/docker/for-linux/issues/1015#issuecomment-841915668. As root: 1. Install Docker (https://docs.docker.com/engine/install/centos/) 2. `systemctl start docker.service` 3. `yum install gcc strace` 4. Download `test.c` and `test.sh`. 5. Create some dummy `Gemfile` (e.g. `cp test.c Gemfile`). 6. As root, run `bash test.sh`. ``` [root@stanhu-centos8-test tmp]# cat /etc/centos-release CentOS Linux release 8.4.2105 [root@stanhu-centos8-test tmp]# bash test.sh Sending build context to Docker daemon 45.06kB Step 1/2 : FROM debian:10.8-slim ---> 115566c891d1 Step 2/2 : RUN apt update && apt install -y gcc strace ---> Using cache ---> 369f30fe4781 Successfully built 369f30fe4781 Successfully tagged strace:latest Local: OK Docker - mounted: Copy failed Docker - copied: OK ``` | ||||
Additional Information | I've verified https://github.com/torvalds/linux/commit/1a980b8cbf0059a5308eea61522f232fd03002e2 has been pulled into CentOS 8. From the source package in https://vault.centos.org/8.4.2105/BaseOS/Source/SPackages/kernel-4.18.0-305.25.1.el8_4.src.rpm: ``` [root@stanhu-centos8-test linux-4.18.0-305.25.1.el8_4]# grep ovl_splice fs/overlayfs/file.c static ssize_t ovl_splice_read(struct file *in, loff_t *ppos, ovl_splice_write(struct pipe_inode_info *pipe, struct file *out, .splice_read = ovl_splice_read, .splice_write = ovl_splice_write, ``` `file.c` is not even present in the Linux stable tree! https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/overlayfs?h=linux-4.18.y | ||||
Tags | No tags attached. | ||||
Filed in RedHat Enterprise Linux as well: https://bugzilla.redhat.com/show_bug.cgi?id=2028998 | |
Thanks for the detailed report. However, given the C8 EOL happening in ~3 weeks, chances of getting the fix into the C8 kernel will be virtually null. Hope this gets fixed in RHEL soon now that you've filed this issue in RHBZ. | |
CentOS Linux 8 ended its life on December 31, 2021 and, therefore, is no longer supported. | |