View Issue Details

IDProjectCategoryView StatusLast Update
0016851CentOS-7kernelpublic2019-12-20 19:36
Reporterkp 
PrioritynormalSeverityminorReproducibilityrandom
Status newResolutionopen 
Product Version7.4.1708 
Target VersionFixed in Version 
Summary0016851: Observed multiple random occurrences root file system entered into read-only-state. System locked up.
DescriptionStarting with CentOS 7.1, we began to observe write related system-lockup issues at product sites.

The problem has been observed and represented by the following comments. We are not currently sure how to reproduce the problem.
This report supports our requirement to inform CentOS of the issue, as we dig further.
This has been elusive for almost a year now, resulting in no concrete reproduction method as of yet.

- System enters into a state where root file system becomes read only.
- System either freezes or does not wake-up (assuming since no write), making visual display invalid.
- Network access remains available, we are able to execute remote commands via SSH w/ users that do not attempt to read/write to disk in env file (bashrc/cshrc/etc).
- Forceful reboot is the only way to recover.
- Once rebooted, random files have been observed to be truncated to 0 bytes. The inode/datablock tie is completely removed from the disk structure, as if it never existed.
- Processes remain active, and when killed (via ssh command method) become 'defunct' forever, and do not quit. It seems that the /proc file system cannot be written to, causing the application to appear hung forever, since the PID are never removed.

When the system was in this state, i was able to examine files via very simple commands using 'ssh <user@ip> <command' action, such as 'ls, cat, ps'. Any IO action that attempted to write to disk hung indefinitely. Any more complex or disk write command would fail. IE: echo "HELLO" > /var/tmp/output hung forever.

Steps To ReproduceUnknown at this time. We have observed this issue 5x at our 30+ sites over the last year. It has occurred twice locally during testing in the last year.
Additional InformationI am suspect of our use of RSYNC (3.0.9) software to copy files from network mounts and submit information to our 'gateway' data storage system is contributing to this.

To describe how the system uses RSYNC :

- CentOS is running on our main 'hub' computer.
- This hub machine mounts various folders from external computers to a folder in /mnt.
- RSYNC is used to copy files from the mount folder to a daily log folder.
- RSYNC is used to push files from this log folder to our data storage computer, called the 'gateway'.

Interestingly enough, the only folder that was entirely unreadable, meaning no listing access, was the daily log folder. The folder with incoming and outgoing RSYNC ties.
This is especially tricky, since when it occurs, no system logging is possible and the machine must be forcefully rebooted.

I am interested in any reports of similar behavior, and will continue looking into concrete methods to reproduce this problem.
TagsNo tags attached.
abrt_hash
URL

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2019-12-20 19:36 kp New Issue