View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0016851||CentOS-7||kernel||public||2019-12-20 19:36||2019-12-20 19:36|
|Target Version||Fixed in Version|
|Summary||0016851: Observed multiple random occurrences root file system entered into read-only-state. System locked up.|
|Description||Starting with CentOS 7.1, we began to observe write related system-lockup issues at product sites.|
The problem has been observed and represented by the following comments. We are not currently sure how to reproduce the problem.
This report supports our requirement to inform CentOS of the issue, as we dig further.
This has been elusive for almost a year now, resulting in no concrete reproduction method as of yet.
- System enters into a state where root file system becomes read only.
- System either freezes or does not wake-up (assuming since no write), making visual display invalid.
- Network access remains available, we are able to execute remote commands via SSH w/ users that do not attempt to read/write to disk in env file (bashrc/cshrc/etc).
- Forceful reboot is the only way to recover.
- Once rebooted, random files have been observed to be truncated to 0 bytes. The inode/datablock tie is completely removed from the disk structure, as if it never existed.
- Processes remain active, and when killed (via ssh command method) become 'defunct' forever, and do not quit. It seems that the /proc file system cannot be written to, causing the application to appear hung forever, since the PID are never removed.
When the system was in this state, i was able to examine files via very simple commands using 'ssh <user@ip> <command' action, such as 'ls, cat, ps'. Any IO action that attempted to write to disk hung indefinitely. Any more complex or disk write command would fail. IE: echo "HELLO" > /var/tmp/output hung forever.
|Steps To Reproduce||Unknown at this time. We have observed this issue 5x at our 30+ sites over the last year. It has occurred twice locally during testing in the last year.|
|Additional Information||I am suspect of our use of RSYNC (3.0.9) software to copy files from network mounts and submit information to our 'gateway' data storage system is contributing to this.|
To describe how the system uses RSYNC :
- CentOS is running on our main 'hub' computer.
- This hub machine mounts various folders from external computers to a folder in /mnt.
- RSYNC is used to copy files from the mount folder to a daily log folder.
- RSYNC is used to push files from this log folder to our data storage computer, called the 'gateway'.
Interestingly enough, the only folder that was entirely unreadable, meaning no listing access, was the daily log folder. The folder with incoming and outgoing RSYNC ties.
This is especially tricky, since when it occurs, no system logging is possible and the machine must be forcefully rebooted.
I am interested in any reports of similar behavior, and will continue looking into concrete methods to reproduce this problem.
|Tags||No tags attached.|