View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0012808||CentOS-7||gnome-shell||public||2017-02-13 10:26||2018-08-16 16:26|
|Target Version||Fixed in Version|
|Summary||0012808: Gnome-Shell freezes after 50 days running|
we are in the process of moving to CentOS as the base platform for our products. After an uptime of around 50 days we noticed on many of these systems that the user interface freezes and is not usable any more. On most systems one can still click on the taskbar but the shell wont switch to the selected application. Sometimes one can still use (parts of) the applications, sometimes they won't react to any input at all. Often it is still possible to open the app menu and start new programs.
Most of the time it is possible to restart the gnome-shell with alt+F2 => r, then everything works again (probably for the next 50 days). But sometimes this won't open the command menu and then everything is frozen.
These systems were all installed with a custom kickstart installation but i could reproduce this bug with a standard installation.
I don't know if that bug appears on 7.3 but i expect it and i will start a test and check it.
|Steps To Reproduce||Make a standard installation of 7.2.1511 (all settings on default). |
Open a terminal ( make a loop that prints the time every minute ) (don't know how important that step is)
wait 50 days
Note: In my test that system was without network connection all the time.
|Additional Information||a 32bit unsigned int timestamp (max 4294967295) with a millisecond precision overruns after 49.71 days.|
I have taken a look at the gnome-shell source code and i found some timestamps that could lead to these problems but i wasn't able to find out, where the problem exactly is.
|Interesting - I also noted that gnome-shell in CentOS 7 became painfully slow after running for several weeks. The problem was there in point released before 7.2.1511 as well but IIRC 7.2.1511 was the last time I tested it - I simply switched to Mate from EPEL instead.|
This is still an issue in latest 7.4. (Or at least 50-day old 7.4, since it takes a long time to reproduce.)
Here are some potentially-interesting log-messages from /var/log/messages:
org.gnome.Shell.desktop: Window manager warning: last_user_time (20691848) is greater than comparison timestamp (4262798381). This most likely represents a buggy client sending inaccurate timestamps in messages such as _NET_ACTIVE_WINDOW. Trying to work around...
org.gnome.Shell.desktop: Window manager warning: 0x1800076 (TITLE_OF_WINDOW) appears to be one of the offending windows with a timestamp of 20691848. Working around...
journal: Couldn't lock screen: Timeout was reached
org.gnome.Shell.desktop: Window manager warning: last_focus_time (20699824) is greater than comparison timestamp (4262798381). This most likely represents a buggy client sending inaccurate timestamps in messages such as _NET_ACTIVE_WINDOW. Trying to work around...
org.gnome.Shell.desktop: Window manager warning: last_user_time (20699655) is greater than comparison timestamp (4262798381). This most likely represents a buggy client sending inaccurate timestamps in messages such as _NET_ACTIVE_WINDOW. Trying to work around...
Finally! Im not the only one facing this issue!
After frustratingly blaming gnome-system-monitor, desktop animations, vmware workstation, heavy I/O and bad ram for this issue, I can confirm that the hang of gnome-shell happens on several of my CentOS 7 installations. The hang is simply time-based - somewhere between 30 and 90 days; will try to narrow down the time.
Switching to a TTY and killing gnome-shell is currently the only fix. It has been noticed on CentOS 7.4, gnome-shell version 3.22.3.
Note that similar symptoms have been noticed on an Ubuntu 14.04 running Gnome 3, but since I have only one machine with that distribution i cannot confirm for sure.
I will be testing on a RedHat 7.5; will revert with the results in 50 days.
|CentOS 7.5 was released about 2 months ago and contains yet another rebase of gnome from 3.22 to 3.26 so it's definitely worth trying this on the latest release. see you in 49.1 days...|
Last week i came back to this thread and motivated by your answers I spent some time on this problem and I found an interesting bug entry with a fix here:
and some background info:
it seems that this fix was commited into the master on 01.feb.2018 which did not arrive in the latest CentOS version, so one has to patch mutter by hand.
I also found a nice tutorial how to easily patch mutter by yourself:
I am going to test it and will report back how it is going.
|That's not a great tutorial for rebuilding. You should grab the SRPM for it, install that, edit the spec file to include your new patch file, rebuild the SRPM from that then feed the SRPM into mock to rebuild it.|
|It's also probably worth raising this on bugzilla.redhat.com and trying to get the distro version patched by them|
I submitted it to bugzilla.redhat.com here:
There were already a bug entries here:
|According to upstream, https://access.redhat.com/errata/RHBA-2018:2461 , this bug has been fixed in mutter-3.26.2-15.el7_5|
|2017-02-13 10:26||d.winter||New Issue|
|2017-02-13 10:26||d.winter||Tag Attached: gnome-shell|
|2017-02-14 12:51||Marcus Sundberg||Note Added: 0028570|
|2018-01-23 15:firstname.lastname@example.org||Note Added: 0031000|
|2018-07-21 16:46||amin.sushant||Note Added: 0032330|
|2018-07-21 17:20||TrevorH||Note Added: 0032331|
|2018-08-06 10:57||d.winter||Note Added: 0032438|
|2018-08-06 11:18||TrevorH||Note Added: 0032439|
|2018-08-06 11:20||TrevorH||Note Added: 0032440|
|2018-08-07 09:43||d.winter||Note Added: 0032451|
|2018-08-16 16:26||pgreco||Status||new => resolved|
|2018-08-16 16:26||pgreco||Resolution||open => fixed|
|2018-08-16 16:26||pgreco||Note Added: 0032511|