View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0017482 | CentOS-8 | kernel | public | 2020-06-18 08:04 | 2020-11-06 12:18 |
Reporter | ParkaMark | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | always |
Status | new | Resolution | open | ||
OS | CentOS | OS Version | 8.2.2004 | ||
Summary | 0017482: kworker/0:1+events using 33% CPU usage on a single core | ||||
Description | After upgrading to 8.2.2004 of CentOS 8, I have noted there is now a single kworker process that is continuously utilising around 33% of a single CPU core: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 52 root 20 0 0 0 0 R 32.9 0.0 17:35.57 kworker/0:1+events Googling has revealed this as a known problem for a couple of years now, and from what I can see, users have identified the problem as being excessive ACPI GPE's being generated, which you can view from running a command such as: grep . -r /sys/firmware/acpi/interrupts However, when running the above, all the counters on each GPE entry are 0 on my system, which suggests that there is actually no ACPI GPE's being generated at all. Further more, running "iotop -b -o -t | grep -i kworker" also shows no unusual activity - in fact, the exact kworker process in question ("0:1" in this case) doesn't even show up on the iotop output. I have also added "acpi_enforce_resources=lax" to the Kernel boot params in an attempt to resolve, but not affect. I added this because I noticed that the "f71882fg" temp module was no longer loading after the upgrade, failing with this error: lm_sensors-modprobe-wrapper[2805]: modprobe: ERROR: could not insert 'f71882fg': Device or resource busy and adding "acpi_enforce_resources=lax" has now resolved this. If any further diagnostics are required, please let me know required commands to run and I will update with results when I can. Thank you. | ||||
Tags | kernel bug, kworker | ||||
Using perf, I have managed to take a 10 second snap shot of kernel events - see below. It looks like fwtable_read32 is the culprit. Note this is still happening on the latest kernel upgrade to 4.18.0-193.14.2.el8_2.x86_64 I will continue posting if I discover anything else. Samples: 31K of event 'cycles', Event count (approx.): 9252703616 Children Self Command Shared Object Symbol + 69.11% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] ret_from_fork + 69.11% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] kthread + 69.11% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] worker_thread + 69.11% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] process_one_work + 69.11% 0.01% kworker/1:2+eve [kernel.kallsyms] [k] i915_hpd_poll_init_work + 69.11% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] drm_helper_hpd_irq_event + 69.10% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] drm_helper_probe_detect_ctx + 68.54% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] intel_hdmi_detect + 68.48% 0.01% kworker/1:2+eve [kernel.kallsyms] [k] intel_hdmi_set_edid + 68.46% 0.01% kworker/1:2+eve [kernel.kallsyms] [k] i2c_transfer + 68.46% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] __i2c_transfer + 68.46% 0.01% kworker/1:2+eve [kernel.kallsyms] [k] gmbus_xfer + 66.76% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] bit_xfer + 65.21% 0.59% kworker/1:2+eve [kernel.kallsyms] [k] sclhi + 63.75% 0.84% kworker/1:2+eve [kernel.kallsyms] [k] get_clock + 55.55% 52.60% kworker/1:2+eve [kernel.kallsyms] [k] fwtable_read32 + 49.68% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] try_address + 46.26% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] drm_get_edid + 46.26% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] drm_do_probe_ddc_edid + 33.96% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] i2c_stop + 32.32% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] i2c_outb + 25.11% 0.00% swapper [kernel.kallsyms] [k] secondary_startup_64 + 25.11% 0.03% swapper [kernel.kallsyms] [k] cpu_startup_entry + 25.09% 0.12% swapper [kernel.kallsyms] [k] do_idle + 23.15% 0.01% swapper [kernel.kallsyms] [k] cpuidle_enter + 23.13% 0.18% swapper [kernel.kallsyms] [k] cpuidle_enter_state + 23.07% 0.00% swapper [kernel.kallsyms] [k] start_secondary + 22.21% 0.00% kworker/1:2+eve [kernel.kallsyms] [k] drm_dp_dual_mode_detect + 22.21% 0.01% kworker/1:2+eve [kernel.kallsyms] [k] drm_dp_dual_mode_read + 16.15% 16.13% swapper [kernel.kallsyms] [k] intel_idle + 7.97% 4.63% kworker/1:2+eve [kernel.kallsyms] [k] _raw_spin_lock_irqsave + 6.38% 5.18% kworker/1:2+eve [kernel.kallsyms] [k] fwtable_write32 + 4.45% 0.05% swapper [kernel.kallsyms] [k] irq_exit + 4.34% 0.04% swapper [kernel.kallsyms] [k] __softirqentry_text_start + 3.92% 0.01% swapper [kernel.kallsyms] [k] ret_from_intr + 3.91% 0.02% swapper [kernel.kallsyms] [k] do_IRQ + 3.83% 0.05% swapper [kernel.kallsyms] [k] net_rx_action + 3.65% 3.31% kworker/1:2+eve [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore + 3.48% 0.09% swapper [kernel.kallsyms] [k] igb_poll + 3.32% 0.04% swapper [kernel.kallsyms] [k] __netif_receive_skb_core + 3.16% 0.02% swapper [kernel.kallsyms] [k] ip_rcv + 3.08% 0.01% swapper [kernel.kallsyms] [k] netif_receive_skb_internal ... |
|
Ok, I think this is something to do with the onboard Intel graphics, relating to HDMI. This machine does not get used for graphics at all, and does not even have any HDMI display attached to the system, as it is only a server. - 55.55% 52.60% kworker/1:2+eve [kernel.kallsyms] [k] fwtable_read32 ◆ - 52.60% ret_from_fork ▒ kthread ▒ worker_thread ▒ process_one_work ▒ i915_hpd_poll_init_work ▒ drm_helper_hpd_irq_event ▒ - drm_helper_probe_detect_ctx ▒ - 52.54% intel_hdmi_detect ▒ - 52.54% intel_hdmi_set_edid ▒ - 35.60% drm_get_edid ▒ drm_do_probe_ddc_edid ▒ i2c_transfer ▒ __i2c_transfer ▒ gmbus_xfer ▒ - bit_xfer ▒ - 26.46% try_address ▒ - 17.33% i2c_outb ▒ - 17.08% sclhi ▒ - 16.94% get_clock ▒ fwtable_read32 ▒ - 9.11% i2c_stop ▒ - 9.06% sclhi ▒ - 9.03% get_clock ▒ fwtable_read32 ▒ - 9.11% i2c_stop ▒ - 9.09% sclhi ▒ - 9.08% get_clock ▒ fwtable_read32 ▒ - 16.93% drm_dp_dual_mode_detect ▒ drm_dp_dual_mode_read ▒ i2c_transfer ▒ __i2c_transfer ▒ gmbus_xfer ▒ - bit_xfer ▒ - 12.39% try_address ▒ - 7.77% i2c_outb ▒ - 7.77% sclhi ▒ - 7.76% get_clock ▒ fwtable_read32 ▒ - 4.62% i2c_stop ▒ - 4.59% sclhi ▒ get_clock ▒ fwtable_read32 ▒ - 4.53% i2c_stop ▒ - 4.52% sclhi ▒ - get_clock ▒ fwtable_read32 ▒ - 2.95% fwtable_read32 ▒ 1.68% _raw_spin_unlock_irqrestore ▒ 0.98% _raw_spin_lock_irqsave |
|
I don't know for sure, but I now suspect this issue is the result of a back-ported bug in the i915 driver: https://bugs.freedesktop.org/show_bug.cgi?id=112125 I'm not absolutely certain this is the same bug, as the thread is talking about kworker using 100% rather than my observed 33%, but I'm now confident this a problem with the i915 driver. It would be great if this fix (or the latest i915 version) could be brought into the CentOS 8 4.* kernel as deviating away to a 5.* kernel (instructions here: https://computingforgeeks.com/how-to-install-latest-kernel-on-centos-linux/) may introduce other problems I do not wish to entertain right now. My only other option would be to blacklist/disable the i915 module at boot time: https://access.redhat.com/solutions/41278 |
|
I think I got a same issue. But what I running is Debian "Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64". - 72.80% 68.71% kworker/1:0+eve [kernel.kallsyms] [k] fwtable_read32 ▒ - 68.71% ret_from_fork ▒ kthread ▒ worker_thread ▒ process_one_work ▒ - i915_hotplug_work_func ▒ - 68.69% intel_encoder_hotplug ▒ drm_helper_probe_detect_ctx ▒ - intel_hdmi_detect ▒ - 68.68% intel_hdmi_set_edid ▒ - 50.70% drm_get_edid ▒ drm_do_probe_ddc_edid ▒ i2c_transfer ▒ __i2c_transfer ▒ gmbus_xfer ▒ + bit_xfer ▒ - 17.99% drm_dp_dual_mode_detect ▒ drm_dp_dual_mode_read ▒ i2c_transfer ▒ __i2c_transfer ▒ gmbus_xfer ▒ + bit_xfer ▒ + 4.09% fwtable_read32 ▒ + 62.85% 0.00% kworker/1:0+eve [kernel.kallsyms] [k] drm_get_edid ▒ + 62.85% 0.01% kworker/1:0+eve [kernel.kallsyms] [k] drm_do_probe_ddc_edid ▒ + 58.37% 0.00% kworker/1:0+eve [kernel.kallsyms] [k] try_address ▒ + 46.82% 0.00% kworker/1:0+eve [kernel.kallsyms] [k] i2c_stop ▒ + 38.79% 0.00% kworker/1:0+eve [kernel.kallsyms] [k] i2c_outb ▒ + 23.31% 0.00% kworker/1:0+eve [kernel.kallsyms] [k] drm_dp_dual_mode_detect ▒ + 23.31% 0.00% kworker/1:0+eve [kernel.kallsyms] [k] drm_dp_dual_mode_read ▒ + 10.61% 6.18% kworker/1:0+eve [kernel.kallsyms] [k] _raw_spin_lock_irqsave ▒ + 8.49% 6.89% kworker/1:0+eve [kernel.kallsyms] [k] fwtable_write32 |
|
As a work around, I followed the instructions here: https://access.redhat.com/solutions/41278 for blacklisting the i915 driver. This has now reduced the load average back to zero and the problem is now gone. | |
Date Modified | Username | Field | Change |
---|---|---|---|
2020-06-18 08:04 | ParkaMark | New Issue | |
2020-06-18 08:04 | ParkaMark | Tag Attached: kernel bug | |
2020-06-18 08:04 | ParkaMark | Tag Attached: kworker | |
2020-08-04 14:48 | ParkaMark | Note Added: 0037486 | |
2020-08-04 15:00 | ParkaMark | Note Added: 0037487 | |
2020-08-05 07:55 |
|
File Added: logo.png | |
2020-08-05 16:44 | toracat | File Deleted: logo.png | |
2020-08-05 20:08 | ParkaMark | Note Added: 0037502 | |
2020-09-11 01:52 | fghlw | Note Added: 0037690 | |
2020-11-06 12:18 | ParkaMark | Note Added: 0037852 |