View Issue Details

IDProjectCategoryView StatusLast Update
0015923CentOS-7kernelpublic2019-03-14 15:16
Reporters.kachkin 
PriorityhighSeveritymajorReproducibilitysometimes
Status newResolutionopen 
Platformppc64le OSCentos 7 OS Version3.10.0-957.5.1.e
Product Version7.6.1810 
Target VersionFixed in Version 
Summary0015923: Broadcast hrtimer becomes inactive, performance / power usage issue
DescriptionBroadcast hrtimer becomes inactive intermittently and never restarted. As a result all CPUs never goes to deep idle state.

Impact:
1) performance overhead in HT/SMT environment. Idle LCPU siblings are busy with looping and it impacts LCPUs doing productive work.
2) increased server power usage as CPUs never go to deep idle state.

Dump review notes:

On IDLE machine all processors , except one are looping with following stack (confirmed with perf)

 #0 [c000007ffe0ebe90] cpu_idle_poll at c000000000a87148 <<< --- all processors except one are polling.
 #1 [c000007ffe0ebec0] cpu_startup_entry at c000000000180a28
 #2 [c000007ffe0ebf20] start_secondary at c000000000054c30
 #3 [c000007ffe0ebf90] start_secondary_prolog at c000000000009b6c

All polling processors has force_mask bit set:
crash> p tick_broadcast_force_mask
tick_broadcast_force_mask = $1 =
 {{
    bits = {18446735277616529407, 18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
  }}
crash> eval 18446735277616529407
hexadecimal: fffff7ffffffffff
    decimal: 18446735277616529407 (-8796093022209)
      octal: 1777777577777777777777
     binary: 1111111111111111111101111111111111111111111111111111111111111111
                                                               ^ - cpu 43, broadcast owner.

Checking clock event device for broadcast timer:

crash> struct clock_event_device ce_broadcast_hrtimer
struct clock_event_device {
  event_handler = 0xc0000000001966b0 <tick_handle_oneshot_broadcast>,
  set_next_event = 0x0,
  set_next_ktime = 0xc000000000198750 <bc_set_next>,
  next_event = {
    tv64 = 30539490000000 <------- the timer stopped at '30539490000000'
 ...........
  bound_on = 43, <---- it is bound to CPU 43
.............
}

Checking the CPU 43

struct clock_event_device {
  event_handler = 0xc000000000137450 <hrtimer_interrupt>,
  set_next_event = 0xc000000000024960 <decrementer_set_next_event>,
  set_next_ktime = 0x0,
  next_event = {
    tv64 = 88246684369798 <--- this is where we should be.
...............

The bc hrtimer is missing in CPU timer queue, but i found it and it is inactive:

crash> struct hrtimer c0000000016d3df8
struct hrtimer {
  node = {
    node = {
      __rb_parent_color = 13835058055306100216,
      rb_right = 0xc000000123acd178,
      rb_left = 0x0
    },
    expires = {
      tv64 = 30539490000000 <-- far in the past
    }
  },
  _softexpires = {
    tv64 = 30539490000000
  },
  function = 0xc000000000198670 <bc_handler>,
  base = 0xc000000123acc9c8,
  state = 0, <------ the hrtimer is 'inactive '
  start_pid = 0,
  start_site = 0xc0000000001987b4 <bc_set_next+100>,
  start_comm = "swapper/253\000\000\000\000" <--- bc timer is bound on CPU 43, but we still have stale 'swapper/253'. I guess it was re-started there last time. The problem could happen during hrtimer re-start or migration CPU 253 -> 43.
}

Steps To Reproducethe problem happens intermittently under heavy postgres workload.
Tagskernel bug
abrt_hash
URL

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2019-03-14 15:16 s.kachkin New Issue
2019-03-14 15:16 s.kachkin Tag Attached: kernel bug