View Issue Details

IDProjectCategoryView StatusLast Update
0005792CentOS-6kernelpublic2012-06-21 22:36
Reporterrickdic 
PrioritynormalSeveritycrashReproducibilityalways
Status newResolutionopen 
PlatformXeon Quad-coreOSCentOS 6.2 (x86_64)OS Version2.6.32-220.el6
Product Version6.2 
Target VersionFixed in Version 
Summary0005792: call to 'mlock()' hangs system in presence of multiple RT threads
DescriptionWhen multiple RealTime threads are executing on their own, individual CPUs, a call to 'mlock()' from a non-realtime thread on a different CPU will hang. If some of the RealTime threads lower their priority, the call to mlock will complete.

I set the severity of this bug report to 'crash' because this failure results in a hung machine.

See additional information section...
Steps To ReproduceI have created a test case (mlock_test) which demonstrates the problem - this requires a minimum of 4 CPUs. It also requires running in text mode (ie. runlevel 3) due to the RealTime priority of many threads (CPUs).

1. compile mlock_test using:
    gcc -Wall -Werror -o mlock_test mlock_test.c -lpthread
2. open a console window in text mode (i.e. sudo init 3), then execute mlock_test using:
    sudo ./mlock_test
3. wait for the text "aux_main: starting in ..."; this will count down from 5s
4. wait for the text "about to lock..."

- AT THIS POINT, THE THREAD IS HUNG IN 'mlock()'

after a few more seconds, each of the RealTime threads will decrease their priority, displaying the text "... adjusted scheduling to SCHED_OTHER."
5. Once all the RealTime threads are running at normal priority, the 'mlock()' call will complete, displaying the message "memory locked (0x...)"
6. Several allocation/lock messages will be displayed, after which the <ENTER> key can be pressed to exit the application.

See additional information section...
Additional InformationNotes:

1. The test case executes properly on CentOS 5.5 & 5.7; that is, the call to 'mlock()' does NOT hang.

2. The test case fails identically on:
  - Quad-core Xeon (4 CPUs)
  - Dual Quad-core Xeon (8 CPUs)
  - Dual Quad-core Xeon w/Hyperthreads (16 CPUs)

3. Theory of operation:
  - the 'main()' thread creates an 'aux_main()' thread, affinitizes each to CPU 0, then blocks on a barrier, awaiting the 'aux_main' thread completion
  - the 'aux_main()' thread creates several (1 fewer than the CPU count) RealTime threads and affinitizes them to an individual CPU (starting w/highest CPU # and proceeding down to #1). Each of these threads spins, awaiting an 'exit_flag' to be set (via a memory location)
  - after creating the RealTime threads, the 'aux_main()' thread allocates a large block of memory and attempts to 'mlock()' it (where it hangs). It does this the same # of times as there are RealTime threads.

4. If the code is changed such that the RealTime threads are instead created as normally-scheduled threads, then the call to 'mlock()' DOES NOT HANG. This can be demonstrated by changing line 227 to read:
  "if( 0 )"
This prevents the test threads from being elevated to RealTime threads.

5. The amount of memory allocated/locked is large, but I have seen identical results using a memory size of 4Kb.
TagsNo tags attached.

Activities

rickdic

rickdic

2012-06-21 22:36

reporter  

mlock_test.c (10,925 bytes)

Issue History

Date Modified Username Field Change
2012-06-21 22:36 rickdic New Issue
2012-06-21 22:36 rickdic File Added: mlock_test.c