• Valentin Schneider's avatar
    sched/topology: Assert non-NUMA topology masks don't (partially) overlap · ccf74128
    Valentin Schneider authored
    topology.c::get_group() relies on the assumption that non-NUMA domains do
    not partially overlap. Zeng Tao pointed out in [1] that such topology
    descriptions, while completely bogus, can end up being exposed to the
    scheduler.
    
    In his example (8 CPUs, 2-node system), we end up with:
      MC span for CPU3 == 3-7
      MC span for CPU4 == 4-7
    
    The first pass through get_group(3, sdd@MC) will result in the following
    sched_group list:
    
      3 -> 4 -> 5 -> 6 -> 7
      ^                  /
       `----------------'
    
    And a later pass through get_group(4, sdd@MC) will "corrupt" that to:
    
      3 -> 4 -> 5 -> 6 -> 7
           ^             /
    	`-----------'
    
    which will completely break things like 'while (sg != sd->groups)' when
    using CPU3's base sched_domain.
    
    There already are some architecture-specific checks in place such as
    x86/kernel/smpboot.c::topology.sane(), but this is something we can detect
    in the core scheduler, so it seems worthwhile to do so.
    
    Warn and a...
    ccf74128
topology.c 57.9 KB