1. 04 Oct, 2022 2 commits
    • Ian Rogers's avatar
      perf topology: Add core_wide · cc2c4e26
      Ian Rogers authored
      
      It is possible to optimize metrics when all SMT threads (CPUs) on a
      core are measuring events in system wide mode. For example, TMA
      metrics defines CORE_CLKS for Sandybrdige as:
      
      if SMT is disabled:
        CPU_CLK_UNHALTED.THREAD
      if SMT is enabled and recording on all SMT threads:
        CPU_CLK_UNHALTED.THREAD_ANY / 2
      if SMT is enabled and not recording on all SMT threads:
        (CPU_CLK_UNHALTED.THREAD/2)*
        (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK )
      
      That is two more events are necessary when not gathering counts on all
      SMT threads. To distinguish all SMT threads on a core vs system wide
      (all CPUs) call the new property core wide.  Add a core wide test that
      determines the property from user requested CPUs, the topology and
      system wide. System wide is required as other processes running on a
      SMT thread will change the counts.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ahmad Yasin <ahmad.yasin@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220831174926.579643-5-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cc2c4e26
    • Ian Rogers's avatar
      perf smt: Compute SMT from topology · 09b73fe9
      Ian Rogers authored
      
      The topology records sibling threads. Rather than computing SMT using
      siblings in sysfs, reuse the values in topology. This only applies
      when the file smt/active isn't available.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ahmad Yasin <ahmad.yasin@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220831174926.579643-4-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      09b73fe9
  2. 22 Jan, 2022 1 commit
    • Ian Rogers's avatar
      perf cpumap: Migrate to libperf cpumap api · 44028699
      Ian Rogers authored
      
      Switch from directly accessing the perf_cpu_map to using the appropriate
      libperf API when possible. Using the API simplifies the job of
      refactoring use of perf_cpu_map.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: http://lore.kernel.org/lkml/20220122045811.3402706-3-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44028699
  3. 14 Jan, 2022 1 commit
    • Thomas Richter's avatar
      perf cputopo: Fix CPU topology reading on s/390 · a6e62743
      Thomas Richter authored
      Commit fdf1e29b ("perf expr: Add metric literals for topology.")
      fails on s390:
      
       # ./perf test -Fv 7
         ...
       # FAILED tests/expr.c:173 #num_dies >= #num_packages
         ---- end ----
         Simple expression parser: FAILED!
       #
      
      Investigating this issue leads to these functions:
       build_cpu_topology()
         +--> has_die_topology(void)
              {
                 struct utsname uts;
      
                 if (uname(&uts) < 0)
                        return false;
                 if (strncmp(uts.machine, "x86_64", 6))
                        return false;
                 ....
              }
      
      which always returns false on s390. The caller build_cpu_topology()
      checks has_die_topology() return value. On false the
      the struct cpu_topology::die_cpu_list is not contructed and has zero
      entries. This leads to the failing comparison: #num_dies >= #num_packages.
      s390 of course has a positive number of packages.
      
      Fix this by adding s390 architecture to support CPU die list.
      
      Output after:
       # ./perf test -Fv 7
        7: Simple expression parser                                        :
        --- start ---
        division by zero
        syntax error
        ---- end ----
        Simple expression parser: Ok
       #
      
      Fixes: fdf1e29b
      
       ("perf expr: Add metric literals for topology.")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20211124090343.9436-1-tmricht@linux.ibm.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a6e62743
  4. 12 Jan, 2022 2 commits
    • Ian Rogers's avatar
      perf cpumap: Give CPUs their own type · 6d18804b
      Ian Rogers authored
      
      A common problem is confusing CPU map indices with the CPU, by wrapping
      the CPU with a struct then this is avoided. This approach is similar to
      atomic_t.
      
      Committer notes:
      
      To make it build with BUILD_BPF_SKEL=1 these files needed the
      conversions to 'struct perf_cpu' usage:
      
        tools/perf/util/bpf_counter.c
        tools/perf/util/bpf_counter_cgroup.c
        tools/perf/util/bpf_ftrace.c
      
      Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
      cpu_map__build_map to cpu function".
      
      Additionally these needed to be fixed for the ARM builds to complete:
      
        tools/perf/arch/arm/util/cs-etm.c
        tools/perf/arch/arm64/util/pmu.c
      Suggested-by: default avatarJohn Garry <john.garry@huawei.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6d18804b
    • Ian Rogers's avatar
      perf cpumap: Move 'has' function to libperf · dfc66bef
      Ian Rogers authored
      
      Make the cpu map argument const for consistency with the rest of the
      API. Modify cpu_map__idx accordingly.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-21-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dfc66bef
  5. 13 Nov, 2021 3 commits
  6. 17 May, 2021 1 commit
    • Jin Yao's avatar
      perf header: Support HYBRID_TOPOLOGY feature · f7d74ce3
      Jin Yao authored
      
      It is useful to let the user know about the hybrid topology.
      
      Add the HYBRID_TOPOLOGY feature in header to indicate the core CPUs
      and the atom CPUs.
      
      With this patch a perf.data generated on a hybrid platform reports
      the hybrid CPU list:
      
        root@otcpl-adl-s-2:~# perf report --header-only -I
        ...
        # hybrid cpu system:
        # cpu_core cpu list : 0-15
        # cpu_atom cpu list : 16-23
      
      For a perf.data generated on a non-hybrid platform, reports a message
      that HYBRID_TOPOLOGY is missing:
      
        root@kbl-ppc:~# perf report --header-only -I
        ...
        # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210514122948.9472-2-yao.jin@linux.intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f7d74ce3
  7. 22 Aug, 2019 1 commit
  8. 29 Jul, 2019 3 commits
  9. 09 Jul, 2019 3 commits
  10. 10 Jun, 2019 2 commits
  11. 19 Feb, 2019 3 commits