1. 16 Nov, 2022 5 commits
    • Ian Rogers's avatar
      perf build: Install libsymbol locally when building · 84bec6f0
      Ian Rogers authored
      
      The perf build currently has a '-Itools/lib' on the CC command
      line. This causes issues as the libapi, libsubcmd, libtraceevent,
      libbpf and libsymbol headers are all found via this path, making it
      impossible to override include behavior.
      
      Change the libsymbol build mirroring the libbpf, libsubcmd, libapi,
      libperf and libtraceevent build, so that it is installed in a directory
      along with its headers.
      
      A later change will modify the include behavior.  Don't build kallsyms.o
      as part of util as this will lead to duplicate definitions. Add
      kallsym's directory to the MANIFEST rather than individual files, so
      that the Build and Makefile are added to a source tar ball.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221109184914.1357295-11-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      84bec6f0
    • Ian Rogers's avatar
      perf build: Install libtraceevent locally when building · ef019df0
      Ian Rogers authored
      
      The perf build currently has a '-Itools/lib' on the CC command
      line. This causes issues as the libapi, libsubcmd, libtraceevent,
      libbpf headers are all found via this path, making it impossible to
      override include behavior.
      
      Change the libtraceevent build mirroring the libbpf, libsubcmd, libapi
      and libperf build, so that it is installed in a directory along with its
      headers. A later change will modify the include behavior.
      
      Similarly, the plugins are now installed into libtraceevent_plugins
      except they have no header files.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221109184914.1357295-7-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ef019df0
    • Ian Rogers's avatar
      perf build: Install libperf locally when building · 91009a3a
      Ian Rogers authored
      
      The perf build currently has a '-Itools/lib' on the CC command
      line. This causes issues as the libapi, libsubcmd, libtraceevent,
      libbpf headers are all found via this path, making it impossible to
      override include behavior.
      
      Change the libperf build mirroring the libbpf, libsubcmd and libapi
      build, so that it is installed in a directory along with its headers. A
      later change will modify the include behavior.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221109184914.1357295-6-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      91009a3a
    • Ian Rogers's avatar
      perf build: Install libapi locally when building · 00314c9b
      Ian Rogers authored
      
      The perf build currently has a '-Itools/lib' on the CC command line.
      This causes issues as the libapi, libsubcmd, libtraceevent, libbpf
      headers are all found via this path, making it impossible to override
      include behavior.
      
      Change the libapi build mirroring the libbpf and libsubcmd build, so
      that it is installed in a directory along with its headers. A later
      change will modify the include behavior.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221109184914.1357295-5-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00314c9b
    • Ian Rogers's avatar
      perf build: Install libsubcmd locally when building · 911920b0
      Ian Rogers authored
      
      The perf build currently has a '-Itools/lib' on the CC command
      line. This causes issues as the libapi, libsubcmd, libtraceevent,
      libbpf headers are all found via this path, making it impossible to
      override include behavior. Change the libsubcmd build mirroring the
      libbpf build, so that it is installed in a directory along with its
      headers. A later change will modify the include behavior.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221109184914.1357295-4-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      911920b0
  2. 04 Nov, 2022 1 commit
  3. 06 Oct, 2022 2 commits
  4. 08 Sep, 2022 1 commit
    • Jiri Slaby's avatar
      perf tools: Don't install data files with x permissions · 0a9eaf61
      Jiri Slaby authored
      install(1), by default, installs with rwxr-xr-x permissions. Modify
      perf's Makefile to pass '-m 644' when installing:
      
        * Documentation/tips.txt
        * examples/bpf/*
        * perf-completion.sh
        * perf_dlfilter.h header
        * scripts/perl/Perf-Trace-Util/lib/Perf/Trace/*
        * scripts/perl/*.pl
        * tests/attr/*
        * tests/attr.py
        * tests/shell/lib/*.sh
        * trace/strace/groups/*
      
      All those are supposed to be non-executable. Either they are not scripts
      at all, or they don't have shebang.
      
      Signed-off-by: <jslaby@suse.cz>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220908060426.9619-1-jslaby@suse.cz
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a9eaf61
  5. 10 Aug, 2022 1 commit
    • Claire Jensen's avatar
      perf test: JSON format checking · 0c343af2
      Claire Jensen authored
      
      Add field checking tests for perf stat JSON output.
      
      Sanity checks the expected number of fields are present, that the
      expected keys are present and they have the correct values.
      
      Committer notes:
      
      Had to fix this:
      
        -               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \
        +               $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
      
      Committer testing:
      
        [root@quaco ~]# perf test json
         90: perf stat JSON output linter                                    : Ok
        [root@quaco ~]# set -o vi
        [root@quaco ~]# perf test -v json
         90: perf stat JSON output linter                                    :
        --- start ---
        test child forked, pid 560794
        Checking json output: no args [Success]
        Checking json output: system wide [Success]
        Checking json output: system wide Checking json output: system wide no aggregation [Success]
        Checking json output: interval [Success]
        Checking json output: event [Success]
        Checking json output: per core [Success]
        Checking json output: per thread [Success]
        Checking json output: per die [Success]
        Checking json output: per node [Success]
        Checking json output: per socket [Success]
        test child finished with 0
        ---- end ----
        perf stat JSON output linter: Ok
        [root@quaco ~]#
      Signed-off-by: default avatarClaire Jensen <cjense@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alyssa Ross <hi@alyssa.is>
      Cc: Claire Jensen <clairej735@gmail.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com
      
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0c343af2
  6. 01 Aug, 2022 1 commit
    • Namhyung Kim's avatar
      perf lock: Use BPF for lock contention analysis · 407b36f6
      Namhyung Kim authored
      
      Add -b/--use-bpf option to use BPF to collect lock contention stats.
      For simplicity it now runs system-wide and requires C-c to stop.
      Upcoming changes will add the usual filtering.
      
        $ sudo perf lock con -b
        ^C
         contended   total wait     max wait     avg wait         type   caller
      
                42    192.67 us     13.64 us      4.59 us     spinlock   queue_work_on+0x20
                23     85.54 us     10.28 us      3.72 us     spinlock   worker_thread+0x14a
                 6     13.92 us      6.51 us      2.32 us        mutex   kernfs_iop_permission+0x30
                 3     11.59 us     10.04 us      3.86 us        mutex   kernfs_dop_revalidate+0x3c
                 1      7.52 us      7.52 us      7.52 us     spinlock   kthread+0x115
                 1      7.24 us      7.24 us      7.24 us     rwlock:W   sys_epoll_wait+0x148
                 2      7.08 us      3.99 us      3.54 us     spinlock   delayed_work_timer_fn+0x1b
                 1      6.41 us      6.41 us      6.41 us     spinlock   idle_balance+0xa06
                 2      2.50 us      1.83 us      1.25 us        mutex   kernfs_iop_lookup+0x2f
                 1      1.71 us      1.71 us      1.71 us        mutex   kernfs_iop_getattr+0x2c
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      407b36f6
  7. 26 Jul, 2022 1 commit
    • Yang Jihong's avatar
      perf kwork: Implement BPF trace · daf07d22
      Yang Jihong authored
      
      'perf record' generates perf.data, which generates extra interrupts
      for hard disk, amount of data to be collected increases with time.
      
      Using eBPF trace can process the data in kernel, which solves the
      preceding two problems.
      
      Add -b/--use-bpf option for latency and report to support
      tracing kwork events using eBPF:
      
      1. Create bpf prog and attach to tracepoints,
      2. Start tracing after command is entered,
      3. After user hit "ctrl+c", stop tracing and report,
      4. Support CPU and name filtering.
      
      This commit implements the framework code and
      does not add specific event support.
      
      Test cases:
      
        # perf kwork rep -h
      
         Usage: perf kwork report [<options>]
      
            -b, --use-bpf         Use BPF to measure kwork runtime
            -C, --cpu <cpu>       list of cpus to profile
            -i, --input <file>    input file name
            -n, --name <name>     event name to profile
            -s, --sort <key[,key2...]>
                                  sort by key(s): runtime, max, count
            -S, --with-summary    Show summary with statistics
                --time <str>      Time span for analysis (start,stop)
      
        # perf kwork lat -h
      
         Usage: perf kwork latency [<options>]
      
            -b, --use-bpf         Use BPF to measure kwork latency
            -C, --cpu <cpu>       list of cpus to profile
            -i, --input <file>    input file name
            -n, --name <name>     event name to profile
            -s, --sort <key[,key2...]>
                                  sort by key(s): avg, max, count
                --time <str>      Time span for analysis (start,stop)
      
        # perf kwork lat -b
        Unsupported bpf trace class irq
      
        # perf kwork rep -b
        Unsupported bpf trace class irq
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
      
      
      [ Simplify work_findnew() ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      daf07d22
  8. 29 Jun, 2022 3 commits
    • Ian Rogers's avatar
      perf jevents: Remove jevents.c · 5a059790
      Ian Rogers authored
      
      Remove files and build rules.
      
      Remove test for comparing with jevents.py as there is no longer a binary
      to compare with.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Liu <liuqi115@huawei.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220629182505.406269-5-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5a059790
    • Ian Rogers's avatar
      perf jevents: Switch build to use jevents.py · 00facc76
      Ian Rogers authored
      
      Generate pmu-events.c using jevents.py rather than the binary built from
      jevents.c.
      
      Add a new config variable NO_JEVENTS that is set when there is no
      architecture json or an appropriate python interpreter isn't present.
      
      When NO_JEVENTS is defined the file pmu-events/empty-pmu-events.c is
      copied and used as the pmu-events.c file.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Ian Rogers <rogers.email@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Liu <liuqi115@huawei.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220629182505.406269-4-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00facc76
    • Ian Rogers's avatar
      perf jevents: Add python converter script · ffc606ad
      Ian Rogers authored
      
      jevents.c is large, has a dependency on an old forked version of jsmn,
      and is challenging to work upon. A lot of jevents.c's complexity comes
      from needing to write json and csv parsing from first principles. In
      contrast python has this functionality in standard libraries and is
      already a build pre-requisite for tools like asciidoc (that builds all
      of the perf man pages).
      
      Introduce jevents.py that produces identical output to jevents.c. Add a
      test that runs both converter tools and validates there are no output
      differences. The test can be invoked with a phony build target like:
      
        $ make -C tools/perf jevents-py-test
      
      The python code deliberately tries to replicate the behavior of
      jevents.c so that the output matches and transitioning tools shouldn't
      introduce regressions. In some cases the code isn't as elegant as hoped,
      but fixing this can be done as follow up.
      
      Committer testing:
      
        $ make -C tools/perf jevents-py-test
        make: Entering directory '/var/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j32' parallel build
          HOSTCC  fixdep.o
          HOSTLD  fixdep-in.o
          LINK    fixdep
      
        Auto-detecting system features:
        ...                         dwarf: [ on  ]
        ...            dwarf_getlocations: [ on  ]
        ...                         glibc: [ on  ]
        ...                        libbfd: [ on  ]
        ...                libbfd-buildid: [ on  ]
        ...                        libcap: [ on  ]
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ on  ]
        ...                       libperl: [ on  ]
        ...                     libpython: [ on  ]
        ...                     libcrypto: [ OFF ]
        ...                     libunwind: [ on  ]
        ...            libdw-dwarf-unwind: [ on  ]
        ...                          zlib: [ on  ]
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ on  ]
        ...                           bpf: [ on  ]
        ...                        libaio: [ on  ]
        ...                       libzstd: [ on  ]
        ...        disassembler-four-args: [ on  ]
      
          HOSTCC  pmu-events/json.o
          HOSTCC  pmu-events/jsmn.o
          HOSTCC  pmu-events/jevents.o
          HOSTLD  pmu-events/jevents-in.o
          LINK    pmu-events/jevents
        Checking architecture: arm64
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: nds32
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: powerpc
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: s390
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        Checking architecture: x86
        Generating using jevents.c
        Generating using jevents.py
        Diffing
        make: Leaving directory '/var/home/acme/git/perf/tools/perf'
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarJohn Garry <john.garry@huawei.com>
      Tested-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Tested-by: default avatarXing Zhengjun <zhengjun.xing@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
      Cc: Like Xu <likexu@tencent.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Liu <liuqi115@huawei.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20220629182505.406269-3-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ffc606ad
  9. 26 May, 2022 1 commit
    • Namhyung Kim's avatar
      perf record: Enable off-cpu analysis with BPF · edc41a10
      Namhyung Kim authored
      
      Add --off-cpu option to enable the off-cpu profiling with BPF.  It'd
      use a bpf_output event and rename it to "offcpu-time".  Samples will
      be synthesized at the end of the record session using data from a BPF
      map which contains the aggregated off-cpu time at context switches.
      So it needs root privilege to get the off-cpu profiling.
      
      Each sample will have a separate user stacktrace so it will skip
      kernel threads.  The sample ip will be set from the stacktrace and
      other sample data will be updated accordingly.  Currently it only
      handles some basic sample types.
      
      The sample timestamp is set to a dummy value just not to bother with
      other events during the sorting.  So it has a very big initial value
      and increase it on processing each samples.
      
      Good thing is that it can be used together with regular profiling like
      cpu cycles.  If you don't want to that, you can use a dummy event to
      enable off-cpu profiling only.
      
      Example output:
        $ sudo perf record --off-cpu perf bench sched messaging -l 1000
      
        $ sudo perf report --stdio --call-graph=no
        # Total Lost Samples: 0
        #
        # Samples: 41K of event 'cycles'
        # Event count (approx.): 42137343851
        ...
      
        # Samples: 1K of event 'offcpu-time'
        # Event count (approx.): 587990831640
        #
        # Children      Self  Command          Shared Object       Symbol
        # ........  ........  ...............  ..................  .........................
        #
            81.66%     0.00%  sched-messaging  libc-2.33.so        [.] __libc_start_main
            81.66%     0.00%  sched-messaging  perf                [.] cmd_bench
            81.66%     0.00%  sched-messaging  perf                [.] main
            81.66%     0.00%  sched-messaging  perf                [.] run_builtin
            81.43%     0.00%  sched-messaging  perf                [.] bench_sched_messaging
            40.86%    40.86%  sched-messaging  libpthread-2.33.so  [.] __read
            37.66%    37.66%  sched-messaging  libpthread-2.33.so  [.] __write
             2.91%     2.91%  sched-messaging  libc-2.33.so        [.] __poll
        ...
      
      As you can see it spent most of off-cpu time in read and write in
      bench_sched_messaging().  The --call-graph=no was added just to make
      the output concise here.
      
      It uses perf hooks facility to control BPF program during the record
      session rather than adding new BPF/off-cpu specific calls.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      edc41a10
  10. 27 Apr, 2022 1 commit
  11. 01 Apr, 2022 1 commit
    • John Garry's avatar
      perf tools: Stop depending on .git files for building PERF-VERSION-FILE · d4ff9265
      John Garry authored
      This essentially reverts commit c72e3f04 ("tools/perf/build:
      Speed up git-version test on re-make") and commit 4e666cdb
      ("perf tools: Fix dependency for version file creation")
      
      In commit c72e3f04 ("tools/perf/build: Speed up git-version test
      on re-make"), a makefile dependency on .git/HEAD was added. The
      background is that running PERF-VERSION-FILE is relatively slow, and
      commands like "git describe" are particularly slow.
      
      In commit 4e666cdb ("perf tools: Fix dependency for version file
      creation"), an additional dependency on .git/ORIG_HEAD was added, as
      .git/HEAD may not change for "git reset --hard HEAD^" command. However,
      depending on whether we're on a branch or not, a "git cherry-pick" may
      not lead to the version being updated.
      
      As discussed with the git community in [0], using git internal files for
      dependencies is not reliable. Commit 4e666cdb also breaks some build
      scenarios [1].
      
      As mentioned, c72e3f04 ("tools/perf/build: Speed up git-version
      test on re-make") was added to speed up the build. However in commit
      7572733b ("perf tools: Fix version kernel tag") we removed the
      call to "git describe", so just revert Makefile.perf back to same as pre
      c72e3f04 ("tools/perf/build: Speed up git-version test on
      re-make") and the build should not be so slow, as below:
      
      Pre 7572733b:
      
        $> time util/PERF-VERSION-GEN
          PERF_VERSION = 5.17.rc8.g4e666cdb
      
        real    0m0.110s
        user    0m0.091s
        sys     0m0.019s
      
      Post 7572733b:
      
        $> time util/PERF-VERSION-GEN
          PERF_VERSION = 5.17.rc8.g7572733b
      
        real    0m0.039s
        user    0m0.036s
        sys     0m0.007s
      
      [0] https://lore.kernel.org/git/87wngkpddp.fsf@igel.home/T/#m4a4dd6de52fdbe21179306cd57b3761eb07f45f8
      [1] https://lore.kernel.org/linux-perf-users/20220329093120.4173283-1-matthieu.baerts@tessares.net/T/#u
      
      Committer testing:
      
      After a fresh rebuild using 'make -C tools/perf O=/tmp/build/perf install-bin':
      
        $ perf -v
        perf version 5.17.g162f9db407b6
        $ git log --oneline -1
        162f9db407b6a6e5 (HEAD -> perf/core) perf tools: Stop depending on .git files for building PERF-VERSION-FILE
        $
      
      Now using a detached tarball, i.e. outside the kernel source tree:
      
        $ ls -la perf*tar
        ls: cannot access 'perf*tar': No such file or directory
        $ make perf-tar-src-pkg
          TAR
          PERF_VERSION = 5.17.g31d10b3ef133
        $ ls -la perf*tar
        -rw-r--r--. 1 acme acme 22241280 Mar 30 13:26 perf-5.17.0.tar
        $ mv perf-5.17.0.tar /tmp
        $ cd /tmp
        $ tar xf perf-5.17.0.tar
        $ cd perf-5.17.0/
        $ make -C tools/perf |& tail
          CC      util/pmu.o
          CC      util/pmu-flex.o
          CC      util/expr-flex.o
          CC      util/expr.o
          LD      util/scripting-engines/perf-in.o
          LD      util/intel-pt-decoder/perf-in.o
          LD      util/perf-in.o
          LD      perf-in.o
          LINK    perf
        make: Leaving directory '/tmp/perf-5.17.0/tools/perf'
        $ tools/perf/perf -v
        perf version 5.17.g31d10b3ef133
        $ pwd
        /tmp/perf-5.17.0
        $ cat PERF-VERSION-FILE
        #define PERF_VERSION "5.17.g31d10b3ef133"
        $
      
      Fixes: 4e666cdb
      
       ("perf tools: Fix dependency for version file creation")
      Reported-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/1648635774-14581-1-git-send-email-john.garry@huawei.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d4ff9265
  12. 22 Mar, 2022 1 commit
    • John Garry's avatar
      perf tools: Fix dependency for version file creation · 4e666cdb
      John Garry authored
      
      The version generated by perf may not be correct by just changing the
      head commit, like this:
      
        $ git log --pretty=format:"%H" -n 1
        b5d9d4708a24ac1889a30e9aedf8af8d73102139
        $ perf -v
        perf version 5.16.gb5d9d4708a24
        $ git reset --hard HEAD^
        HEAD is now at 629f520b265f
        $ make
        ...
        $ ./perf -v
        perf version 5.16.gb5d9d4708a24
      
      The dependency to building PERF-VERSION-FILE should also include ORIG_HEAD,
      as this changes when changing the head commit (while HEAD does not).
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Link: https://lore.kernel.org/r/1645449409-158238-2-git-send-email-john.garry@huawei.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4e666cdb
  13. 15 Feb, 2022 1 commit
    • Masahiro Yamada's avatar
      kbuild: replace $(if A,A,B) with $(or A,B) · 5c816641
      Masahiro Yamada authored
      
      $(or ...) is available since GNU Make 3.81, and useful to shorten the
      code in some places.
      
      Covert as follows:
      
        $(if A,A,B)  -->  $(or A,B)
      
      This patch also converts:
      
        $(if A, A, B) --> $(or A, B)
      
      Strictly speaking, the latter is not an equivalent conversion because
      GNU Make keeps spaces after commas; if A is not empty, $(if A, A, B)
      expands to " A", while $(or A, B) expands to "A".
      
      Anyway, preceding spaces are not significant in the code hunks I touched.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNicolas Schier <nicolas@fjasle.eu>
      5c816641
  14. 16 Dec, 2021 1 commit
    • Namhyung Kim's avatar
      perf ftrace: Add -b/--use-bpf option for latency subcommand · 177f4eac
      Namhyung Kim authored
      
      The -b/--use-bpf option is to use BPF to get latency info of kernel
      functions.  It'd have better performance impact and I observed that
      latency of same function is smaller than before when using BPF.
      
      Committer testing:
      
        # strace -e bpf perf ftrace latency -b -T __handle_mm_fault -a sleep 1
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914e00, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 128) = -1 EINVAL (Invalid argument)
        bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\350\2\0\0\350\2\0\0\353\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1515, btf_log_size=0, btf_log_level=0}, 128) = 3
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7fff51914c30, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 5
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 4
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 5
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 7
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 8
        bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=30, btf_vmlinux_value_type_id=0}, 128) = 9
        bpf(BPF_MAP_UPDATE_ELEM, {map_fd=9, key=0x7fff51914c40, value=0x7f6e99be2000, flags=BPF_ANY}, 128) = 0
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x11e4160, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc50, func_info_cnt=1, line_info_rec_size=16, line_info=0x11e04c0, line_info_cnt=9, attach_btf_id=0, attach_prog_fd=0}, 128) = 10
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=99, insns=0x11ded70, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc70, func_info_cnt=1, line_info_rec_size=16, line_info=0x11f6e10, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 11
        bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 13
        bpf(BPF_LINK_CREATE, {link_create={prog_fd=13, target_fd=-1, attach_type=0x29 /* BPF_??? */, flags=0}}, 128) = -1 EINVAL (Invalid argument)
        --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1699992, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
        #   DURATION     |      COUNT | GRAPH                                          |
             0 - 1    us |         52 | ###################                            |
             1 - 2    us |         36 | #############                                  |
             2 - 4    us |         24 | #########                                      |
             4 - 8    us |          7 | ##                                             |
             8 - 16   us |          1 |                                                |
            16 - 32   us |          0 |                                                |
            32 - 64   us |          0 |                                                |
            64 - 128  us |          0 |                                                |
           128 - 256  us |          0 |                                                |
           256 - 512  us |          0 |                                                |
           512 - 1024 us |          0 |                                                |
             1 - 2    ms |          0 |                                                |
             2 - 4    ms |          0 |                                                |
             4 - 8    ms |          0 |                                                |
             8 - 16   ms |          0 |                                                |
            16 - 32   ms |          0 |                                                |
            32 - 64   ms |          0 |                                                |
            64 - 128  ms |          0 |                                                |
           128 - 256  ms |          0 |                                                |
           256 - 512  ms |          0 |                                                |
           512 - 1024 ms |          0 |                                                |
             1 - ...   s |          0 |                                                |
        +++ exited with 0 +++
        #
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211215185154.360314-5-namhyung@kernel.org
      
      
      [ Add missing util/cpumap.h include and removed unused 'fd' variable ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      177f4eac
  15. 08 Dec, 2021 1 commit
  16. 12 Nov, 2021 3 commits
  17. 07 Nov, 2021 1 commit
  18. 31 Oct, 2021 1 commit
  19. 27 Oct, 2021 1 commit
    • Adrian Hunter's avatar
      perf dlfilter: Add dlfilter-show-cycles · c3afd6e5
      Adrian Hunter authored
      
      Add a new dlfilter to show cycles.
      
      Cycle counts are accumulated per CPU (or per thread if CPU is not recorded)
      from IPC information, and printed together with the change since the last
      print, at the start of each line. Separate counts are kept for branches,
      instructions or other events.
      
      Note also, the itrace A option can be useful to provide higher granularity
      cycle information.
      
      Example:
      
        $ perf record -e intel_pt/cyc/u uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.044 MB perf.data ]
        $ perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so --deltatime | head
               0                   perf-exec  8509 [001]     0.000000000:  psb offs: 0
               0                   perf-exec  8509 [001]     0.000000000:  cbr: 42 freq: 4219 MHz (156%)
             833        833            uname  8509 [001]     0.000047689: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )        _start
             833                       uname  8509 [001]     0.000003261: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            2015       1182            uname  8509 [001]     0.000000282: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            2676        661            uname  8509 [001]     0.000002629: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            3612        936            uname  8509 [001]     0.000001232: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            4579        967            uname  8509 [001]     0.000002519: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            6145       1566            uname  8509 [001]     0.000001050: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_setup_hash
            6239         94            uname  8509 [001]     0.000000023: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_sysdep_start
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20211027080334.365596-5-adrian.hunter@intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3afd6e5
  20. 25 Oct, 2021 1 commit
  21. 05 Oct, 2021 1 commit
  22. 28 Sep, 2021 1 commit
  23. 31 Aug, 2021 1 commit
  24. 11 Aug, 2021 2 commits
    • Adrian Hunter's avatar
      perf tests: Add dlfilter test · 9f9c9a8d
      Adrian Hunter authored
      
      Add a perf test to test the dlfilter C API.
      
      A perf.data file is synthesized and then processed by perf script with a
      dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to
      provide a dso to match the synthesized perf.data file.
      
      Committer testing:
      
        [root@five ~]# perf test dlfilter
        72: dlfilter C API                                                  : Ok
        [root@five ~]# perf test -v dlfilter
        72: dlfilter C API                                                  :
        --- start ---
        test child forked, pid 3387712
        Checking for gcc
        Command: gcc --version
        gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
        Copyright (C) 2021 Free Software Foundation, Inc.
        This is free software; see the source for copying conditions.  There is NO
        warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
      
        dlfilters path: /var/home/acme/libexec/perf-core/dlfilters
        Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c
        Creating new host machine structure
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last
        start API
        filter_event_early API
        filter_event API
        stop API
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last
        start API
        filter_event_early API
        filter_event API
        stop API
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last
        start API
        filter_event_early API
        stop API
        test child finished with 0
        ---- end ----
        dlfilter C API: Ok
        [root@five ~]#
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https //lore.kernel.org/r/20210811101036.17986-7-adrian.hunter@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f9c9a8d
    • Adrian Hunter's avatar
      perf build: Move perf_dlfilters.h in the source tree · 3af1dfdd
      Adrian Hunter authored
      
      Move perf_dlfilters.h in the source tree so that it will be found when
      building dlfilters as part of the perf build.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https //lore.kernel.org/r/20210811101036.17986-6-adrian.hunter@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3af1dfdd
  25. 07 Jul, 2021 1 commit
  26. 05 Jul, 2021 1 commit
  27. 01 Jul, 2021 1 commit
  28. 29 Apr, 2021 1 commit
    • Michael Petlan's avatar
      perf tools: Enable libtraceevent dynamic linking · 56d32d4c
      Michael Petlan authored
      
      Currently we support only static linking with kernel's libtraceevent
      (tools/lib/traceevent). This patch adds libtraceevent package detection
      and support to link perf with it dynamically.
      
        The libtraceevent package status is displayed with:
        $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
        ...
        ...                 libtraceevent: [ on  ]
      
      Default behavior remains the same (static linking).
      
      Committer testing:
      
        $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
        Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
        $
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      56d32d4c
  29. 20 Apr, 2021 1 commit
    • Alexander Antonov's avatar
      perf stat: Enable iostat mode for x86 platforms · f9ed693e
      Alexander Antonov authored
      This functionality is based on recently introduced sysfs attributes for
      Intel® Xeon® Scalable processor family (code name Skylake-SP):
      
      Commit bb42b3d3
      
       ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
      
      Mode is intended to provide four I/O performance metrics in MB per each
      PCIe root port:
      
       - Inbound Read: I/O devices below root port read from the host memory
       - Inbound Write: I/O devices below root port write to the host memory
       - Outbound Read: CPU reads from I/O devices below root port
       - Outbound Write: CPU writes to I/O devices below root port
      
      Each metric requiries only one uncore event which increments at every 4B
      transfer in corresponding direction. The formulas to compute metrics
      are generic:
          #EventCount * 4B / (1024 * 1024)
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAlexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9ed693e
  30. 23 Mar, 2021 1 commit
    • Song Liu's avatar
      perf stat: Introduce 'bperf' to share hardware PMCs with BPF · 7fac83aa
      Song Liu authored
      The perf tool uses performance monitoring counters (PMCs) to monitor
      system performance. The PMCs are limited hardware resources. For
      example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
      
      Modern data center systems use these PMCs in many different ways: system
      level monitoring, (maybe nested) container level monitoring, per process
      monitoring, profiling (in sample mode), etc. In some cases, there are
      more active perf_events than available hardware PMCs. To allow all
      perf_events to have a chance to run, it is necessary to do expensive
      time multiplexing of events.
      
      On the other hand, many monitoring tools count the common metrics
      (cycles, instructions). It is a waste to have multiple tools create
      multiple perf_events of "cycles" and occupy multiple PMCs.
      
      bperf tries to reduce such wastes by allowing multiple perf_events of
      "cycles" or "instructions" (at different scopes) to share PMUs. Instead
      of having each perf-stat session to read its own perf_events, bperf uses
      BPF programs to read the perf_events and aggregate readings to BPF maps.
      Then, the perf-stat session(s) reads the values from these BPF maps.
      
      Please refer to the comment before the definition of bperf_ops for the
      description of bperf architecture.
      
      bperf is off by default. To enable it, pass --bpf-counters option to
      perf-stat. bperf uses a BPF hashmap to share information about BPF
      programs and maps used by bperf. This map is pinned to bpffs. The
      default path is /sys/fs/bpf/perf_attr_map. The user could change the
      path with option --bpf-attr-map.
      
      Committer testing:
      
        # dmesg|grep "Performance Events" -A5
        [    0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
        [    0.225280] ... version:                0
        [    0.225280] ... bit width:              48
        [    0.225281] ... generic registers:      6
        [    0.225281] ... value mask:             0000ffffffffffff
        [    0.225281] ... max period:             00007fffffffffff
        #
        #  for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
        [1] 2436231
        [2] 2436232
        [3] 2436233
        [4] 2436234
        [5] 2436235
        [6] 2436236e
      
      
        # perf stat -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
               310,326,987      cycles                                                        (41.87%)
               236,143,290      instructions              #    0.76  insn per cycle           (41.87%)
      
               0.100800885 seconds time elapsed
      
        #
      
      We can see that the counters were enabled for this workload 41.87% of
      the time.
      
      Now with --bpf-counters:
      
        #  for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
        [1] 2436514
        [2] 2436515
        [3] 2436516
        [4] 2436517
        [5] 2436518
        [6] 2436519
        [7] 2436520
        [8] 2436521
        [9] 2436522
        [10] 2436523
        [11] 2436524
        [12] 2436525
        [13] 2436526
        [14] 2436527
        [15] 2436528
        [16] 2436529
        [17] 2436530
        [18] 2436531
        [19] 2436532
        [20] 2436533
        [21] 2436534
        [22] 2436535
        [23] 2436536
        [24] 2436537
        [25] 2436538
        [26] 2436539
        [27] 2436540
        [28] 2436541
        [29] 2436542
        [30] 2436543
        [31] 2436544
        [32] 2436545
        #
        # ls -la /sys/fs/bpf/perf_attr_map
        -rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
        # bpftool map | grep bperf | wc -l
        64
        #
      
        # bpftool map | tail
        1265: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1266: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1267: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 996
        	pids perf(2436545)
        1268: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1269: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1270: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 997
        	pids perf(2436541)
        1285: array  name pid_iter.rodata  flags 0x480
        	key 4B  value 4B  max_entries 1  memlock 4096B
        	btf_id 1017  frozen
        	pids bpftool(2437504)
        1286: array  flags 0x0
        	key 4B  value 32B  max_entries 1  memlock 4096B
        #
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        8f f3 bc ca 00 00 00 00  80 fd 2a d1 4d 00 00 00
        80 fd 2a d1 4d 00 00 00
        value (CPU 22):
        7e d5 64 4d 00 00 00 00  a4 8a 2e ee 4d 00 00 00
        a4 8a 2e ee 4d 00 00 00
        value (CPU 23):
        a7 78 3e 06 01 00 00 00  b2 34 94 f6 4d 00 00 00
        b2 34 94 f6 4d 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        c6 8b d9 ca 00 00 00 00  20 c6 fc 83 4e 00 00 00
        20 c6 fc 83 4e 00 00 00
        value (CPU 22):
        9c b4 d2 4d 00 00 00 00  3e 0c df 89 4e 00 00 00
        3e 0c df 89 4e 00 00 00
        value (CPU 23):
        18 43 66 06 01 00 00 00  5b 69 ed 83 4e 00 00 00
        5b 69 ed 83 4e 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        f2 6e db ca 00 00 00 00  92 67 4c ba 4e 00 00 00
        92 67 4c ba 4e 00 00 00
        value (CPU 22):
        dc 8e e1 4d 00 00 00 00  d9 32 7a c5 4e 00 00 00
        d9 32 7a c5 4e 00 00 00
        value (CPU 23):
        bd 2b 73 06 01 00 00 00  7c 73 87 bf 4e 00 00 00
        7c 73 87 bf 4e 00 00 00
        Found 1 element
        #
      
        # perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
             119,410,122      cycles
             152,105,479      instructions              #    1.27  insn per cycle
      
             0.101395093 seconds time elapsed
      
        #
      
      See? We had the counters enabled all the time.
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7fac83aa