1. 07 Nov, 2021 1 commit
  2. 31 Oct, 2021 1 commit
  3. 27 Oct, 2021 1 commit
    • Adrian Hunter's avatar
      perf dlfilter: Add dlfilter-show-cycles · c3afd6e5
      Adrian Hunter authored
      
      Add a new dlfilter to show cycles.
      
      Cycle counts are accumulated per CPU (or per thread if CPU is not recorded)
      from IPC information, and printed together with the change since the last
      print, at the start of each line. Separate counts are kept for branches,
      instructions or other events.
      
      Note also, the itrace A option can be useful to provide higher granularity
      cycle information.
      
      Example:
      
        $ perf record -e intel_pt/cyc/u uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.044 MB perf.data ]
        $ perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so --deltatime | head
               0                   perf-exec  8509 [001]     0.000000000:  psb offs: 0
               0                   perf-exec  8509 [001]     0.000000000:  cbr: 42 freq: 4219 MHz (156%)
             833        833            uname  8509 [001]     0.000047689: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )        _start
             833                       uname  8509 [001]     0.000003261: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            2015       1182            uname  8509 [001]     0.000000282: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            2676        661            uname  8509 [001]     0.000002629: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            3612        936            uname  8509 [001]     0.000001232: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            4579        967            uname  8509 [001]     0.000002519: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )            _dl_start
            6145       1566            uname  8509 [001]     0.000001050: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_setup_hash
            6239         94            uname  8509 [001]     0.000000023: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )                _dl_sysdep_start
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20211027080334.365596-5-adrian.hunter@intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3afd6e5
  4. 25 Oct, 2021 1 commit
  5. 05 Oct, 2021 1 commit
  6. 28 Sep, 2021 1 commit
  7. 31 Aug, 2021 1 commit
  8. 11 Aug, 2021 2 commits
    • Adrian Hunter's avatar
      perf tests: Add dlfilter test · 9f9c9a8d
      Adrian Hunter authored
      
      Add a perf test to test the dlfilter C API.
      
      A perf.data file is synthesized and then processed by perf script with a
      dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to
      provide a dso to match the synthesized perf.data file.
      
      Committer testing:
      
        [root@five ~]# perf test dlfilter
        72: dlfilter C API                                                  : Ok
        [root@five ~]# perf test -v dlfilter
        72: dlfilter C API                                                  :
        --- start ---
        test child forked, pid 3387712
        Checking for gcc
        Command: gcc --version
        gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
        Copyright (C) 2021 Free Software Foundation, Inc.
        This is free software; see the source for copying conditions.  There is NO
        warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
      
        dlfilters path: /var/home/acme/libexec/perf-core/dlfilters
        Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c
        Creating new host machine structure
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last
        start API
        filter_event_early API
        filter_event API
        stop API
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last
        start API
        filter_event_early API
        filter_event API
        stop API
        Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last
        start API
        filter_event_early API
        stop API
        test child finished with 0
        ---- end ----
        dlfilter C API: Ok
        [root@five ~]#
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https //lore.kernel.org/r/20210811101036.17986-7-adrian.hunter@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f9c9a8d
    • Adrian Hunter's avatar
      perf build: Move perf_dlfilters.h in the source tree · 3af1dfdd
      Adrian Hunter authored
      
      Move perf_dlfilters.h in the source tree so that it will be found when
      building dlfilters as part of the perf build.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https //lore.kernel.org/r/20210811101036.17986-6-adrian.hunter@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3af1dfdd
  9. 07 Jul, 2021 1 commit
  10. 05 Jul, 2021 1 commit
  11. 01 Jul, 2021 1 commit
  12. 29 Apr, 2021 1 commit
    • Michael Petlan's avatar
      perf tools: Enable libtraceevent dynamic linking · 56d32d4c
      Michael Petlan authored
      
      Currently we support only static linking with kernel's libtraceevent
      (tools/lib/traceevent). This patch adds libtraceevent package detection
      and support to link perf with it dynamically.
      
        The libtraceevent package status is displayed with:
        $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
        ...
        ...                 libtraceevent: [ on  ]
      
      Default behavior remains the same (static linking).
      
      Committer testing:
      
        $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
        Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
        $
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      56d32d4c
  13. 20 Apr, 2021 1 commit
    • Alexander Antonov's avatar
      perf stat: Enable iostat mode for x86 platforms · f9ed693e
      Alexander Antonov authored
      This functionality is based on recently introduced sysfs attributes for
      Intel® Xeon® Scalable processor family (code name Skylake-SP):
      
      Commit bb42b3d3
      
       ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
      
      Mode is intended to provide four I/O performance metrics in MB per each
      PCIe root port:
      
       - Inbound Read: I/O devices below root port read from the host memory
       - Inbound Write: I/O devices below root port write to the host memory
       - Outbound Read: CPU reads from I/O devices below root port
       - Outbound Write: CPU writes to I/O devices below root port
      
      Each metric requiries only one uncore event which increments at every 4B
      transfer in corresponding direction. The formulas to compute metrics
      are generic:
          #EventCount * 4B / (1024 * 1024)
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAlexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9ed693e
  14. 23 Mar, 2021 1 commit
    • Song Liu's avatar
      perf stat: Introduce 'bperf' to share hardware PMCs with BPF · 7fac83aa
      Song Liu authored
      The perf tool uses performance monitoring counters (PMCs) to monitor
      system performance. The PMCs are limited hardware resources. For
      example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
      
      Modern data center systems use these PMCs in many different ways: system
      level monitoring, (maybe nested) container level monitoring, per process
      monitoring, profiling (in sample mode), etc. In some cases, there are
      more active perf_events than available hardware PMCs. To allow all
      perf_events to have a chance to run, it is necessary to do expensive
      time multiplexing of events.
      
      On the other hand, many monitoring tools count the common metrics
      (cycles, instructions). It is a waste to have multiple tools create
      multiple perf_events of "cycles" and occupy multiple PMCs.
      
      bperf tries to reduce such wastes by allowing multiple perf_events of
      "cycles" or "instructions" (at different scopes) to share PMUs. Instead
      of having each perf-stat session to read its own perf_events, bperf uses
      BPF programs to read the perf_events and aggregate readings to BPF maps.
      Then, the perf-stat session(s) reads the values from these BPF maps.
      
      Please refer to the comment before the definition of bperf_ops for the
      description of bperf architecture.
      
      bperf is off by default. To enable it, pass --bpf-counters option to
      perf-stat. bperf uses a BPF hashmap to share information about BPF
      programs and maps used by bperf. This map is pinned to bpffs. The
      default path is /sys/fs/bpf/perf_attr_map. The user could change the
      path with option --bpf-attr-map.
      
      Committer testing:
      
        # dmesg|grep "Performance Events" -A5
        [    0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
        [    0.225280] ... version:                0
        [    0.225280] ... bit width:              48
        [    0.225281] ... generic registers:      6
        [    0.225281] ... value mask:             0000ffffffffffff
        [    0.225281] ... max period:             00007fffffffffff
        #
        #  for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
        [1] 2436231
        [2] 2436232
        [3] 2436233
        [4] 2436234
        [5] 2436235
        [6] 2436236e
      
      
        # perf stat -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
               310,326,987      cycles                                                        (41.87%)
               236,143,290      instructions              #    0.76  insn per cycle           (41.87%)
      
               0.100800885 seconds time elapsed
      
        #
      
      We can see that the counters were enabled for this workload 41.87% of
      the time.
      
      Now with --bpf-counters:
      
        #  for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
        [1] 2436514
        [2] 2436515
        [3] 2436516
        [4] 2436517
        [5] 2436518
        [6] 2436519
        [7] 2436520
        [8] 2436521
        [9] 2436522
        [10] 2436523
        [11] 2436524
        [12] 2436525
        [13] 2436526
        [14] 2436527
        [15] 2436528
        [16] 2436529
        [17] 2436530
        [18] 2436531
        [19] 2436532
        [20] 2436533
        [21] 2436534
        [22] 2436535
        [23] 2436536
        [24] 2436537
        [25] 2436538
        [26] 2436539
        [27] 2436540
        [28] 2436541
        [29] 2436542
        [30] 2436543
        [31] 2436544
        [32] 2436545
        #
        # ls -la /sys/fs/bpf/perf_attr_map
        -rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
        # bpftool map | grep bperf | wc -l
        64
        #
      
        # bpftool map | tail
        1265: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1266: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1267: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 996
        	pids perf(2436545)
        1268: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1269: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1270: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 997
        	pids perf(2436541)
        1285: array  name pid_iter.rodata  flags 0x480
        	key 4B  value 4B  max_entries 1  memlock 4096B
        	btf_id 1017  frozen
        	pids bpftool(2437504)
        1286: array  flags 0x0
        	key 4B  value 32B  max_entries 1  memlock 4096B
        #
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        8f f3 bc ca 00 00 00 00  80 fd 2a d1 4d 00 00 00
        80 fd 2a d1 4d 00 00 00
        value (CPU 22):
        7e d5 64 4d 00 00 00 00  a4 8a 2e ee 4d 00 00 00
        a4 8a 2e ee 4d 00 00 00
        value (CPU 23):
        a7 78 3e 06 01 00 00 00  b2 34 94 f6 4d 00 00 00
        b2 34 94 f6 4d 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        c6 8b d9 ca 00 00 00 00  20 c6 fc 83 4e 00 00 00
        20 c6 fc 83 4e 00 00 00
        value (CPU 22):
        9c b4 d2 4d 00 00 00 00  3e 0c df 89 4e 00 00 00
        3e 0c df 89 4e 00 00 00
        value (CPU 23):
        18 43 66 06 01 00 00 00  5b 69 ed 83 4e 00 00 00
        5b 69 ed 83 4e 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        f2 6e db ca 00 00 00 00  92 67 4c ba 4e 00 00 00
        92 67 4c ba 4e 00 00 00
        value (CPU 22):
        dc 8e e1 4d 00 00 00 00  d9 32 7a c5 4e 00 00 00
        d9 32 7a c5 4e 00 00 00
        value (CPU 23):
        bd 2b 73 06 01 00 00 00  7c 73 87 bf 4e 00 00 00
        7c 73 87 bf 4e 00 00 00
        Found 1 element
        #
      
        # perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
             119,410,122      cycles
             152,105,479      instructions              #    1.27  insn per cycle
      
             0.101395093 seconds time elapsed
      
        #
      
      See? We had the counters enabled all the time.
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7fac83aa
  15. 06 Mar, 2021 3 commits
  16. 29 Jan, 2021 1 commit
    • Sedat Dilek's avatar
      tools: Factor Clang, LLC and LLVM utils definitions · 211a741c
      Sedat Dilek authored
      When dealing with BPF/BTF/pahole and DWARF v5 I wanted to build bpftool.
      
      While looking into the source code I found duplicate assignments in misc tools
      for the LLVM eco system, e.g. clang and llvm-objcopy.
      
      Move the Clang, LLC and/or LLVM utils definitions to tools/scripts/Makefile.include
      file and add missing includes where needed. Honestly, I was inspired by the commit
      c8a950d0
      
       ("tools: Factor HOSTCC, HOSTLD, HOSTAR definitions").
      
      I tested with bpftool and perf on Debian/testing AMD64 and LLVM/Clang v11.1.0-rc1.
      
      Build instructions:
      
      [ make and make-options ]
      MAKE="make V=1"
      MAKE_OPTS="HOSTCC=clang HOSTCXX=clang++ HOSTLD=ld.lld CC=clang LD=ld.lld LLVM=1 LLVM_IAS=1"
      MAKE_OPTS="$MAKE_OPTS PAHOLE=/opt/pahole/bin/pahole"
      
      [ clean-up ]
      $MAKE $MAKE_OPTS -C tools/ clean
      
      [ bpftool ]
      $MAKE $MAKE_OPTS -C tools/bpf/bpftool/
      
      [ perf ]
      PYTHON=python3 $MAKE $MAKE_OPTS -C tools/perf/
      
      I was careful with respecting the user's wish to override custom compiler, linker,
      GNU/binutils and/or LLVM utils settings.
      Signed-off-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: Jiri Olsa <jolsa@redhat.com> # tools/build and tools/perf
      Link: https://lore.kernel.org/bpf/20210128015117.20515-1-sedat.dilek@gmail.com
      211a741c
  17. 20 Jan, 2021 1 commit
    • Song Liu's avatar
      perf stat: Enable counting events for BPF programs · fa853c4b
      Song Liu authored
      
      Introduce 'perf stat -b' option, which counts events for BPF programs, like:
      
        [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
           1.487903822            115,200      ref-cycles
           1.487903822             86,012      cycles
           2.489147029             80,560      ref-cycles
           2.489147029             73,784      cycles
           3.490341825             60,720      ref-cycles
           3.490341825             37,797      cycles
           4.491540887             37,120      ref-cycles
           4.491540887             31,963      cycles
      
      The example above counts 'cycles' and 'ref-cycles' of BPF program of id
      254.  This is similar to bpftool-prog-profile command, but more
      flexible.
      
      'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF
      programs (monitor-progs) to the target BPF program (target-prog). The
      monitor-progs read perf_event before and after the target-prog, and
      aggregate the difference in a BPF map. Then the user space reads data
      from these maps.
      
      A new 'struct bpf_counter' is introduced to provide a common interface
      that uses BPF programs/maps to count perf events.
      
      Committer notes:
      
      Removed all but bpf_counter.h includes from evsel.h, not needed at all.
      
      Also BPF map lookups for PERCPU_ARRAYs need to have as its value receive
      buffer passed to the kernel libbpf_num_possible_cpus() entries, not
      evsel__nr_cpus(evsel), as the former uses
      /sys/devices/system/cpu/possible while the later uses
      /sys/devices/system/cpu/online, which may be less than the 'possible'
      number making the bpf map lookup overwrite memory and cause hard to
      debug memory corruption.
      
      We need to continue using evsel__nr_cpus(evsel) when accessing the
      perf_counts array tho, not to overwrite another are of memory :-)
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Link: https://lore.kernel.org/lkml/20210120163031.GU12699@kernel.org/
      
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20201229214214.3413833-4-songliubraving@fb.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fa853c4b
  18. 15 Jan, 2021 1 commit
  19. 11 Nov, 2020 1 commit
  20. 01 Oct, 2020 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Use the autogenerated mmap 'prot' string/id table · 388968d8
      Arnaldo Carvalho de Melo authored
      
      No change in behaviour:
      
        # perf trace -e mmap sleep 1
             0.000 ( 0.009 ms): sleep/751870 mmap(len: 143317, prot: READ, flags: PRIVATE, fd: 3)                  = 0x7fa96d0f7000
             0.028 ( 0.004 ms): sleep/751870 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS)           = 0x7fa96d0f5000
             0.037 ( 0.005 ms): sleep/751870 mmap(len: 1872744, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3)       = 0x7fa96cf2b000
             0.044 ( 0.011 ms): sleep/751870 mmap(addr: 0x7fa96cf50000, len: 1376256, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x25000) = 0x7fa96cf50000
             0.056 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0a0000, len: 307200, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x175000) = 0x7fa96d0a0000
             0.064 ( 0.007 ms): sleep/751870 mmap(addr: 0x7fa96d0eb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bf000) = 0x7fa96d0eb000
             0.075 ( 0.005 ms): sleep/751870 mmap(addr: 0x7fa96d0f1000, len: 13160, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7fa96d0f1000
             0.253 ( 0.005 ms): sleep/751870 mmap(len: 218049136, prot: READ, flags: PRIVATE, fd: 3)               = 0x7fa95ff38000
        #
        #
        # set -o vi
        # strace -e mmap sleep 1
        mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f333bd83000
        mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f333bd81000
        mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f333bbb7000
        mmap(0x7f333bbdc000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f333bbdc000
        mmap(0x7f333bd2c000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f333bd2c000
        mmap(0x7f333bd77000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f333bd77000
        mmap(0x7f333bd7d000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f333bd7d000
        mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f332ebc4000
        +++ exited with 0 +++
        #
      
      And you can as well tweak 'perf trace's output to more closely match
      strace's:
      
        # perf config trace.show_arg_names=no
        # perf config trace.show_duration=no
        # perf config trace.show_prefix=yes
        # perf config trace.show_timestamp=no
        # perf config trace.show_zeros=yes
        # perf config trace.no_inherit=yes
        # perf trace -e mmap sleep 1
        mmap(NULL, 143317, PROT_READ, MAP_PRIVATE, 3, 0)                      = 0x7f0d287ca000
        mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS)     = 0x7f0d287c8000
        mmap(NULL, 1872744, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0)       = 0x7f0d285fe000
        mmap(0x7f0d28623000, 1376256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f0d28623000
        mmap(0x7f0d28773000, 307200, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x175000) = 0x7f0d28773000
        mmap(0x7f0d287be000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f0d287be000
        mmap(0x7f0d287c4000, 13160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f0d287c4000
        mmap(NULL, 218049136, PROT_READ, MAP_PRIVATE, 3, 0)                   = 0x7f0d1b60b000
        #
      
        # perf config | grep ^trace
        trace.show_arg_names=no
        trace.show_duration=no
        trace.show_prefix=yes
        trace.show_timestamp=no
        trace.show_zeros=yes
        trace.no_inherit=yes
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      388968d8
  21. 29 Sep, 2020 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf trace beauty: Add script to autogenerate mremap's flags args string/id table · 9012e3dd
      Arnaldo Carvalho de Melo authored
      
      It'll also conditionally generate the defines, so that if we don't have
      those when building a new tool tarball in an older systems, we get
      those, and we need them sometimes in the actual scnprintf routine, such
      as when checking if a flags means we have an extra arg, like with
      MREMAP_FIXED.
      
        $ tools/perf/trace/beauty/mremap_flags.sh
        static const char *mremap_flags[] = {
        	[ilog2(1) + 1] = "MAYMOVE",
        #ifndef MREMAP_MAYMOVE
        #define MREMAP_MAYMOVE 1
        #endif
        	[ilog2(2) + 1] = "FIXED",
        #ifndef MREMAP_FIXED
        #define MREMAP_FIXED 2
        #endif
        	[ilog2(4) + 1] = "DONTUNMAP",
        #ifndef MREMAP_DONTUNMAP
        #define MREMAP_DONTUNMAP 4
        #endif
        };
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9012e3dd
  22. 04 Sep, 2020 2 commits
  23. 14 Aug, 2020 1 commit
    • Frank Ch. Eigler's avatar
      perf build-ids: Fall back to debuginfod query if debuginfo not found · c7a14fdc
      Frank Ch. Eigler authored
      During a perf-record, use the -ldebuginfod API to query a debuginfod
      server, should the debug data not be found in the usual system
      locations.  If successful, the usual $HOME/.debug dir is populated.
      
      Tested with:
      
        $ find .
        .
        ./ctags-debuginfo-5.8-26.fc31.x86_64.rpm
        ./usr
        ./usr/lib
        ./usr/lib/debug
        ./usr/lib/debug/.build-id
        ./usr/lib/debug/.build-id/ca
        ./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d
        ./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d.debug
        ./usr/lib/debug/usr
        ./usr/lib/debug/usr/bin
        ./usr/lib/debug/usr/bin/ctags-5.8-26.fc31.x86_64.debug
      
        $ debuginfod  -F .
        ...
      
        $ rm -rf ~/.debug/ ; mkdir ~/.debug
      
        $ perf record make tags
          BUILD:   Doing 'make -j8' parallel build
          GEN      tags
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.107 MB perf.data (1483 samples) ]
      
        $ find ~/.debug | grep ctags
        /home/jolsa/.debug/usr/bin/ctags
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes
      
        $ rm -rf ~/.debug/ ; mkdir ~/.debug
      
        $ DEBUGINFOD_URLS=http://localhost:8002
      
       perf record make tags
          BUILD:   Doing 'make -j8' parallel build
          GEN      tags
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.108 MB perf.data (1531 samples) ]
      
        $ find ~/.debug | grep ctag
        /home/jolsa/.debug/usr/bin/ctags
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/debug
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
        /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes
      
      Note the 'debug' file is created in the last run.
      
      Note that currently the debuginfo data are downloaded only on record path,
      we still need add this support to perf build-id/report.. and test ;-)
      Tested-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarFrank Ch. Eigler <fche@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c7a14fdc
  24. 12 Aug, 2020 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf trace beauty: Use the autogenerated protocol family table · f3cf7fa9
      Arnaldo Carvalho de Melo authored
      
      That helps us not to lose new protocol families when they are
      introduced, replacing that hardcoded, dated family->string table.
      
      To recap what this allows us to do:
      
        # perf trace -e syscalls:sys_enter_socket/max-stack=10/ --filter=family==INET --max-events=1
           0.000 fetchmail/41097 syscalls:sys_enter_socket(family: INET, type: DGRAM|CLOEXEC|NONBLOCK, protocol: IP)
                                             __GI___socket (inlined)
                                             reopen (/usr/lib64/libresolv-2.31.so)
                                             send_dg (/usr/lib64/libresolv-2.31.so)
                                             __res_context_send (/usr/lib64/libresolv-2.31.so)
                                             __GI___res_context_query (inlined)
                                             __GI___res_context_search (inlined)
                                             _nss_dns_gethostbyname4_r (/usr/lib64/libnss_dns-2.31.so)
                                             gaih_inet.constprop.0 (/usr/lib64/libc-2.31.so)
                                             __GI_getaddrinfo (inlined)
                                             [0x15cb2] (/usr/bin/fetchmail)
        #
      
      More work is still needed to allow for the more natura strace-like
      syscall name usage instead of the trace event name:
      
        # perf trace -e socket/max-stack=10,family==INET/ --max-events=1
      
      I.e. to allow for modifiers to follow the syscall name and for logical
      expressions to be accepted as filters to use with that syscall, be it as
      trace event filters or BPF based ones.
      
      Using -v we can see how the trace event filter is built:
      
        # perf trace -v -e syscalls:sys_enter_socket/call-graph=dwarf/ --filter=family==INET --max-events=2
        <SNIP>
        New filter for syscalls:sys_enter_socket: (family==0x2) && (common_pid != 41384 && common_pid != 2836)
        <SNIP>
      
        $ tools/perf/trace/beauty/socket.sh | grep -w 2
      	[2] = "INET",
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f3cf7fa9
  25. 29 May, 2020 2 commits
    • Stephane Eranian's avatar
      perf tools: Add optional support for libpfm4 · 70943490
      Stephane Eranian authored
      This patch links perf with the libpfm4 library if it is available and
      LIBPFM4 is passed to the build. The libpfm4 library contains hardware
      event tables for all processors supported by perf_events. It is a helper
      library that helps convert from a symbolic event name to the event
      encoding required by the underlying kernel interface. This library is
      open-source and available from: http://perfmon2.sf.net
      
      .
      
      With this patch, it is possible to specify full hardware events by name.
      Hardware filters are also supported. Events must be specified via the
      --pfm-events and not -e option. Both options are active at the same time
      and it is possible to mix and match:
      
        $ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles ....
      
      One needs to explicitely ask for its inclusion by using the LIBPFM4 make
      command line option, ie its opt-in rather than opt-out of feature
      detection and build support.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Igor Lubashev <ilubashe@akamai.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jiwei Sun <jiwei.sun@windriver.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70943490
    • Arnaldo Carvalho de Melo's avatar
      perf build: Allow explicitely disabling the NO_SYSCALL_TABLE variable · 43de3869
      Arnaldo Carvalho de Melo authored
      
      This is useful to see if, on x86, the legacy libaudit still works, as it
      is used in architectures that don't have the SYSCALL_TABLE logic and we
      want to have it tested in 'make -C tools/perf/ build-test'.
      
      E.g.:
      
      Without having audit-libs-devel installed:
      
        $ make NO_SYSCALL_TABLE=1 O=/tmp/build/perf -C tools/perf install-bin
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
        <SNIP>
        Auto-detecting system features:
        <SNIP>
        ...                      libaudit: [ OFF ]
        ...                        libbfd: [ on  ]
        ...                        libcap: [ on  ]
        <SNIP>
        Makefile.config:664: No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev
        <SNIP>
      
      After installing it:
      
        $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
        $ time make NO_SYSCALL_TABLE=1 O=/tmp/build/perf  -C tools/perf install-bin ; perf test python
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
          HOSTCC   /tmp/build/perf/fixdep.o
          HOSTLD   /tmp/build/perf/fixdep-in.o
          LINK     /tmp/build/perf/fixdep
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
        diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
        Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
        diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
        Warning: Kernel ABI header at 'tools/perf/util/hashmap.c' differs from latest version at 'tools/lib/bpf/hashmap.c'
        diff -u tools/perf/util/hashmap.c tools/lib/bpf/hashmap.c
      
        Auto-detecting system features:
        <SNIP>
        ...                      libaudit: [ on  ]
        ...                        libbfd: [ on  ]
        ...                        libcap: [ on  ]
        <SNIP>
        $ ldd ~/bin/perf | grep audit
        	libaudit.so.1 => /lib64/libaudit.so.1 (0x00007fc18978e000)
        $
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200529155552.463-3-acme@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      43de3869
  26. 28 May, 2020 1 commit
  27. 05 May, 2020 1 commit
    • Ian Rogers's avatar
      perf doc: Pass ASCIIDOC_EXTRA as an argument · 4b198449
      Ian Rogers authored
      commit e9cfa47e
      
       ("perf doc: allow ASCIIDOC_EXTRA to be an argument")
      allowed ASCIIDOC_EXTRA to be passed as an option to the Documentation
      Makefile. This change passes ASCIIDOC_EXTRA, set by detected features or
      command line options, prior to doing a Documentation build. This is
      necessary to allow conditional compilation, based on configuration
      variables, in asciidoc code.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Igor Lubashev <ilubashe@akamai.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiwei Sun <jiwei.sun@windriver.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Link: http://lore.kernel.org/lkml/20200429231443.207201-2-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4b198449
  28. 26 Mar, 2020 1 commit
  29. 24 Mar, 2020 1 commit
  30. 06 Jan, 2020 1 commit
  31. 26 Nov, 2019 1 commit
    • Jiri Olsa's avatar
      perf tools: Allow to link with libbpf dynamicaly · 7b65e203
      Jiri Olsa authored
      
      Currently we support only static linking with kernel's libbpf
      (tools/lib/bpf). This patch adds libbpf package detection and support to
      link perf with it dynamically.
      
      The libbpf package status is displayed with:
      
        $ make VF=1
        Auto-detecting system features:
        ...
        ...                        libbpf: [ on  ]
      
      It's not checked by default, because it's quite new.  Once it's on most
      distros we can switch it on.
      
      For the same reason it's not added to the test-all check.
      
      Perf does not need advanced version of libbpf, so we can check just for
      the base bpf_object__open function.
      
      Adding new compile variable to detect libbpf package and link bpf
      dynamically:
      
        $ make LIBBPF_DYNAMIC=1
          ...
          LINK     perf
        $ ldd perf | grep bpf
          libbpf.so.0 => /lib64/libbpf.so.0 (0x00007f46818bc000)
      
      If libbpf is not installed, build stops with:
      
        Makefile.config:486: *** Error: No libbpf devel library found,\
        please install libbpf-devel.  Stop.
      
      Committer testing:
      
        $ make LIBBPF_DYNAMIC=1 -C tools/perf O=/tmp/build/perf
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j8' parallel build
        Makefile.config:493: *** Error: No libbpf devel library found, please install libbpf-devel.  Stop.
        make[1]: *** [Makefile.perf:225: sub-make] Error 2
        make: *** [Makefile:70: all] Error 2
        make: Leaving directory '/home/acme/git/perf/tools/perf'
        $
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Toke Høiland-Jørgensen <toke@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191126121253.28253-1-jolsa@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7b65e203
  32. 15 Oct, 2019 2 commits
    • Arnaldo Carvalho de Melo's avatar
      libbeauty: Hook up the x86 irq_vectors table generator · f19a85c6
      Arnaldo Carvalho de Melo authored
      I.e. after running:
      
        $ make -C tools/perf O=/tmp/build/perf
      
      We end up with:
      
        $ cat /tmp/build/perf/trace/beauty/generated/x86_arch_irq_vectors_array.c
        static const char *x86_irq_vectors[] = {
        	[0x02] = "NMI",
        	[0x12] = "MCE",
        	[0x20] = "IRQ_MOVE_CLEANUP",
        	[0x80] = "IA32_SYSCALL",
        	[0xec] = "LOCAL_TIMER",
        	[0xed] = "HYPERV_STIMER0",
        	[0xee] = "HYPERV_REENLIGHTENMENT",
        	[0xef] = "MANAGED_IRQ_SHUTDOWN",
        	[0xf0] = "POSTED_INTR_NESTED",
        	[0xf1] = "POSTED_INTR_WAKEUP",
        	[0xf2] = "POSTED_INTR",
        	[0xf3] = "HYPERVISOR_CALLBACK",
        	[0xf4] = "DEFERRED_ERROR",
        	[0xf6] = "IRQ_WORK",
        	[0xf7] = "X86_PLATFORM_IPI",
        	[0xf8] = "REBOOT",
        	[0xf9] = "THRESHOLD_APIC",
        	[0xfa] = "THERMAL_APIC",
        	[0xfb] = "CALL_FUNCTION_SINGLE",
        	[0xfc] = "CALL_FUNCTION",
        	[0xfd] = "RESCHEDULE",
        	[0xfe] = "ERROR_APIC",
        	[0xff] = "SPURIOUS_APIC",
        };
        $
      
      Now its just a matter of using it, associating it to tracepoint arguments named
      'vector', all of which can be correctly used with this table, for int args.
      
      At some point we should move tools/perf/trace/beauty to tools/beauty/,
      so that it can be used more generally and even made available externally
      like libbpf, libperf, libtraceevent, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-0p2df4kq1afrxbck4e4ct34r@git.kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f19a85c6
    • Jiri Olsa's avatar
      perf tools: Allow to build with -ltcmalloc · bb91a073
      Jiri Olsa authored
      
      By using "make TCMALLOC=1" you can enable perf to be build for usage
      with libtcmalloc.so (gperftools).
      
      Get heap profile (tools/perf directory):
      
        $ <install gperftools>
        $ make TCMALLOC=1 DEBUG=1
        $ HEAPPROFILE=/tmp/heapprof ./perf ...
        $ pprof ./perf /tmp/heapprof.000*
        (pprof) top
        Total: 2335.5 MB
          1735.1  74.3%  74.3%   1735.1  74.3% memdup
           402.0  17.2%  91.5%    402.0  17.2% zalloc
           140.2   6.0%  97.5%    145.8   6.2% map__new
            33.6   1.4%  98.9%     33.6   1.4% symbol__new
            12.4   0.5%  99.5%     12.4   0.5% alloc_event
             6.2   0.3%  99.7%      6.2   0.3% nsinfo__new
             5.5   0.2% 100.0%      5.5   0.2% nsinfo__copy
             0.3   0.0% 100.0%      0.3   0.0% dso__new
             0.1   0.0% 100.0%      0.1   0.0% do_read_string
             0.0   0.0% 100.0%      0.0   0.0% __GI__IO_file_doallocate
      
      See callstack:
        $ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
        $ pprof --web ./perf /tmp/heapprof.00*
      
      Committer testing:
      
      Install gperftools, on fedora:
      
        # dnf install gperftools-devel
      
      Then build:
      
       $ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin
      
      Verify that it linked against the right library:
      
        $ ldd ~/bin/perf | grep tcma
      	libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
        $
      
      Run 'perf trace' system wide for 1 minute:
      
        # HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
        <SNIP>
         59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
         59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
         59981.956 (10.143 ms): SCTP timer/21716  ... [continued]: select())                            = 0 (Timeout)
         59985.549 (         ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
             0.926 (59999.481 ms): sleep/29764  ... [continued]: nanosleep())                           = 0
         59992.133 (         ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80)                            ...
         60000.477 ( 0.009 ms): sleep/29764 close(fd: 1)                                                = 0
         60000.493 ( 0.005 ms): sleep/29764 close(fd: 2)                                                = 0
         60000.514 (         ): sleep/29764 exit_group()                                                = ?
        Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
      [root@quaco ~]#
      
      Install pprof:
      
        # dnf install pprof
      
      And run it:
      
        # pprof ~/bin/perf /tmp/heapprof.0001.heap
        Using local file /root/bin/perf.
        Using local file /tmp/heapprof.0001.heap.
        Welcome to pprof!  For help, type 'help'.
        (pprof) top
        Total: 4.0 MB
             1.7  42.0%  42.0%      2.2  54.1% map__new
             0.9  23.3%  65.3%      0.9  23.3% zalloc
             0.5  11.4%  76.7%      0.5  11.4% dso__new
             0.2   5.6%  82.3%      0.3   8.5% trace__sys_enter
             0.2   4.9%  87.2%      0.2   4.9% __GI___strdup
             0.2   3.8%  91.0%      0.2   3.8% new_term
             0.1   2.2%  93.2%      0.4  10.1% __perf_pmu__new_alias
             0.0   1.0%  94.3%      0.0   1.2% event_read_fields
             0.0   0.8%  95.1%      0.0   0.8% nsinfo__new
             0.0   0.7%  95.8%      0.1   3.2% trace__read_syscall_info
        (pprof)
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191013151427.11941-2-jolsa@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bb91a073
  33. 11 Oct, 2019 1 commit
  34. 09 Oct, 2019 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Hook up the x86 MSR table generator · fd218347
      Arnaldo Carvalho de Melo authored
      This way we generate the source with the table for later use by plugins,
      etc.
      
      I.e. after running:
      
        $ make -C tools/perf O=/tmp/build/perf
      
      We end up with:
      
        $ head /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
        static const char *x86_MSRs[] = {
        	[0x00000000] = "IA32_P5_MC_ADDR",
        	[0x00000001] = "IA32_P5_MC_TYPE",
        	[0x00000010] = "IA32_TSC",
        	[0x00000017] = "IA32_PLATFORM_ID",
        	[0x0000001b] = "IA32_APICBASE",
        	[0x00000020] = "KNC_PERFCTR0",
        	[0x00000021] = "KNC_PERFCTR1",
        	[0x00000028] = "KNC_EVNTSEL0",
        	[0x00000029] = "KNC_EVNTSEL1",
        $
      
      Now its just a matter of using it, first in a libtracevent plugin.
      
      At some point we should move tools/perf/trace/beauty to tools/beauty/,
      so that it can be used more generally and even made available externally
      like libbpf, libperf, libtraevent, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-b3rmutg4igcohx6kpo67qh4j@git.kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fd218347