
I want to profile my benchmarks generated by go test -c, but the go tool pprof needs a profile file usually generated inside the main function like this:

func main() {
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
        defer pprof.StopCPUProfile()

How can I create a profile file within my benchmarks ?

Was it helpful?


As described in you can specify the profile file using the flag -cpuprofile.

For example

go test -cpuprofile cpu.out


Use the -cpuprofile flag to go test as documented at

This post explains how to profile benchmarks with an example: Benchmark Profiling with pprof.

The following benchmark simulates some CPU work.

package main

import (

func BenchmarkRand(b *testing.B) {
    for n := 0; n < b.N; n++ {

To generate a CPU profile for the benchmark test, run:

go test -bench=BenchmarkRand -benchmem -cpuprofile profile.out

The -memprofile and -blockprofile flags can be used to generate memory allocation and blocking call profiles.

To analyze the profile use the Go tool:

go tool pprof profile.out
(pprof) top
Showing nodes accounting for 1.16s, 100% of 1.16s total
Showing top 10 nodes out of 22
      flat  flat%   sum%        cum   cum%
     0.41s 35.34% 35.34%      0.41s 35.34%  sync.(*Mutex).Unlock
     0.37s 31.90% 67.24%      0.37s 31.90%  sync.(*Mutex).Lock
     0.12s 10.34% 77.59%      1.03s 88.79%  math/rand.(*lockedSource).Int63
     0.08s  6.90% 84.48%      0.08s  6.90%  math/rand.(*rngSource).Uint64 (inline)
     0.06s  5.17% 89.66%      1.11s 95.69%  math/rand.Int63
     0.05s  4.31% 93.97%      0.13s 11.21%  math/rand.(*rngSource).Int63
     0.04s  3.45% 97.41%      1.15s 99.14%  benchtest.BenchmarkRand
     0.02s  1.72% 99.14%      1.05s 90.52%  math/rand.(*Rand).Int63
     0.01s  0.86%   100%      0.01s  0.86%  runtime.futex
         0     0%   100%      0.01s  0.86%  runtime.allocm

The bottleneck in this case is the mutex, caused by the default source in math/rand being synchronized.

Other profile presentations and output formats are also possible, e.g. tree. Type help for more options.

Note, that any initialization code before the benchmark loop will also be profiled.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top