Simple Gaudi algorithm to test Intel Profiler.
A video guide to installing profiler package.
CpuHungryAlg algorithms define 4 functions:
double mysin();
double mycos();
double mytan();
double myatan();
, and we would like to profile them. Execution of this functions depends on name of algorithm.
double result = 0;
}
else if (
name() ==
"Alg2") {
}else {
}
}
StatusCode execute() override
Algorithm execution.
const std::string & name() const override
The identifying name of the algorithm object.
This class is used for returning status codes from appropriate routines.
constexpr static const auto SUCCESS
Run
$> intelprofiler -o ~/profiler GaudiTestSuite/options/IntelProfiler.py
At the end you should see something like that:
...
IntelProfilerAu... INFO Start profiling (event #1)
IntelProfilerAu... DEBUG Skip component TopSequence
IntelProfilerAu... DEBUG Start profiling component TopSequence SubSequence
IntelProfilerAu... DEBUG Start event type 2 for TopSequence SubSequence
IntelProfilerAu... DEBUG Pause event 2
IntelProfilerAu... DEBUG Start profiling component TopSequence SubSequence Alg1
IntelProfilerAu... DEBUG Start event type 3 for TopSequence SubSequence Alg1
IntelProfilerAu... DEBUG End event for Alg1
IntelProfilerAu... DEBUG Resume event for 2
IntelProfilerAu... DEBUG Skip component Alg2
IntelProfilerAu... DEBUG Resume
IntelProfilerAu... DEBUG Pause event 2
IntelProfilerAu... DEBUG Start profiling component TopSequence SubSequence Alg3
IntelProfilerAu... DEBUG Start event type 4 for TopSequence SubSequence Alg3
IntelProfilerAu... DEBUG End event for Alg3
IntelProfilerAu... DEBUG Resume event for 2
IntelProfilerAu... DEBUG End event for SubSequence
IntelProfilerAu... DEBUG Pause
IntelProfilerAu... DEBUG Skip component Alg4
IntelProfilerAu... DEBUG Pause
IntelProfilerAu... INFO Stop profiling (event #2)
ApplicationMgr INFO Application Manager Stopped successfully
EventLoopMgr INFO Histograms converted successfully according to request.
ApplicationMgr INFO Application Manager Finalized successfully
ApplicationMgr INFO Application Manager Terminated successfully
Using result path `/data/amazurov/Amplifier/IntelProfilingExample/r050hs`
, where Using result path `/data/amazurov/Amplifier/IntelProfilingExample/r050hs`
is our profiling database.
Analyze
GUI
$> amplxe-gui ~/profiler/r000hs
Command line
Hot spots report:
amplxe-cl -report hotspots -r ~/profiler/r000hs
If options file looks like this:
from Configurables import IntelProfilerAuditor, CpuHungryAlg
alg1.Loops = alg2.Loops = alg3.Loops = alg4.Loops = 5000000
subtop = Sequencer('SubSequence', Members = [alg1, alg2, alg3], StopOverride = True )
top = Sequencer('TopSequence', Members = [subtop, alg4], StopOverride = True )
profiler = IntelProfilerAuditor()
profiler.OutputLevel = DEBUG
profiler.StartFromEventN = 1
profiler.StopAtEventN = 2
profiler.ComponentsForTaskTypes = []
profiler.IncludeAlgorithms = ["SubSequence"]
profiler.ExcludeAlgorithms = ["Alg2"]
EvtSel = 'NONE',
HistogramPersistency = 'NONE',
TopAlg = [top],
AuditAlgorithms=True)
The Application Manager class.
This service manages Auditors.
Algorithm which consume a lot of CPU.
, as a result we can see the following:
Function Module CPU Time
CpuHungryAlg::mysin libIntelProfilerExample.so 0.410
CpuHungryAlg::myatan libIntelProfilerExample.so 0.380
CpuHungryAlg::mytan libIntelProfilerExample.so 0.370
CpuHungryAlg::mycos libIntelProfilerExample.so 0.250
[Import thunk tan] libIntelProfilerExample.so 0.010
Report by algorithm chain:
$> amplxe-cl -report hotspots -r ~/profiler/r000hs --group-by task
Result:
Task Type CPU Time
TopSequence SubSequence Alg3 0.759
TopSequence SubSequence Alg1 0.410
TopSequence SubSequence 0.250
Report by algorithm chain with function's name:
$> amplxe-cl -report hotspots -r ~/profiler/r000hs --group-by task-function
Result:
Function Task Type CPU Time
CpuHungryAlg::mysin TopSequence SubSequence Alg1 0.410
CpuHungryAlg::myatan TopSequence SubSequence Alg3 0.380
CpuHungryAlg::mytan TopSequence SubSequence Alg3 0.370
CpuHungryAlg::mycos TopSequence SubSequence 0.250
[Import thunk tan] TopSequence SubSequence Alg3 0.010