NAME
pmc.core
—
measurement events for Intel Core Solo
and Core Duo family CPUs
LIBRARY
library “libpmc”
SYNOPSIS
#include
<pmc.h>
DESCRIPTION
Intel Core Solo and Core Duo CPUs contain PMCs conforming to version 1 of the Intel performance measurement architecture.These PMCs are documented in Volume 3: System Programming Guide, IA-32 Intel® Architecture Software Developer's Manual, Order Number 253669-027US, Intel Corporation, July 2008.
PMC Features
CPUs conforming to version 1 of the Intel performance measurement
architecture contain two programmable PMCs of class
PMC_CLASS_IAP
. The PMCs are 40 bits width and offer
the following capabilities:
Capability | Support |
PMC_CAP_CASCADE | No |
PMC_CAP_EDGE | Yes |
PMC_CAP_INTERRUPT | Yes |
PMC_CAP_INVERT | Yes |
PMC_CAP_READ | Yes |
PMC_CAP_PRECISE | No |
PMC_CAP_SYSTEM | Yes |
PMC_CAP_TAGGING | No |
PMC_CAP_THRESHOLD | Yes |
PMC_CAP_USER | Yes |
PMC_CAP_WRITE | Yes |
Event Qualifiers
Event specifiers for these PMCs support the following common qualifiers:
cmask=
value- Configure the PMC to increment only if the number of configured events measured in a cycle is greater than or equal to value.
edge
- Configure the PMC to count the number of de-asserted to asserted transitions of the conditions expressed by the other qualifiers. If specified, the counter will increment only once whenever a condition becomes true, irrespective of the number of clocks during which the condition remains true.
inv
- Invert the sense of comparison when the
“
cmask
” qualifier is present, making the counter increment when the number of events per cycle is less than the value specified by the “cmask
” qualifier. os
- Configure the PMC to count events happening at processor privilege level 0.
usr
- Configure the PMC to count events occurring at privilege levels 1, 2 or 3.
If neither of the “os
” or
“usr
” qualifiers are specified, the
default is to enable both.
Events that require core-specificity to be specified use a
additional qualifier
“core=
value”,
where argument value is one of:
all
- Measure event conditions on all cores.
this
- Measure event conditions on this core.
this
”.
Events that require an agent qualifier to be specified use an
additional qualifier “agent=
value”,
where argument value is one of:
this
- Measure events associated with this bus agent.
any
- Measure events caused by any bus agent.
this
”.
Events that require a hardware prefetch qualifier to be specified
use an additional qualifier
“prefetch=
value”,
where argument value is one of:
both
- Include all prefetches.
only
- Only count hardware prefetches.
exclude
- Exclude hardware prefetches.
both
”.
Events that require a cache coherence qualifier to be specified
use an additional qualifier
“cachestate=
value”,
where argument value contains one or more of the
following letters:
e
- Count cache lines in the exclusive state.
i
- Count cache lines in the invalid state.
m
- Count cache lines in the modified state.
s
- Count cache lines in the shared state.
eims
”.
Event Specifiers
The following event names are case insensitive. Whitespace, hyphens and underscore characters in these names are ignored.
Core PMCs support the following events:
BAClears
- (Event E6H, Umask 00H) The number of BAClear conditions asserted.
BTB_Misses
- (Event E2H, Umask 00H) The number of branches for which the branch table buffer did not produce a prediction.
Br_BAC_Missp_Exec
- (Event 8AH, Umask 00H) The number of branch instructions executed that were mispredicted at the front end.
Br_Bogus
- (Event E4H, Umask 00H) The number of bogus branches.
Br_Call_Exec
- (Event 92H, Umask 00H) The number of
CALL
instructions executed. Br_Call_Missp_Exec
- (Event 93H, Umask 00H) The number of
CALL
instructions executed that were mispredicted. Br_Cnd_Exec
- (Event 8BH, Umask 00H) The number of conditional branch instructions executed.
Br_Cnd_Missp_Exec
- (Event 8CH, Umask 00H) The number of conditional branch instructions executed that were mispredicted.
Br_Ind_Call_Exec
- (Event 94H, Umask 00H) The number of indirect
CALL
instructions executed. Br_Ind_Exec
- (Event 8DH, Umask 00H) The number of indirect branches executed.
Br_Ind_Missp_Exec
- (Event 8EH, Umask 00H) The number of indirect branch instructions executed that were mispredicted.
Br_Inst_Exec
- (Event 88H, Umask 00H) The number of branch instructions executed including speculative branches.
Br_Instr_Decoded
- (Event E0H, Umask 00H) The number of branch instructions decoded.
Br_Instr_Ret
- (Event C4H, Umask 00H) (Alias "Branch Instruction Retired") The number of branch instructions retired. This is an architectural performance event.
Br_MisPred_Ret
- (Event C5H, Umask 00H) (Alias "Branch Misses Retired") The number of mispredicted branch instructions retired. This is an architectural performance event.
Br_MisPred_Taken_Ret
- (Event CAH, Umask 00H) The number of taken and mispredicted branches retired.
Br_Missp_Exec
- (Event 89H, Umask 00H) The number of branch instructions executed and mispredicted at execution including branches that were not predicted.
Br_Ret_BAC_Missp_Exec
- (Event 91H, Umask 00H) The number of return branch instructions that were mispredicted at the front end.
Br_Ret_Exec
- (Event 8FH, Umask 00H) The number of return branch instructions executed.
Br_Ret_Missp_Exec
- (Event 90H, Umask 00H) The number of return branch instructions executed that were mispredicted.
Br_Taken_Ret
- (Event C9H, Umask 00H) The number of taken branches retired.
Bus_BNR_Clocks
- (Event 61H, Umask 00H) The number of external bus cycles while BNR (bus not ready) was asserted.
Bus_DRDY_Clocks
[,agent=agent]- (Event 62H, Umask 00H) The number of external bus cycles while DRDY was asserted.
Bus_Data_Rcv
- (Event 64H, Umask 40H) The number of cycles during which the processor is busy receiving data.
Bus_Locks_Clocks
[,core=core]- (Event 63H) The number of external bus cycles while the bus lock signal was asserted.
Bus_Not_In_Use
[,core=core]- (Event 7DH) The number of cycles when there is no transaction from the core.
Bus_Req_Outstanding
[,agent=agent] [,core=core]- (Event 60H) The weighted cycles of cacheable bus data read requests from the data cache unit or hardware prefetcher.
Bus_Snoop_Stall
- (Event 7EH, Umask 00H) The number bus cycles while a bus snoop is stalled.
Bus_Snoops
[,agent=agent] [,cachestate=mesi]- (Event 77H) The number of snoop responses to bus transactions.
Bus_Trans_Any
[,agent=agent]- (Event 70H) The number of completed bus transactions.
Bus_Trans_Brd
[,core=core]- (Event 65H) The number of read bus transactions.
Bus_Trans_Burst
[,agent=agent]- (Event 6EH) The number of completed burst transactions. Retried transactions may be counted more than once.
Bus_Trans_Def
[,core=core]- (Event 6DH) The number of completed deferred transactions.
Bus_Trans_IO
[,agent=agent] [,core=core]- (Event 6CH) The number of completed I/O transactions counting both reads and writes.
Bus_Trans_Ifetch
[,agent=agent] [,core=core]- (Event 68H) Completed instruction fetch transactions.
Bus_Trans_Inval
[,agent=agent] [,core=core]- (Event 69H) The number completed invalidate transactions.
Bus_Trans_Mem
[,agent=agent]- (Event 6FH) The number of completed memory transactions.
Bus_Trans_P
[,agent=agent] [,core=core]- (Event 6BH) The number of completed partial transactions.
Bus_Trans_Pwr
[,agent=agent] [,core=core]- (Event 6AH) The number of completed partial write transactions.
Bus_Trans_RFO
[,agent=agent] [,core=core]- (Event 66H) The number of completed read-for-ownership transactions.
Bus_Trans_WB
[,agent=agent]- (Event 67H) The number of completed write-back transactions from the data cache unit, excluding L2 write-backs.
Cycles_Div_Busy
- (Event 14H, Umask 00H) The number of cycles the divider is busy. The event is only available on PMC0.
Cycles_Int_Masked
- (Event C6H, Umask 00H) The number of cycles while interrupts were disabled.
Cycles_Int_Pending_Masked
- (Event C7H, Umask 00H) The number of cycles while interrupts were disabled and interrupts were pending.
- (Event 78H) The number of data cache unit snoops to L1 cache lines in the shared state.
DCache_Cache_Lock
[,cachestate=mesi]- (Event 42H) The number of cacheable locked read operations to invalid state.
DCache_Cache_LD
[,cachestate=mesi]- (Event 40H) The number of cacheable L1 data read operations.
DCache_Cache_ST
[,cachestate=mesi]- (Event 41H) The number cacheable L1 data write operations.
DCache_M_Evict
- (Event 47H, Umask 00H) The number of M state data cache lines that were evicted.
DCache_M_Repl
- (Event 46H, Umask 00H) The number of M state data cache lines that were allocated.
DCache_Pend_Miss
- (Event 48H, Umask 00H) The weighted cycles an L1 miss was outstanding.
DCache_Repl
- (Event 45H, Umask 0FH) The number of data cache line replacements.
Data_Mem_Cache_Ref
- (Event 44H, Umask 02H) The number of cacheable read and write operations to L1 data cache.
Data_Mem_Ref
- (Event 43H, Umask 01H) The number of L1 data reads and writes, both cacheable and un-cacheable.
Dbus_Busy
[,core=core]- (Event 22H) The number of core cycles during which the data bus was busy.
Dbus_Busy_Rd
[,core=core]- (Event 23H) The number of cycles during which the data bus was busy transferring data to a core.
Div
- (Event 13H, Umask 00H) The number of divide operations including speculative operations for integer and floating point divides. This event can only be counted on PMC1.
Dtlb_Miss
- (Event 49H, Umask 00H) The number of data references that missed the TLB.
ESP_Uops
- (Event D7H, Umask 00H) The number of ESP folding instructions decoded.
EST_Trans
[,trans=transition]- (Event 3AH) Count the number of Intel Enhanced SpeedStep transitions. The
argument transition can be one of the following
values:
any
- (Umask 00H) Count all transitions.
frequency
- (Umask 01H) Count frequency transitions.
any
”. FP_Assist
- (Event 11H, Umask 00H) The number of floating point operations that required microcode assists. The event is only available on PMC1.
FP_Comp_Instr_Ret
- (Event C1H, Umask 00H) The number of X87 floating point compute instructions retired. The event is only available on PMC0.
FP_Comps_Op_Exe
- (Event 10H, Umask 00H) The number of floating point computational instructions executed.
FP_MMX_Trans
- (Event CCH, Umask 01H) The number of transitions from X87 to MMX.
Fused_Ld_Uops_Ret
- (Event DAH, Umask 01H) The number of fused load uops retired.
Fused_St_Uops_Ret
- (Event DAH, Umask 02H) The number of fused store uops retired.
Fused_Uops_Ret
- (Event DAH, Umask 00H) The number of fused uops retired.
HW_Int_Rx
- (Event C8H, Umask 00H) The number of hardware interrupts received.
ICache_Misses
- (Event 81H, Umask 00H) The number of instruction fetch misses in the instruction cache and streaming buffers.
ICache_Reads
- (Event 80H, Umask 00H) The number of instruction fetches from the instruction cache and streaming buffers counting both cacheable and un-cacheable fetches.
IFU_Mem_Stall
- (Event 86H, Umask 00H) The number of cycles the instruction fetch unit was stalled while waiting for data from memory.
ILD_Stall
- (Event 87H, Umask 00H) The number of instruction length decoder stalls.
ITLB_Misses
- (Event 85H, Umask 00H) The number of instruction TLB misses.
Instr_Decoded
- (Event D0H, Umask 00H) The number of instructions decoded.
Instr_Ret
- (Event C0H, Umask 00H) (Alias "Instruction Retired") The number of instructions retired. This is an architectural performance event.
L1_Pref_Req
- (Event 4FH, Umask 00H) The number of L1 prefetch request due to data cache misses.
L2_ADS
[,core=core]- (Event 21H) The number of L2 address strobes.
L2_IFetch
[,cachestate=mesi] [,core=core]- (Event 28H) The number of instruction fetches by the instruction fetch unit from L2 cache including speculative fetches.
L2_LD
[,cachestate=mesi] [,core=core]- (Event 29H) The number of L2 cache reads.
L2_Lines_In
[,core=core] [,prefetch=prefetch]- (Event 24H) The number of L2 cache lines allocated.
L2_Lines_Out
[,core=core] [,prefetch=prefetch]- (Event 26H) The number of L2 cache lines evicted.
L2_M_Lines_In
[,core=core]- (Event 25H) The number of L2 M state cache lines allocated.
L2_M_Lines_Out
[,core=core] [,prefetch=prefetch]- (Event 27H) The number of L2 M state cache lines evicted.
L2_No_Request_Cycles
[,cachestate=mesi] [,core=core] [,prefetch=prefetch]- (Event 32H) The number of cycles there was no request to access L2 cache.
L2_Reject_Cycles
[,cachestate=mesi] [,core=core] [,prefetch=prefetch]- (Event 30H) The number of cycles the L2 cache was busy and rejecting new requests.
L2_Rqsts
[,cachestate=mesi] [,core=core] [,prefetch=prefetch]- (Event 2EH) The number of L2 cache requests.
L2_ST
[,cachestate=mesi] [,core=core]- (Event 2AH) The number of L2 cache writes including speculative writes.
LD_Blocks
- (Event 03H, Umask 00H) The number of load operations delayed due to store buffer blocks.
LLC_Misses
- (Event 2EH, Umask 41H) The number of cache misses for references to the last level cache, excluding misses due to hardware prefetches. This is an architectural performance event.
LLC_Reference
- The number of references to the last level cache, excluding those due to hardware prefetches. This is an architectural performance event. (Event 2EH, Umask 4FH) This is an architectural performance event.
MMX_Assist
- (Event CDH, Umask 00H) The number of EMMX instructions executed.
MMX_FP_Trans
- (Event CCH, Umask 00H) The number of transitions from MMX to X87.
MMX_Instr_Exec
- (Event B0H, Umask 00H) The number of MMX instructions executed excluding
MOVQ
andMOVD
stores. MMX_Instr_Ret
- (Event CEH, Umask 00H) The number of MMX instructions retired.
Misalign_Mem_Ref
- (Event 05H, Umask 00H) The number of misaligned data memory references, counting loads and stores.
Mul
- (Event 12H, Umask 00H) The number of multiply operations include speculative floating point and integer multiplies. This event is available on PMC1 only.
NonHlt_Ref_Cycles
- (Event 3CH, Umask 01H) (Alias "Unhalted Reference Cycles") The number of non-halted bus cycles. This is an architectural performance event.
Pref_Rqsts_Dn
- (Event F8H, Umask 00H) The number of hardware prefetch requests issued in backward streams.
Pref_Rqsts_Up
- (Event F0H, Umask 00H) The number of hardware prefetch requests issued in forward streams.
Resource_Stall
- (Event A2H, Umask 00H) The number of cycles where there is a resource related stall.
SD_Drains
- (Event 04H, Umask 00H) The number of cycles while draining store buffers.
SIMD_FP_DP_P_Ret
- (Event D8H, Umask 02H) The number of SSE/SSE2 packed double precision instructions retired.
SIMD_FP_DP_P_Comp_Ret
- (Event D9H, Umask 02H) The number of SSE/SSE2 packed double precision compute instructions retired.
SIMD_FP_DP_S_Ret
- (Event D8H, Umask 03H) The number of SSE/SSE2 scalar double precision instructions retired.
SIMD_FP_DP_S_Comp_Ret
- (Event D9H, Umask 03H) The number of SSE/SSE2 scalar double precision compute instructions retired.
SIMD_FP_SP_P_Comp_Ret
- (Event D9H, Umask 00H) The number of SSE/SSE2 packed single precision compute instructions retired.
SIMD_FP_SP_Ret
- (Event D8H, Umask 00H) The number of SSE/SSE2 scalar single precision instructions retired, both packed and scalar.
SIMD_FP_SP_S_Ret
- (Event D8H, Umask 01H) The number of SSE/SSE2 scalar single precision instructions retired.
SIMD_FP_SP_S_Comp_Ret
- (Event D9H, Umask 01H) The number of SSE/SSE2 single precision compute instructions retired.
SIMD_Int_128_Ret
- (Event D8H, Umask 04H) The number of SSE2 128-bit integer instructions retired.
SIMD_Int_Pari_Exec
- (Event B3H, Umask 20H) The number of SIMD integer packed arithmetic instructions executed.
SIMD_Int_Pck_Exec
- (Event B3H, Umask 04H) The number of SIMD integer pack operations instructions executed.
SIMD_Int_Plog_Exec
- (Event B3H, Umask 10H) The number of SIMD integer packed logical instructions executed.
SIMD_Int_Pmul_Exec
- (Event B3H, Umask 01H) The number of SIMD integer packed multiply instructions executed.
SIMD_Int_Psft_Exec
- (Event B3H, Umask 02H) The number of SIMD integer packed shift instructions executed.
SIMD_Int_Sat_Exec
- (Event B1H, Umask 00H) The number of SIMD integer saturating instructions executed.
SIMD_Int_Upck_Exec
- (Event B3H, Umask 08H) The number of SIMD integer unpack instructions executed.
SMC_Detected
- (Event C3H, Umask 00H) The number of times self-modifying code was detected.
SSE_NTStores_Miss
- (Event 4BH, Umask 03H) The number of times an SSE streaming store instruction missed all caches.
SSE_NTStores_Ret
- (Event 07H, Umask 03H) The number of SSE streaming store instructions executed.
SSE_PrefNta_Miss
- (Event 4BH, Umask 00H) The number of times
PREFETCHNTA
missed all caches. SSE_PrefNta_Ret
- (Event 07H, Umask 00H) The number of
PREFETCHNTA
instructions retired. SSE_PrefT1_Miss
- (Event 4BH, Umask 01H) The number of times
PREFETCHT1
missed all caches. SSE_PrefT1_Ret
- (Event 07H, Umask 01H) The number of
PREFETCHT1
instructions retired. SSE_PrefT2_Miss
- (Event 4BH, Umask 02H) The number of times
PREFETCHNT2
missed all caches. SSE_PrefT2_Ret
- (Event 07H, Umask 02H) The number of
PREFETCHT2
instructions retired. Seg_Reg_Loads
- (Event 06H, Umask 00H) The number of segment register loads.
Serial_Execution_Cycles
- (Event 3CH, Umask 02H) The number of non-halted bus cycles of this code while the other core was halted.
Thermal_Trip
- (Event 3BH, Umask C0H) The duration in a thermal trip based on the current core clock.
Unfusion
- (Event DBH, Umask 00H) The number of unfusion events.
Unhalted_Core_Cycles
- (Event 3CH, Umask 00H) The number of core clock cycles when the clock signal on a specific core is not halted. This is an architectural performance event.
Uops_Ret
- (Event C2H, Umask 00H) The number of micro-ops retired.
Event Name Aliases
The following table shows the mapping between the PMC-independent aliases supported by library “libpmc” and the underlying hardware events used.
Alias | Event |
branches |
Br_Instr_Ret |
branch-mispredicts |
Br_MisPred_Ret |
dc-misses |
(unsupported) |
ic-misses |
ICache_Misses |
instructions |
Instr_Ret |
interrupts |
HW_Int_Rx |
unhalted-cycles |
(unsupported) |
PROCESSOR ERRATA
The following errata affect performance measurement on these processors. These errata are documented in Intel® CoreTM Duo Processor and Intel® CoreTM Solo Processor on 65 nm Process, Specification Update, Order Number 309222-017, Intel Corporation, July 2008.
- AE19
- Data prefetch performance monitoring events can only be enabled on a single core.
- AE25
- Performance monitoring counters that count external bus events may report incorrect values after processor power state transitions.
- AE28
- Performance monitoring events for retired floating point operations (C1H) may not be accurate.
- AE29
- DR3 address match on MOVD/MOVQ/MOVNTQ memory store instruction may incorrectly increment performance monitoring count for saturating SIMD instructions retired (Event CFH).
- AE33
- Hardware prefetch performance monitoring events may be counted inaccurately.
- AE36
- The
CPU_CLK_UNHALTED
performance monitoring event (Event 3CH) counts clocks when the processor is in the C1/C2 processor power states. - AE39
- Certain performance monitoring counters related to bus, L2 cache and power management are inaccurate.
- AE51
- Performance monitoring events for retired instructions (Event C0H) may not be accurate.
- AE67
- Performance monitoring event
FP_ASSIST
may not be accurate. - AE78
- Performance monitoring event for hardware prefetch requests (Event 4EH) and hardware prefetch request cache misses (Event 4FH) may not be accurate.
- AE82
- Performance monitoring event
FP_MMX_TRANS_TO_MMX
may not count some transitions.
SEE ALSO
pmc(3), pmc.atom(3), pmc.core2(3), pmc.iaf(3), pmc.k7(3), pmc.k8(3), pmc.p4(3), pmc.p5(3), pmc.p6(3), pmc.soft(3), pmc.tsc(3), pmclog(3), hwpmc(4)
HISTORY
The pmc
library first appeared in
FreeBSD 6.0.
AUTHORS
The library “libpmc” library was written by Joseph Koshy <jkoshy@FreeBSD.org>.