NAME
acpicpu —
ACPI CPU
SYNOPSIS
acpicpu* at cpu?
DESCRIPTION
The
acpicpu device driver supports certain processor features
that are either only available via ACPI or that require ACPI to function
properly. Typically the ACPI processor functionality is grouped into so-called
C-, P-, and T-states.
C-states
The processor power states, or C-states, are low-power modes that can be used
when the CPU is idle. The idea is not new: already in the 80486 processor a
specific instruction (HLT) was used for this purpose. This was later
accompanied by a pair of other instructions (MONITOR, MWAIT). By default,
NetBSD may use either one; see the
machdep.idle-mechanism
sysctl(8) variable. ACPI
provides the latest amendment.
The following C-states are typically available. Additional processor or vendor
specific states (C4, ..., Cn) are handled internally by
acpicpu.
-
-
C0
- This is the normal state of a processor; the CPU is busy
executing instructions.
-
-
C1
- This is the state that is typically reached via the
mentioned x86 instructions. On a typical processor,
C1
turns off the main internal CPU clock, leaving
APIC running at full speed. The CPU is free to temporarily leave the state
to deal with important requests.
-
-
C2
- The main difference between
C1
and
C2
lies in the internal hardware entry method of
the processor. While less power is expected to be consumed than in
C1
, the bus interface unit is still running. But
depending on the processor, the local APIC timer may be stopped. Like with
C1
, entering and exiting the state are expected to
be fast operations.
-
-
C3
- This is the deepest conventional state. Parts of the CPU
are actively powered down. The internal CPU clock is stopped. The local
APIC timer is stopped. Depending on the processor, additional timers such
as x86/tsc(9) may be
stopped. Processor caches may be flushed. Entry and exit latencies are
expected to be high; the CPU can no longer “quickly” respond
to bus activity or other interruptions.
Each state has a latency associated with entry and exit. The higher the state,
the lower the power consumption, and the higher the potential performance
costs.
The
acpicpu driver tries to balance the latency constraints
when choosing the appropriate state. One of the checks involves bus master
activity; if such activity is detected, a lower state is used. It is known
that particularly
usb(4) may cause
high activity even when not in use. If maximum power savings are desirable, it
may be necessary to use a custom kernel without USB support. And generally: to
save power with C-states, one should avoid polling, both in userland and in
the kernel.
P-states
The processor performance states, or P-states, are used to control the clock
frequencies and voltages of a CPU. Underneath the abstractions of ACPI,
P-states are associated with such technologies as “SpeedStep”
(Intel), “PowerNow!” (AMD), and “PowerSaver” (VIA).
The P0-state is always the highest operating frequency supported by the
processor. The number of additional P-states may vary across processors and
vendors. Each higher numbered P-state represents lower clock frequencies and
hence lower power consumption. Note that while
acpicpu
always uses the exact frequencies internally, the user-visible values reported
by ACPI may be rounded or approximated by the vendor.
Unlike conventional CPU frequency management, ACPI provides support for Dynamic
Voltage and Frequency Scaling (DVFS). Among other things, this means that the
firmware may request the implementation to dynamically scale the presently
supported maximum or minimum clock frequency. For example, if
acpiacad(4) is disconnected,
the maximum available frequency may be lowered. By default, the
NetBSD implementation may manipulate the frequencies
according to the notifications from the firmware.
T-states
Processor T-states, or “throttling states”, can be used to actively
modulate the time a processor is allowed to execute. Outside the ACPI
nomenclature, throttling and T-states may be known as “on-demand clock
modulation” (ODCM).
The concept of “duty cycle” is relevant to T-states. It is generally
defined to be a fraction of time that a system is in an “active”
state. The T0-state has always a duty cycle of 100 %, and thus, comparable to
the C0-state, the processor is fully active. Each additional higher-numbered
T-state indicates lower duty cycles. At most eight T-states may be available,
although also T-states use DVFS.
The duty cycle does not refer to the actual clock signal, but to the time period
in which the clock signal is allowed to drive the processor chip. For
instance, if a T-state has a duty cycle of 75 %, the CPU runs at the same
clock frequency and uses the same voltage, but 25 % of the time the CPU is
forced to idle. Because of this, the use of T-states may severely affect
system performance.
There are two typical situations for throttling: power management and thermal
control. As a technique to save power, T-states are largely an artifact from
the past. There was a short period in the x86 lineage when P-states were not
yet available and throttling was considered as an option to modulate the
processor power consumption. The approach was however quickly abandoned. In
modern x86 systems P-states should be preferred in all circumstances. It is
also more beneficial to move from the C0-state to deeper C-states than it is
to actively force down the duty cycle of a processor.
But T-states have retained their use as a last line of defense against critical
thermal conditions. Many x86 processors include a catastrophic shutdown
detector. When the processor core temperature reaches this factory defined
trip-point, the processor execution is halted without any software control.
Before this fatal condition, it is possible to use throttling for a short
period of time in order to force the temperatures to lower levels. The thermal
control modulation is typically started only when the system is in the
highest-power P-state and a high temperature situation exists. After the
temperatures have returned to non-critical levels, the modulation ceases.
System Control Variables
The
acpicpu driver uses the same
sysctl(8) controls for P-states
as the ones provided by
est(4) and
powernow(4). Depending on the
processor, the second-level node is either
machdep.est or
machdep.powernow. Please note that future versions of
acpicpu may however remove these system control variables
without further notice.
In addition, the following two variables are available.
-
-
- hw.acpi.cpu.dynamic
- A boolean that controls whether the states are allowed to
change dynamically. When enabled, C-, P-, and T-states may all change at
runtime, and acpicpu may also take actions based on
requests from the firmware.
-
-
- hw.acpi.cpu.passive
- A boolean that enables or disables automatic processor
thermal management via
acpitz(4).
Statistics
The
acpicpu driver uses event counters to track the times a
processor has entered a given state. It is possible to view the statistics by
using
vmstat(1) (with the
-e flag).
SEE ALSO
acpi(4),
acpitz(4),
est(4),
odcm(4),
powernow(4),
cpu_idle(9)
Etienne Le Sueur and
Gernot Heiser, Dynamic Voltage and
Frequency Scaling: The Laws of Diminishing Returns,
http://www.ertos.nicta.com.au/publications/papers/LeSueur_Heiser_10.pdf,
October, 2010, Proceedings of the
2010 Workshop on Power Aware Computing and Systems
(HotPower'10).
David C. Snowdon,
Operating System Directed Power Management,
School of Computer Science and Engineering, University of New
South Wales,
http://ertos.nicta.com.au/publications/papers/Snowdon:phd.pdf,
March, 2010, PhD
Thesis.
Microsoft Corporation,
Windows Native Processor Performance Control,
Version 1.1a,
http://msdn.microsoft.com/en-us/windows/hardware/gg463343,
November, 2002.
Venkatesh Pallipadi and
Alexey Starikovskiy, The Ondemand
Governor. Past, Present, and Future, Intel Open Source
Technology Center,
http://www.kernel.org/doc/ols/2006/ols2006v2-pages-223-238.pdf,
July, 2006, Proceedings of the
Linux Symposium.
HISTORY
The
acpicpu device driver appeared in
NetBSD
6.0.
AUTHORS
Jukka Ruohonen ⟨jruohonen@iki.fi⟩
CAVEATS
At least the following caveats can be mentioned.
- It is currently only safe to use
C1
on NetBSD. All other
C-states are disabled by default.
- Processor thermal control (see
acpitz(4)) is not yet
supported.
- Depending on the processor, changes in C-, P-, and
T-states may all skew timers and counters such as
x86/tsc(9). This is neither
handled by acpicpu nor by
est(4) or
powernow(4).
- There is currently neither a well-defined,
machine-independent API for processor performance management nor a
“governor” for different policies. It is only possible to
control the CPU frequencies from userland.