Measuring power consumption on Linux

..with PCP and pmda-denki

Christian Horn

@Tokyo Linux User Group, November, 2024

We’ll have Q&A time at the end, but feel free to interrupt me.

Slides also hosted at:

https://denki.fluxcoil.net

Agenda

  • Theory
    • What is pmda-denki, why was it written?
    • Prior art / related software
    • A typical power measurement setup
    • Implemented sources for power metrics
  • Practical usage
    • How much power are wlan/NIC etc. using?
    • How much energy consume gpu/memory when playing a video?
    • For a given task, which system is most energy efficient?
  • The future
  • Links

What is pmda-denki?

  • A piece of the Performance Co-Pilot (PCP) suite
  • Performance Metrics Domain Agent (PMDA): specialist for measurements in one area:
    • pmda-postgresql: PostgreSQL metrics
    • pmda-linux: linux memory, network etc.
    • pmda-denki: power consumption
  • denki (Japanese 電気): “electricity”

Why was pmda-denki written?

  • For insights like “Your commit makes the workload 5% faster, but uses up 20% more power” or “In one day, this system uses the power equivalent to XY liter kerosine”
  • To get greener:
  • Because I was curious about some things, and not proper tool for exploration was around - see ‘practical usage’ section later
  • 2007: powertop, default tool on x86/ARM/RISC-V to see live power metrics, to optimize. No abstraction into library, no historical metrics.
  • 2021: pmda-denki, PCP power consumption metrics. Name appeared first at pcp.conf 2019. Via PCP we get historical recording, live monitoring, warnings (PMIE), visualization via redis/Grafana, anomaly detection. RHEL8 and later contain pmda-denki.
  • 2022 (?): Kepler, heavy duty, Red Hat/IBM seeded and sponsored. Kepler is reading consumption metrics, taking guesses, attributing consumption to single containers. On OpenShift helping to decide “shift this container to a different node which has solar generated power right now”. Kepler also getting available as source for PCP.

Test setup overview

Sources for power consumption metrics

  • RAPL readings
    • On x86, RAPL offers metrics on how much power is consumed by CPU, RAM and onboard GPU
  • Battery readings
    • For systems with battery, you can run on battery while running workloads. By observing the battery discharge level, we can calculate the consumption.
  • Smart plug readings (via pmda-openmetrics)
    • Smart plugs are inserted between power outlet and our ‘system under test’. Smart plugs hook into the network via WLAN or RJ-45 and report the electrical consumption of the connected consumer.

Further sources can be implemented via pmda-denki, or ad hoc via pmda-openmetrics.

The sources in detail

The sources in detail new

Computing consumption from battery charge level

Basic installation

The control system with Fedora 40 or RHEL8/9:

$ sudo dnf -y install pcp-zeroconf pcp-pmda-denki pcp-pmda-openmetrics
$ cd /var/lib/pcp/pmdas/denki/ && sudo ./Install
$ cd /var/lib/pcp/pmdas/openmetrics/ && sudo ./Install

The SUT runs RHEL 9. pmcd with pmda-denki are offering RAPL metrics:

$ sudo dnf -y install pcp pcp-pmda-denki pcp-system-tools
$ cd /var/lib/pcp/pmdas/denki/ && sudo ./Install
$ sudo echo 'PMCD_LOCAL=0' >>/etc/sysconfig/pmcd
$ sudo systemctl restart pmcd
$ sudo systemctl enable pmcd

Run this command to see the power metrics:

$ pmrep -h <IP-of-SUT> denki

Reference use cases, 1

What can we learn about consumption of single hardware components?

Given task, various software

Where do we spend power?

Second test series

Let’s take a single workload, run it on multiple systems, and compare energy efficiency:

  • Setup of ‘control system’ and ‘System Under Test’ (SUT)
  • Ansible playbooks to prepare SUT: install packages, etc.
  • Python code to run the workload(s) and compute power consumption

Job execution

Our contenders

  • Thinkpad L480: x86_64, model released 2018, an 8th gen Intel i5-8250U CPU (14nm), 4 cores without hyperthreading. For this system, all three sources to measure power consumption are usable.
  • Macbook Pro Asahi Fedora remix: 10 core AppleSilicon M2 CPU (5nm), aarch64. Model from 2023. Up to 10 threads can be run on separate cores.
  • Steam Deck: AMD CPU with 4 cores/8 threads (7nm), released 2022
  • Raspberry Pi 4: aarch64, a 4 core (16nm) system from 2019
  • Star 64: RISC-V board with 4 cores, introduced 2023

Total power draw

Total job extractions per second

Energy consumption per single job

Our contenders, extended

  • Thinkpad L480: x86_64, model released 2018, an 8th gen Intel i5-8250U CPU (14nm), 4 cores without hyperthreading. For this system, all three sources to measure power consumption are usable.
  • Macbook Pro Asahi Fedora remix: 10 core AppleSilicon M2 CPU (5nm), aarch64. Model from 2023. Up to 10 threads can be run on separate cores.
  • Steam Deck: AMD CPU with 4 cores/8 threads (7nm), released 2022
  • Raspberry Pi 4: aarch64, a 4 core (16nm) system from 2019
  • Star 64: RISC-V board with 4 cores, introduced 2023
  • Sun Ultra5: sparc64, 1 core UltraSPARC IIi (270Mhz, 0.35 μm (350nm)), released 1998, running Linu^WNetBSD

Energy consumption per single job incl. Sun Ultra5

Energy consumption per single job incl. Sun Ultra5 (logarithmic)

More usecases are in the pmda-handbook, via html or pdf.

Future

  • pmda-denki can be used
    • to find the best system for their workload
    • to understand if new versions of their software use more power
  • QE departments could use this to ensure power consumption is not spiking up. If invehicle-OS uses 2x the power after an errata, batteries will be drained..
  • Worth looking into:
    • Emulating various architectures, overhead?
    • Commpunity project open for everybody to contribute consumption measurements?
    • Instead of a load which is cpu bottlenecked, how about measuring consumption per I/O?
    • Measure power consumption while running Geekbench AI?
    • Plenty of ideas, even more