Installing Furiosa Metrics Exporter

The Furiosa metrics exporter exposes a collection of metrics related to FuriosaAI NPU devices in Prometheus format.

Furiosa Metrics Exporter

The Furiosa metrics exporter exposes collection of metrics related to FuriosaAI NPU devices in Prometheus format. In a Kubernetes cluster, you can scrape the metrics provided by furiosa-metrics-exporter using Prometheus and visualize them with a Grafana dashboard. This can be easily set up using the Prometheus Chart and Grafana Helm charts, along with the furiosa-metrics-exporter Helm chart.

Deploying Furiosa Metrics Exporter with Helm

The Furiosa metrics exporter helm chart is available at https://github.com/furiosa-ai/helm-charts. To configure deployment as you need, you can modify charts/furiosa-metrics-exporter/values.yaml. For example, the Furiosa metrics exporter Helm chart automatically creates a Service Object with Prometheus annotations to enable metric scraping automatically. You can modify the values.yaml to change the port or disable the Prometheus annotations if needed. You can deploy the Furiosa Metrics Exporter by running the following commands:

sh
helm repo add furiosa https://furiosa-ai.github.io/helm-charts
helm repo update
helm install furiosa-metrics-exporter furiosa/furiosa-metrics-exporter -n furiosa-system

Alternative Installation: furiosa-metrics-exporter via apt

NOTE

For special use cases or non-Kubernetes environments, you can alternatively install the exporter via an apt package.

The minimum requirements for installing the furiosa-metrics-exporter package are as follows:

Then, please install the furiosa-metrics-exporter package as follows:

sh
sudo apt update
sudo apt install -y furiosa-metrics-exporter

This command installs packages furiosa-libsmi and furiosa-metrics-exporter.

Configuration

The Furiosa Metrics Exporter has various command-line (CLI) options. The following table summarizes the available CLI options:

CLI OptionRequiredDescription
--interval <int>YesMetrics collection interval in seconds. e.g., --interval 10 for collecting metrics every 10 seconds.
--port <int>YesPort number for the metrics HTTP server. e.g., --port 6254.
--node-name <string>NoIf set, hostname label will be added to the metrics. Default is empty ("").
--kube-resources-label <bool>NoIf set, Kubernetes resource labels (such as namespace, pod, and container) will be added to the metrics. Default is false. Recommended when running in Kubernetes for richer metrics context.

Metrics

The exporter is composed of chain of collectors, each collector is responsible for collecting specific metrics from the Furiosa NPU devices. The following table shows the available collectors and metrics:

Collector NameMetricTypeMetric LabelsDescription
Livenessfuriosa_npu_alivegaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe liveness of the Furiosa NPU device.
Temperaturefuriosa_npu_hw_temperaturegaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, container, labelThe temperature of the Furiosa NPU device.
Powerfuriosa_npu_hw_powergaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, container, labelThe power consumption of the Furiosa NPU device.
Core Utilizationfuriosa_npu_core_utilizationgaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe core utilization of the Furiosa NPU device.
Core Frequencyfuriosa_npu_core_frequencygaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe core frequency of the Furiosa NPU device.
Cycle Countfuriosa_npu_total_cycle_countcounterarch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe total cycle count of the Furiosa NPU device.
Task Execution Cyclefuriosa_npu_task_execution_cyclecounterarch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe task execution cycle of the NPU Task.
Total DRAM Sizefuriosa_npu_dram_totalgaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe total DRAM size of the Furiosa NPU device.
DRAM Used Sizefuriosa_npu_dram_usagegaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe currently used DRAM size of the Furiosa NPU device.
Throttling Events Countfuriosa_npu_throttling_events_countgaugearch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, containerThe number of throttling events that occurred on the Furiosa NPU device within a fixed time window.

All metrics share common metric labels such as arch, core, device, uuid, pci_bus_id, firmware_version, driver_version, hostname, namespace, pod, and container. The following table describes the common metric labels:

Attribute NameDescription
archThe architecture of the Furiosa NPU device. e.g. warboy, rngd
coreThe core number of the Furiosa NPU device. e.g. 0, 1, 2, 3, 4, 5, 6, 7, 0-7
deviceThe device name of the Furiosa NPU device. e.g. npu0
uuidThe UUID of the Furiosa NPU device.
pci_bus_idThe PCI bus ID of the Furiosa NPU device. e.g. 0000:c7:00.0
firmware_versionThe firmware version of the Furiosa NPU device. e.g. 2025.1.0+696efad
driver_versionThe driver version of the Furiosa NPU device. e.g. 2025.1.0+f09a8d8

The following metric labels are optional and enabled only when the corresponding command-line options (--node-name and --kube-resources-label) are set. Additionally, the namespace, pod, and container labels require an environment where the Kubernetes PodResource API is supported.

Attribute NameDescription
hostnameThe hostname of the machine where the exporter is running.
namespaceThe namespace of the Kubernetes pod to which the NPU is allocated.
podThe name of the Kubernetes pod to which the NPU is allocated.
containerThe container name within the Kubernetes pod to which the NPU is allocated.

The metric label “label” is used to describe additional attributes specific to each metric. This approach helps avoid having too many metric definitions and effectively aggregates metrics that share common characteristics.

Metric TypeLabel NameDescription
TemperaturepeakThe highest temperature observed from SoC sensors
TemperatureambientThe temperature observed from sensors attached to the board
PowerrmsRoot Mean Square (RMS) value of the power consumed by the device, providing an average power consumption metric over a period of time.

Examples

The following shows real-world example of the metrics:

sh
#liveness
furiosa_npu_alive{arch="rngd",container="furiosa",core="0-7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1

#temperature
furiosa_npu_hw_temperature{arch="rngd",container="furiosa",core="0-7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",label="ambient",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 52
furiosa_npu_hw_temperature{arch="rngd",container="furiosa",core="0-7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",label="peak",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 67.756

#power
furiosa_npu_hw_power{arch="rngd",container="furiosa",core="0-7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",label="rms",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 50

#core utilization
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="0",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.68363645361265
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="1",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.68363645361265
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="2",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.68363645361265
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="3",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.68363645361265
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="4",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.6826341187199
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="5",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.6826341187199
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="6",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.6826341187199
furiosa_npu_core_utilization{arch="rngd",container="furiosa",core="7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 99.6826341187199

#core frequency
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="0",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="1",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="2",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="3",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="4",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="5",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="6",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750
furiosa_npu_core_frequency{arch="rngd",container="furiosa",core="7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1750

#total cycle count
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="0",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7242541456e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="1",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7242541456e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="2",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7242541456e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="3",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7242541456e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="4",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7175902913e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="5",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7175902913e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="6",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7175902913e+10
furiosa_npu_total_cycle_count{arch="rngd",container="furiosa",core="7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 1.7175902913e+10

#task execution cycle
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="0",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.686392711e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="1",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.686392711e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="2",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.686392711e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="3",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.686392711e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="4",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.685170235e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="5",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.685170235e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="6",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.685170235e+09
furiosa_npu_task_execution_cycle{arch="rngd",container="furiosa",core="7",device="npu0",driver_version="2025.1.0+f09a8d8",firmware_version="2025.1.0+696efad",hostname="cntk002",namespace="default",pci_bus_id="0000:c7:00.0",pod="furiosa",uuid="09512C86-0702-4303-8F40-474746474A40"} 5.685170235e+09

On this page