Metrics

A "metric" is a measured value for an element in the infrastructure. AppFormix Agent collects and calculates metrics for hosts and instances. Metrics are organized into hierarchical categories based on the type of metric.

Some metrics are percentages of total capacity. In such cases, the category of the metric determines the total capacity by which the percentage is computed. For instance, host.cpu.usage indicates the percentage of CPU consumed relative to the total CPU available on a host. In contrast, instance.cpu.usage is the percentage of CPU consumed relative to the total CPU available to an instance. As an example, consider an instance that is using 50% of one core on a host with 20 cores. The instance's host.cpu.usage will be 2.5%. If the instance has been allocated 2 cores, then its instance.cpu.usage will be 25%.

An Alarm may be configured for any metric. Many metrics may also be displayed in charts. When an Alarm triggers for a metric, the Alarm will be plotted on charts at the time of the event. In this way, metrics that may not be plotted directly as a chart are still visually correlated in time with other metrics.

AppFormix Agent collects both raw metrics and calculated metrics. Raw metrics are values read directly from the underlying infrastructure. Calculated metrics are metrics that AppFormix Agent derives from raw metrics.

Host

The following raw metrics are available for hosts.

Metric Chart Alarm
host.cpu.io_wait x x
host.cpu.ipc ** x x
host.cpu.l3_cache.miss ** x x
host.cpu.l3_cache.usage ** x x
host.cpu.mem_bw.local ** x x
host.cpu.mem_bw.remote ** x x
host.cpu.mem_bw.total ** x x
host.cpu.usage x x
host.disk.io.read x x
host.disk.io.write x x
host.disk.response_time x x
host.disk.read_response_time x x
host.disk.write_response_time x x
host.disk.smart.hdd.command_timeout x
host.disk.smart.hdd.current_pending_sector_count x
host.disk.smart.hdd.offline_uncorrectable x
host.disk.smart.hdd.reallocated_sector_count x
host.disk.smart.hdd.reported_uncorrectable_errors x
host.disk.smart.ssd.available_reserved_space x
host.disk.smart.ssd.media_wearout_indicator x
host.disk.smart.ssd.reallocated_sector_count x
host.disk.smart.ssd.wear_leveling_count x
host.disk.usage.bytes x x
host.disk.usage.percent x x
host.memory.usage x x
host.memory.swap.usage x x
host.memory.dirty.rate x x
host.memory.page_fault.rate x x
host.memory.page_in_out.rate x x
host.network.egress.bit_rate x x
host.network.egress.drops x x
host.network.egress.errors x x
host.network.egress.packet_rate x x
host.network.ingress.bit_rate x x
host.network.ingress.drops x x
host.network.ingress.errors x x
host.network.ingress.packet_rate x x
host.network.ipv4tables.rule_count x x
host.network.ipv6tables.rule_count x x
openstack.host.disk_allocated x x
openstack.host.memory_allocated x x
openstack.host.vcpus_allocated x x

The following calculated metrics are available for hosts.

Metric Chart Alarm
host.cpu.normalized_load_1m x x
host.cpu.normalized_load_5m x x
host.cpu.normalized_load_15m x x
host.cpu.temperature x
host.disk.smart.predict_failure x
host.heartbeat x

Instance

The following raw metrics are available for instances.

Metric Chart Alarm
instance.cpu.usage x x
instance.cpu.ipc ** x x
instance.cpu.l3_cache.miss ** x x
instance.cpu.l3_cache.usage ** x x
instance.cpu.mem_bw.local ** x x
instance.cpu.mem_bw.remote ** x x
instance.cpu.mem_bw.total ** x x
instance.disk.io.read_bandwidth x x
instance.disk.io.read_iops x x
instance.disk.io.read_iosize x x
instance.disk.io.read_response_time x x
instance.disk.io.write_bandwidth x x
instance.disk.io.write_iops x x
instance.disk.io.write_iosize x x
instance.disk.io.write_response_time x x
instance.disk.usage.bytes x x
instance.disk.usage.percentage x x
instance.memory.usage x x
instance.network.egress.bit_rate x x
instance.network.egress.drops x x
instance.network.egress.errors x x
instance.network.egress.packet_rate x x
instance.network.egress.total_bytes x x
instance.network.egress.total_packets x x
instance.network.ingress.bit_rate x x
instance.network.ingress.drops x x
instance.network.ingress.errors x x
instance.network.ingress.packet_rate x x
instance.network.ingress.total_bytes x x
instance.network.ingress.total_packets x x

The following calculated metrics are available for instances.

Metric Chart Alarm
instance.heartbeat x

OpenContrail vRouter

The following raw metrics are available for a OpenContrail vRouter on a host.

Metric Chart Alarm
plugin.contrail.vrouter.aged_flows x x
plugin.contrail.vrouter.total_flows x x
plugin.contrail.vrouter.exception_packets x x
plugin.contrail.vrouter.drop_stats_flow_queue_limit_exceeded x x
plugin.contrail.vrouter.drop_stats_flow_table_full x x
plugin.contrail.vrouter.drop_stats_vlan_fwd_enq x x
plugin.contrail.vrouter.drop_stats_vlan_fwd_tx x x
plugin.contrail.vrouter.flow_export_drops x x
plugin.contrail.vrouter.flow_export_sampling_drops x x
plugin.contrail.vrouter.flow_rate_active_flows x x
plugin.contrail.vrouter.flow_rate_added_flows x x
plugin.contrail.vrouter.flow_rate_deleted_flows x x

OpenStack Project

The following raw metrics are available in the OpenStack Project Chart View.

Metric Chart Alarm
openstack.project.active_instances x x
openstack.project.vcpus_allocated x x
openstack.project.volume_storage_allocated x x
openstack.project.memory_allocated x x
openstack.project.floating_ip_count x x
openstack.project.security_group_count x x
openstack.project.volume_count x x

Kubernetes Pod

The following raw metrics are available in the Kubernetes Pod Chart View.

Metric Chart Alarm
pod.memory_allocated x x
pod.vcpus_allocated x x

** CPU cache and memory bandwidth metrics are available for Intel© Xeon© processor family with Intel© Resource Directory Technology. AppFormix automatically detects the processor family and makes the additional metrics available for display and analysis.