Get SpatialOS

Sites

Menu
These are the docs for 13.7, an old version of SpatialOS. The docs for this version are frozen: we do not correct, update or republish them. 13.8 is the newest →

Metrics reference

When you run a deployment, SpatialOS collects metrics on it, which you can use to monitor the deployment’s health and status. This page lists the metrics that are collected and explains what you can use them for.

This page also contains information on how to query the metrics using the Prometheus query syntax.

You can query the metrics through code, or on your own analytics platform.

Metric detail levels

Metric values are retrieved roughly every 15 seconds, and stored at two levels of granularity:

  • Aggregated metrics: These are kept for 9 days for deployments with the alpha, beta or prod tags. Otherwise they are kept for 1 day.

  • Detailed metrics (alpha): These are more detailed and extremely useful for debugging. Due to their storage impact, these metrics are only kept for 30 minutes after they’ve been collected. The provided labels allow for much finer grained querying of the data.

For general monitoring of a deployment, use the aggregated metrics. But for detailed investigation, use the detailed metrics (alpha).

Worker metrics

Login outcome

A counter for the outcome of connection attempts to the SpatialOS Runtime using the connect methods provided by the Worker SDK (C#/C++).

Useful for:

  • Checking how many connection attempts have been made.
  • Checking how many attempts have been rejected because of rate or capacity limits.
Aggregated metric

Metric name: spatialos_login_outcome::sum

Label Description
project The name of the project (for example test_project). This label is mandatory.
dpl The name of the deployment (for example test_deployment).
worker_type The name of the worker type (for example MyCSharpWorker).
outcome The outcome of the login request. Possible values: SUCCESS, JOIN_RATE_EXCEEDED, CAPACITY_EXCEEDED.

Example query

spatialos_login_outcome::sum{project="test_project", dpl="test_deployment", worker_type="MyCSharpWorker"}

Worker connected

A gauge for the number of worker instances connected to the SpatialOS Runtime.

Useful for:

  • Checking how many players are logged in.
  • Checking if the correct number of managed workers are running.
Aggregated metric

Metric name: spatialos_worker_connected::sum

Label Description
project The name of the project (for example test_project). This label is mandatory.
cluster The name of the cluster which the deployment is running in (for example eu1-prod).
dpl The name of the deployment (for example test_deployment).
dpl_tag The deployment stage of the deployment, as set in the SpatialOS Console. Possible values: beta, alpha, prod.
worker_type The name of the worker type (for example MyCSharpWorker).

Example query

spatialos_worker_connected::sum{project="test_project", dpl="test_deployment", dpl_tag="prod", worker_type="MyCSharpWorker"}

Worker update

The worker operation update rate in the last minute for each worker platform.

  • Use “update_size_bytes” metrics for bandwidth
  • Use “update_messages” metrics for messages sent

Use the detailed metric (alpha) (spatialos_worker_update_size_bytes:rate1m) to check updates per component type.

Useful for:

  • Optimising for performance and cost.
Aggregated metric

Metric names: spatialos_worker_update_size_bytes::rate1m, spatialos_worker_update_messages::rate1m

Label Description
project The name of the project (for example test_project). This label is mandatory.
cluster The name of the cluster which the deployment is running in (for example eu1-prod).
dpl The name of the deployment (for example test_deployment).
dpl_tag The deployment stage of the deployment, as set in the SpatialOS Console. Possible values: beta, alpha, prod.
worker_type The name of the worker type (for example MyCSharpWorker).
direction The direction of the message (egress or ingress). Possible values: from_worker, to_worker.

Example query

spatialos_worker_update_size_bytes::rate1m{project="test_project", dpl="test_deployment", dpl_tag="prod", worker_type="MyCSharpWorker", direction="from_worker"}
Detailed metric (alpha)
  • Metric names: spatialos_worker_update_size_bytes:rate1m, spatialos_worker_update_messages:rate1m
Label Description
project The name of the project (for example test_project). This label is mandatory.
cluster The name of the cluster which the deployment is running in (for example eu1-prod).
dpl The name of the deployment (for example test_deployment).
dpl_tag The deployment stage of the deployment, as set in the SpatialOS Console. Possible values: beta, alpha, prod.
worker_type The name of the worker type (for example MyCSharpWorker).
direction The direction of the message (egress or ingress). Possible values: from_worker, to_worker.
component_type The fully-qualified name of a component as defined in the schema (for example player.Health).

Example query:

spatialos_worker_update_size_bytes:rate1m{project="test_project", dpl="test_deployment", dpl_tag="prod", worker_type="MyCSharpWorker", direction="from_worker", component_type="player.Health"}

Node metrics

Node up

A gauge for the number of nodes that are exporting metrics. Use detailed metrics (alpha) to break down the value by node category node_cat.

Useful for:

  • Setting up alerts if nodes are not all up.
Aggregated metric
  • Name: spatialos_node_up::sum

  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_node_up::sum{project=”test_project”, dpl=”test_deployment”, dpl_tag=”prod”}

Detailed metric (alpha)
  • Name: spatialos_node_up:sum

  • Labels: project, dpl, dpl_tag, node_cat

Example query: spatialos_node_up::sum{project=”test_project”, dpl=”test_deployment”, dpl_tag=”prod”, node_cat=”master”}

Node CPU usage ratio

Useful for:

  • Optimising for performance and cost.
Aggregated metric
  • Name: spatialos_node_cpu_used::max_ratio A gauge for the highest ratio of CPU cores used per total available CPU cores (i.e. the CPU cores available for user code) across nodes.

  • Labels: project, cluster, dpl, dpl_tag, node_cat

Example query: spatialos_node_cpu_used::max_ratio{project="test_project", dpl="test_deployment", node_cat="gsimbridge"

Detailed metric (alpha)
  • Name: spatialos_node_cpu_used:ratio A gauge for the ratio of CPU cores used per total available CPU cores (i.e. the CPU cores available for user code).

  • Labels: project, dpl, dpl_tag, node, node_cat

Example query: spatialos_node_cpu_used:ratio{project="test_project", dpl="test_deployment", node="gsimbridge02"

Memory usage ratio

Useful for:

  • Optimising for performance and cost.

  • Detecting memory leaks.

Aggregated metric
  • Name: spatialos_node_memory_used::max_ratio A gauge for the highest ratio of memory used per total available memory across nodes.

  • Labels: project, cluster, dpl, dpl_tag, node_cat

Example query: spatialos_node_memory_used::max_ratio{project="test_project", dpl="test_deployment", node_cat="fsim"}

Detailed metric (alpha)
  • Name: spatialos_node_memory_used:ratio A gauge for the ratio of memory used per total available memory.

  • Labels: project, dpl, dpl_tag, node, node_cat

Example query: spatialos_node_memory_used:ratio{project="test_project", dpl="test_deployment", node="fsim_01"}

Disk space available

A gauge for the number of bytes available on the filesystem root (/).

  • Name: spatialos_node_filesystem_available_bytes::sum
  • Labels: project, dpl, dpl_tag, node, node_cat

Useful for:

  • Setting up alerts for disk space usage

Example query: spatialos_node_filesystem_available_bytes::sum{project="test_project", dpl="test_deployment", node="workers"}

Logging metrics

Log rate

A rate for the number of error or warning logs.

Useful for:

  • Triggering alerts if the error rate is too high
Aggregated metric
  • Name: spatialos_logging_logs::rate1m

  • Labels: project, cluster, dpl, dpl_tag, level={“ERROR”|”WARN”}

Example query: spatialos_logging_logs::rate1m{project="test_project", dpl="test_deployment", level="ERROR"}

Entity metrics

Entity count

A gauge for the number of entities.

Useful for:

  • Debugging peaks or drops of entity counts in your deployment.
  • Designing your game and tweaking its mechanics, eg “Are there too many/too few entities of a given kind?”
Aggregated metric
  • Name: spatialos_entity_count::sum
  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_entity_count::sum{project="test_project", dpl="test_deployment"}

Detailed metric (alpha)
  • Name: spatialos_entity_count:sum
  • Labels: project, dpl, dpl_tag, entity_type={“Player”|…}

Example query: spatialos_entity_count :sum{project="test_project", dpl="test_deployment", entity_type="Player"}

Entities created

A rate of entities created per minute.

Useful for:

  • Debugging spikes of entities created.
Aggregated metric
  • Name: spatialos_entity_created::rate1m
  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_entity_created::rate1m{project="test_project", dpl="test_deployment"}

Detailed metric (alpha)
  • Name: spatialos_entity_created:rate1m
  • Labels: project, dpl, dpl_tag, entity_type={“Player”|…}

Example query: spatialos_entity_created:rate1m{project="test_project", dpl="test_deployment", entity_type="Player"}

Entities deleted

The rate of entities deleted per minute.

Useful for:

  • Debugging spikes of entities deleted.
Aggregated metric
  • Name: spatialos_entity_deleted::rate1m
  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_entity_deleted::rate1m{project="test_project", dpl="test_deployment"}

Detailed metric (alpha)
  • Name: spatialos_entity_deleted:rate1m
  • Labels: project, dpl, dpl_tag, entity_type={“Player”|…}

Example query: spatialos_entity_deleted:rate1m{project="test_project", dpl="test_deployment", entity_type="Player"}

Entity authority changes

The rate at which the authority of entities changes.

Useful for:

  • Debugging spikes in the frequency of entities crossing worker boundaries.
Aggregated metric
  • Name: spatialos_authority_changes::rate1m
  • Labels: project, dpl, dpl_tag, outcome

Example query: spatialos_authority_changes::rate1m{project="test_project", dpl="test_deployment", outcome="failure"}

Command metrics

Command count

The rate of commands sent per minute. The status label values are defined on the API reference pages: C# and C++.

Useful for:

  • Alerting and debugging spikes or drops in commands sent.
  • Optimising for performance and cost.
Aggregated metric
  • Name: spatialos_command_count::rate1m
  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_command_count::rate1m{project="test_project", dpl="test_deployment"}

Detailed metric (alpha)
  • Name: spatialos_command_count:rate1m

  • Labels: project, dpl, dpl_tag, component_type={“player.test”|”SYSTEM”}, command_type={“USER_DEFINED”|”CREATE_ENTITY_REQUEST”|”REMOVE_ENTITY_REQUEST”|…}, status

Example query: spatialos_command_count:rate1m{project=”test_project”, dpl=”test_deployment”, component_type={“player.Health”}, command_type="USER_DEFINED”}

Command latency

The latency of commands measured from the SpatialOS Runtime receiving the command request to the Runtime receiving the command response in the last five minutes in 99th, 90th and 50th percentiles. The status label values are defined on the API reference pages: C# and C++. Latency is capped at 1 second, so any commands taking longer than this will be reported as taking 1s.

Useful for:

  • Alerting abnormal latency in a deployment.
  • Debugging latency for certain components.
  • Optimising for performance and cost.
Aggregated metric
  • Name: spatialos_command_latency_seconds::summary5m
  • Labels: project, cluster, dpl, dpl_tag, quantile

Example query: spatialos_command_latency_seconds::summary5m{project="test_project", dpl_tag="prod", quantile="0.95"}

Detailed metric (alpha)
  • Name: spatialos_command_latency_seconds:summary5m
  • Labels: project, dpl, dpl_tag, quantile, component_type={“player.test”|”SYSTEM”}, command_type={“USER_DEFINED”|”CREATE_ENTITY_REQUEST”|”REMOVE_ENTITY_REQUEST”|…}, status

Example query: spatialos_command_latency_seconds:summary5m{project=”test_project”, dpl=”test_deployment”, quantile="0.95", component_type={“player.Health”}, command_type="USER_DEFINED”}

Network metrics

Network egress rate

The rate of total network egress (traffic going out of the cloud) bytes per minute.

Useful for:

  • Optimising for performance and cost.
Aggregated metric
  • Name: spatialos_network_egress_bytes::rate1m
  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_network_egress_bytes::rate1m{project="test_project", dpl="test_deployment"}

Detailed metric (alpha)
  • Name: spatialos_network_egress_bytes:rate1m
  • Labels: project, dpl, dpl_tag, node

Example query: spatialos_network_egress_bytes:rate1m{project="test_project", dpl="test_deployment", node="worker_01"}

Runtime metrics

Operations

Operations (ops) are messages carrying information between client-worker and server-worker instances and SpatialOS. The size of an op is measured in operation units.

Useful for:

  • Optimizing for performance and cost.
Aggregated metric
  • Name: spatialos_worker_ops::rate1m
Label Description
project The name of the project (for example test_project). This label is mandatory.
dpl The name of the deployment (for example test_deployment).
dpl_tag The deployment stage of the deployment, as set in the SpatialOS Console. Possible values: beta, alpha, prod.
worker_type The name of the worker type (for example MyCSharpWorker).
direction Whether the operation was sent to or from a worker instance. Possible values: to_worker, from_worker.

Example query: spatialos_worker_ops::rate1m{project="test_project", dpl="test_deployment", dpl_tag="prod", worker_type="PhysicsWorker", direction="from_worker"}

Operation units

Every operation counts towards one or more operation units (op units) based on its payload size. The total op units per second across the whole deployment is the total amount of information SpatialOS is synchronizing. Your choice of game template specifies an upper limit on how many op units your deployment can handle.

Aggregated metric
  • Name: spatialos_worker_op_units::rate1m
Label Description
project The name of the project (for example test_project). This label is mandatory.
dpl The name of the deployment (for example test_deployment).
dpl_tag The deployment stage of the deployment, as set in the SpatialOS Console. Possible values: beta, alpha, prod.
worker_type The name of the worker type (for example MyCSharpWorker).
direction Whether the operation was sent to or from a worker instance. Possible values: to_worker, from_worker.

Example query: spatialos_worker_op_units::rate1m{project="test_project", dpl="test_deployment", dpl_tag="prod", worker_type="PhysicsWorker", direction="from_worker"}

Worker to Runtime latency

The round-trip time (RTT) from workers to the SpatialOS Runtime in the last five minutes in 99th, 90th and 50th percentiles. Latency is capped at 10 seconds, so any round-trips taking longer than this will be reported as taking 10s.

Useful for:

  • Optimising for performance and cost.
Aggregated metric
  • Name: spatialos_runtime_worker_latency_seconds::summary5m

  • Labels: project, cluster, dpl, dpl_tag, worker_type, quantile

Example query: spatialos_runtime_worker_latency_seconds::summary5m{project="test_project", dpl="test_deployment", worker_type="MyCSharpClient", quantile="0.90"}

View lateness

The latency for an update anywhere in the system to be reflected in a view at the 50th percentile.

Useful for:

  • Optimising for performance and cost.
Aggregated metric
  • Name: spatialos_runtime_view_lateness_50th_percentile_ms
  • Labels: project, cluster, dpl, dpl_tag

Example query: spatialos_runtime_view_lateness_50th_percentile_ms{project="test_project", dpl="test_deployment"}

Snapshot metrics

Snapshot count

A counter for the number of snapshots.

Useful for:

  • Alerting when there is a snapshot failure.
Aggregated metric
  • Name: spatialos_snapshot_count::sum

  • Labels: project, cluster, dpl, dpl_tag, outcome={“success”|“failure”}

Example query: spatialos_snapshot_count::sum{project="test_project", dpl="test_deployment", outcome="failure"}

Search results

Was this page helpful?

Thanks for letting us know!

Thanks for your feedback

Need more help? Ask on the forums