This section describes how to view monitoring overview information for multiple clusters.

Prerequisites

  • You need to have the platform-admin role in the KubeSphere platform. For more information, see Users and Platform Roles.

  • The host cluster and the member clusters to be monitored need to have WizTelemetry Global Monitoring enabled.

    Note

    If a member cluster does not have WizTelemetry Global Monitoring enabled, WizTelemetry Global Monitoring will not be able to obtain data from that member cluster.

Steps

  1. Log in to the KubeSphere web console with a user who has the platform-admin role.

  2. In the upper right corner of the page, click the grid icon and select WizTelemetry Observability Platform.

  3. Click Global Monitoring > Global Overview in the left navigation pane.

    The Overview page displays monitoring information for all clusters with WizTelemetry Global Monitoring enabled by default.

    Functional Area Description

    Number of Created Resources

    Displays the number of all monitored clusters, nodes, projects, pods, Deployments, StatefulSets, DaemonSets, Jobs, CronJobs, PersistentVolumeClaims, Services, and Ingresses.

    Resource Usage

    Displays the CPU, memory, and disk usage of all nodes across all monitored clusters, and the percentage of created pods relative to the maximum number of pods that can be created. By default, each node can create up to 110 pods.

    For CPU and memory, hover over eye to view the amount of resources reserved and limited for containers and projects.

    Global Alerts

    Displays the number of alerts generated by global alert rule groups and the most recent alert messages. Alerts displayed here do not include those generated by cluster and project alert rule groups. Global alert rule groups are managed by platform administrators in Global Alerts.

    Alert severity types include Info, Warning, Important, and Critical.

    Alert status types include:

    • Verifying: The monitoring metrics meet the preset conditions but have not yet satisfied the preset duration.

    • Triggered: The monitoring metrics meet the preset conditions and have satisfied the preset duration.

    Pods

    Displays the number of various types of pods across all monitored clusters.

    Pod status types include:

    • Pending: The pod has been accepted by the system, but at least one container has not been created or is not running. In this state, the pod may be waiting to be scheduled or waiting for the container image to be downloaded.

    • Running: The pod has been assigned to a node, all containers in the pod have been created, and at least one container is running, starting, or restarting.

    • Succeeded: All containers in the pod have terminated successfully (terminated with exit code 0) and will not be restarted.

    • Failed: All containers in the pod have terminated, and at least one container terminated with a non-zero exit code.

    • Unknown: The system is unable to retrieve the pod’s status. This state typically occurs due to communication failure between the system and the host where the pod is located.

    Pod QoS (Quality of Service) types include:

    • Guaranteed: Every container in the pod has memory limits, memory requests, CPU limits, and CPU requests set, and the memory limit equals the memory request, and the CPU limit equals the CPU request.

    • Burstable: At least one container in the pod does not meet the requirements for the Guaranteed type.

    • BestEffort: The containers in the pod are not configured with any memory limits, memory requests, CPU limits, or CPU requests.

    The QoS type of a pod determines its runtime priority. When system resources are insufficient to run all pods, the system prioritizes ensuring the operation of pods with a QoS type of Guaranteed, followed by pods with a QoS type of Burstable, and lastly, pods with a QoS type of BestEffort.

    Number of Pods Terminated and Restarted Due to OOM: The number of pods that were forcibly terminated and automatically restarted by the system due to insufficient memory (Out Of Memory).

    Number of Pending Pods: The number of pods that have been created but cannot start due to insufficient resources or scheduling issues.

    Number of Restarted Pods: The number of pods that have been automatically restarted due to failures or configuration changes.

  4. In the upper right corner of the page, click Select Cluster and choose the clusters you want to monitor. The Overview page will display monitoring information for the selected clusters.

  5. Click Collapse Cluster List/Expand Cluster List to hide or show the cluster list on the right.

    • Click a cluster name in the cluster list to enter the monitoring overview page for that cluster.

    • Click row and list_view to view cluster information in list or thumbnail format.

    • Click cogwheel to enter the Component Settings page. For more information, see Set Components.