Skip to content

Ensure that Audit Log Errors Emit Alerts

An XCCDF Rule

Description

OpenShift audit works at the API server level, logging all requests coming to the server. However, if API server instance is unable to write errors, an alert must be issued in order for the organization to take a relevant action. e.g. shutting down that instance.

Kubernetes by default has metrics that enable one to write such alerts:

  • apiserver_audit_event_total
  • apiserver_audit_error_total
Such an example is shipped in OCP 4.9+
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: audit-errors
  namespace: openshift-kube-apiserver
spec:
  groups:
  - name: apiserver-audit
    rules:
    - alert: AuditLogError
      annotations:
        summary: |-
          An API Server instance was unable to write audit logs. This could be
          triggered by the node running out of space, or a malicious actor
          tampering with the audit logs.
        description: An API Server had an error writing to an audit log.
      expr: |
        sum by (apiserver,instance)(rate(apiserver_audit_error_total{apiserver=~".+-apiserver"}[5m])) / sum by (apiserver,instance) (rate(apiserver_audit_event_total{apiserver=~".+-apiserver"}[5m])) > 0
      for: 1m
      labels:
        severity: warning

For more information, consult the official Kubernetes documentation.

warning alert: Warning

This rule's check operates on the cluster configuration dump. Therefore, you need to use a tool that can query the OCP API, retrieve the following:
  • /apis/monitoring.coreos.com/v1/prometheusrules API endpoint, filter with with the jq utility using the following filter [.items[].spec.groups[].rules[].expr] and persist it to the local /apis/monitoring.coreos.com/v1/prometheusrules#5fd5244e3dcae63319f7e86b918cb8ea6ce1b4124670ccb43750d7a75ca03cb7 file.

Rationale

When there are errors writing audit logs, security events will not be logged by that specific API Server instance. Security Incident Response teams use these audit logs, amongst other artifacts, to determine the impact of security breaches or events. Without these logs, it becomes very difficult to assess a situation and do appropriate root cause analysis in such incidents.

ID
xccdf_org.ssgproject.content_rule_audit_error_alert_exists
Severity
High
References
Updated



Remediation - Kubernetes Patch

---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: audit-errors
  namespace: openshift-kube-apiserver