CTC US - Degraded messaging, API and UI performance
Incident Report for CalAmp
Postmortem

CTC US – Message Processing Delayed; Datapump, CTC Admin, CalAmp App, CTC API Access Degraded - 10/1/2024

Incident Started: 10/1/2024 07:00 am PT
Service impacted: 10/1/2024 07:13 am PT (Message processing delayed. CTC Admin, CalAmp App, CTC APIs)
Corrective action: 10/1/2024 07:50 am PT (Message processing restored. Backlog of messages being processed. CTC Admin and CalAmp App still experiencing intermittent issues.)
Backlog Cleared: 10/1/2024 08:20 am PT (All data current)
UI Cleared: 10/1/2024 08:20 am PT (CTC Admin and CalAmp App fully functional)
Event declared over: 10/1/2024 09:00 am PT

Problem Statement

CTC message processing was delayed. Access to Datapump, CTC Admin, and CalAmp App experienced access failures.

Root Cause Analysis

CTC US messaging pipeline experienced a failure in processing messages. No messages were lost but a delay and backlog of messages occurred. This failure impacted users trying to get messages via datapump or using the API.

The issue was caused by degraded hardware in our Cloud provider's storage system. This required a CTC server restart to migrate to different hardware. Once the restart was performed, message processing was restored, and a backlog of messages needed to be processed. As part of the recovery, other services needed to be restarted to fully restore CTC Admin and CalAmp Application.

Once the services were restarted, all systems and applications were fully functional and returned to normal activity.

Corrective Action and Follow-Up

  1. CalAmp had a review of the storage system infrastructure with the Cloud provider to understand the failure and identify ways for better monitoring and alerting of such issues.
  2. CalAmp will implement additional monitoring and alarms specific to storage systems to improve the detection and recovery times.
Posted Oct 02, 2024 - 18:25 PDT

Resolved
This incident has been resolved.
Posted Oct 01, 2024 - 09:00 PDT
Update
All systems have recovered but we continue to monitor
Posted Oct 01, 2024 - 08:20 PDT
Monitoring
Messages are processing.
Posted Oct 01, 2024 - 07:50 PDT
Update
We are continuing to investigate this issue.
Posted Oct 01, 2024 - 07:27 PDT
Investigating
We are currently investigating this issue.
Posted Oct 01, 2024 - 07:05 PDT
This incident affected: US CalAmp Telematics Cloud (US CTC Core Services) and US CalAmp App.