Early this morning one of VividCortex's proxies experienced increased backpressure due to an overloaded EC2 instance that exhausted its CPU credits. While working to resolve the issue, we performed a proxy failover, and encountered a latent misconfiguration from a past infrastructure change. This increased the time to resolution and caused additional unavailability of data for some customers.
To prevent this from recurring and increase the reliability of our systems, we have taken immediate actions as well as identifying additional steps for future action. We have extended our monitoring to include not only CPU credits but other similar items, and we have reviewed all proxy configuration settings and ensured it is included in our infrastructure-as-code automation. As a proactive measure we plan to perform more failover testing that includes the testing of our proxies.
The incident created a delay only in data being available to customers. Our recovery routines are working as expected and any gap in data should soon be resolved.