Incidents
Any event capable of causing a disruption in your organization's workflow qualifies as an incident. It's crucial to establish a methodical procedure for managing such events. Our incident management feature serves as a strategic solution for your organization to efficiently identify and resolve incidents.
Incidents are integrated within the Flanksource system, side by side with the configs & health checks. This allows for a comprehensive view and effective filtering of incidents relevant to your situation.
Involved People
Person | Description |
---|---|
Incident Commander | An individual responsible for overseeing the entire response to an incident. |
Communicator | An individual responsible for managing all communications related to an incident. |
Creator | The person who created the incident. Can also be the system if the incident was created automatically. |
Severity Levels
Incidents can be classified into different severity levels based on their potential impact on your operations. The severity levels are as follows:
Severity | Description |
---|---|
Critical | This is the highest level of severity, indicating an incident that has caused or threatens to cause major disruptions. |
High | This level indicates a significant incident that has a substantial impact but doesn't qualify as critical. |
Medium | This level is for moderate incidents that cause some disruption but can be managed without significant diversion of resources. |
Low | This is for minor incidents that have minimal impact on normal operations. |
Info | This level is used for incidents that don't impact operations but still need to be recorded for informational purposes. |
Types
Incidents can also be classified based on their nature. The incident types are as follows:
Type | Description |
---|---|
Availability | Incidents that affect the availability of services or resources. |
Cost | Incidents that cause unexpected increases in costs or resource usage. |
Performance | Incidents that impact the performance of services or resources. |
Security | Incidents involving security breaches or vulnerabilities. |
Technical Debt | Incidents caused by accumulated technical issues that haven't been addressed. |
Compliance | Incidents related to non-compliance with regulatory requirements or standards. |
Integration | Incidents that involve issues with integrated systems or services. |
Reliability | Incidents under this category involve issues that affect the consistent and dependable performance of services or resources over time |
Status
Status describes the current state of the incident. The incident status labels are as follows:
Status | Description |
---|---|
Open | The incident has been reported and is awaiting investigation. |
Closed | The incident has been fully resolved and no further action is required. |
Mitigated | Temporary measures have been taken to manage the incident, but further investigation or action is needed. |
Resolved | The root cause of the incident has been addressed, resolving the issue. |
Cancelled | The incident was closed without a resolution, typically because it was a false alarm or duplicate. |