Error code 32 splunk itsi

8/17/2023

KISS (Keep It Simple)īefore we get into the meat and potatoes of how to configure thresholds and alerts, please remember to keep it simple. So instead, break the 3 batch servers out into their own subservice and keep the other 17 grouped together. It also makes leveraging per entity thresholds and anomaly detection nearly impossible down the road. If all 20 servers are defined as entities in one service, KPIs like average CPU are nearly meaningless in aggregate (and therefore difficult to accurately threshold) because the batch servers are expected to exhibit different behavior. Let's then assume that 3 of those 20 servers are solely responsible for processing nightly batch operations jobs. You may be asking, why is this the case? Let’s use the batch servers example above and assume I have a farm of 20 app servers associated to a critical business app that I’m monitoring in Splunk ITSI. Entities spanning different architectural tiers (DB, Web, Application, etc.) should be broken out.Batch servers or dedicated purpose servers should typically be broken out from their general purpose counterparts.Entities that span two different data centers should typically be broken out into DC-specific subservices.Let’s use a handful of examples to clarify: Predictably different entities should be broken out to their own subservice. Typically, you’ll want to ensure that each entity in your service behaves about the same as every other entity. Therefore, grouping the right entities together in your service is important to ensure success with thresholds and alerts. The entities selected for your service directly impact the aggregate and per-entity results for each KPI. It's a separate and final configuration using notable event aggregation policies that turn one or more notable events into your traditional alerts like emails, tickets, etc. Put the process all together and it looks like this: Correct Entity Groupings Are Paramount Notable events are not alerts-at least in the traditional sense-they are simply events of interest viewable from the Notable Events Review dashboard. We configure ITSI to continuously monitor KPI statuses and service health scores when we detect problems or concerns, we can then create notable events. Notable events are created first, which then lead to one or more traditional alerts. We’ll dig in to these configurations later, but for now, we just want to acknowledge the difference between the two concepts.Īt the risk of being pedantic, what is an alert anyway? Is it an email? A text message? A ticket to a ticketing system? A flashing red light in the NOC? Something else? Within ITSI, we take a two-tiered approach to generating alerts. KPI severities are viewable in the service analyzer dashboard, deep dives, and other UI locations, but in and of themselves don’t generate alerts.Īlerts are generated from additional configurations, driven from KPI severity and service health score changes.

Thresholds apply only to KPIs they dictate when a KPI severity (or status, as they are sometimes referred) changes from normal to critical, high, low, etc. Let’s first clarify the difference between thresholds and alerts-in ITSI, these are related but separate concepts. Dependent services are also optional and are simply references to other already configured ITSI services on which this service depends.

KPIs are optional and when defined, will require threshold configurations.

Dependent services (Optional sometimes referred to as subservices)Įach service will always have a health score, which is computed based on the status of the KPIs and subservices defined for that service.
We'll refer back to these concepts during alert configuration, so having a basic working understanding of this hierarchy is important. To understand how service issues will ultimately result in meaningful alerts, we should briefly revisit the hierarchy of KPIs and services. In this multi-part blog, we'll outline some practical guidance to get you going. Excited about your shiny new Splunk IT Service Intelligence (ITSI) license? Well, you should be! But navigating from your first service creation to meaningful and trusted alerts takes some care and planning.

0 Comments

Error code 32 splunk itsi

Leave a Reply.

Author

Archives

Categories