Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
57 views10 pages

Monitoring & Automation Improvements

The document outlines the current monitoring setup and maturity model for B&C, identifying obstacles and opportunities for improvement. Proposed enhancements include additional monitoring for critical components, automation opportunities across various processes, and the implementation of AIOPS capabilities for better incident management. The goal is to optimize performance, improve visibility, and enhance collaboration within the monitoring framework.

Uploaded by

mrvinodk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views10 pages

Monitoring & Automation Improvements

The document outlines the current monitoring setup and maturity model for B&C, identifying obstacles and opportunities for improvement. Proposed enhancements include additional monitoring for critical components, automation opportunities across various processes, and the implementation of AIOPS capabilities for better incident management. The goal is to optimize performance, improve visibility, and enhance collaboration within the monitoring framework.

Uploaded by

mrvinodk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

B&C Monitoring Improvement &

Automation Opportunities.

Aug 29, 2024

©LTIMindtree | Privileged and Confidential 2024


Agenda..

Current Monitoring setup

Monitoring Maturity Model

Obstacles and Opportunities

Proposed Improvements

Automation Opportunities

©LTIMindtree | Privileged and Confidential 2024 2


B&C Current Monitoring Setup

Azure
Monitor

Service Bus Recovery Recovery Service Vault Alert


Storage
Service Vault Only Backup Job Status is
account
Metrics monitored.

Metric
s
Key Vault Private Cosmos DB
endpoints

Application & API Alert


Activity App Availability is monitored
Application Metrics Logs
Insights SMART
Detecti
on
App Service App Application
plan Service Insights

VM Alert
Log • VM CPU Usage [90% . 85%,
Analytics 80%]
VM Insights workspace • VM Memory Usage
Diagnosti • Disk Space Usage
cs Metrics &
• VM Availability
Logs VM Data
Virtual SQL Server
Collection
Machine Azure Cloud Rule
Defender

©LTIMindtree | Privileged and Confidential 2024 3


Monitoring Maturity Model

Optimized:
Proactive: Comprehens
Reactive:
Initial: Basic Advanced ive
Monitoring
Monitoring Monitoring Observabilit
for
with Limited and y and
Troubleshoo
Visibility Predictive Continuous
ting
Capabilities Improvemen
t

• Basic Metrics Collection • Enhanced Metrics • Comprehensive Metrics • Full Observability


• Manual Log Analysis Collection and Logs • Machine Learning
• Limited Alerting • Basic Log Management • Automated Log Analysis Insights
• No ITSM Integration • Incident-Based Alerting • Predictive Alerting • Automated Responses
• No Event Correlation • ITSM Integration • Advanced ITSM • Continuous ITSM
• Manual Event Correlation Integration Integration
• Automated Event • Advanced Event
Correlation Correlation

©LTIMindtree | Privileged and Confidential 2024 4


Obstacles & Opportunities in current monitoring setup

• Insufficient Oversight of Key Vault, SA, Service Bus, and Cosmos DB


• Enhanced monitoring of App & key Infra services
Obstacles • Integration between ticketing and monitoring tool
&
• Alert Isolation
Opportunities
• Awareness on impact of Critical Components
• Dependencies on Application Teams

©LTIMindtree | Privileged and Confidential 2024 5


Proposed Improvements.

Enable additional monitoring for Key vault, Enable additional monitoring of Application • Benefits Lack of Awareness of critical components
Service Bus, Cosmos DB & Storage account Service Automated Incident Management thereby, improper categorization of alerts.
to alert on Availability, Latency , Saturation Build Comprehensive Dashboard to Improved Visibility and Monitoring Dependency on Application team to address
& Error metrics. provide Real-time Visibility Faster Issue Resolution critical issues.
Benefits: Monitor and respond to potential Trend Analyse on usage, Request & Enhanced Collaboration Regular Reviews
security incidents, log activity on SA, Response time, http Server Errors Proactive Problem Management Training
optimize performance & track usage of & exception
Service bus. Compliance and Reporting
Improved response time Granular
Benefits: Optimized usage of App service
monitoring of system processes and reduce
and auto scaling. Proactively identify issues potential system outages.
related to SKU Capacity and take actions.

Insufficient Enhanced Integration & Enhanced


Oversight monitoring alert Isolation Collaboration

Use AIOPS Capability for:


Dynamic Threshold
Anomaly Detection
KQL Query based Trend Analysis on
Capacity usage.
Use Application Insights to gauge User
impact analysis during outages.

AI Ops

©LTIMindtree | Privileged and Confidential 2024 6


Automation Opportunities

• Capacity Monitoring. • Patch Management


• Service Availability • Backup Management
Monitoring • Cost Management
• Batch Job Monitoring • Security Management
• Reporting • Password Management
• Performance Monitoring

Cloud Cloud
Monitoring Maintenance

Service
Request
Operations
Management
Management

• User Access Management • Release Management


• Environment Provisioning & • Service Continuity
Updates Management
• Configuration updates • Change Management
• Website Management • Configuration Management

©LTIMindtree | Privileged and Confidential 2024 7


Thank You

©LTIMindtree | Privileged and Confidential 2024


Automation Opportunities continued..

Automation Area Process Current state Proposed Automation approach

RBAC Sheets got Defined , have RBAC based 1. Implement RBAC based pipelines across
User Access Management
pipeline for access grant Project

Environment Provisioning are done using 1. Convert ARM Template base scripts to
Request Management Environment Provisioning/Update
IaC( ARM Template, Biceps, Terraform) Biceps based

Resource Configurations are updated using


Configuration Update
ansible scripts For new requirements, build the Ansible
scripts to fullfill the need.
Release Management are deployed via
Release Management Pipeline 1. Completion of Data Pipelines

Application Changes are script driven, have


Change Management
change request to track

Configuration Management
Service now based Incident Management in
Incident Management
Service Operations Management Place

RTO & RPO are Defined for Each 1. Design and Implement DR Setup for all
Service Continuity Management Application, have IaC Script for Provisioning Application
and Confguring Environment in case DR

Service Level Management Service now based Tickement manage

©LTIMindtree | Privileged and Confidential 2024 9


Automation Opportunities
Automation Area Process Current state Proposed Automation approach

Use AIOPS Capability for:


- Dynamic Threshold
- Anomaly Detection
- KQL Query based Trend Analysis on Capacity usage.
Cloud Monitoring - Use Application Insights to gauge User impact analysis
Capacity Monitoring Alert Based Notification on Monitoring Thresholds during outages.
Service Availability App Insights, Logic App based Monitoring In Place
Reporting No existing report mechanism Custom Dashboards for Overall View
Batch Job Monitoring RedGate based Alerting on SQL job in Place
Make use of Azure Update Management for Applying Automation of Post Verification Steps on Application
Patch Management
Patchess, SoP Driven Steps for Post Verification Checks Availability

Make use of Azure Recovery Service Vault base Backup for


Backup Management Resource Level backup, Database backup. Recovery
Service Vault has in-built capability to Notify on Job Failure

1. Project Wise Dashboards on Cost Trends


2. Automation scripts to Deduct Cost Optimization on
areas like
Reduced Logging
Make use of Munichre's Centralized Team on Cost Unused resources
Cost Management Optimization Recommendation, all the recommended
Cloud Maintenance Reduction in size based on usage
steps were Using Reserved instances for cost saving

1. Current Scripts on Security Recommendation are


Security Recommendation by Central Team are Standalone; we can update the IaC scripts (Biceps) to
Security Management
implemented using Standalone Powershell Scripts include the Securiy Recommendation too during build
phase.
1. Extend the Current Password Management Automation
Password Management Automation in place to notify on Password Expires. to be used for all applications and tune it as per
application challenges

©LTIMindtree | Privileged and Confidential 2024 10

You might also like