Commit Graph

110 Commits

Author SHA1 Message Date
Vijay Joshi
62d65862b7 INFRA-3611 : Add more logs for better debugging of events (#449)
* INFRA-3611 : Add more logs for better debugging of events

* INFRA-3611 : Log tag fix
2024-08-09 17:32:04 +05:30
Vijay Joshi
2df1603fd2 INFRA-3661 : Make Data fetch calls concurrent to avoid timeouts (#448) 2024-08-09 12:36:58 +05:30
Vijay Joshi
3cfebd20a6 INFRA-3467 : Filter out failures on Non-Houston channel events (#447) 2024-08-08 23:44:47 +05:30
Vijay Joshi
804be01c2f INFRA-3467 : Private Houston Incidents (#445)
* INFRA-3467 : Private Houston Incidents

* INFRA-3627 : Minor self review

* INFRA-3627 : PR Review changes

INFRA-3627 : Minor changes

INFRA-3627 : UT fix

INFRA-3637 : Message changes

INFRA-3627 : Minor changes

INFRA-3627 : Constant fix

INFRA-3627 : Do not post SLA breach in public channels for private incidents
2024-08-08 19:20:04 +05:30
Vijay Joshi
55da2b4791 INFRA-3570 : Do not show the current severity and status in update incident in slack UI (#439)
* INFRA-3570 : Do not show same severity and status in update incident in slack UI

* INFRA-3570 : Cyclic dependency fix

* INFRA-3570 : Minor changes

* INFRA-3570 : Add UT'S

* INFRA-3570 : Major refactor

* INFRA-3570 : Move all incident status repo functions to new service

* INFRA-3570 : Add UT's
2024-07-18 13:17:28 +05:30
Vijay Joshi
293220ded8 INFRA-3565 : Remove retrospective and service owner in incident roles (#434)
* INFRA-3565 : Remove retrospective and service owner in incident roles

* INFRA-3565 : Build failure fix
2024-07-11 14:53:39 +05:30
Vijay Joshi
9448e8ec83 INFRA-3529 - Truncated title in slack channel name (#433)
* INFRA-3529 - Truncated title in slack channel name

* INFRA-3529 : Minor fix

* INFRA-3529 : Dashboard inconsistency fix

* INFRA-3529 : Remove role from daily reminder message
2024-07-09 15:07:43 +05:30
Amit Jambotkar
3a3a0a7c26 INFRA-3437|Amit|Severity reduction (#427)
* INFRA-3437|Amit|Severity reduction

* INFRA-3437|Amit|Severity reduction

* INFRA-3437|Amit|Severity reduction
2024-06-18 17:54:10 +05:30
Vijay Joshi
29f0c7bacc INFRA-3012 : Houston topic changes according to new construct (#415)
* INFRA-3012 : Houston title changes according to new construct

* INFRA-3012 : add titkle change for resolve and duplicate case

* INFRA-3012 : Failing tests fix

* INFRA-3012 : Added migration script for backfilling
2024-04-26 14:52:12 +05:30
Vijay Joshi
1f1679b272 INFRA-3126 : Cleanup of deprecated API's and dead code (#414)
* INFRA-3126 : Cleanup of deprecated API's and dead code

* INFRA-3126 : More cleanup
2024-04-15 17:28:39 +05:30
Vijay Joshi
0ed941abc3 INFRA-3151 : Add unarchival listener to add houston bot to incident slack channel (#416)
* INFRA-3151 : Add unarchival listener to add houston bot to incident slack channel

* INFRA-3151 : review comments

* INFRA-3151 : format fix
2024-04-12 18:53:06 +05:30
Shashank Shekhar
233c632d38 INFRA-2866 | Create and update incident with assigner and responder from slack (#394)
* INFRA-2866 | create incident modal with product

* INFRA-2866 | Update product flow

* INFRA-2866 | Resolving review comments

* INFRA-2866 | Adding default values for product, assigner and responder

* INFRA-2866 | bug fix in getting assigner and responder team

* INFRA-2866 | bug-fix: users in no team are not getting products

* INFRA-2866 | adding log lines

* INFRA-2866 | adding assigner team members into incident

* INFRA-2866 | updated help command response text

* INFRA-2866 | adding assigner team members by severity

* INFRA-2866 | updating product list for users with no product

* INFRA-2866 | assigner teams = (teamsOfUser ++ teamsOfSelectedProducts)

* INFRA-2866 | renamed assigner to reporting team

* INFRA-2866 | query to seed product as others for current open incidents without any product
2024-03-19 16:26:30 +05:30
Vijay Joshi
2eba625b0d INFRA-2888 : Added custom metrics on all major flows (#393)
* INFRA-2888 : Added alerts on all major flows

* INFRA-2888 : Remove unnecessary space

* INFRA-2888 : Metric handler

* INFRA-2888 : Review changes

* INFRA-2888 : Build fix

* INFRA-2888 : Code cleanup

* INFRA-2888 : Review comments round 1

* INFRA-2888 : Err msg changes

* INFRA-2888 : task to job in name
2024-03-12 19:56:52 +05:30
Vijay Joshi
793c9183ec INFRA-2931 : Auto archival of all severity incidents (#387)
* INFRA-2931 : Basic setup for auto archival

* INFRA-2931 : Cron implemented for auto archival

* INFRA-2931 : query based on config map

* INFRA-2931 : change query to is false

* INFRA-2931 : Minor changes in names and duplicate archival conditions
2024-03-05 19:25:18 +05:30
Shashank Shekhar
de7639d6fe INFRA-2866 | moving update team modal ack before async workflow (#389) 2024-03-05 16:40:17 +05:30
Shashank Shekhar
d4d7da3328 INFRA-2866 | Create and update incident api changes (#386)
* INFRA-2866 | added APIs to get product for user and to get asigner and responder teams

* INFRA-2866 | added create-incident-v3 API

* INFRA-2866 | migration script to fill team_severity, team_user and team_user_severity tables

* INFRA-2866 | adding team severity users upon team and severity update

* INFRA-2866 | using update team v2 in slack action

* INFRA-2866 | update product flow

* INFRA-2866 | fixed user not invited issue

* INFRA-2866 | updated API paths

* INFRA-2866 | using constant for header fetching

* INFRA-2866 | PR review changes
2024-03-05 15:26:00 +05:30
Shashank Shekhar
36d590221c INFRA-2829 | Implemented transaction in create incident flow (#371)
* INFRA-2829 | Implemented transaction in create incident flow

* INFRA-2829 | created util func for rollback

* INFRA-2829 | removed redundant cod to create slack channel
2024-02-16 14:47:22 +05:30
Vijay Joshi
358799442e INFRA-2828 : Drop unused tables, update migrations and removed unused code modules (#367) 2024-02-13 15:43:24 +05:30
Vijay Joshi
f8b286adb1 INFRA-2856 : Added alerts for incident creation, resolve and zenduty failures (#366) 2024-02-12 19:11:02 +05:30
Gullipalli Chetan Kumar
a4c648649b TP-54496 | Created justification message prompt for de-escalation (#360)
* TP-54496| created justification message prompt for de-escalation

* TP-54496| added migration file to update new column in log table

* TP-54496| added feature flag

* TP-54496| created util functions and constants

* TP-54496| updated design changes

* TP-54496| made the requested changed in PR comments

* TP-54496| fixed bugs in merge conflicts

* TP-54496| acknowledging to slack before hand so to not time out

* TP-54496| modified log entity field justification

---------

Co-authored-by: Shashank Shekhar <shashank.shekhar@navi.com>
2024-02-07 18:11:50 +05:30
Vijay Joshi
ad96361d68 TP-49979 , TP-52174 : API to get resolution tags + resolve incident API + incident resolve entire flow refactor (#347)
* TP-49979 : Added API to get tags for resolving incident

* TP-49979 : Set up basic structure for resolve incident from UI

* TP-49979 : Complete till post rca flow

* TP-49979 : Complete till rca gen flow

* TP-52174 : rebase changes

* TP-52174 : Integrate with slack

* TP-52174 : fix error in flows

* TP-52174 : Segregate interface and impl

* TP-52174 : Fix ut failures

* TP-52174 : Fix resolve tag api error

* TP-52174 : Fix jira link bug

* TP-52174 : Remove nil

* TP-52174 : Rebase changes

* TP-52174 : Jira links table fix

* TP-52174 : Line length fix

* TP-52174 : Makefile changes

* TP-52174 : Basic bug fixes

* TP-52174 : Minor fixes

* TP-52174 : Add UT's for initial flows

* TP-52174 : Added all UT's

* TP-52174 : More PR review changes

* TP-52174 : Add UT's for incident jira and tag service

* TP-52174 : Fix jira link bug and batched create incident tags db call

* TP-52174 : Make auto archival severities configurable

* TP-52174 : Fix jira link in incident table issue
2024-02-01 15:23:15 +05:30
Shashank Shekhar
e0f78dc190 TP-54880 | Posting SLA breach heads-up message when an incident is reopened and TAT is breached or breaching within next 24 hours (#358)
* TP-54880 | Posting SLA breach headsup message when an incident is reopened and TAT is breached or breaching withing next 24 hours

* TP-54880 | Posting SLA breach headsup message when an incident is reopened and TAT is breached or breaching withing next 24 hours

* TP-54880 | Posting SLA breach headsup message when an incident is reopened and TAT is breached or breaching withing next 24 hours
2024-01-23 19:07:05 +05:30
Ajay Devarakonda
392f434e46 TP-53262 | Fixed incident status from slash command processor and incident update status popup (#353)
* TP-38709 | Merging the changes to master on the logfix

* TP-53262 | Fixed set status to investigating from incident action drop down and from slash command resolver

* TP-53262 | Resolved pr comment

* TP-53262 | Resolved pr comment

* TP-53262 | Resolved pr comment
2024-01-17 17:42:51 +05:30
Gullipalli Chetan Kumar
8e7619f972 TP-52454 : Created Zenduty integration (#348)
* TP-52454| created zenduty integration

* TP-52454| added migration script for external team table

* TP-52454| added extra logs

* TP-52454| modified logs

* TP-52454|added extra logs

* TP-52454| changed post url for zenduty

* TP-52454| fixed bugs in zenduty client

* TP-52454| created constants for environmental varibales

* TP-52454| enabled zenduty if severity is less than or equal to the defined config
2024-01-12 14:24:19 +05:30
Ajay Devarakonda
72b6271947 TP-52077 | Implemented retry option for rca generation failure (#344)
* TP-38709 | Merging the changes to master on the logfix

* TP-52077 | Added on demand rca generation entry point and its implementation

* TP-52077 | slack client ack request issue fix

* TP-52077 | implemented retry button for rca generation failure

* TP-52077 | Added rca generation unit tests

* TP-52077 | Addressed review comments

* TP-52077 | Added retyr button on rca generation failure incoming webhook

* TP-52077 | Fixed unit test conflicts
2024-01-05 17:02:53 +05:30
Ajay Devarakonda
adb2795b12 TP-51651 | Fixed error logging for delete event skipping case (#340)
* TP-38709 | Merging the changes to master on the logfix

* TP-51651 | Fixed merge conflicts issue along with error optimisation

* TP-51651 | Fixed error handling for empty record not found use case while fetching rca link
2023-12-27 12:05:08 +05:30
Gullipalli Chetan Kumar
aeb572f47e TP-45807 : Sending google transcripts to gen ai (#315)
* created service for sending google transcripts to gen ai

* TP-45807| resolved bugs in drive service tests

* TP-45807| unit tests for getting conversation data function

* creating driveservice in app context and passing to rca service

* modified the unit tests to accomodate driveservicemock

* resolved merge conflicts

* resolved merge conflicts
2023-12-26 14:28:27 +05:30
Gullipalli Chetan Kumar
74c1b88b3d TP-51709 : Enabled Marking an Incident as Duplicate through update Incident API (#336)
* TP-51709| created mark-duplicate-incident-status function

* TP-51709| made the duplicate status code modular
2023-12-22 14:18:43 +05:30
Ajay Devarakonda
c9785af64b TP-51651 | Implemented delete event (#335)
* TP-38709 | Merging the changes to master on the logfix

* TP-51651 | Added implementation for deleting event on incident resolve and duplicate status updates

* TP-51651 | Added delete event as go routine

* TP-51651 | Added go routine in resolve action as well

* TP-51651 | Fixed naming conventions

* TP-51651 | Fixed naming conventions
2023-12-22 12:05:47 +05:30
Shashank Shekhar
5758e603e8 Jira link table (#331)
* TP-51013 | incident_jira entity, repo and service

* TP-51013 | get jira status api

* TP-51013 | added db migration file

* TP-51013 | added migration query to migrate existing jira links into new table

* TP-51013 | removing linked_jira_issues column from incident table

* TP-51013 | removing empty jira fields if no response found for a jira key in jira api response

* TP-51013 | handled jira api failure cases, will return empty jira fields

* TP-51013 | removed linked_jira_issues field from incident entity

* TP-51013 | handled jira link addition and removal in slack action

* TP-51013 | resolving PR comments

* TP-51013 | adding jira link max length check
2023-12-21 16:52:35 +05:30
Ajay Devarakonda
1750ac3c18 TP-51655 | Implementation for creating conference event when severity is updated to other than sev3 (#334)
* TP-38709 | Merging the changes to master on the logfix

* TP-51652 | Added condition to not create conference events for sev3 severity incidents

* TP-51653 | Added condition to skip krakatoa workflow for sev 3 incidents

* TP-51655 | Added implementation to create google meet when severity is changed from sev3

* TP-51655 | Updated wait group count

* TP-51655 | Removed service as a param
2023-12-19 18:17:39 +05:30
Ajay Devarakonda
4c0fbceb33 TP-51197 | Fixed meeting link display in blaze channel (#326)
* TP-38709 | Merging the changes to master on the logfix

* TP-51197 | Fixed meeting link display on blaze channel

* TP-51197 | Fixed label for meeting not available
2023-12-15 14:46:21 +05:30
Ajay Devarakonda
29fbf519e5 TP-51020 | Fixed houston commands alignment issue (#324)
* TP-38709 | Merging the changes to master on the logfix

* TP-51020 | Added help commands button in incident channel which displays the list of supported commands

* TP-51020 | Fixed houston commands alignment issue
2023-12-14 16:43:18 +05:30
Ajay Devarakonda
51e249cef6 TP-51020 | Added help commands button in incident channel section (#323)
* TP-38709 | Merging the changes to master on the logfix

* TP-51020 | Added help commands button in incident channel which displays the list of supported commands
2023-12-14 14:39:52 +05:30
Ajay Devarakonda
0f46d506e3 TP-49333 | Fixed auto escalation message while creating incident (#306)
* TP-38709 | Merging the changes to master on the logfix

* TP-49333 | Fixed SLA message while creating incident

* TP-49333 | Fixed PR review comments

* TP-49333 | Fixed merge conflicts
2023-12-13 18:39:15 +05:30
Vijay Joshi
a40785213b TP-50685 : Krakatoa Integration Phase 2 - Fetch CSV files from monitoring service and post CSV files to Slack (#321)
Krakatoa Integration Phase 2
2023-12-13 16:54:34 +05:30
Gullipalli Chetan Kumar
14b360d0e5 TP-47107| added restriction to not update incident to same status (#318) 2023-12-12 13:41:24 +05:30
Ajay Devarakonda
a62ecbe0a5 TP-48512 | Implementation of RCA and tag migration (#296)
* TP-38709 | Merging the changes to master on the logfix

* TP-48512 | Added button element for RCA section and implemented fill rca details

* TP-48512 | Small fixes

* TP-48512 | adding unit tests

* TP-48512 | added unit tests

* TP-48512 | updated color code for rca card

* TP-48512	| Removed duplicate interface

* TP-48512	| Added one more unit test

* TP-48512 | added comments for jira link validation and update

* TP-48512 | Merging the changes to master on the logfix

# Conflicts:
#	cmd/app/handler/slack_handler.go

* TP-48512 | Added button element for RCA section and implemented fill rca details

# Conflicts:
#	common/util/common_util.go
#	common/util/constant.go
#	internal/processor/action/incident_resolve_action.go
#	internal/processor/action/incident_update_jira-links_action.go
#	internal/processor/action/incident_update_resolution_text_action.go
#	internal/processor/action/view/incident_resolution_text.go
#	internal/processor/action/view/incident_section.go
#	service/slack/slack_service.go

* TP-48512 | Small fixes

* TP-48512 | adding unit tests

* TP-48512 | added unit tests

# Conflicts:
#	Makefile
#	service/incident/incident_service_v2_interface.go

* TP-48512 | updated color code for rca card

* TP-48512	| Removed duplicate interface

* TP-48512	| Added one more unit test

* TP-48512 | added comments for jira link validation and update

* TP-48512 | Fixed merge conflicts

* TP-48512 | Fixed merge conflicts

* TP-48512 | Fixed merge conflicts

* TP-48512 | Added sql migration script for adding tags

* TP-48512 | Updated sql migration script for adding tags

* TP-48512 | Fixed merge conflicts and updated tags in sql migration script
2023-12-07 14:13:12 +05:30
Vijay Joshi
120d508a05 Incident Service Integration with monitoring service client for Houston-Krakatoa integration (#302)
Incident Service Integration with monitoring service client for Houston-Krakatoa integration (#302)
2023-12-05 15:49:13 +05:30
Ajay Devarakonda
1cec0657db TP-49982 | Modified incidents to tag already resolved incidents as duplicates (#308)
* TP-38709 | Merging the changes to master on the logfix

* TP-49982 | Modified to accept resolved incidents for duplicating the incidents
2023-12-05 12:16:27 +05:30
Gullipalli Chetan Kumar
c393b81bbc TP-47335 : Update get teams api to reduce latency by getting user data from database instead of slack (#284)
* TP-47335| created teamservice version 2 for get teams api

* TP-47335| modified the getusers info function to handle nil error

* refactored the structure of team service and created interfaces

* TP-47335| created unit tests

* TP-47335| added unit tests for get teams api

* resolved PR comments

* created custom error types

* made some changes in unit tests

* added unit tests for team handler

* solved merge conflicts

* solved invalid users bug

* resolved merge conflicts

* restricting incident title length to 100 characters

* removed unecessary comments
2023-12-04 15:16:21 +05:30
Vijay Joshi
527ba2c04f TP-44155, TP-47355 : Update incident web refractor + Update severity slack refractor with unit tests (#262)
* TP-44155 : Update incident web refractor

* Resolution of v1 and v2 service calls

* PR review changes

* Rebase fixes

* TP-47355 : Add slack update severity refractor

* Cors fix

* Rebase fix

* Second PR revice changes

* More review changes

* Add concurrency to slack calls

* rebase

* Setup interfaces

* Added unit tests for update incident refractor

* Add more test cases

* Rebase changes

* Fix responder addeb by

* Fix build error

* Fix concurretn slack calls

* Revert rebase bug

* Shorten function length: added slack workflows

* Made fucntion size smaller
2023-11-30 14:24:29 +05:30
Shashank Shekhar
805d45bb34 TP-49403 | restricting set status to Resolved (#299) 2023-11-30 13:39:06 +05:30
Shashank Shekhar
88459577f4 TP-49403 | parameterized slash command (#297)
* TP-49403 | parameterized slash command

* TP-49403 | handeling resolve and rca params also implemented Help-Commands button

* TP-49403 | using command pattern for command resolutiuon and execution

* TP-49403 | made find team by name and find severity by name queries case insensitive

* TP-49403 | updating help message keys
2023-11-30 11:56:32 +05:30
Gullipalli Chetan Kumar
4abb12f71f TP-47360 : Removed deleted_at column, changed text messages (#294)
* removed the deleted_at column and removed one extra space in the resolved,duplicated messages

* changed rca input entity stucture

* changes text in tests of rca service
2023-11-24 16:20:47 +05:30
Gullipalli Chetan Kumar
2dd4d710e5 TP-47360 : Created service for uploading slack conversations to s3 and send request to maverick for generating RCA (#290)
* TP-47360| created services to upload slack conversation to cloud and send urls to generate RCA

* created rca input repository

* TP-47360| enabled generating rca service on resolution

* resolved merge conflicts

* TP-47360| added migration script for creating rca input table

* changed json response structure according to contract

* added unit tests

* removed api to make gen ai call

* made changes in message format posted in slack

* changed entity struct and adding flag to enable rca generation

* attaching title, description and replies related to incident status block

* made design changes in message format
2023-11-24 14:39:34 +05:30
Ajay Devarakonda
1125f573b2 TP-44162 | Google Meet integration to create calendar invite with meeting link creation on incident creation (#277)
* TP-44158 | Adding service to get transcript files from Google Drive (#234)

Adding service to get transcript files from Google Drive

* TP-45120 (#275)

TP-45120 | merging Google auth implementation and calendar event fix

* TP-44162 | added service implementation for calendar actions

* TP-44162 | Updated label in slack message

* TP-44162 | Fixed build failures

* TP-44162 | Updated sql migration file name

* TP-44162 | added unit tests for google calendar service

* TP-48200 | updated response messages in link and unlink jira apis (#278)

* TP-44162 | resolved review comments

* TP-44158 | Adding service to get transcript files from Google Drive (#234)

Adding service to get transcript files from Google Drive

* TP-45120 (#275)

TP-45120 | merging Google auth implementation and calendar event fix

* TP-44162 | added service implementation for calendar actions

* TP-44162 | Updated label in slack message

* TP-44162 | Fixed build failures

* TP-44162 | Updated sql migration file name

* TP-44162 | added unit tests for google calendar service

* TP-44162 | resolved review comments

* TP-44162 | updated few naming conventions

* TP-44162 | Adding timeouts to google drive api calls and related UTs

* TP-44162 | Adding drive api timeout to viper for unit test

---------

Co-authored-by: Sriram Bhargav <sriram.bhargav@navi.com>
Co-authored-by: Shashank Shekhar <shashank.shekhar@navi.com>
2023-11-09 16:25:20 +05:30
Gullipalli Chetan Kumar
f199f68ae6 removed immediately archiving feature in update incident api and changed message format (#274) 2023-11-06 13:02:58 +05:30
Shashank Shekhar
5ce7d38064 TP-46247 | API to add jira links to an incident (#257)
* TP-46247 | API to add jira links to an incident

* TP-464408 | Add Jira link modal

* TP-45730 | renaming log entity name back to log from logger
2023-11-03 15:30:07 +05:30
Gullipalli Chetan Kumar
7454d3561e TP-42838: Auto archive incident channels after specified delay (#272)
* TP-42838| created auto archival scheduler

* TP-42838| created utility to post archival messages and updated the archiving scheduler

* TP-42838| added messages to be posted in incident channel for archiving

* TP-42838| made utility functions for posting messages

* added environmental variables for cron

* changed posting time to ist from utc

* archiving channels based on end time in incident table

* changed time from 24 to 12 hour format

* updated the query to retrieve channels to be archived

* resolved merge conflicts

* made the requested changes in PR
2023-11-03 11:42:43 +05:30