Patching Machines (or Shut-up Alerts!)

Over the years, I’ve worked for various organizations and most have had a fairly routine process for applying patches.  The best of these uses a defined Change Management window during which the patches were deployed and the computers rebooted (if necessary).  This post is not a recommendation on when to do this or how to do this.  This post is a recommendation on how to get SolarWinds Orion to shut-up when you need to do this work.

Years ago, the SolarWinds Orion platform introduced the idea of Unmanaging a node.  What this meant in plain terms was that you would just pause all monitoring on a specific element.  Because the monitoring was paused, you get no alerts, but you also get no metrics.  This worked, but it could be better.

Enter the new concept of Muting alerts.  When you mute an element, the polling doesn’t stop – just the alerting engine.  This means that CPU, Memory, Interface Utilization, whatever is still being recorded, but you won’t get any alerts.

Then I got a question at a recent SWUG about integrating this with Active Directory.  That got my mind racing.

So here’s my theory:

You are currently using Active Directory Sites & Services to properly segment your network.  This is also how you handle your patching – location by location.

Active Directory Sites

Now, the interesting thing about SolarWinds Orion is that when we discover a machine and realize that it’s a Windows server, we also ask for the AD Site information.

Node Details

So, if you are using Sites to determine when you do patching, you can just ask Orion what specific nodes are in this location.

Then you can use the Orion API to say “please mute these alerts on [said] day for x time.”  In my example, I’m saying to start the muting right now and to have it suppress the alerts for 1 hour.  (Your mileage may and will vary).

<###########################################################
#
#  Variables
#
###########################################################>

# Since the AD Site Name is stored as the "Location" for WMI and Agent Managed Windows Servers in Orion, we can just do a filter on that.
$AdSiteName = "EAST"

# Suppress alerts for how much time?
$MuteMinutes = 60

<###########################################################
#
#  Orion Stuff!
#
###########################################################>

if ( -not ( $SwisCreds ) )
{
    $SwisCreds = Get-Credential -Message "Enter your Orion Credentials:"
}

$SwisHost = "10.196.3.50"

$SwisConnection = Connect-Swis -Hostname $SwisHost -Credential $SwisCreds

# Query Details:
# 1) Location matches above variable
# 2) Vendor is Windows
# 3) Monitoring Type is Agent or WMI (Haven't tested it for SNMP, but pretty sure it uses the 'sysLocation' field)
$Nodes = Get-SwisData -SwisConnection $SwisConnection -Query @"
SELECT Caption, IP_Address, Uri AS [EntityUri]
FROM Orion.Nodes
WHERE Location = '$AdSiteName'
  AND Vendor = 'Windows'
  AND ObjectSubType IN ( 'Agent', 'WMI' )
ORDER BY Caption
"@

ForEach ( $Node in $Nodes )
{
    # Let's cycle through each and see if it's already set to be muted
    $IsMuted = Get-SwisData -SwisConnection $SwisConnection -Query @"
SELECT EntityUri, SuppressFrom, SuppressUntil
FROM Orion.AlertSuppression
WHERE EntityUri = '$( $Node.EntityUri )'
"@
    if ( $IsMuted )
    {
        Write-Host "$( $Node.Caption ) ($( $Node.IP_Address )) is already muted!" -ForegroundColor Yellow
    }
    else
    {
        Write-Host "$( $Node.Caption ) ($( $Node.IP_Address )) is NOT already muted!" -ForegroundColor Green
        try
        {
            Write-Host "`tTrying to mute $( $Node.Caption )" -ForegroundColor Green
            $Results = Invoke-SwisVerb -SwisConnection $SwisConnection -EntityName Orion.AlertSuppression -Verb SuppressAlerts -Arguments @( @( $Node.EntityUri ), ( Get-Date ).ToUniversalTime(),  ( ( Get-Date ).AddMinutes($MuteMinutes) ).ToUniversalTime() )
        }
        catch
        {
            Write-Host "Something went wrong" -ForegroundColor Red
        }
    }
}

Honestly, the hardest part for me was deciding to do this en-mass or doing them one at a time.  I decided to do it one at a time so that I could get a line-by-line log of the event.  You could do it another way if you like.

The important part was that the SuppressAlerts verb takes three parameters:

  • An array of Entity URI’s – if you only send one it needs to be surrounded by @( ) which indicates that this is an array
  • The date/time to start the muting (this can be $null – indicating that start it right now)
  • The date/time to end the muting (this can be $null – indicating to do this forever)

When run you’ll get something like this:

and if you try to run it again against the same, you’ll get:

You can wait for these to unmute themselves, you can manually unmute them via the Manage Entities screen, or unmute them using the ResumeAlerts verb.

I just wanted to drop this by after I thought about it.  Let me know if I need to make updates!

Keep on rambling!

1 thought on “Patching Machines (or Shut-up Alerts!)”

  1. Great script. We have clients that schedule reboots on a weekly basis so I modified the script to read a csv file, loop through each row in the csv and put the server in maintenance mode.

    What I noticed is that at when the specific end time (based on the date/time calc’d by the MuteMinutes variable) passes, the solarwinds web console show the server out of maintenance mode but the Orion.AlertSuppression table still shows the server using Get-SwisData $SwisConnection “SELECT ID, EntityUri, SuppressFrom, SuppressUntil FROM Orion.AlertSuppression” | Format-Table.

    When a subsequent mute script run attempt occurs, it fails because of the row in the Orion.AlertSuppression table. If I execute a script with Invoke-SwisVerb -SwisConnection $SwisConnection -EntityName Orion.AlertSuppression -Verb ResumeAlerts @( ,@( $Node.EntityUri ) ), the mute script executes successfully (putting the servers in maintenance mode again). I was expecting the row in the orion.alertsuppression table to clear once the SuppressUntil date/time passes.

    Thoughts on this?

    Thanks in advance and thank you for a great script!

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.