Integrating vROps with vRO using webhook shim

Introduction

In an ideal world, running custom vRO workflows as a result of a triggered alert in vROps would be either an out of the box feature or (easily) possible via a vROps management pack.

There have been two incarnations of a vRO management pack for vROps and neither has provided this, the first incarnation did, with a lot of manual changes to the vROps adapter file allow this, and from an initial inspection of the latest vRO management pack this would also be possible – perhaps the subject of another blog post!

So, in the absence of such a (perfect) solution the good news is that there is a solution to the challenge, and there have already been a number of blog posts written on this subject, both from VMware staff and externally. Whilst working with a customer I found that none of these resources captured all of the steps in a single succinct post, that didn’t assume any previous knowledge.

The solution

A couple of years back, John Dias and Steve Flanders wrote a web hook shim that enabled vROps and vRLI to interact not just with vRO but with a number of different endpoints. The web hook shim is a small python application that listens for requests from vROps or vRLI and translates those requests into something that the intended endpoint can understand.

Install Photon OS

As VMware has their  own Linux distribution, this is what I have used for this blog post, especially as it works really well as a container host. There are lots of articles on how to get an instance of Photon OS up and running quickly:

https://vmware.github.io/photon/

The easiest method is to use the OVA image. Although you should note that there has been an issue with the hardware version 13 image, as outlined on William Lam’s blog:

https://www.virtuallyghetto.com/2017/11/workarounds-for-deploying-photonos-2-0-on-vsphere-fusion-workstation.html

Configure Network Settings

Whilst you could leave the VM to use DHCP for network addressing, you’ll most likely want to configure a static IP address. Details on how to do this can be found in the photon OS administration guide:

https://github.com/vmware/photon/blob/master/docs/photon-admin-guide.md#managing-the-network-configuration

I ended up with a network configuration file as shown below:

Screen Shot 2018-07-28 at 13.02.23

Once the networking configuration has been set, run the following command to restart the network daemon:

systemctl restart systemd-networkd

Run the following command to validate that the IP address information has been set correctly

ip a

The result should be something similar to that shown below

Screen Shot 2018-07-28 at 13.07.13

Configure Hostname

To set the hostname of the appliance run the following command:

hostnamectl set-hostname abc

To check that is has been set correctly run the following command:

hostnamectl

The result should be something similar to that shown below:

Screen Shot 2018-07-28 at 13.13.01

Configure Docker

We will be running the web hook shim as a Docker container. Docker is already installed within the Photon OS image, but it is does not configured to run by default. Docker also needs to be configured to start automatically

Start Docker

Start Docker by running the following command:

systemctl start docker

Set Docker to start automatically

Set Docker to start automatically during system start by running the following command:

systemctl enable docker

Start the web hook shim

The web hook shim is available on Docker Hub, so as long as the appliance can access the internet then getting it running is a simple process. Before we can get the link between vROps and vRO working we need to do some configuration of the web hook shim. Run the following command to start the Docker container

docker run -it -p 5001:5001 --name {yourchoice} vmware/webhook-shims bash

This command runs the container interactively, sets the ports that the container will respond on, gives it a memorable name, pulls the container image from Docker hub and provides a bash shell to allow shell access within the container

To check the status of the the container run the following command:

docker ps -a

This will have an output similar to the following:

Screen Shot 2018-07-28 at 13.23.04

Configure the web hook shim

Attach to the container by running the following command:

docker attach vmwareshim

The various translation files for the different endpoints are in a directory called ‘loginsightwebhookdemo’:

cd loginsightwebhookdemo
vi vrealizeowchestrator.py

The first element to change is the hostname of the vRO endpoint to which alert notifications will be sent:

VROHOSTNAME = 'vrohostname'

If running an instance of vRO from a vRA appliance there is no need to include port information. If running the standalone version of vRO then you will need to include port 8281 in this section for example:

VROHOSTNAME = 'myvroserver.mydomain.local:8281'

Information around the various means of authentication can be found within the vrealizeowchestrator.py file. Amend the various components as necessary to achieve the desired authentication mechanism:

USENETRC = True
VROUSER = ''
VROPASS = ''
VROTOKEN = ''
VROHOK = ''

If running in an environment without trusted certificates, the following line needs to be set so that certificate checks are disabled:

VERIFY = False

The shim currently has an issue whereby two instances of the workflow are executed for each alert that is triggered. To fix this issue edit the vrealizeorchestrator.py file and alter the following line:

@app.route("/endpoint/vro/", methods=['PUT', 'POST'])

To:

@app.route("/endpoint/vro/", methods=['POST'])

Save the file and then exit vi.

Using .netrc credentials

Prior to configuring the web hook shim, if .netrc credentials are going to be used it is easier to configure these now, instead of having to exit the shim, creating the credentials and then re-attaching

To use netrc credentials, run the following commands:

cd ~
echo 'machine {vro-host-or-ip} login {vro-login-name} password {vro-password}' > .netrc
chmod 600 .netrc

Start the web hook shim

Once all of the configuration has been done it is time to start the web hook shim:

cd ..
./runserver.py

A couple of information messages will be displayed. To validate that the web hook shim is operating, open a web browser and enter the address as:

webhooksshimiporhostname:5001

You should see the following returned

Screen Shot 2018-07-28 at 13.47.11

Useful commands

To stop the web hook shim enter the following command:

CTRL-C

You can restart the web hook shim later via this command:

./runserver.py

To detach from the Docker container, whilst leaving the web hook shim running, use the following command:

CTRL-P, CTRL-Q

Setup the master workflow in vRO

Before vROps can be configured we need a workflow that will receive the alerts generated. In John Dias’s series of posts he discusses a vRO workflow package that can be imported and used to test the functionality. That package can be found at the following location:

https://github.com/vmw-loginsight/webhook-shims/tree/master/samples

When configuring vROps we are going to specify the ID of a workflow that will be run. This workflow needs to have a single input called ‘alertId’ with a type of ‘string’. For the purposes of this blog post I’m going to keep things really simple and merely prove the ability to run a workflow based on a vROps alert. As a result my workflow is really basic, it has a single input and just outputs that alert to the system log.

The ID of the workflow can be found on its general tab, under ID. Copy this and save it for later.

Screen Shot 2018-07-28 at 14.05.03

vROps configuration

At a high level the configuration steps within vROps consist of the following:

  1. Setup an outbound instance
  2. Configure notification settings
  3. Setup a test alert

Setup an outbound instance

The first step is to create the outbound instance using the OOTB Rest Notification Plugin. You get to the notification settings via the following sequence:

Administration > Management > Notification Settings

Configure the outbound plugin with the following parameters:

Parameter Value
Plugin Type Rest Notification Plugin
Instance Name vmwareshim
Endpoint http://webhookshimip:5001/endpoint/vro/workflowId
Username none
Password none (but you must put something in this box)
Content Type application/json
Certificate Thumbprint none
Connection Timeout 20

To validate the settings, press the test button. The test will fail within vROps with the following error displayed:

Screen Shot 2018-07-28 at 14.19.50

However, you will be able to see that integration is working by looking within the photon OS appliance. If you have an SSH session open and you’re attached to the web hook shim container then you should see the following log output:

screen-shot-2018-07-28-at-14-22-00.png

You can also check within vRO, where there should be a an execution of the workflow specified

Screen Shot 2018-07-28 at 14.40.53

Configure notification settings

vROps can be configured to send notifications to multiple targets and use advanced filtering to determine where different alerts should be sent. For the purposes of this blog post I am going to configure a notification rule to send all alerts to vRO via the web hook shim. Configure the rule by navigating to the following page:

Alerts > Alert Settings > Notification Settings

The rule should look like the following image:

Screen Shot 2018-07-28 at 14.50.29

Setup a test alert

The final step is to have an alert triggered. If you’re performing this in an environment with active ESXi hosts / Virtual Machines, then the likelihood is that you will get alerts triggered anyway. If you are just testing this in a lab, configure an alert with symptoms that have such low thresholds that they will inevitably get triggered.

Screen Shot 2018-07-28 at 14.57.10

Once the alert has been triggered you should be able to see successful executions of your workflow in vRO. In my case I’m just logging the alertId but that is all that you need to be able to extract a lot of information – although there will be multiple REST calls required to vROps in order to get that information.

John Dias included some initial steps that could be used to retrieve information about the affected object. The basic flow is as follows:

  • alertID – retrieving the alert from vROps will also return the alertDefinitionId and the resourceId of the affected object
  • resourceId – can be used to retrieve the name of the object, properties of the object, related object, triggered symptoms
  • alertDefinitionId – can be used to find out more prescriptive information about the alert definition itself and can be used to match the symptoms of the alert definition to the triggered symptoms on the resource

Additional Resources

There are a large number of blog posts around this subject, which may provide additional information and/or context:

https://blogs.vmware.com/management/2017/03/webhook-shims-now-available-on-docker-hub.html

https://blogs.vmware.com/management/2017/02/self-healing-datacenter.html

https://blogs.vmware.com/management/2017/01/vrealize-webhooks-infinite-integrations.html?src=so_5703fb3d92c20&cid=70134000001M5td

vROps Supermetric Conditionals and resource entry aliasing

Introduction

Whilst vROps supermetrics are awesome, there are two features that make them even better:

  1. Resource entry aliasing
  2. Creating condition based supermetrics

The first feature can make complex supermetrics easier to create by allowing a metric to be assigned an alias which can be reused later in the supermetric

The second feature allows logic to be applied to supermetric calculation resulting in the ability to create significantly more complex supermetrics.

The example below shows a supermetric using both of these features

${this, metric=cpu|corecount_provisioned} as cpucount>1?${this, metric=cpu|actual.capacity}/cpucount:${this, metric=cpu|actual.capacity}

The cpu|corecount_provisioned metric is assigned an alias of cpucount, and this alias is reused later in the expression. The supermetric also makes use a conditional element to evaluate the number of cores provisioned to the virtual machine and calculating a value based on whether the result is true of false.

The supermetric is determining whether the virtual machine has more than one vCPUs. If it does then a calculation is made for the Mhz per core by dividing the current Mhz size of the virtual machine by the number of cores. If the VM doesn’t have more than one core then the Mhz per core is simply the current Mhz value. This super metric is only given as an example to show the use of aliases and conditional evaluation as the same result could be derived simply by dividing the current Mhz value by the number of cores.

The format of the conditional supermetric is as follows:

Expression to evaluate ? If true : If false

Transformation of the result can also be performed once the expression has been evaluated:

Transformation(Expression to evaluate ? If true : If false)

To better understand the use of conditionals consider the following examples:

1+1 > 1 ? 10 : 20 - This would return 10 as 1 + 1 is greater than 1

A super metric involving a transformative function is as follows:

avg(2+2>3?[10,15,20]:[20,25,30]) - This would return 15 as this is the average of 10, 15 & 20

Supermetrics support a number of different means of evaluation which are similar to those found in most programming languages:

comparator Description
> Greater than
>= Greater than or equal to
<= Less than or equal to
< Less than
== Equal to
!= Not equal to

Examples of the different comparators are as follows:

sum(1+1>=3?[1,2,3]:[4,5,6]) - This would return 15 as 1+1 is not greater than or equal to 3 and the result is the sum of 4+5+6

sum(1+1<=3?[1,2,3]:[4,5,6]) - This would return 6 as 1+1 is less than or equal to 3 and the result is the sum of 1+2+3

count(1+1==3?[3,5,7]:[1,2,3,4,5,6]) - This would return 6 as 1+1 does not equal 3 and the result is the count of the objects

max(1+1!=3?[3,5,7]:[1,2,3,4,5,6]) - This would return 7 as 1+1 does not equal 3 and the result is the maximum value of 3,5 & 7

The conditional supermetrics can be further enhanced  to include AND & OR statements:

Logic Description
&& AND
|| OR
1+1==2&&2+2==4?10:20 - This would return 10 as 1+1 does equal 2 and 2+2 does equal 4

1+1==2||2+2==5?10:20 - This would also return 10 as the expression is true if either 1+1  equals 2 OR 2+2 equals 5.

These can also be combined into ever more complex supermetrics:

1+1==2&&(2+2==4||3+3==7)?25:50 - To return true (and a value of 25), 1+1 must equal 2 AND either 2+2 must equal 4 OR 3+3 must equal 7 - This expression will return true

The examples shown above all use simple integer based expressions to explain them more easily, but the real power is when these are used with actual metrics:

${this, metric=guestfilesystem|capacity_total} as totalcapacity > 200 &&${this, metric=guestfilesystem|percentage_total} as percentused<95?totalcapacity:((totalcapacity/100)*percentused)+20

In this example, the total capacity of the guest file system is evaluated to determine whether it is greater than 200GB. Also the disk usage is evaluated to determine whether it is less than 95%. If both conditions are true then the result is the total capacity but if either is false i.e the disk is less than 200GB in size or the usage is greater than 95% then the returned value is equal to the current usage plus 20GB. The result is that a value for all guests is calculated that gives at least 20GB of free space.

Setting scoreboard widget to not use colours via metricconfig.xml files

Introduction

There are often occasions when it is not desirable to use colour coding for certain metrics on scoreboard dashboards. These are often metrics for which there is no obvious change in state such as the amount of memory configured or amount of provisioned disk space. These metrics rarely change value, but more importantly a change in value does not necessitate a change in health state. For these metrics it can be more useful to not display colours on a scoreboard widget as they can distract end users from important state change data. Additionally, these metrics may have values that have valid reasons for varying between object types, thereby making assigning colour changes very challenging.

Dashboard Overview

The image below shows a dashboard with three widgets:

  1. Object picker
  2. Self providing scoreboard
  3. Scoreboard with the object being derived from the object picker and the metrics from a metricconfig.xml file

Screen Shot 2018-01-30 at 22.26.30

Each scoreboard widget has four metrics with one (co-stop) deriving its state from symptoms, two (ready time and swap wait) from custom colours and the fourth (total capacity) utilising no colours as it makes little sense for this metric to display colours as a) it could vary between hosts and b) is unlikely to ever change.

Defining state and colour changes via the widget

As a reminder, the colours for metrics that are defined directly on the scoreboard widget are configured by selecting the appropriate setting under ‘Color Method’ when editing the widget:

Screen Shot 2018-01-30 at 22.27.06

Defining state and colour changes via a metriconfig.xml file

To achieve the same result via a metricconfig.xml file the following code is used:

Screen Shot 2018-01-30 at 22.27.57

Specifically the following is added to the end of each Metric entry:

showState="false"

 

Using PowervROps cmdlets (part 3) – Adding metrics and properties to objects

Introduction

In the third part in this series of blog posts we are going to use the cmdlets to add metrics and properties to various objects, these metrics and properties can be used in exactly the same way as those from management packs, although you should be aware that metrics age and are not available on dashboards or for use in super metric calculations after a period of time. For dashboards this appears to be roughly four hours and for super metrics this is after 48 hours.

The functions

There are three functions available within PowervROps for adding properties and metrics:

  1. addProperties
  2. addStats
  3. addStatsforResources

The first and second functions enable properties and stats (metrics) to be added to a single vROps object (multiple properties and metrics can be added in a single command). The third command allows metrics to be added to multiple objects in a single command, which is much more efficient in large environments, although there is a limit of 1000 objects in a single execution. There is no comparable API method to add properties to multiple objects via a single command.

How to use

As discussed in part 1 of this blog series, before any cmdlets can be run a means of authentication needs to be configured. As with previous blog posts the examples shown will use token based authentication, although credentials based is perfectly fine. In all examples shown a variable of $token will be used.

All three functions require body content to be submitted along with the command. The functions will accept both JSON and XML (as the vROps API accepts both). For the purposes of this blog JSON will be used throughout. More information can be found about different request examples via the vROps API document (https://vropsfqdn/suite-api/docs/rest/index.html). The functions within PowervROps match the name shown in the documentation so should be easy to locate.

The addProperties and addStats functions also require an objectid to be submitted along with the request. Examples are shown in part 1 of this blog series

Examples

addProperties

In this example we are going to add a property called ‘CloudKindergarten|Demo|DevicePurpose’ with a value of ‘Active Directory Domain Controller’ to the virtual machine ‘vm-a’

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$object = getresources -resthost $resthost -token $token -name 'vm-a' -resourceKind 'VirtualMachine'
$body = @{
 'property-content' = @( @{
 'statKey' = 'CloudKindergarten|Demo|DevicePurpose'
 'timestamps' = @(getTimeSinceEpoch)
 'values' = @('Active Directory Domain Controller')
 'others' = @()
 'otherAttributes' = @{}
 }
 )
}
addProperties -resthost $resthost -token $token -objectid $object.resourcelist.identifier -body ($body | convertto-json -depth 5) -contenttype 'json'

This example shows a JSON body content. If the body was in XML format then the addProperties command would be altered as below:

addProperties -resthost $resthost -token $token -objectid $object.resourcelist.identifier -body ($body | convertto-json -depth 5) -contenttype 'xml'

We can see below that the property is displayed in the All Metrics page and can be selected in the same way as any other property:

Screen Shot 2018-01-30 at 13.22.34

We can also retrieve this property using the getResourceProperties function:

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$object = getresources -resthost $resthost -token $token -name 'vm-a' -resourceKind 'VirtualMachine'
$resourceproperties = (getresourceproperties -resthost $resthost -token $token -objectid $object.resourcelist.identifier).property | where { $_.name -eq 'CloudKindergarten|Demo|DevicePurpose'}
$resourceproperties

This returns the following:

Screen Shot 2018-01-30 at 13.40.48

addStats

The second example shows adding a single metric to a virtual machine. The metric name is ‘CloudKindergarten|Demo|SingleMetricOne and a random value of between 1 and 100  will be added. The code includes an additional unnecessary line that generates the random number, and this variable is used within the JSON body.

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$object = getresources -resthost $resthost -token $token -name 'vm-b' -resourceKind 'VirtualMachine'
$metricvalue = get-random -Minimum 1 -Maximum 100
$body = @{
'stat-content' = @( @{
'statKey' = 'CloudKindergarten|Demo|SingleMetricOne'
'timestamps' = @(getTimeSinceEpoch)
'data' = @($metricvalue)
'others' = @()
'otherAttributes' = @{}
}
)
}
addStats -token $token -resthost $resthost -objectid $object.resourcelist.identifier -body ($body | convertto-json -depth 5) -contenttype json

In the image below the results of running the command many times in a short space are shown.

Screen Shot 2018-01-30 at 15.56.27

When running this command, it is not limited to single metric, nor a single data point. In the example below values for two different metrics are being added for datapoints at five minute intervals for the last hour.

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$object = getresources -resthost $resthost -token $token -name 'vm-c' -resourceKind 'VirtualMachine'
$metricvaluesone = @()
$metricvaluestwo = @()
$timestamps = @()
for ($i = 0;$i -lt 12;$i++) {
$metricvaluesone += (get-random -Minimum 1 -Maximum 100)
$metricvaluestwo += (get-random -Minimum 1 -Maximum 100)
$timestamps += gettimesinceepoch -date (get-date).AddMinutes(($i*-5))}
$body = @{
'stat-content' = @( @{
'statKey' = 'CloudKindergarten|Demo|SingleMetricOne'
'timestamps' = $timestamps
'data' = $metricvaluesone
'others' = @()
'otherAttributes' = @{}
}
@{
'statKey' = 'CloudKindergarten|Demo|SingleMetricTwo'
'timestamps' = $timestamps
'data' = $metricvaluestwo
'others' = @()
'otherAttributes' = @{}
}
)
}
addStats -token $token -resthost $resthost -objectid $object.resourcelist.identifier -body ($body | convertto-json -depth 5) -contenttype json

The results can be seen in the image below

Screen Shot 2018-01-30 at 16.33.39.png

addStatsforResources

In the last example multiple metrics are going to be added to multiple VMs. Specifically the following metrics are going to be added:

  • CloudKindergarten|Demo|MultiMetricOne – with a random value between 1 and 100
  • CloudKindergarten|Demo|MultiMetricTwo – with a random value between 1 and 100
  • CloudKindergarten|Demo|MultiMetricThree – with a random value between 1 and 100

These metrics are going to be added to the following VMs:

  • vm-a
  • vm-b
  • vm-c
  • vm-d
  • vm-e
    import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
    $resthost = 'vrops-01a.cloudkindergarten.local'
    $token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
    $metrictime = getTimeSinceEpoch
    $allvms = @('vm-a','vm-b','vm-c','vm-d','vm-e')
    $allvmstatcontent = @()
    foreach ($vm in $allvms) {
    $object = getresources -resthost $resthost -token $token -name $vm -resourceKind 'VirtualMachine'
    $statcontent = @()
    $statcontent += (@{statKey='CloudKindergarten|Demo|MultiMetricOne';timestamps=@($metrictime);data=@((get-random -Minimum 1 -Maximum 100));others=@();otherAttributes=@{};}) 
    $statcontent += (@{statKey='CloudKindergarten|Demo|MultiMetricTwo';timestamps=@($metrictime);data=@((get-random -Minimum 1 -Maximum 100));others=@();otherAttributes=@{};})
    $statcontent += (@{statKey='CloudKindergarten|Demo|MultiMetricThree';timestamps=@($metrictime);data=@((get-random -Minimum 1 -Maximum 100));others=@();otherAttributes=@{};})
    $vmstatdetail = @{
     'id' = $object.resourcelist.identifier
     'stat-contents' = $statcontent
    }
    
    $allvmstatcontent += $vmstatdetail
    
    }
    $vmstatcontent = @{
     'resource-stat-content' = $allvmstatcontent
     }
    
    addstatsforresources -resthost $resthost -token $token -body ($vmstatcontent | convertto-json -depth 10)

The outcome of this can be seen via a view in vROps

Screen Shot 2018-01-30 at 17.00.20

As with the addStats function, multiple data points can be added with the addstatsforresources function:

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$allvms = @('vm-a','vm-b','vm-c','vm-d','vm-e')
$allvmstatcontent = @()
foreach ($vm in $allvms) {
$object = getresources -resthost $resthost -token $token -name $vm -resourceKind 'VirtualMachine'
$metricvaluesone = @()
$metricvaluestwo = @()
$metricvaluesthree = @()
$timestamps = @()
for ($i = 0;$i -lt 12;$i++) {
$metricvaluesone += (get-random -Minimum 1 -Maximum 100)
$metricvaluestwo += (get-random -Minimum 1 -Maximum 100)
$metricvaluesthree += (get-random -Minimum 1 -Maximum 100)
$timestamps += gettimesinceepoch -date (get-date).AddMinutes(($i*-5))}
$statcontent = @()
$statcontent += (@{statKey='CloudKindergarten|Demo|MultiMetricOne';timestamps=$timestamps;data=$metricvaluesone;others=@();otherAttributes=@{};})
$statcontent += (@{statKey='CloudKindergarten|Demo|MultiMetricTwo';timestamps=$timestamps;data=$metricvaluestwo;others=@();otherAttributes=@{};})
$statcontent += (@{statKey='CloudKindergarten|Demo|MultiMetricThree';timestamps=$timestamps;data=$metricvaluesthree;others=@();otherAttributes=@{};})
$vmstatdetail = @{ 'id' = $object.resourcelist.identifier 'stat-contents' = $statcontent}
$allvmstatcontent += $vmstatdetail}
$vmstatcontent = @{ 'resource-stat-content' = $allvmstatcontent }
addstatsforresources -resthost $resthost -token $token -body ($vmstatcontent | convertto-json -depth 10)

The results can be seen for the three metrics on three different virtual machines:

Screen Shot 2018-01-30 at 19.39.19

All posts in this series:

Using PowervROps cmdlets (part 1) – Introduction

Using PowervROps cmdlets (part 2) – Retrieving metrics from objects

Using PowervROps cmdlets (part 3) – Adding metrics and properties to objects

Using PowervROps cmdlets (part 4) – Working with objects

Using PowervROps cmdlets (part 5) – Working with relationships

Using PowervROps cmdlets (part 6) – Working with custom groups

Using PowervROps cmdlets (part 7) – Working with supermetrics

Using PowervROps cmdlets (part 8) – Working with reports

Using PowervROps cmdlets (part 9) – Working with alerts

 

Using PowervROps cmdlets (part 2) – Retrieving metrics from objects

Introduction

This is the second in a series of posts about PowervROps and the available functions. This post will focus on the ability to query metric data from objects in vROps. Two functions are available to query the stats held for resources:

  • getLatestStatsofResources
  • getStatsForResources

The former returns just the latest value for the given resource IDs and metrics, whilst the latter allows a large degree of customisation over the time frame of metrics and rollup of data e.g. averages or maximums within specified time frames

getLatestStatsofResources

This is the simpler of the two functions available, but as a consequence only returns the latest data point for each metric. The example below shows the value for the metric ‘mem|guest_demand’ for the VM ‘esxi-04a’

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$object = getresources -resthost $resthost -token $token -resourceKind 'VirtualMachine' -name 'esxi-04a'
$stats = getLatestStatsofResources -resthost $resthost -token $token -objectid $object.resourceList.identifier -statkey 'mem|guest_demand'
$stats.values.'stat-list'.stat

The output of this script is as follows:

Screen Shot 2018-01-30 at 21.38.28

getStatsForResources

The getStatsForResources function can return a vast amount of data and perform transformation of that data in a similar manner available within vROps views. The function requires a body element that contains the query parameters.

The example below queries all virtual machines for a single metric; mem|guest_demand. The data returned in this example returns all data at five minute intervals for the last hour

import-module c:\users\taguser\documents\github\powervrops\powervrops.psm1
$resthost = 'vrops-01a.cloudkindergarten.local'
$token = acquiretoken -resthost $resthost -username admin -password VMware1! -authsource local
$allvms = getresources -resthost $resthost -token $token -resourceKind 'VirtualMachine'
$resourcelist = @()
$metricstoquery = @('mem|guest_demand')
foreach ($vm in $allvms) {
$resourcelist += $vm.resourcelist.identifier
}
$body = @{
'resourceId' = $resourcelist
'statKey' = $metricstoquery
'begin' = (getTimeSinceEpoch -date ((get-date).AddHours(-1)))
'end' = (getTimeSinceEpoch)
'rollUpType' = 'LATEST'
'intervalType' = 'MINUTES'
'intervalQuantifier' = 5
'dt' = $false
'latestMaxSmaples' = 1
} | convertto-json -depth 10
$statsofresources = getStatsForResources -resthost $resthost -token $token -body $body
foreach ($statentry in $statsofresources.values) {
write-host ''
write-host ("Virtual Machine: " + ($allvms.resourceList | where { $_.identifier -eq $statentry.resourceid }).resourcekey.name)
foreach ($stat in $statentry.'stat-list') {
write-host ('Metric: ' + $stat.stat.statkey.key)
for ($i=0;$i -lt $stat.stat.timestamps.count;$i++) {
write-host (((Get-Date '1/1/1970').AddMilliseconds($stat.stat.timestamps[$i])).datetime + ' | ' + $stat.stat.data[$i])
}
}
}

The example above outputs the data to the powershell session, but in practice once the data has been returned then it can be used in any way.

Screen Shot 2018-01-30 at 20.49.20

The data held in vROps can be queried and returned in a number of different ways. The relevant elements to modify are as follows:

'resourceId' = $resourcelist

An array of vROps identifiers

'statKey' = $metricstoquery

An array of metrics to query

'begin' = (getTimeSinceEpoch -date ((get-date).AddDays(-7)))
'end' = (getTimeSinceEpoch)

The start and end times for the queried data. In the example above all data from the current time and for the last seven days will be queried.

'rollUpType' = 'LATEST'

How the data will be transformed. Available options are as follows:

SUM – The sum of all datapoints in each interval period
AVG – The average of the datapoints in each interval period

Screen Shot 2018-01-30 at 21.19.13
MIN – The minimum value of the datapoints in each interval period
MAX – The maximum value of the datapoints in each interval period

Screen Shot 2018-01-30 at 21.20.17
LATEST – The most recent sample of the datapoints in each interval period (see the example for interval type below to better describe the data that will be returned)
COUNT – The number of samples within each interval period

Screen Shot 2018-01-30 at 21.18.24

Take care when using a large time frame with a small interval type and a large number of metrics and resources as it can take a long time for the data to be returned

'intervalType' = 'MINUTES'
'intervalQuantifier' = 5

The time interval between data points. Note that if using any interval period beyond hours then the values returned will have time stamps of the last second of each hour. They will not be for hour points directly preceding the time at which the query is executed. An example of this is shown below:

Screen Shot 2018-01-30 at 21.15.09

In this example, the data point for 21:59:59 is the last data point in the hour between 21:00:00 and 21:59:59 which in this case is actually 9:09:13 – the query was run at approximately 21:14:00. The data point for 20:59:59 is the last value between 20:00:00 and 21:59:59 which was actually at 21:59:13

Using PowervROps cmdlets (part 1) – Introduction

Using PowervROps cmdlets (part 2) – Retrieving metrics from objects

Using PowervROps cmdlets (part 3) – Adding metrics and properties to objects

Using PowervROps cmdlets (part 4) – Working with objects

Using PowervROps cmdlets (part 5) – Working with relationships

Using PowervROps cmdlets (part 6) – Working with custom groups

Using PowervROps cmdlets (part 7) – Working with supermetrics

Using PowervROps cmdlets (part 8) – Working with reports

Using PowervROps cmdlets (part 9) – Working with alerts

Using the PowervROps Cmdlets (Part 1) – Introduction

This is the first in a multi part series on how the cmdlets within the PowervROps module (https://github.com/andydvmware/PowervROps) can be used, and gives specific examples for use.

Loading the module

The first thing that needs to be done prior to using the module is to load it into the current PowerShell session. This can be accomplished by running the following command:

import-module <path-to-module>\powervrops.psm1

To list all of the available commands within the module issue the get-command cmdlet:

get-command -module powervrops

Screen Shot 2017-08-21 at 14.32.07

As with all PowerShell cmdlets the get-help cmdlet can be run to return information about a specific one:

get-help addStats -detailed

Authentication

Before any meaningful commands can be executed against vROps we must have a mechanism to authenticate to it. With PowervROps there are two options; a credentials based method and a token based method. Both authentication methods work with all of the cmdlets and indeed they could be mixed and matched as necessary.

Credentials based authentication

Credentials based authentication allows identical functionality when running in interactive mode in that a credentials object can be generated and then used in the same way as using token based authentication:

$credentials = get-credential -username <username>

The advantage of using credentials based authentication is that (although a security risk), the password used for authentication can be saved to a file and then read in during credentials creation. There is lots of information on this subject and below is a link to a blog post I’ve referenced a number of times:

https://blog.kloud.com.au/2016/04/21/using-saved-credentials-securely-in-powershell-scripts/

With this method scripts can be written and then used with task scheduler in a non-interactive fashion.

Token based authentication

Token based authentication requires that a user authenticates to vROps and in turn receives a token that lasts for 24 hours and can then be used on all subsequent requests. To make the process simpler there is a cmdlet called acquireToken which handles this for you. In order to be able to use the token during the session save it to a variable:

$token = acquiretoken -resthost <fqdn-of-the-vrops-node-or-cluster> -username <username> -password <password> -authsource <authsource>

If we execute this and then query the $token variable we should return something like below:

cac9cdc1-c2b3-487c-a51f-4ccb45e2b246::571da88c-a643-4706-a4d5-2e67d9ab1254

A few things to note about this command:

  • The password if using this command directly will be shown in plain text on screen, although there are methods around this when using the command in a script and prompting the user for their password an example is as follows:
  • $token = acquireToken -resthost $resthost -username $username -password ([Runtime.InteropServices.Marshal]::PtrToStringAuto([Runtime.InteropServices.Marshal]::SecureStringToBSTR((read-host 'Password: ' -assecurestring)))) -authSource $authsource)))
  • The authsource argument can either be local for local vROps accounts or the name of the domain name as configured via the authentication sources section in vROps.

Running our first command

Now that we have an authentication method we can execute our first command. For the purposes of the rest of this post (and the others in this series) I will use variables for things like the authentication token and the rest host that I will be connecting to. There is a cmdlet called getNodeStatus which returns information about the status of your vROps nodes:

getNodeStatus -resthost $resthost -token $token

This will return information similar to the following:

Screen Shot 2017-09-12 at 10.38.19

This command could be used along with a credentials object in a scheduled task to perform daily checks against vROps in a script format. If the variable is set by the command such as $notestatus then $nodestatus.status would quickly and easily allow you to retrieve the status of your vROps nodes without needing to open up a browser

Getting specific property information about an object

As interesting as retrieving the status of your vROps cluster is, that is far from the most interesting thing that can be achieved with PowervROPs. The scenario is to retrieve whether vSphere HA is enabled on a specific cluster. Now this information can be retrieved in many different ways and this is by far the most efficient (that’s what PowerCLI is for….). However, it shows what can be achieved via PowervROps.

There are a number of stages necessary in order to retrieve this information

  1. We need to obtain the ID of the object we want to query (the GUID that you see in the browser window when viewing the object in the vROps UI)
  2. We then need to get the specific property we are interested in from that object

Obtain the resources vROps ID

To get a resource we use the getResources cmdlet. Along with the standard arguments used in all of the cmdlets this has additional arguments for ‘name’, ‘resourceKind’ and ‘objectID’ which can be used to get a specific resource. This cmdlet can also be used to get all cluster resources, or all resources that have a common name element. For now we are going to get the cluster object called ‘Cluster A’ and save it to the variable $cluster

$cluster = getResources -token $token -resthost $resthost -name 'Cluster A' -resourceKind 'ClusterComputeResource'

If we query the cluster object and get its members we can see what a vROps resource query returns:

Screen Shot 2017-09-12 at 10.58.02

To get at the actual cluster object we need to step through the resourceList element:

Screen Shot 2017-09-12 at 10.58.43

We’ll go through the resourceList element in further detail in a future post but for now what we are interested in is the identifier element which is the object ID within vROps. We can get at this ID directly via:

$cluster.resourceList.identifier

Get all properties

Whilst we are specifically interesting in the HA enabled property it is worth knowing how to retrieve all properties as it has the benefit of showing you how the property is named when it is returned via such a query:

$clusterproperties =getResourceProperties -resthost $resthost -token $token -objectid $cluster.resourceList.identifier

If we look at the members of this variable we can see there is an element called property and if we query that then we can see all of the properties for this object

Screen Shot 2017-09-12 at 11.09.42

Get the HA setting property

To get at the specific property via a single command then we need to alter it slightly

$clusterhaproperty = getResourceProperties -resthost $resthost -token $token -objectid $cluster.resourceList.identifier).property | where { $_.name -eq 'configuration|dasConfig|enabled' }

Screen Shot 2017-09-12 at 11.12.38

As you can see knowing what the name of property is comes in handy when querying it directly.

This is the end of the first part in this series, check back soon for additional instalments where we will look at (amongst other things), generating reports, adding properties to an object, creating new objects, creating custom groups and investigating supermetrics.

All posts in this series:

Using PowervROps cmdlets (part 1) – Introduction

Using PowervROps cmdlets (part 2) – Retrieving metrics from objects

Using PowervROps cmdlets (part 3) – Adding metrics and properties to objects

Using PowervROps cmdlets (part 4) – Working with objects

Using PowervROps cmdlets (part 5) – Working with relationships

Using PowervROps cmdlets (part 6) – Working with custom groups

Using PowervROps cmdlets (part 7) – Working with supermetrics

Using PowervROps cmdlets (part 8) – Working with reports

Using PowervROps cmdlets (part 9) – Working with alerts

 

 

Custom Groups vs Custom Datacenters

Recently I have been working with a global organisation to assist with capacity planning and wastage of their virtual estate. This organisation has a significant number of business units and network zones that need to have their capacity planning performed independently, but the clusters within those boundaries are generally managed by the same vCenter server with geographically focussed datacenters. The challenge therefore was how do we manage capacity of our groups of clusters (this customer has hundreds of clusters)? One thing to note is that this customer has a standard (which is then reflected in a vROps policy) to only use an allocation based model of capacity utilisation and they were unable, for a variety of reasons, to move to a demand based model.

In vROps we have two means of grouping objects; custom groups and custom datacenters. Unfortunately neither of these is ideal for our purposes, due to the reasons shown in the table below:

PROS CONS
Custom Group
  • Membership can be dynamically configured based on many metrics, relationships or properties
  • Capacity remaining can be calculated via super metrics that sum the capacity remaining of child objects
  • Time remaining is not available as a metric on the object
  • Custom groups are just that groups as opposed to objects in their own right
Custom Datacenter
  • Allows large datacentres to be configured into logical groupings that more readily reflect the breakdown of an organisation
  • Treated as a full object in its own right meaning that capacity remaining and time remaining are calculated for the custom datacenter based on the child objects
  • Different policies can be configured between custom datacenters and their child objects
  • Membership cannot be natively configured
  • Different policies can be configured between custom datacenters and their child objects
  • Consistency between the values computed for a custom datacenter and its child objects can sometimes be challenging
  • Based entirely on infrastructure so can only be broken down into clusters or hosts. It is not possible to create a custom datacenter based upon virtual machines directly

When discussing these with the customer they asked the obvious question:

Is there a way to determine the time remaining of a group of clusters based on the time remaining of the individual clusters via a super metric?

The simple answer is no, and the more complex answer is still no

Take the following simple example where we have two identically sized clusters and every VM that we deploy is the same size and we deploy a set number of VMs per day (to make this example easier to understand)

  • Cluster A has capacity for 10 VMS and we deploy 2 VMs per day
  • Cluster B has capacity for 20 VMs and we deploy 1 VM per day

Based on the above the time remaining for each of these would be as follows:

  • Cluster A – 5 days
  • Cluster B – 20 days

So what is the time remaining for the clusters as a group? 20 days? 25 days? Something else? The table below shows us how the group of clusters will be filled and therefore how much time remaining there is:

        Day
 0 1 2 3 4 5 6 7 8 9 10
  Cluster A  Capacity Remaining  10 8 6 4 2 0
 Time Remaining  5 4 3 2 1 0
  Cluster B  Capacity Remaining  20 19 18 17 16 15 12 9 6 3 0
 Time Remaining  20 19 18 17 16 15 4 3 2 1 0

This is a a simple example and yet the maths is pretty complex, imagine this with 10 clusters, differing virtual machine sizes, different and changeable deployment patterns and the ability to define a single value via the super metric mechanism is not suitable.

The answer therefore seems to be custom datacenters, as by using these we gain access to vROps inbuilt capacity engine to calculate the time remaining figure for groups of clusters. Custom Datacenters are top-level objects within vROps and every benefit that goes with that.

There are still some challenges that I’ve experienced specifically around consistency of numbers whereby the number of virtual machines remaining in the clusters that make up a custom datacenter don’t necessarily add up to the number of VMs remaining for the custom datacenter itself. This can however be masked by the use of super metrics.

The second and much larger challenge however is that there is no dynamic membership mechanism for custom datacenters. In smaller environment this may not be so much of a problem whereby the membership can be managed manually. However, in environments with hundreds of clusters, then managing this manually via the UI is both impractical and prone to mistakes.

With my current customer we’ve used the vROps API both to create the custom datacenters but also to manage the relationships between the custom datacenters and the underlying clusters. More information on how to use the vROps API can be found at the following links:

Straight up flying with the vrealize operations rest api

vRealize Operations Manager API Guide

In doing so PowervROps was born, a module that allows the use of the vROps API directly from within PowerShell, but that is worthy of a post all to itself:

PowervOps – PowerShell cmdlets for the vROps API

 

PowervROPs – Powershell cmdlets for the vROps API

What started out as the functions that were written for a customer I was working with has become something of an obsession that has resulted in a full-scale PowerShell module that exposes various parts of the vROps API.

The module should not be seen as a replacement for the vROps cmdlets available within PowerCLI, more as a complementary set that were born out of the specific requirements of a customer, but have since evolved as additional functions have been written.

The module is available on GitHub at the following URL:

https://github.com/andydvmware/PowervROps

There are currently 41 functions within the module, and for the most part they accept all parameters as defined by the API.

Screen Shot 2017-08-21 at 14.32.07

Over the coming days/weeks I’m going to write a series of posts detailed some of the ways in which the module has been used, however some of the highlights of the current release include:

  • Ability to generate and then download reports
  • Ability to add metrics and properties to objects
  • Add, set and delete relationships
  • Create new resources (for example custom datacenter objects)
  • Creation and deletion of custom groups
  • Ability to start/stop monitoring of resources
  • Ability to mark/unmark resources as being maintained

Configuring the time frame that vROps uses when calculating capacity and time remaining values

The challenge: Configure vROps so that it will perform capacity based calculations in a way that is less susceptible to historical events.

By default, vROps will use all available data when calculating the amount of capacity and time remaining for an object. In certain situations and scenarios it may be preferable to only use a certain time frame when calculating these values. Examples of such situations could be:

  • Upgrading the hardware in a cluster which increases the available capacity
  • Large scale provisioning or disposal of virtual machines that cause a significant impact on the amount of used capacity

The capacity and time remaining figures are inextricably linked. Each time capacity is calculated vROps will first work out the capacity remaining and from that the time remaining figures. When calculating these values vROps will use all available data and then try to smooth a curve across the full extent of the data range.

Take the following two images of utilisation of CPU. The two charts both represent real world data. In the first the full data range is shown, which shows a number of events that occurred within this cluster; hardware upgrade and a large scale decommissioning of test VMs. The second is the same cluster but the image has been cropped such that only 90 days worth of data is shown.

Full data rangeFull-range-metric
90 day data rangeMetric-90day-Window

The diagrams show quite clearly how it is easier to work out the underlying trend line of the data in the second image compared to the data in the first image.

It should also be noted that the underlying algorithms used are particularly susceptible to inaccuracy when the allocation model of capacity management is used, and the issues seen here do not necessarily apply when using either the demand based model or when the demand and allocation models are used in combination.

vROps also has a policy element called ‘Time Range’ which is set by default to 30 days and as per the description within the policy this time range is only used for non trend analytics and has no bearing on the analytical side of vROps which both capacity and time remaining fall into.Screen Shot 2017-07-24 at 20.00.17

As mentioned earlier, vROps will use all available data when calculating capacity and time remaining but there are a couple of additional gotchas that should be considered:

  1. The data retained may or may not be as per that configured under ‘Time Series Data’ on the ‘Global Settings’ page. vROps will purge data beyond the value configured here but it will not be instantaneous as the removal of data is a relatively expensive operation and so the amount of data retained may be a few days or weeks beyond this value
  2. vROps also rolls up data into coarser grained values but still uses these values when calculating capacity.
  3. When changing the time frame value this will affect time remaining, trend views and projects.

 

To alter the behaviour of vROps so that it uses a specific time frame for capacity calculation a number of steps need to be performed; firstly to alter the time frame and secondly to alter how the rolled up data is used.

DATA TIME FRAME TO USE

Care should be taken when altering files on a vROps node. The file should be backed up prior to making changes.

SSH into every vROps node as the root user

Modify the following file: $ALIVE_BASE/user/conf/analytics/capacity.properties

Change:

historicDataRangeForTrend=-1

To:

historicDataRangeForTrend=90

The number shown here is for 90 days, enter the appropriate number of days that you wish to use

Save the file and exit the editor

Restart all vROps services by issuing the following command as root:

service vmware-vcops restart
Once vROps analytics has finished starting up, trend views and projects would be immediately effected. The time remaining values will only be affected after the next capacity calculation.

ROLLED UP DATA

Depending on the version of vROps that is being used determines how to alter this behaviour. In vROps 6.6 the configuration is now directly exposed in the UI, but in prior versions a configuration change needs to be made on each vROps node

6.6 Modification Change

Alter the global setting ‘Additional Time Series’ from its default of 36 months to 0 months.

Prior to 6.6 Modification Change

Care should be taken when altering files on a vROps node. The file should be backed up prior to making changes.

SSH into every vROps node as the root user

Modify the following file: $ALIVE_BASE/user/conf/fsdb/rollup.properties

Change:

rollupReadEnabled=true

To:

rollupReadEnabled=false

Save the file and exit the editor

Restart all vROps services by issuing the following command as root:

service vmware-vcops restart
I’d like to credit my colleague James Ang who spent many hours working with me on this and providing a lot of very helpful information and insight into the inner workings of vROps capacity analysis engine.