Pulling performance data from XtremIO and pushing it to InfluxDB

If you followed my previous blog posts you have seen how cool Grafana is and what you can start to do when you have your data in a time series database like InfluxDB. I’ve written a script in Ruby that will pull data from XtremIO via REST and will feed it into InfluxDB. The script will just pull data from the cluster object but it would be fairly easily modified to pull from other objects. XtremIO returns the data in a JSON format that is easily modified and sent on to InfluxDB. You will need to configure your specific settings in the settings.local.yml file.

settlings.local.yml

xtremio_user: "user"
xtremio_pass: "password"

influx_url: "http://<hostname>:8086/db/<dbname>/series"
influx_user: "user"
influx_pass: "password"

hosts: [xio01, xio02]
metrics: [ compression-factor-text, dedup-ratio-text, rd-bw, wr-bw, rd-latency, wr-latency, avg-latency, rd-iops, wr-iops]

xtrempull.rb

require 'rest-client'
require 'json'
require 'pry-byebug'
require 'yaml'
require 'ostruct'

#pull in the settings file
Settings = OpenStruct.new YAML::load_file(File.join(__dir__, 'settings.local.yml'))

#method for "get" operations against a REST api
def get_url(url, username, password)
  result = RestClient::Request.execute(user: username,
                                       password: password,
                                       url: url,
                                       method: :get,
                                       headers: {:accept => :json},
                                       verify_ssl: false,
                                       timeout: 30,
                                       open_timeout: 30)
  return result
end

#method for "put" operations against a REST api
def puts_influx(in_url, in_user, in_pass, myjson)
  result = RestClient::Request.execute(user: in_user,
                                       password: in_pass,
                                       url: in_url,
                                       method: :post,
                                       headers: {:accept => :json},
                                       verify_ssl: false,
                                       timeout: 30,
                                       payload: myjson,
                                       open_timeout: 30)
  return result
end

#user/pass for XtremIO
username = Settings.xtremio_user
password = Settings.xtremio_pass

#URL,user, and pass for InfluxDB
in_url = Settings.influx_url
in_user = Settings.influx_user
in_pass = Settings.influx_pass

#for each host - determine the cluster name
Settings.hosts.each do |xbrick|

  #### build the URL for the API
  baseurl = "https://#{xbrick}"
  clusterapi = '/api/json/types/clusters'
  xiourl = baseurl + clusterapi

  #### pull list of clusters from each xio brick
  getclusters = get_url(xiourl, username, password)
  clusternames = JSON.parse(getclusters)

  myxio = clusternames["clusters"].select { |name| !name["name"].nil? }.map{ |x| x['name']}
#### pull XIO data from each clusters
  myxio.each do |name|
    clusterurl = xiourl + "?name=#{name}"
    getperf = get_url(clusterurl, username, password)
    perfdata = JSON.parse(getperf)

    #### pull the metrics from the JSON
    data = []
    Settings.metrics.each do |metric|
      var  = perfdata['content'][metric]
      hash = {}
      hash[:name] = xbrick + "_" + metric
      hash[:columns] = ['value']
      hash[:points] = [[var]]
      data << hash
    end

    #### convert back to JSON and send to influx
    payload = data.to_json
    puts_influx(in_url, in_user, in_pass, payload)

  end
end

Creating Grafana dashboards with vCenter performance data–Part Three

In you followed part one and part two of this blog series you should have a functional StatsFeeder and InfluxDB implementation. This post will go into detail on how to create a functional Grafana dashboard. Grafana 2.0 was just launched so download and install the latest version(2.02 at the time of this post).  Chris Wahl just published a great blog post on this same subject as well except he is using PowerCLI to pull the data from vCenter. Be sure to check it out.

2015-04-22_19-06-33

Once you start Grafana it will listen on port 3000 by default. Using a browser, pull up the login page. The default credentials are admin / admin.

2015-04-28_17-09-59

I found the UI for Grafana to not be very intuitive. Once I was used to it though I came to appreciate it. It does a great job of getting out of the way of the data you are trying to present.

The first thing we need to do is configure the data sources. You do this by clicking the Grafana icon in the top left to open the side menu and then click “Data Sources”.

2015-04-28_17-14-40

To add a data source Click “Add new” and enter in the appropriate information. Enter in a friendly name and select the appropriate version of InfluxDB. You will want to check the Default box or you will be changing the data source on every graph you build. The url will be the hostname of the InfluxDB system on port 8086. The last piece will be the database name in InfluxDB and the user/password for the database.

2015-04-28_17-25-23

Now, let’s create a dashboard. Click “Dashboards”, “Home, and then “New”.

2015-04-28_17-28-08

You will now be on a dashboard called “New Dashboard”. Click the settings icon on the right select “Save As…” to save the dashboard with a better name.

2015-04-28_17-34-30

Let’s add a graph. You should see a little green bar on the left side of your dashboard. Click it and select “Add Panel” and “Graph”.

2015-04-28_17-37-37

By default it will add a graph with no data. You need to select the title, and then click “edit” to start editing the graph.

2015-04-28_17-46-50

You will now be on the Metrics screen of the graph. On the left hand side it should say “series”. If you recall, each counter that we are pushing to InfluxDB is its own series in the database. I found that the best way to locate the counter that I am searching for is to enter a vm name and scroll through the list.

2015-04-28_18-00-29

Once you have selected your counter you should have some data showing up in your graph. Depending on which counter you selected you may need to make some adjustments on the select line to get the data presented accurately.

2015-04-29_17-31-20

Now it’s time to build something a bit fancier. My original goal when I started playing with Grafana was to build a vm performance dashboard. I have thousands of vms though so it doesn’t make sense to build out a dashboard per vm. Grafana has a feature called Templating that makes life a lot easier. We will need to create some variables and get comfortable with some regex. Click the settings icon, select “Templating”, and then “Add”.

2015-04-29_17-39-47

I want to create a variable that allows me to use any vm name. You will need to create a query against InfluxDB and apply some regex to pull up the proper data. Enter the variable name, set the type to query, and ensure the Datasource is correct. In the “Variable values query” field type in “list series /esxprefix.vm.*/”. This is querying all series in InfluxDB and finding anything that starts with “esxprefix.vm”. If you want host data you would use “/esxprefix.esx.*/”. We now need to add some additional regex to extract the vm name from the series name. Using “\..*?\.(.*?)\.” should extract the vm name.

2015-04-29_19-13-20

You will now have a dropdown on the dashboard with every vm as a selectable option. Edit your graph and add the variable into the query to make sure the graph pulls up the right data for the vm that you select.

2015-04-29_19-22-09

Templating is a very useful feature. Keep in mind that you can use variables that you create as part of other variables.

2015-04-29_19-25-55

Grafana can be a very powerful tool to represent data with a little bit of work.

Creating Grafana dashboards with vCenter performance data – Part Two

If you followed part one of this blog series you should have StatsFeeder with the GraphiteReceiver plugin installed. Now we need to install InfluxDB and Grafana. Before we begin with the InfluxDB install we need to make a couple more modifications to the StatsFeeder config file that you will be using.

2015-04-26_19-21-41

In the <receiver> section of the xml file, locate the graphite receiver and change the property “host”. Set it to the hostname of the system that will be running InfluxDB. I’ll use “localhost” since I am installing everything on a lab system.

2015-04-26_19-42-10

In the same section, find and change the property “only_one_sample_x_period” and set it to “false”.

2015-04-26_19-24-16

In the <entities> section, locate the graphite receiver and add “VirtualMachine” to the list of child types. This will ensure that we grab data from both the hosts and the virtual machines.

2015-04-26_19-32-48

StatsFeeder should be ready to go now so let’s move on to InfluxDB. Grab and install the latest version of InfluxDB. At the time of this blog post it was the 0.8 version. The 0.9 release is imminent though and I haven’t tested against it yet.

2015-04-25_9-12-21

InfluxDB will install to /opt/influxdb. The first thing we will need to do is modify the InfluxDB configuration file to accept graphite input. The config file is /opt/influxdb/shared/config.toml. Find the [input_plugins] section of the config file and enable the plugin, specify the port, and enter the name of the database that you will use for the data. It doesn’t matter that the database doesn’t exist yet.

2015-04-25_9-21-24

Start InfluxDB and using a browser open the InfluxDB administration page. InfluxDB listens on port 8083 by default. Don’t forget to modify or disable iptables if needed.

2015-04-25_9-23-09

The default credentials are root  / root. Enter those and hit “Connect”.

2015-04-25_9-31-20

Create the database that you entered into the configuration file. You just need to enter the name and click “Create Database”.

2015-04-25_9-32-40

InfluxDB should now be ready. Time to feed some data into it. Change directory to where you installed StatsFeeder and run the StatsFeeder.sh script. The script takes four parameters –h “vcenter host” –u “user” –p “password” and –c “config file” . You should see the output after a few seconds.

2015-04-26_19-58-29


2015-04-26_20-02-18

Let’s verify that the data is being fed to InfluxDB correctly. Using a browser go back to the InfluxDB administration page. Click “Explore Data” to the right of the database you created.

2015-04-26_20-05-10

You should now be at a query interface for your database. Type in “list series” and click “Execute Query”. You should see a fairly long list of results. In a really large environment it can take a long time to show the results. Each of these series in InfluxDB is a specific performance counter for either your hosts or virtual machines.

2015-04-26_20-07-33

All done for now. In part three I will go over installing Grafana and making some dashboards.

Creating Grafana dashboards with vCenter performance data – Part One

Lately, I’ve been playing around with pulling data from different parts of my infrastructure and dumping the data into a time series database. This post will start to go into details on how you can pull data from vCenter, dump it into InfluxDB, and use Grafana to make some pretty cool dashboards. Here is an example of a dashboard I have built to pique your interest. At the end of this blog series you will be able to build this and other dashboards.

2015-04-24_17-40-53

The instructions are written with a CentOS 6.6 system in mind so your mileage may vary if using another linux distribution. The first thing we will need to install is StatsFeeder. StatsFeeder is a VMware Fling from a couple years back. The main purpose of it is to collect performance metrics in a scalable manner. It’s a bit dated but it seems to work for most things in vSphere 5.5. You can find more information at https://labs.vmware.com/flings/statsfeeder.

StatsFeeder uses pluggable modules or “receivers” as a means to send data to different repositories. It supports CSV and Perfmon file formats and it also has receivers for OpenTSDB and KairosDB. For my purposes though I really wanted to use InfluxDB for reasons that I won’t go into in this post. Luckily someone has created a receiver for Graphite and InfluxDB can be configured to accept data intended for Graphite.

Let’s get to installing the required packages. Download StatsFeeder and unzip in your preferred directory. I will be using /opt/statsfeeder.

2015-04-22_19-06-33

StatsFeeder requires java. I’m installing the full jdk because we will need it later.

2015-04-22_19-40-05

Next, we need to grab and install the GraphiteReceiver for Statsfeeder. The best way to do this is to follow the direction listed at the Github repository https://github.com/SYNAXON/GraphiteReceiver. The installation will also require that you have Git and Maven installed.

2015-04-22_19-43-35

Using Git, clone the GraphiteReceiver repository.

2015-04-22_19-45-16

Maven needs to be installed next but it isn’t in the default CentOS repositories. You will need to add another one and then use yum to install Maven. Add the repository and install Maven.

2015-04-22_19-48-24

2015-04-22_19-50-42

Now we need to prepare for building the GraphiteReceiver jar file. Assuming you also used /opt/statsfeeder the commands below should work for you.

cd GraphiteReceiver/
mkdir -p ~/.m2/repository/com/vmware/tools/statsfeeder-common/4.1
mkdir -p ~/.m2/repository/com/vmware/tools/statsfeeder-core/4.1
cp /opt/statsfeeder/lib/statsfeeder-common-4.1.jar ~/.m2/repository/com/vmware/tools/statsfeeder-common/4.1
cp /opt/statsfeeder/lib/statsfeeder-core-4.1.jar ~/.m2/repository/com/vmware/tools/statsfeeder-core/4.1

Use Maven to build the GraphiteReceiver jar file.

2015-04-22_20-00-51

Copy the new jar file to the StatsFeeder lib directory.

2015-04-22_20-03-02

Copy the sample configuration file to the StatsFeeder config directory. Don’t overwrite the existing sampleConfig.xml file though.

2015-04-22_20-07-28

Now let’s validate StatsFeeder functionality with the default configuration file first since we haven’t setup InfluxDB yet. The default configuration should pull performance data and dump it into a csv file. Change directory to the location that you installed StatsFeeder and run the StatsFeeder.sh script. The script requires you to pass the vcenter name, user, password, and config file as arguments.

2015-04-22_20-12-05 Let it run for a minute and you should see an output.csv file in the directory.

2015-04-22_20-16-09

All  done for Part One…..

NSX Issue With Service Objects

We recently discovered an issue with how NSX currently manages service objects. When we add a new application that is behind the NSX distributed firewall we typically create a service. This service will get added to a service group. This will then allow us to apply firewall rules against the service. The problem occurs if you define a service that includes a combination of port ranges and a comma the service isn’t parsed correctly. Take a look at the three test services I have created below.

2014-08-15_9-19-20

Now I will apply a firewall rule against a single VM that blocks the Test1 service.
2014-08-15_10-08-29

Next we need to SSH into the ESXi host that this VM is running on to verify if the rule was applied correctly. First we need to get the name of the NIC that is attached to the vm. You can do this with the summarize-dvfitlter command.

2014-08-15_10-25-09

Now that we have the name of the NIC we can check out the rules. Youc an do this with the vsipioctl command.

2014-08-15_10-28-53

Looking at the top line you can see it is correctly blocking ports 1,2,3, and 4. Let’s try this with the Test2 service.

2014-08-15_10-39-56

As you can see the rule looks good and is correctly blocking ports 1-4. Let’s see what the Test3 service does.

2014-08-15_10-43-30

Yikes! It completely ignored the dash and did not apply the rules as expected! If you are using the service objects and specifying multiple ports be sure to only use either commas or dashes. Never both on the same line or you will get very unpredictable results. This is expected to be fixed in an upcoming release.

VCAC to NSX Upgrade Issue

Last week we ran into an unexpected issue after upgrading from VCNS to NSX. In my environment we are utilizing the distributed firewall heavily. We create security groups for our vms and those security groups are also nested inside of other security groups to apply bundles of firewall rules that we want to apply to all vms. Those bundles include things like DNS, AD, NTP, etc.

When creating security groups and other objects in VCNS they are created with a scope of the Datacenter object in vCenter. However, when creating an object in NSX the default scope is Global. You can not change this scope in the UI currently. You can still create new objects via the API with a scope of Datacenter.

With NSX the scopes are hierarchical. Global is the top level. You can not nest a security group with a scope of Global into a security group with a lesser scope such as Datacenter. This became a large problem for us since we are using nesting to apply bundles of firewall rules. There are instances where we still need to manually create objects via the UI and we couldn’t nest them into the objects already built.

VMware assisted us by doing a direct database update to reset all of our objects to Global. I’m sure they will have other options in a future version but for now you will need to engage support to work through this issue if it impacts your environment. If you need to still create security groups with a scope other than Global see my previous post https://virtuallygone.wordpress.com/2014/07/11/scripting-nsx-security-groups/.

NSX Manager CLI Privileged Mode Password Reset

As part of the gotcha that I mentioned in my last post I needed to get into the NSX Manager CLI. Normally this isn’t a problem. Unfortunately, when the password was changed on this specific manager it appears that a typo was entered and I couldn’t gain access. Below I will list a step – by – step recovery procedure. You will need a separate linux VM to perform these steps.

Warning: I’m pretty sure this isn’t supported. So use at your own risk.

Shut down your NSX Manager appliance and attach the VMDK of the NSX Manager to your linux VM as an extra disk. If for some reason you have a snapshot of your NSX Manager make certain that you select the latest snapshot VMDK.

2014-07-14_18-28-59

You will need to reboot your linux VM or force it to rescan the SCSI devices. Google will be your friend for the latter option.

You need to mount the sixth partition of the NSX Manager disk. First you need to determine the correct disk device. On my system it is /dev/sdb.

Create a directory for the mount point. #mkdir /mnt/nsx

Mount the filesystem. #mount /dev/sdb6 /mnt/nsx

Change directory. #cd /mnt/nsx/configs/cli/etc

Make a backup of the password file. I’m not sure what good this will do you since this is the reason you are here…but it’s still a good practice. #cp passwd passwd.orig

Using your favorite editor (VI) modify the password file. #vi passwd

Replace the admin line with the following text. admin:$1$eulbenal$WYxc8kepSLzXFP0KjuZRp1:0:0:admin:/common/configs/cli/:

2014-07-14_19-09-39

Save the file and shutdown your linux VM. Detach the NSX Manager from the linux VM and power on the NSX Manager.

If you followed these procedures properly the privileged mode password should be reset to “default”. This will not affect the password of admin in the web UI. Don’t forget to change the privileged mode password again.