VSA Deepdive Part 2 - WSCLI

VMware VSA Deepdive – Part 2 – WSCLI

Previously in the series, we got the homelab version of VMware VSA up and running.

You may not know it yet, but it’s pretty important to become BEST FRIENDS with the WSCLI tool set–the only way to interact with the VSA systems from the command line and see what is actually happening behind the scenes. The VSA Manager gives you a heads up mostly–the rest is with this toolset.

There are occasional mutters about WSCLI on the internet if you search. I’d like this to be the definitive guide to it from a practical perspective.

WHAT IT DOES

WSCLI is a Java .jar file that comes installed with the VSA Manager plugin. You’ll find it in the install directory under subfolder \TOOLS. It comes with a simple .BAT file that finds the JRE’s current location and calls it along with the -jar parameter. A bonus that I have found–this being Java, the tool is essentially cross-platform so you can copy the JAR to other systems and run them from wherever. This is critical to know about later on, in my opinion.

WSCLI gives you the ability to do what is available in the VSA Manager vSphere plugin except the ability to upgrade from 5.1 to 5.5.
WSCLI output is far more verbose and can actually be useful in ways that are not obvious. When VSA Manager fails to do what you want it to (and it will), there is almost always a KB article with a how-to guide on fixing it with WSCLI.  Examples such as replacing a failed cluster node, or getting specific replication data or status are all must haves for a VSA Administrator.

SEND HELP

The first command argument for WSCLI you need to know is help. While it sounds obvious as to why, but this provides a wealth of information that becomes really interesting later. Keep the items emphasized below in the back of your mind for later to see how.
Let’s look at the shutdownSvaServer help info for an example of the help documentation.

Command usage:
shutdownSvaServer [<boolean:maintenance-mode=false>]
Parameters:
maintenance - whether the node should enter maintenance mode on restart, defaults to false
Prepares the node for a shutdown.
This command does not actually perform a shutdown of this node's VSA service or of its virtual machine. Instead, any test coverage data being collected is saved, all open files are synced, whether the node should enter maintenance mode on restart is persisted, and the ZooKeeper configuration is updated if the node will be entering maintenance mode, but the actual shutdown is not performed. The caller should arrange to power off the node on receiving the SvaShutdownEvent. Calls to this method are not counted when determining whether this node should be placed in maintenance mode because it appears to be in a continuous cycle of reboots.
This operation sends an SvaEnterMaintenanceModeEvent management event when the node is marked for entering maintenance mode, and a SvaShutdownEvent when theVSA service is ready to shut down.

Pretty verbose and unusally direct on the process, if you ask me! But notice how misleading it is – it just stops processes.  You have to power off the thing yourself.
Not a problem right?  Why wouldn’t I be able to power off a VM?

Wait, what? Y U NO POWER OFF?!!1
Wait, what? Y U NO POWER OFF?!!1

In the vCenter database, once the VSA VM is created, it is modified extensively and disables most methods entirely. So even good old Administrator@vsphere.local can’t do anything about this. This is the biggest obstacle to automation and management of the system, since it has to be powered off before you can put the host into Maintenance Mode.

And before you ask – VUM’s option to power off the VSA VM doesn’t fly here either!

There are a number of ways to tackle this which will be discussed later.

YOU MIGHT WANT TO JUST LISTEN

The second command argument for WSCLI you need to know is startListener. This launches a daemon that will show events happening in real-time, such as a cluster takeover, replication progress updated every minute or so, and entering/exiting maintenance mode. VSA Manager, assuming it shows anything useful, only updates every 30-40 seconds in the vSphere Client.

So, to build on the shutdownSvaServer command above, you could open a second command window, run wscli <cluster IP> startListener to see what’s happening within the cluster in real time. Then run the command wscli <vsa node IP> shutdownSvaServer and see the SvaShutdownEvent fire in the first window within a second or two. You can reboot the node and watch the carnage in your listener! You’ll see everything from the failover, to the reconnection, the sync, and completion.

THE COMMAND STRUCTURE

The commands are divided into 3 categories: SVA (Storage Virtual Appliance), SAS (Shared Access Storage), and VCS (VSA Cluster Service).

SVA commands are specific to the individual storage node and not the cluster in general.  Pings, version checks, and entering/exiting maintenance mode are the main uses in this context.

SAS commands are specific to the cluster, and is the most commonly used set.  Getting all of the UUID values, parameters, and querying synch state are the main uses in this context.

VCS commands are specific to the standalone cluster service.  In a two-node cluster, the quorum is handled by this service, and is installed on the vCenter Server when VSA Manager is installed. You can download the cluster service for Windows or even as a Linux package, and put it on a separate machine entirely if you want.  It is fairly rare to use any commands in here except to query status and maybe restart the service if the replicas aren’t coming up in a failover scenario.

WHAT IT ISN’T

The HELP documentation of the WSCLI package is pretty good, even if you have to do a good bit of copy/pasting from the command window to get anything useful.

But the verbosity of it, such as showing absolutely everything as UUIDs is infuriating sometimes.  Expect to be filtering all data based on the management IPs.  This isn’t PowerCLI grade goodness, unfortunately.

(MORE) GENERAL GRIPES

No real monitoring capability. In my experience, there are no real proactive warnings that something bad has happened, other than a few stock alarms in vCenter that happen and then don’t clear themselves out when the condition changes. It’s the boy who cried wolf, and you tend to ignore most of the issues as time goes on. But when things break, they break real good. And then you have to hope you don’t screw up typing in specific commands to save the day in WSCLI.

Protip to VMware – let the admins decide who can/can not Power Off their VSA VMs. It’s what roles were designed for!
(As an aside, I secretly love to give my TAM grief about this product.)

Because this is a software RAID over ethernet solution, one thing to keep in mind is that the health of the underlying hardware is far more critical. Any major problem I have had with the product has been with bad hardware, primarily the disks themselves, or the controller and stripes. Make sure that you have that under control and address right away, or you’ll pay for it.

Upgrades require complete cluster outages. This is the really unacceptable one from my perspective – but I suppose the product was geared more to SMB shops that close at some point and can take the downtime. I’m in one of the biggest shops out there, and it simply doesn’t work that way.

But in quandaries like this, you have to get creative and dig deep. So that’s what I did.

In the next few posts, less theory and complaints – more workflows and scripts, surprising finds and interesting workarounds to manage VSA at hundreds of sites.

 

Introducing: My SDDC 2015

First things first. When you’re in this line of work, and especially for engineer or higher levels, you have to have a lab. It’s pretty easy to get one going, especially if you have a garage where you can put your baremetal boxes, with your own Catalyst switches, and all that fun stuff.

Living in a 500 square foot Seattle condo poses some interesting challenges when it comes to a lab. I have a spouse, 3 cats, and like many people, a gaming habit. Given my space constraints, I had to go small and practical.

So, here’s what I roll–what I call SDDC 2015. I used this for preparing for my VCAP5-DCA exam and it worked just fine.


The Case

The case.
The case.

I don’t think I could be much happier with a case than this one, the Lian-Li PC-V359WX. For $170 at the time of purchase, it seemed steep, but I wouldn’t change anything. It’s hard to find now, but I can’t find really anything to fault on it. It’s a two-story box, with Plexiglass on top letting you peek inside to see your motherboard and all the goods. You can even go SLI/CrossFire if you have adequate power, but I haven’t needed that.  Yet.

This case fit the dimensions of my current TV stand perfectly (although, a bit tight!) and the reviews were glowing. I have been a fan of Lian Li cases for a while now, but never bothered to really look at them because of cost. This time, it was spare no expense!


The Build

Intel Xeon E3-1230v3
Intel Xeon E3-1230v3

The brains of the outfit – an Intel Xeon E1230v3.

I wanted a real CPU to handle this lab – again, I had to pack as much punch as I could in one chassis without blowing my budget. This CPU is fantastic from a price/performance perspective. 8 Cores, VT-d, all the bells and whistles, and it runs quiet and cool.

 

 

The ASRock Z97M.
The ASRock Z97M.

The motherboard as also important in terms of feature set. I didn’t want a piece of junk Realtek NIC running my VMs, much less on my network. But, I’ve got a size constraint and that narrowed the list down to just a few candidates, and I decided to go with the ASRock Z97M. This one is great – it has HCL-supported Intel NICs, so for nested virtualization purposes, less drama with building custom ISOs. Add in 32GB of DDR3-1600 memory from Crucial, and that’s plenty of room to run things!

 


The Drives

SSD – Crucial MX100 512GB, another coming soon!
HDD – 2x Hitachi 2TB 64MB Cache – RAID1

The drives are really the biggest item on the list in terms of quality of life. SSDs make a lab so much more bearable to work with it’s crazy – don’t skimp here is my recommendation. You just thin provision on the SSD for speed and you’re set. The Hitachi HDDs are just there because I had them from my old build, and the other 5 went into my home brew NAS, which performs quite a bit of extra duty.


The Software

When I call this “SDDC 2015” it’s really more of a joke – this lab runs Windows 7 Professional running Workstation 9, which I got for free after passing my VCP5. Some edits to Virtual Networks later, and I now have nested ESXi 5.1, 5.5, and soon 6.0 hosts. Some additional nested VSAN instances, VCO, along with the usual AD and DNS servers.

And when I’m done, my wife knows how to startup Netflix and Plex. This is more important than you realize!


The NAS

Such compact, wow
Such compact, wow

It’s really not a lab until you have something that can serve up blocks!

Once again, I went with Lian Li’s cases in Mini-ITX. This guy is the Lian-Li PC-Q25B and it is amazing. The motherboard is an ASUS H87, with an Intel G3220 CPU. The setup is really overkill for what it does, but I have put it to work as a media server, and my wife makes sure it’s always working!

The OS is FreeNAS 9.1.2 to a USB stick, and I have 5 Hitachi 2TB 64MB disks in there as a RAID-Z2 (essentially RAID-6). So far, it works pretty well and reliably.


 

I’m not sure I will refresh this lab any time soon, so it’s got to last. Maybe if I get a garage I can get myself a used ASA and a stack of 3750X switches.

But, until then – I’ll fire up Crypt of the NecroDancer. I love this lab!