Tag Archives: VMware

vSAN 7.0U1 – A New Way to Add Capacity

As we all know there are a number of ways of scaling capacity in a vSAN environment, you can add disks to existing hosts and scale the storage independently of compute, or you can add nodes to the cluster and scale both the storage and compute together, but what if you are in a situation where you do not have any free disk slots available, and / or you are unable to add more nodes to the existing cluster? Well vSAN 7.0U1 comes with a new feature called vSAN HCI Mesh, so what does this mean and how does it work?

Let’s take the scenario below, we have two vSAN clusters in the same vCenter, Cluster A is nearing capacity from a storage perspective, but the compute is relatively under utilised, there are no available disk slots to expand out the storage. Cluster B on the other hand has a lot of free storage capacity but is more utilised on the compute side of things:

Now the vSAN HCI Mesh will allow you to consume storage on a remote vSAN cluster providing it exists within the same vCenter inventory, there are no special hardware / software requirements (apart from 7.0U1) and the traffic will leverage the existing vSAN network traffic configuration.

This cool feature adds an elastic capability to vSAN Clusters, especially if you need to have some additional temporary capacity for application refactoring or service upgrade where you want to deploy the new services but keep the old one operational until the transition is made.

VMware has not left the monitoring capabilities of such use out either, in the UI you can monitor the usage of “Remote VM” from a capacity perspective as well as within the performance service

So this clearly allows dissagregation of storage and compute in a vSAN environment and offers that flexibility and elasticity of storage consumption are there any limitations?

  • A vSAN cluster can only mount up to 5 remote vSAN Datastores
  • The vSAN Cluster must be able to access the other vSAN cluster(s) via the vSAN Network
  • vSphere and vCenter must be running 7.0U1 or later
  • Enterprise and Enterprise Plus editions of vSAN
  • Enough hosts / configuration to support storage policy, for example if your remote cluster has only four hosts, you cannot use a policy which requires RAID6

So this is a pretty cool feature and sort of elliminates the need for Storage Only vSAN nodes which was discussed in the past at many VMworlds

How intel optane and nvme helps with node consolidation

As the core density increases on a CPU it opens up the opportunity to consolidate the number of nodes required in any given cluster, but in a vSAN cluster, node consolidation has a negative effect on available IOPS, if you think about how each node provides a specific amount of IOPS, lowering the number of hosts in the cluster removes the IOPS capability of the nodes you are consolidating by, take the following for example:

Number of VMs : 200
vCPU Per VM : 4
Virtual Memory per VM : 32GB
Storage per VM : 600GB
vCPU to Core Ratio : 4 to 1

Now for the purpose of this sizing excersize I am going to use the vSAN Sizing tool and apply some cluster settings as per below:

So in the above scenario, the number of cores per CPU is 18, and I want to ensure that this is a two disk group configuration, if we then input the workload details as per below:

You will see when we click on recommendation that it shows a required node count of 8 (not taking into account any N+1 capability as we left that as 0 for the purpose of this sizing)

And we can see the disk config below:

However, if we increase the number of CPU Cores to 20 by clicking on the “+” in the sizing output we can see that it changes the number of nodes

And again if we increase the number of cores again to 22 we get a further reduction in the number of nodes to 6

The sizing tool will dynamically increase or decrease the number of disks required per host as well as the RAM per node that is required as you can see here:

But one thing we have not factored in here is the decrease in IOPS Capability that reducing by two nodes , if say for example each node was capable of 80K IOPS, reducing the node count by two means you have just lost 160K IOPS Capability, so what can we do to mitigate that?

Well instead of using SAS/SATA SSDs in your vSAN design, you could opt to use Intel Optane for Cache, and NAND based NVMe drives for capacity.
For write operations, Intel Optane greatly improves on write performance as I have written about before, but also read performance is greatly accelerated because the capacity devices are NVMe, so therefore reducing your node count by two in this case and utilising this kind of technology means you still get similar levels of performance, the best part is, the overall solution will cost you less too, so your TCO comes down which is good for your finance department right?

One question I get asked frequently is what size Optane device is sufficient?

Well in all of my testing, I very rarely saturated the write buffer even with 375GB Optane drives as cache devices, the reason for this is because vSAN starts to perform de-staging from the cache tier to the capacity tier when the write buffer becomes around 30% Full, and because the capacity tier it NVMe based, the de-staging happens a lot quicker, especially since vSAN 6.7 U3 where the de-stage limits have been removed.

So when would a 750GB Optane be useful?

High write intensive workloads such as Video Surveilance and Databases, or when your capacity disks are much slower, Optane can still be used in vSAN Configurations where the Capacity Tier is SAS/SATA which of course are not as fast as most NVMe devices so the write buffer can get more full.

So just to re-cap, you can save money on your vSAN deployments by consolidating hosts with higher core count CPUs as well as leveraging newer technology such as Intel Optane in the Cache Tier and NVMe in the capacity tier thus saving money whilst maintaining same level of performance or better, what’s not to like?

Why vSAN is the storage of choice for VMware Cloud Foundation

Recently VMware annouced that Cloud Foundation would support external storage connectivity, when I first heard this I thought to myself are VMware out of their minds? Amongst customers who I meet and talk to more or less on a daily basis, I also spoke to a lot of customers at VMworld about this but the stance or direction from VMware has not changed, if you want a truly Software Defined Data-Center then vSAN is the storage of choice.

So why have VMware decided to allow external storage in Cloud Foundation?

Many customers who are either about to start or have already started their journey to a Software Defined Data Center and/or Hybrid Cloud are still using existing assets that they wish to continue to sweat out, it may well be that their traditional storage is only a couple of years old and VMware are not expecting customers to simply throw out existing infrastructure that was either recently purchased or still being sweat out as that would be a tall ask right?

Customers also still have workloads that can’t simply be migrated to the full SDDC, they take some time to plan the migration, maybe there is a re-architecture of the workload that needs to be performed, or maybe there’s a specific need where a traditional storage array has to be used until any obstacles have been overcome, VMware recognises this hence the support for external storage in Cloud Foundation.

There are also specific use cases where Hyper Converged powered by vSAN isn’t an ideal fit, cold storage or archive storage is one of these, so supporting an array that can provide a suplemental storage architecture to meet these requirements is also a plus point.

Will I lose any functionality by leveraging traditional storage with Cloud Foundation? Simple answer is yes for the following reasons:

  • No lifecycle management for external storage through SDDC Manager which means patching and updating of the storage is still a manual task.
  • No single pane of glass management out of the box for external storage without installing third party plugins which in my experience have a tendancy to break other things.
  • No automation of stretched cluster deployment on external storage, all the replication etc has to be configured manually.
  • Day-2 operations such as Capacity Reporting, Health Check and Performance Monitoring are lost without installing any third party software for the external storage.
  • Auto deployment of the storage during a Workload Domain deployment, all the external storage has to be prepared beforehand.
  • Losing the true “Software Defined Storage” aspect and granular object control, currently external storage support for VVOLS is not there right now either.

Also remember that with Cloud Foundation, the Management Workload Domain has to run on vSAN, you can not use external storage for this.

So if you want a truly Software Defined Data Center with all the automation of deployment, all the nice built in features for day-2 operations then vSAN is the first choice of storage, if you have existing storage you wish to sweat out whilst you migrate to a full SDDC then that’s supported too, and yes you can have a vSAN cluster co-exist with external storage too, which makes migrations so much easier.

The other aspect to look at this is in a Hybrid Cloud or Multi Cloud strategy, Cloud Foundation is all about providing a consistent infrastructure between your private and public clouds, so if your public cloud is using Cloud Foundation with vSAN then the logical choice is to have vSAN as your storage for your Cloud Foundation Private Cloud to have that consistent infrastructure for your workloads no matter where they may run.