Get on the optimisation bus
Tuesday, 13 April, 2010
You probably cannot help but buy more storage, writes Simon Sharwood, but why not make sure you use every byte of the storage you’ve already got before you sign another cheque? Well that makes sense. Keep reading to find out the latest optimisation techniques.
If you have ever ridden a bus to work on congested city streets and wondered why you are crawling along in the company of innumerable single-occupant cars, you know a little about how storage managers feel when they nestle into their chairs at the start of a day.
Storage managers feel commuters’ frustrations because the storage arrays they tend generally have a lot of capacity they cannot access or must give away to users who take up space but joyride by taking up important capacity without putting it to practical use. So just as a bus has to contend with cars that fill the road without using their full passenger-carrying potential, storage managers are often forced to leave some of their arrays empty or allocate space that is never used.
When congestion in the form of extra data arrives (as it inevitably does given that businesses hardly ever stop making information), storage managers usually respond by adding more raw storage capacity.
This solves the problem for a while, but is equivalent to the trick performed by harried urban planners who add an extra lane to an already-overburdened road, knowing that a little more capacity will help for a while but leaves the root problem untreated.
A collection of practices that do treat the problem have come to be known as ‘storage optimisation’, and allow storage managers to use just about every byte they can get their hands on. Optimisation therefore deals with the congestion, while also treating the problem of increasing traffic. And while optimisation cannot clear the roads, it at least ensures that they are used to full capacity so that the investment in storage is used to its maximum potential.
Optimisation’s big four
The main optimisation techniques are thin provisioning, data deduplication, dynamic tiering and virtualisation.
The first, thin provisioning, is quite simple. In the past, when storage managers have configured arrays it has been necessary to define in advance how much capacity is allocated to an application. Setting up an array and getting it running is a non-trivial chore, as is expanding the amount of storage allocated to an application. Mindful of the likely expense and disruption that comes with an upgrade, cautious application owners have therefore requested much more storage than they need, and that has been provisioned and locked away so that no other application can touch it.
“Database administrators are reluctant to optimise or archive data for safety reasons,” says Phil Morrissey, General Manager of Services at XSI Data Solutions. Most, therefore, make requests for storage capacity that are very optimistic, leading over-allocated storage in the clutches of an application from which it cannot escape.
Thin provisioning turns this around by making it possible for an application to receive and keep its desired initial allocation of storage, but does not lock it away forever. The application thinks it has all the storage it initially requested, but thin provisioning means it will expand into that space and allow it to be used by other applications instead of hogging it all for its own use. The result of applying thin provisioning is that you become less likely to lock storage capacity away, thereby making it available and allowing your storage investment to be put to work.
Data deduplication does what it says on the can; namely, removing duplicated data. A close cousin to data compression, deduplication can take place at the source - on the system where you store data as it is being created or used by an application - or at the ‘target’, usually the piece of storage hardware used to store backups. Either way, you end up with less data, freeing capacity to store more new information rather than losing capacity to duplicates.
Tiering is the storage industry’s current darling, and the subject of much hot competition.
The concept of tiering will be familiar to anyone that lived through storage’s last two big buzz-phrases, namely ‘information lifecycle management’ or ‘hierarchical storage management’. Both of those credos advocated taking information that you don’t use very often and putting it onto the cheapest storage possible, so that you save capacity on the fast storage you use to power transactional applications and instead shuffle old data onto less-expensive media.
Tiering cleaves to that concept, but adds an assumption of policy-based automated movement of data between tiers. An email you sent yesterday and are likely to refer to again would therefore stay on an expensive tier of storage, while emails from 2008 would be shuffled off to a slow, inexpensive tier of storage where it would remain available, but not instantly so.
Automated tiering means that the fast storage that delivers the best possible user experience, either to people or input/output hungry applications, always has lots of capacity to store important data. Older information gets put in the slow lane where it belongs.
The last optimisation technique currently in vogue is virtualisation. Storage arrays can often contain untapped capacity, so storage virtualisation allows users to create a single, logical, pool of storage that spans several arrays. Users can then allocate data into this pool of storage, with a single logical volume potentially spanning multiple physical arrays.
How to optimise
Storage optimisation does not happen automatically. While the techniques mentioned above are all powerful they need to be directed to trouble spots in order to work their magic.
IBRS Analyst Kevin McIsaac therefore believes that an essential step before adopting any optimisation technique is an audit of storage systems.
“If you have duplicated data of bad processes, deduplication is just a bandaid over something that is broken,” he says.
Auditing therefore helps you understand how and where to apply storage optimisation techniques.
“Most people don’t conduct an audit because they are too busy and it is easy to buy more storage,” he says. “Sometimes that is not a bad option, but at other times you want to take stock. My recommendation is to bring in a professional services group to do the audit.”
IBRS recently conducted such an audit for a client using software from a company called Aptare and the results were impressive.
“There was a reasonable amount you could call dark storage - it had never been allocated,” McIsaac recalls, with recoverable storage around 25 to 33% of the array’s capacity. “That’s a fair amount on a 50 TB array.”
Greg Cetinich, the head of Hitachi Data Systems’ consulting practice, also believes taking stock is an important pre-optimisation consideration.
“Our recommendation is an ‘information intervention’,” he says, with the intervention assessing an organisation’s policies and processes, examining its people and organisation structure and then considering the underlying storage technologies.
Once these factors have been considered, he says an optimisation plan can consider “standardisation, simplification and consolidation” as desired outcomes.
The former establishes service management practices that create “a consistent approach to delivering IT services to the business”. Simplification sees management tools rationalised so that administrators have fewer tools to work with.
Consolidation, as HDS has practised at the Australian National University (ANU), has seen under-used arrays located in various departments or faculties redeployed to a single, central, data centre.
“ANU virtualised all legacy storage assets and consolidated multiple environments into one data centre,” Cetinich explained. “The university can now squeeze as much as possible from infrastructure and has reduced annual storage capacity growth rate from 50 to 30% or 35%.”
Sadly, few urban planners can get anywhere near that kind of improvement to road traffic, leaving storage managers with a nasty trip to work but, thanks to storage optimisation, a far happier passage through their working hours.
Where storage disappears
Over-provisioning is not the only reason for storage ‘disappearing.’
“Capacity is not visible due to the way it has been provisioned,” says IBRS’s Kevin McIsaac. “Sometimes disk is never assigned to a logical unit number [LUN, a code used to identify storage resources] and is therefore lost. McIsaac says disks can also be assigned to RAID groups or file systems and administrators ‘lose’ it by forgetting about the allocation.
“You can find terabytes of disk that are never assigned to a LUN, or are assigned to a file system that has never been used.”
Either way, the disk spins merrily inside the array, but is never called upon in anger and is therefore inaccessible and wasted.
John Martin of NetApp points out another reason storage gets lost.
“Windows NTFS degrades when disk reaches 80% capacity,” he says, so administrators don’t fill disks past that point.
A virtualised Windows server therefore leaves disk space empty to preserve NTFS performance while also leaving more disk space empty to meet VMWare’s needs. The result is better performance, but more hidden and unusable disk space.
Martin recommends thin provisioning as the best way to reduce the impact of this need for unused disk, as giving Windows and VMware the overhead they need without locking that capacity away from other applications is a desirable outcome.
But he also warns that optimising storage systems can result in unintended consequences, because it will often see disks filled to capacity and therefore forces single drives to work very hard.
“If you put more data onto fewer drives you get input/output issues,” he says.
Seven predictions that will shape this year
Pete Murray, Managing Director ANZ for Veritas Technologies, predicts trends that will have a...
ARENA jointly funds Vic's first large-scale battery storage
Two large-scale, grid-connected batteries are to be built in Victoria with the help of the...
Protecting next-gen storage infrastructures
Companies looking to modernise their overall IT infrastructure cannot afford to take a relaxed...