Why do we use SAN Storage? Is it essential for any infrastructure? What type of problems will it solve? Do we need it to solve capacity problems only or there are other more pivotal issues? Will it have an impact on performance, scalability and upgradeability?
Let’s get back and see how were infrastructures managed before and what were the problems they faced that triggered the innovation and development of SAN Storage Solutions.
Initially, infrastructures used to have clients connected directly to servers where it was very difficult to transfer data between clients as well as sharing resources. Day by day, number of clients and servers has increased and management became more and more difficult.
This situation above triggered the innovation of LAN to solve communication & management problems between Workstations and Servers as below.
Where Data was always as “Internal Disks in each Server” (i.e., There are two Major Problems) :
Internal SCSI Disks = 1 Operation at a Time (i.e., Only 1 Disk in the Bus will Read or Write while the rest of Disks are waiting). Moreover SCSI Bus Speed is 320 Mbps (80 MBps).
Inefficient distribution of capacity over all servers. Moreover the loss of Multiple Parity & Spare Disks in each & every Server.
Simply The two major problems are : Low Performance & Capacity Loss.
The industry approach was to find a “Consolidation Solution” that will avail a solution for all servers to access a central Storage Array Subsystem with capability to share the capacity where “All Disk Drives will Read & Write” at the same time. This Solution would achieve excessive increase in System’s performance and add reliability to system as Data would be always consolidated in Storage Array.
Based on Fiber Channel Protocol, IBM Invented a New Network for Storage Access only that is SAN (Storage Area Network) or Simply Switches that would connect Servers to Storage Arrays in order to consolidate all Storage Devices in a Fiber Channel Network and have High performance as All Disk Drives will Read & Write at the same time in the Disk Array Subsystem, moreover FC Speed is 4 Gbps (400 MBps) compared to SCSI Speed of 320 Mbps (80 MBps). Efficiently use Disks Capacity by consolidation and Flexible share and distribution across servers.
Conclusion, Leaving Data as Internal Disks inside Servers is totally lower in Performance for system than consolidating it in a Fiber Channel External Storage as per the calculations discussed above.
Other benefits include the ability to allow servers to boot from the SAN itself. This allows for a quick and easy replacement of faulty servers since the SAN can be reconfigured so that a replacement server can use the Data of the faulty server located in storage array subsystem.
The question is which storage would be the best fit for your infrastructure ? Is it the most expensive ? Or the highest in performance ? Is this related to your actual workload ?
The most critical point when choosing the proper storage solution is to have actual sizing for your current workload considering upgradeability factor as storage is designed to live in your infrastructure much longer than your servers that’s why sizing have to be done precisely to know the actual needed performance then to select any of the storage arrays in the market relying on performance differences. But how can customers compare storage solutions of different vendors to each other ? Simply there are one of 3 ways :
Vendor Benchmark : All vendors publish their boxes performance numbers represented in IOps that By: Mohamed El Mofty Storage Networking Solutions Expert IBM Systems and Technology Group The industry approach was to find a “Consolidation Solution” that will avail a solution for all servers to access a central Storage Array Subsystem with capability to share the capacity where All Disk Drives will Read & Write at the same time. This Solution would achieve excessive increase in System’s performance and add reliability to system as Data would be always consolidated in Storage Array. is essential for DB & OLTP environments.
Standard Benchmark : Storage Performance Council is a standard organization responsible for benchmarking storage boxes of all vendors in order to give customers’ capability to compare different storage solutions of different vendors. Especially or OLTP customers, they can refer to SPC-1 Benchmark (www.storageperformance.org).
For Large File Processing customers where business is in need for Huge System Bandwidth like the Seismic Analysis or Data Mining / Warehousing or Video Streaming, they can refer to SPC-2 Benchmark (www.storageperformance.org).
Why does Storage Performance have such importance? Why will Storage Performance be the key driving factor when adding a Storage Solution to your infrastructure?
Simply, infrastructures have need to different servers that will handle variable workloads according to used Applications and number of users. Each and every server passes by proper sizing procedure to determine its number of Processors, Cores as well as Memory.
The most essential part is the needed performance form Fiber Channel Storage Array Subsystem for this server (i.e., Fiber Channel Storage is obliged to offer needed performance for all servers otherwise all servers will suffer from Performance Bottlenecks).
In addition to this, Storage Solutions are required to last for at least 6 years in your infrastructures (i.e., Fiber Channel Storage must have adequate performance to cover performance growth rate in your infrastructure otherwise you would face data migration problems and waste of investment changing your Storage Solution).
Moreover,If you would consider using SnapShoot and Cloning special copy services to have the ability to issue On-Line backup as well as creating Reporting or Development environment beside the running production system, this has to be considered in performance calculations to avoid any performance degradations in your production system.
Finally, if you would go in future for a Full Disaster Recovery Solution locating two SAN Storage Solutions (one at each site replicating data to each other), this would consume about one third to half of your production system performance which need to be calculated and considered carefully when acquiring the first storage solution in your infrastructure otherwise different problems would occur.
Implementing a Disaster Recovery Solution relies on SAN-to-SAN Replication (i.e., using 2 SAN Storage with one at each site while replication is going over WAN).
Mohamed El Mofty
Storage Networking Solutions Expert
IBM Systems and Technology Group