STORAGE
This article will explore the major storage architectures including DAS (Direct Attached Storage), NAS (Network Attached Storage) and SAN (Storage Area Networks)
The main difference between NAS and DAS and SAN is
that NAS servers utilize file level transfers, while DAS and SAN solutions use
block level transfers which are more efficient.
Different
protocols can be used for file sharing such as NFS for UNIX clients and CIF for
Windows clients. Most NAS models
implement the storage arrays as iSCSI targets that can be shared across the
networks. Dedicated iSCSI networks can
also be configured to maximize the network throughput.
Considerations
When configuring disk drives to be used in a SAN-based
Windows cluster, several requirements must be followed. Only BASIC disk types are allowed, no DYNAMIC
volumes. Also, disks must be formatted
with the NTFS file system with either MBR or GPT-based disk partitions. Finally, the shared disks must be made
accessible to the cluster members by LUN masking on the storage
controllers. This establishes which
storage LUNS can be accessed by which servers.
In addition to LUN masking, it is a recommended best
practice to establish a unique SAN zone for each Windows cluster. This is accomplished on the SAN switches to
only allow certain servers and storage controllers to communicate in a logical
zone. By isolating the SAN I/O traffic
to a particular zone, it cuts down on the traffic from other zones, thus
reducing the overall I/O latencies.
Synchronizing Disk Access
As you can imagine, having multiple servers accessing
the same shared storage needs to be done so in an orderly fashion. If multiple servers try to access the same
data at the same time in an uncoordinated fashion, the result is disk
corruption. There are 2 schools of
thought on how to coordinate shared disk access; one is the “shared
everythingâ€
model, the other is the “shared nothing†model.
The “shared everything†model allows all servers to
access all the shared disk drives at the same time, simultaneously. This is accomplished by the use of
“distributed lock manager†software which coordinates the locking of files and records on
disks. Only 1 server can own an
exclusive write-mode lock on a file which prevents other nodes from writing at
the same time. While there is overhead
associated with the distributed lock manager, it scales well as the number of
servers in the cluster grows.
In contrast, Windows clusters utilize the “shared
nothingâ€
model when synchronizing access to storage.
This means that only 1 server can own a particular shared disk drive at
a time. This prevents other nodes from
writing to the disk while the owner node manipulates the data. The other servers can own their own disk
drives, but never can 2 nodes own the same drive. You can see why this is called the shared
nothing model. What follows is a diagram
illustrating the shared nothing model with Server1 owning Disk1 and Server2
owning Disk2
Looking under the hood, the way that Windows
synchronizes disk access is accomplished in 2 ways depending on the version of
the operating system. For Windows 2003
and prior, a Challenge/Defense strategy is used to ensure only 1 node owns the
disk drive at a time. This is
accomplished by issuing SCSI commands to reserve, release or reset the LUN or
bus.
For Windows 2008 and later, the cluster software now
relies on persistent reservations maintained by the storage controllers. Using persistent reservations is less
disruptive on the SAN because there are no more SCSI bus resets as experienced
with the Challenge/Defense mechanism.
However, not all storage controllers support the SCSI-3 persistent
reservations so be sure to know if yours does.
What follows is a diagram illustrating the Windows 2008 disk storage
architecture used with clusters.
iSCSI
technology! As you may know, iSCSI
stands for Internet Small Computer System Interface. It is a transport protocol for mapping block-oriented
storage data over TCP/IP networks.
Disk Architectures
- MBR
- GPT GUiD Partition Table
read or write datas -> request I/O to driver
ntfs.sys from disk controler
Once the filter drivers process the I/O request, the
partition manager (partmgr.sys driver) locates the corresponding partition
utilizing the disk class driver (disk.sys).
Next, any multi-path drivers are used to establish the I/O path to the
storage device if multiple paths are implemented. Microsoft provides a universal multi-path
driver (mpio.sys) that other vendors leverage to provide redundant paths.
Next, the I/O request is handed off to the port driver
(typically storport.sys or scsiport.sys).
The role of the port driver is to process the I/O request by interfacing
with the lower level vendor specific miniport drivers. The miniport drivers implement additional
vendor specific functionality for the storage controllers. At this point, the data is read from the disk
by the storage controller and is passed back up the storage stack to the user
application.
Class Drivers
disk.sys
MPIO drivers
mpio.sys
mpdev.sys
mpsltry.sys
Port/Miniport drivers
SCSIport, Storport, Miniport driver
TOOLS
a tool known as PAL (Performance Analysis of Logs)
his allows you to quickly focus on the time period
when the majority of bottlenecks occurred.
The requirements are thoroughly documented on the Code
Plex web site (http://www.codeplex.com/PAL).
For earlier versions of PAL (prior to v2.0), the Microsoft Log Parser 2.2 software, along
with Microsoft Office Web Components 2003, and .NET Framework 2.0 must be
installed. With the current versions of
PAL (v2.0 or later), PowerShell v2.0 along with Microsoft .NET Framework 3.5
SP1 is required. The PAL installation is
a MSI-based kit with just one prompt for the destination folder.
atch for future troubleshooting articles on another
free tool from Microsoft called Xperf which allows you to dig even deeper into
Windows storage performance issues.
Task Manager
To get a cursory view of the problem, use the Task
Manager (Ctrl+Shift+Esc)
Perfmon
If you need to shed some more light on your storage
performance problems, it may be necessary to gather some performance metrics in
a log file. This will allow you to collect
the data while the problem is occurring, and then analyze the data with charts
and graphs to identify any problem areas.
Windows provides a built-in system tool known as the Performance
Monitor, Perfmon for short. Perfmon
allows you to graph various performance counters illustrating the minimum,
maximum and average values across a time range.
There is also a handy utility called PerfWiz from Microsoft that
provides a menu-driver interface to automate the collection of Perfmon data.
RAID 0
RAID 0, also called striping, is a scheme where data
is divided into blocks and distributed across the drives in the array. This
level does not provide redundancy, so consequently it has the best overall
performance. For this reason, it is not suitable for mission critical
situations, but is best used in situations where improved performance is the
primary drive
RAID 1
There are two implementations of RAID 1: mirroring and
duplexing. With both schemes, data is duplicated on a second disk. Mirroring
uses a single drive controller, while duplexing uses two controllers.
RAID 5
RAID 5 is called striping with distributed parity.
Data and parity or error detection and correction code, is striped across three or more drives.
Parity is stored on a dedicated drive, which results in less available storage
space. If a drive fails, data can be recovered from the remaining data blocks
and the parity information. This level also features improved read and write
performance because data can be read or written simultaneously across multiple
drives.
RAID 0+1 (01)
This is the first of the hybrids, which simply
combines other RAID levels. RAID 0+1 (also called 01) mirrors and stripes data
simultaneously. Combining striping and mirror marries high performance with
fault tolerance, making this one of the most popular levels. Under this scheme,
two striped arrays are created, and one acts as the mirror to the other. This
requires a minimum of four drives.
RAID 1+0 (10)
With this RAID level, data is mirrored and striped
simultaneously. Most often, this is implemented with four drives, and one
mirrored drive set is striped. This provides even higher fault tolerance and
performance. RAID 10 differs from 01 in that things are reversed. Where 01 is a
mirror of stripes, 10 is a stripe of mirrors.
RAID 5+0 (50)
The last hybrid model involves striping across
multiple RAID 5 arrays. So, even if one drive from each array fails, operations
could continue. This requires a minimum of six drives, and provides improved
write performance than RAID 5 alone. Because of the number of drives, this
implementation is expensive but useful when fault tolerance and performance are
critical.
Microsoft Windows XP, Windows 2000 and Windows Server
2003 offer two types of disk storage: basic and dynamic.
Basic Disk Storage
Basic storage uses normal partition tables supported
by MS-DOS, Microsoft Windows 95, Microsoft Windows 98, Microsoft Windows
Millennium Edition (Me), Microsoft Windows NT, Microsoft Windows 2000, Windows
Server 2003 and Windows XP. A disk initialized for basic storage is called a
basic disk. A basic disk contains basic volumes, such as primary partitions,
extended partitions, and logical drives. Additionally, basic volumes include
multidisk volumes that are created by using Windows NT 4.0 or earlier, such as
volume sets, stripe sets, mirror sets, and stripe sets with parity. Windows XP
does not support these multidisk basic volumes. Any volume sets, stripe sets,
mirror sets, or stripe sets with parity must be backed up and deleted or converted
to dynamic disks before you install Windows XP Professional.
Dynamic Disk Storage
Dynamic storage is supported in Windows XP
Professional, Windows 2000 and Windows Server 2003. A disk initialized for
dynamic storage is called a dynamic disk. A dynamic disk contains dynamic
volumes, such as simple volumes, spanned volumes, striped volumes, mirrored
volumes, and RAID-5 volumes. With dynamic storage, you can perform disk and
volume management without the need to restart Windows.
Note: Dynamic disks are not supported on portable
computers or on Windows XP Home Edition-based computers.
You cannot create mirrored volumes or RAID-5 volumes
on Windows XP Home Edition, Windows XP Professional, or Windows XP 64-Bit
Edition-based computers. However, you can use a Windows XP Professional-based
computer to create a mirrored or RAID-5 volume on remote computers that are
running Windows 2000 Server, Windows 2000 Advanced Server, or Windows 2000
Datacenter Server, or the Standard, Enterprise and Data Center versions of
Windows Server 2003.
Storage types are separate from the file system type.
A basic or dynamic disk can contain any combination of FAT16, FAT32, or NTFS
partitions or volumes.
A disk system can contain any combination of storage
types. However, all volumes on the same disk must use the same storage type.
To convert a Basic Disk to a Dynamic Disk:
Use the Disk Management snap-in in Windows
XP/2000/2003 to convert a basic disk to a dynamic disk. To do this, follow
these steps:
Log on as Administrator or as a member of the
Administrators group.
Click Start, and then click Control Panel.
Click Performance and Maintenance, click
Administrative Tools, and then double-click Computer Management. You can also
right-click My Computer and choose Manage if you have My Computer displayed on
your desktop.
In the left pane, click Disk Management.
In the lower-right pane, right-click the basic disk
that you want to convert, and then click Convert to Dynamic Disk. You must
right-click the gray area that contains the disk title on the left side of the
Details pane.
Dynamic Storage Terms
A volume is a storage unit made from free space on one
or more disks. It can be formatted with a file system and assigned a drive
letter. Volumes on dynamic disks can have any of the following layouts: simple,
spanned, mirrored, striped, or RAID-5.
A simple volume uses free space from a single disk. It
can be a single region on a disk or consist of multiple, concatenated regions.
A simple volume can be extended within the same disk or onto additional disks.
If a simple volume is extended across multiple disks, it becomes a spanned
volume.
A spanned volume is created from free disk space that
is linked together from multiple disks. You can extend a spanned volume onto a
maximum of 32 disks. A spanned volume cannot be mirrored and is not
fault-tolerant.
A striped volume is a volume whose data is interleaved
across two or more physical disks. The data on this type of volume is allocated
alternately and evenly to each of the physical disks. A striped volume cannot
be mirrored or extended and is not fault-tolerant. Striping is also known as
RAID-0.
A mirrored volume is a fault-tolerant volume whose
data is duplicated on two physical disks. All of the data on one volume is
copied to another disk to provide data redundancy. If one of the disks fails,
the data can still be accessed from the remaining disk. A mirrored volume
cannot be extended. Mirroring is also known as RAID-1.
A RAID-5 volume is a fault-tolerant volume whose data
is striped across an array of three or more disks. Parity (a calculated value
that can be used to reconstruct data after a failure) is also striped across
the disk array. If a physical disk fails, the portion of the RAID-5 volume that
was on that failed disk can be re-created from the remaining data and the
parity. A RAID-5 volume cannot be mirrored or extended.
The system volume contains the hardware-specific files
that are needed to load Windows (for example, Ntldr, Boot.ini, and
Ntdetect.com). The system volume can be, but does not have to be, the same as
the boot volume.
The boot volume contains the Windows operating system
files that are located in the %Systemroot% and %Systemroot%'System32 folders.
The boot volume can be, but does not have to be, the same as the system volume.
VOLUME RECOVERY
hen you delete a dynamic volume, the OS erases the
volume's file-system boot sector (sector 0) and removes the volume entry from
the Microsoft Management Console (MMC) Disk Management snap-in private region
database. However, as part of this process, the OS leaves the rest of the drive
intact, including the data. Both FAT32 and NTFS store a backup copy of the boot
sector. You can copy this boot sector back to sector 0 and restore the volume
as long as you know the original volume size.
To recover an NTFS volume, perform the following
steps:
Open the Disk Management snap-in (go to Start,
Programs, Administrative Tools, Computer Management, and select Storage).
Recreate the original volume by right-clicking the unpartitioned
space and selecting New Partition from the context menu; specify the exact size
of the original volume in the process, but don't format the volume (you must
know the original volume size to recreate the volume because the Disk
Management snap-in rounds partition sizes).
Use dskprobe.exe to recover the backup boot sector for
the NTFS volume from the end of the deleted dynamic volume (because you're
restoring a dynamic volume, you might need to use dmdiag.exe to find the backup
boot sector). See the Microsoft article "Recovering NTFS boot sector on
NTFS partitions" for an explanation of how to copy the boot sector.
After you rewrite the NTFS boot sector, quit Dskprobe.
Go to the MMC Computer Management console Action menu
and click Rescan Disks to mount the volume for immediate use.
To recover a FAT32 volume, perform the following
steps:
Open the Disk Management snap-in (go to Start,
Programs, Administrative Tools, Computer Management, and select Storage).
Recreate the original volume by right-clicking the
unpartitioned space and selecting New Partition from the context menu; specify
the exact size of the original volume in the process, but don't format the
volume (you must know the original volume size to recreate the volume because
the Disk Management snap-in rounds partition sizes).
Use dskprobe.exe to recover the backup boot sector for
the deleted dynamic FAT32 volume from sector 6 of the logical volume and write
it to sector 0 of the logical volume. See the Microsoft article "Chkdsk
Does Not Use Backup Boot Sector to Fix Corrupted FAT32 Boot Sector" for an
explanation of how to copy the boot sector.
After you rewrite the FAT32 boot sector, quit
Dskprobe.
Go to the Computer Management console Action menu and
click Rescan Disks to mount the volume for immediate use.
No comments:
Post a Comment