OPTIMIZING RAID PERFORMANCE FOR ORACLE RDBMS ON WINDOWS NT 4.0/2000

ABSTRACT:
How Redudant Arrays of Inexpensive Disks (RAIDs) affect Oracle RDBMS
performance when running on Windows NT 4.0/2000, and how to optimize for
different database processes running with RAIDs.

INTRODUCTION:
Redudant Arrays of Inexpensive Disks (RAIDs) are primarily
implemented first for Data Redundancy and protection against downtime.
However, this article will deal only with the performance implications
of different RAID levels as they pertain to the Oracle RDBMS running
on the Windows NT 4.0/2000 platform. Issues of array availability, impact
of disk failure and relative dollar cost of each RAID scheme will be
given only cursory mention.

TOPICS:
1. CHOOSING A RAID LEVEL FOR YOUR RDBMS
2. RAID LEVEL DEFINITIONS
3. DISK SUBSYSTEM PERFORMANCE ATTRIBUTES AS THEY PERTAIN TO ORACLE
4. HOW RAID LEVELS STACK UP ON PERFORMANCE ATTRIBUTES
5. RAID LEVEL RECOMMENDATIONS BY ORACLE FILETYPE & PROCESS
6. RAID CACHE CONFIGURATION RECOMMENDATIONS
7. STRIPE SIZE AND ITS EFFECT ON PERFORMANCE
8. OTHER VARIABLES THAT CAN AFFECT RAID PERFORMNACE ON NT
9. WINDOWS 2000 SPECIFIC RAID INFORMATION
10. DETERMINING PERFORMANCE OF A RAID ON NT


CONCLUSION


1. CHOOSING A RAID LEVEL FOR YOUR RDBMS:

Windows NT 4.0/2000 currently supports RAID Levels 0, 1 and 5 natively through
the Disk Administrator utility. These choices allow the DBA/sysop to
properly configure the NT server hardware to match the needs of the
particular database application.


2. RAID LEVEL DEFINITIONS

RAID 0: Striping with No Parity
RAID 0 offers striping only. During an I/O, a single drive gets
<stripesize> bytes of I/O before the I/O continues onto the next
drive in the set. For I/Os that fit in a single stripe segment,
performance is the same as a single disk drive. For I/O's that span
more than stripe segment, there may be a slight performance
improvement since disks are able to do a little work in parallel.

RAID 1: Shadowing/Mirroring
RAID 1 provides redundancy by duplicating an entire disk drive
onto another. During a read, any of the drives in the shadow set
can be used. During a write, all drives will eventually be updated
with the new data. When all drives are functioning, reads complete
slightly faster than a single disk read since the controller will
route the read to a free (not busy) disk. Writes take slightly longer
than a single disk write. In the worst case, performance is equivalent
to a single disk.

RAID 5: Striping with Rotating Parity
RAID 5 improves on RAID 0 by striping data across 3 or more drives,
while also including parity information will be used to rebuild failed
drives in the array. Reads are striped. Writes require two I/O's
however; one for the data drive, and one for the parity.
RAID vendors typically include cache on-board the RAID controller to
increase performance (see below for a description of caching). RAID 5
is the most popular solution for large disk arrays, since it requires
a lower drive capacity overhead to achieve a measure of redundancy than
RAID Level 1.


3. DISK SUBSYSTEM PERFORMANCE ATTRIBUTES AS THEY PERTAIN TO ORACLE:

RANDOM READ PERFORMANCE
Heavily used in OLTP, this attribute is important for indexed or
hash-based queries and rollback segment reads

RANDOM WRITE PERFORMANCE
Important for DBWR process writes, this attribute is heavily used in
OLTP, lightly used in a DSS

SEQUENTIAL READ PERFORMANCE
Backups, Full-table scans, index creations, paralell queries, temporary
segment reads, recovery from archived redo log files

SEQUENTIAL WRITE PERFORMANCE
Important for LGWR process writes, temporary segment writes, loader
writes, tablespace creations

IMPACT OF CONCURRENCY
Concurrent writes or reads affect different RAID levels differently,
in turn affecting all of the above attributes. A high concurrent system
example would be DSS, whereas OLTP would be low.


4. HOW THE RAID LEVELS STACK UP ON PERFORMANCE:


RAID LEVEL 0
Peformance Attribute Attribute Rating Concurrency Performance (Hi/Low)
Random Read Excellent Excellent/Excellent (1)
Random Write Excellent Excellent/Excellent (1)
Sequential Read Excellent Poor/Excellent (2)
Sequential Write Excellent Poor/Excellent (2)

RAID LEVEL 1
Peformance Attribute Attribute Rating Concurrency Performance (Hi/Low)
Random Read Good Excellent/Excellent (1)
Random Write Good Excellent/Excellent (1)
Sequential Read Excellent Poor/Excellent (2)
Sequential Write Fair Poor/Excellent (2)

RAID LEVEL 5
Peformance Attribute Attribute Rating Concurrency Performance (Hi/Low)
Random Read Good Excellent/Excellent (1)
Random Write Good Poor/Good (1, 3)
Sequential Read Excellent Excellent/Excellent (1)
Sequential Write Fair Poor/Fair

Notes:
1) If each I/O request fits inside a single striping segement
2) If stripes are correctly sized
3) If asynchronous writes are used by the disk controller this can
be acceptable - but caching without battery backup poses serious
data integrity risks.


5. RAID LEVEL RECOMMENDATIONS BY ORACLE FILETYPE & PROCESS

RAID Level 0
RAID 0 is useful with Oracle to reduce disk hot spots for Oracle
data files. It is generally not recommended for other Oracle files.

RAID Level 1
RAID 1 can be used for any Oracle file. It is especially useful for
Oracle redo log files and control files; Oracle only has to issue one
redo log I/O, saving code path and context switching. However, the
DBA/system administrator must use the RAID controller utilities to
keep up with failed disks since the shadowing of the file is hidden
from Oracle.

RAID Level 5
RAID 5 is useful for Oracle data files, but not for redo log files
or rollback segements, which are sequentially accessed. Datafiles
belonging to temporary tablespaces, are not suitable for placement
on a RAID 5 device. More benefit will be seen in situations where
reads predominate over writes.


Recommendations by Oracle RDBMS FileTypes:
RAID / Type of RAID Control File Database File Redo Log Archive Log
0 / Striping avoid OK avoid avoid
1 / Shadowing recommended OK recommended recommended
5 / Striping w/ Round-Robin Parity OK Recommended avoid avoid

More Recommendations by Oracle RDBMS FileTypes:
(all scores are approximate, and can vary on different hardware)

Scoring: 1=Best Performance, 5=Worst Performance
File Types No RAID RAID 0 RAID 1 RAID 5
Control File 2 1 2 3
Redo Log 4 1 2 3
System Tablespace 2 1 2 3
Sort Segment 4 1 5 3
Rollback Segment 2 1 2 5
Indexed Read-only Datafiles 2 1 2 1
Sequential Read-only Datafiles 4 1 5 3
DBWR intensive Datafiles 1 1 2 5
Direct-Load intensive Datafiles 4 1 1 2


6. RAID CACHE CONFIGURATION RECOMMENDATIONS:

Oracle's recommendations for the size of RAID caches on NT are
configuration dependent.
- What kind of Cache? Read, write, or both?
- What is the write policy: eg write through, or write back?
- Are the server and disk array on a UPS?
Write-Through Write-Back
Completes When? data gets to disk data gets to cache
Write Speed slow: must wait for disk fast: memory to memory copy
Vulnerability none: writes are persistent high: writes to memory aren't persistent


Is the disk controller backed up by a battery? If not, the
amount of disk controller write cache should be set to zero.
Anything greater then zero puts the data at risk of corruption
in the event of a disk controller failure, since Oracle is not
aware of the caching software running beneath it.
If the environment is primarily OLTP, then you might want to
have enough write cache to accomodate the rate of data updates
(e.g., redo logs). If the environment is primarily DSS, then
you don't need write cache, and read cache is worthless as
Oracle7 does its own read caching.

Write Caching and its Effect on Oracle Processes:
Type of Caching Control File Database File Redo Log File Archive Log File
Write-Through OK OK avoid avoid
Write-Back (Unprotected) never never never never
Write-Back (Protected) OK OK avoid avoid


7. STRIPE SIZE AND ITS EFFECT ON RAID PERFORMANCE

What is Striping? This is a method of mapping data across the physical
drives in an array to create a large virtual drive. Each stripe
has a defined size or depth, defined in blocks. Data is mapped and
subdivided on the physical hard-disk drives based on the stripe size.
For Windows NT, RAID levels 0 and 5 utilize striping.

How to set the correct stripe size, or block depth?

This determination can be performed by answering the following
questions about the environment.

First, the stripe depth must be at least a multiple of the OS
block size, which is in turn divisible by the Oracle block
size.

For example, a 8K NTFS partition block size is hosting
some datafiles from an Oracle RDBMS, whose block size
is 4K. The stripe size/depth in this case should
be at least 8K, if not more.

Why is this important? Primarily, sizing is important to
avoid Stripe Crossing. This is a situation where an RDBMS
read or write operation, as interpreted by the OS, exceeds
the stripe depth/size. This causes unnecessary additional I/Os
and can slow performance.

Once the minimum stripe depth/size has been determined, then an
assessment of the type of system files that will be implemented
by the RDBMS needs to be made: OLTP or DSS? Generally OLTP
applications require smaller stripe depth/sizes, and DSS larger.
How large or small? This needs to be tuned by the sysop using
tools supplied by the hardware provider.



8. OTHER VARIABLES THAT CAN AFFECT RAID PERFORMANCE ON NT

The major ones are: access speed of constituent disks;
capacity of internal and external buses; number of buses;
size of caches; number of caches; and nature of the algorithms
used for determining how reads and writes are done. Physical
disk I/O rate capacities should not be exceeded. Most of these
variables are vendor-specific. Contact your hardware vendor for
more information about your disk subsystem to understand what
capacities the RAID can provide.



9. WINDOWS 2000 SPECIFIC RAID INFORMATION


Windows 2000 (Win2k) allows RAID 0, 1, and 5. These are referred to as
Striped, Mirrored and 'RAID5' in the documentation. Do not confuse
the 'Spanned' dynamic volume with a RAID level - it is merely a logical
drive made up of multiple physical disks and is not really RAID. RAID
levels 1 and 5 are only supported on the Win2k Server edition (not Professional).

All normal Oracle RAID directives apply equally to the Win2k RAID, but here
are some specifc performance recommendations when using Win2k's RAID features:

Set cache memory: as large as possible.
Set cache write policy: write back.
Set cache read policy: read ahead.

Set stripe size: 8 KB/16 KB for RAID 5 and 64 KB/128 KB or the maximum supported
for RAID 0; the optimal size may vary by controller.

Use multiple logical NTFS partitions. This can be set up using the RAID configuration
software or logical disk manager in Windows2000. Use 16 KB allocation size for
formatting the NTFS volumes (format <drive>: /fs:ntfs /A:16K). Increase the NTFS log file
size to 64 MB for large volumes (chkdsk /L:65536).

Note that data block=allocation unit in MS Windows documentation. For further
information about the Win2k RAID features, consult Win2k documentation.



10. DETERMINING PERFORMANCE OF A RAID ON NT

The Performance Monitor allows I/O statistics to be monitored
easily, in a graphical manner. It is essential to obtain baseline
figures for normal system operation, so you will know when a
performance problem develops, and when your corrective action
has restored (or improved upon) the performance normally
expected.


CONCLUSION

Storage subsystem performance has a great impact on overall Oracle
RDBMS performance on Windows NT 4.0/2000. In a well-tuned Oracle system,
I/O is not usually the performance bottleneck. To ensure that your
RAID is performing well, benchmark the array(s) on a regular basis
to determine their effect on overall server performance - and make
adjustments accordingly. .