Exadata Performance Improvement Tips

_{Exadata Performance Improvement Tips}

Few considerations need to be taken care while dealing with performance tuning in Exadata.

1. Cell.smart_scan_capable
2. Write Back Flash Cache mode
3. Exadata Flash Cache Compression
4. Cell Server Calibration
5. Disk Scrubbing schedule
6. SQL quarantine
7. SQL level performance improvement
8. Huge Pages
9. Exafusion for OLTP

_{1.CELL.SMART_SCAN_CAPABLE=TRUE}

_{To take advantage of
Exadata features we need to set CELL.SMART_SCAN_CAPABLE=TRUE on the ASM disk
groups in which tablespaces are created.}

_{In Exadata Grid Disks
are presented to ASM where setting CELL.SMART_SCAN_CAPABLE=TRUE on ASM disk
groups all the smart scan operations like predicate filtering, column
projection, bloom filter etc are performed at cell storage server.}

_{In Exadata
and ‘cell smart table scan’ and ‘cell smart index scans’ are used, then you can
check the proportion of reads that are actually using SmartScan.: ‘cell
physical IO bytes eligible for predicate offload’. This is the amount of reads
(in bytes) that are going to the SmartScan code.}

_{You have the total
amount of reads as ‘physical read total bytes’ so you can compare it to know
which part of your reads is subject to SmartScan.}

_{If 'cell physical IO
interconnect bytes returned by smart scan' is small then its good and indicates
that smart scan is happening.}

_{If ‘cell physical IO
bytes eligible for predicate offload’ / ‘physical read total bytes’ is quite
small, then it is problem and need to tune.}

_{You want to do
direct-path reads and you want to see ‘TABLE ACCESS STORAGE’ in the execution
plan.}

_{2. Write
Back Flash Cache mode}

_{Databases with
performance issues on behalf of writing I/O benefits from Write-Back, the most
likely symptom of which would be high numbers of the Free Buffer Waits wait-event
however Flash Logging is done with both Write-Through and Write-Back.}

_{Exadata Database
Machine operates in WriteThrough mode by default , in which it only caches the
read IOs to the flash cache. Write-Back Flash Cache (WBFC) provides the ability
to cache write IOs directly to the flash cache in addition to the
read IOs}

_{CellCLI> list cell
attributes flashcachemodea}

_WriteThrough

_{In case Flash Cache
mode is WriteThrough mode, Enabling Write Back Flash Cache when
Storage cell software version is 11.2.3.3.1 or higher. With 11.2.3.3.1 or
higher it is not required to stop cellsrv process or inactivate griddisk.}

_{To reduce performance
impact on the application, execute the change during a period of reduced
workload.}

_{Steps to change
flashcachemodea fromm WriteThrough to WriteBack}

_{a.Validate all the
Physical Disks are in NORMAL state before modifying FlashCache. The following
command should return no rows.}

_{# dcli –l root –g
cell_group cellcli –e “list physicaldisk attributes name,status”|grep –v NORMAL}

_{b. Drop the flash
cache}

_{# dcli –l root –g
cell_group cellcli -e drop flashcache}

_{c. Set flashCacheMode
attribute to writeback}

_{# dcli –l root – g
cell_group cellcli -e "alter cell flashCacheMode=writeback"}

_{d. Re-create the flash
cache}

_{# dcli –l root –g
cell_group cellcli -e create flashcache all}

_{e. Check attribute
flashCacheMode is WriteBack:}

_{# dcli –l root –g
cell_group cellcli -e list cell detail | grep flashCacheMode}

_{f. Validate griddisk
attributes cachingPolicy and cachedby}

_{# cellcli –e list
griddisk attributes name,cachingpolicy,cachedby}

_{CellCLI> list cell
attributes flashcachemode}

_WriteBack

_{3. Exadata
Flash Cache Compression}

_{Note : Exadata Flash
Cache Compression is discontinued ffrom Exadata Storage Server X5.}

_{Support Exadata
Storage Servers X3 and X4 only and Exadata Storage Software version 11.2.3.3.0
and above which Requires Oracle Advanced Compression option to enable flash
cache compression.}

_Advantages

_{Improved performance
for large datasets.}

_{No performance
overhead}

_{Compression and
decompression operations are transparent to the application and databases.}

_{Higher compression
ratio for uncompressed tables & indexes}

_{Stores up to 2 times
more data in flash cache.}

_{4.Cell Server
Calibration :}

_{Note : This is not
recommended to run during normal operation.}

_{This command is mainly
used to initial setup for the exadata but can be used any off peak time to
check the physical disk performance we can use calibrate command which
does parallel stress tests on the cell. If there are some low
performing disks detected, calibrate will test them individually and on the
command prompt displays the output.}

_{Calibrate command is
only available to the root user. There are other users like}_{celladmin and cellmonitor on a
cell storage server and if you login with these users you will not be able to
run the calibrate command.}

_{I there is no user
work load then this CALIBRATE FORCE can be run and it can run when
CELLSRV is still up.}

_{It is therefore
recommended to not run during normal operations. Without the FORCE, the
CELLSRV must be shut down.}

_{CellCLI> help
calibrate}

_{Usage:
CALIBRATE [HARDDISK | FLASHDISK | lun1 [, lun2]] [FORCE]}

_{Purpose:
Runs raw performance tests on disks.}

_{CELLSRV
must be shutdown. User id root is required.}

_Options:

_{[FLASHDISK]:
option to specify that only flash LUNs be calibrated.}

_{[HARDDISK]
: option to specify that only hard disk LUNs be calibrated.}

_{lun1
[, lun2] : option to specify list of LUNs by
name [FORCE] : runs test even
while CELLSRV is running.}

_Examples:

_CALIBRATE

_{CALIBRATE
'0_8', '0_10' force}

_CellCLI>

_{CellCLI> calibrate
force}

_{5. Disk Scrubbing
schedule : Adaptive Scrubbing Schedule}

_{Oracle Exadata Storage
Server Software automatically inspects and repairs hard disks periodically when
the hard disks are idle. The default schedule of scrubbing is every two weeks.}

_{However, once a hard
disk starts to develop bad sectors, it is better to scrub that disk more
frequently because it is likely to develop more bad sectors. In release
12.1.2.3.0, if a bad sector is found on a hard disk in a current scrubbing job,
Oracle Exadata Storage Server Software will schedule a follow-up scrubbing job
for that disk in one week. When no bad sectors are found in a scrubbing job for
that disk, the schedule will fall back to the scrubbing schedule specified by
the hardDiskScrubInterval attribute.}

_{If the user has
changed the hardDiskScrubInterval to less than or equal to weekly, Oracle
Exadata Storage Server Software will use the user-configured frequency instead
of the weekly follow-up schedule even if bad sectors are found.}

_{We can schedule this
scrubbing during the weekly weekend instead of biweekly weekdays. By keeping
this scrubbing in weekend at off pick time, it gives total relief on the cell
IO.}

_{Check the current
setting}

_{CellCLI> list cell
attributes hardDiskScrubInterval}

_biweekly

_{CellCLI> list cell
attributes hardDiskScrubStartTime}

_{1969-12-31T16:00:00-08:00}

_{Can be changed to
weekly to start in weekend cycle.}

_{CellCLI> ALTER CELL
hardDiskScrubStartTime='2017-07-16T16:00:00-08:00'}

_{CellCLI> ALTER CELL
hardDiskScrubInterval=weekly}

_{4 TB size of disk may
take 6 to 7 hours in scrubbing..}

_{6. Make sure SQLs are not getting quarantine}

_Quarantines:_{The option of calling a crash cart is not
available to us here so starting in early 11.2.0.3, we created a
quarantine system where, after a crash, the exception
handler remembers both the sql_id and the disk region being processed
and creates a persistent quarantine for both. When a sql_id or a disk region is
quarantined any Smart Scan operations on them will be executed in passthru
mode.}

_{Currently it is hard
to see when this is happening, but an upcoming release has a new stat to make
this easier. If an operation has been quarantined you can see its effects by
monitoring:cell num bytes in passthru due to quarantine}

_{Listing
Quarantines}

_{When a quarantine has
been created you can look at it in detail using CellCLI:}

_{CellCLI> list
quarantine}

_{20 "SQL
PLAN" Crash}

_{CellCLI> list
quarantine detail}

_{name: 20}

_{cellsrvChecksum: 150f7acd0b05d50095223fb9399e36ee}

_{clientPID: 104403}

_{comment: None}

_{crashReason: Signal:11_addr:0xfffffffffffffff}

_{creationTime: 2016-09-13T11:22:00-07:00}

_{dbUniqueID: 1881175214}

_{dbUniqueName: xx_xx}

_{incidentID: 1}

_{planLineID: 6}

_{quarantineMode: "FULL
Quarantine"}

_{quarantinePlan: SYSTEM}

_{quarantineReason: Crash}

_{quarantineType: "SQL
PLAN"}

_{remoteHostName: xxxx.xxxx.xx}

_{rpmVersion: cellofl-12.1.2.3.2_LINUX.X64_160721}

_{sqlID: 5xx202y5yrwv4p}

_{sqlPlanHashValue: 89138300}

_CellCLI>

_{The 'list detail'
version of the command gives us everything we would need to know about exactly
what has been quarantined and why it was quarantined. CellCLI also supports
manually creating a quarantine using the attributes shown by 'list detail'.}

_{Dropping
Quarantines}

_{The quarantines can
removed manually using CellCLI. Quarantines are also automatically dropped by
certain operations:}

_{Offload
quarantines}_è_{Dropped
on rpm upgrades}

_{Plan Step
Quarantines}_è_{Dropped
on rpm upgrades}

_{Database
quarantines}_è_{Dropped
on rpm upgrades}

_{Disk Region
Quarantines}_è_{Dropped
on rpm upgrades, or on successful writes to the quarantined disk region}

_{7. SQL level
performance improvement}

_{SQL level performance
improvement is iterative process.}

_{a. Full hint:}_{The full hint is
an optimizer directive used to force a full table scan of the specified
table. The optimizer should always choose a full-table scan when it is
the "best" execution plan}_.

_{select /* +full(test)
*/ id from test;}

_{select /*+
FULL(test) PARALLEL(test, 8) */ . . .}

_{b. Direct Path Reads}_:_{When SELECT part of the UPDATE/DELETE statements (the
select part finds the rows to update/delete) does not ever automatically get
direct path read/smart scan chosen and when the SELECT statement may use smart
scan and be really fast, the same select operation in an INSERT SELECT (or
UPDATE/DELETE) context may not end up using smart scans by default.}

_{To work around these
problems and force a direct path read/smart scan.}

_{Run the query in
parallel as parallel full segment scans will use direct path
reads, unless your parallel_degree_policy = AUTO.}

_{You may run the
query in serial, but force the serial direct path reads by
setting _serial_direct_read = TRUE}

_c._{Invisible indexes}_:_{Check if invisible indexes because in Exadata most to time
queries need to be offloaded for smart scan and if the index is there then it
cause issue}

_{If performance
degradation is observed, the index can be made visible again without having to
rebuild the index. This can minimize the period of performance
degradation, while also preventing an expensive operation to rebuild or
recreate an index.}

_{We can verify the
index visibility from the data dictionary viewdba_indexes to determine the
status of an index.}

_{To make an index
invisible: ALTER INDEX index_name INVISIBLE;}

_{To make an index
visible: ALTER INDEX index_name VISIBLE;}

_{Invisible indexes are
maintained like any other index, but they are ignored by the optimizer unless
the OPTIMIZER_USE_INVISIBLE_INDEXES parameter is set to TRUE at the
instance or session level.}

_{8. Huge Pages:}

_{To optimize the performance it is highly recommended to configure huge pages on each Exadata node.}

_Step1.

_{a. Make sure databases
are not using Oracle 11g AMM.}

_{b. If you want to use
HugePages make sure that both MEMORY_TARGET / MEMORY_MAX_TARGET initialization
parameters are unset (i.e. using “ALTER SYSTEM RESET”)

for the database instance.}

_{Step 2.}

_{a. vm.nr_hugepages is
the sum of all database SGA size hosted on the particular node.}

_{vm.nr_hugepages = xxxx}

_{b. Set the kernel parameter. This can be
done on all nodes as root via the following dcli command:

[root@ abcxyz 1 ~]# dcli -l root -g dbs_group “sysctl -w vm.nr_hugepages=xxxx”

abcxyz 1: vm.nr_hugepages = xxxx

abcxyz 2: vm.nr_hugepages = xxxx

abcxyz3: vm.nr_hugepages = xxxx

abcxyz 4: vm.nr_hugepages = xxxx}

_{c. Ensure that
the kernel setting is persistent on reboot.}

_{Edit the /etc/sysctl.confconfiguration file and add the
following in all nodes.

d. Verify Huge Page reservations:

# dcli -l root -g dbs_group “grep Huge /proc/meminfo”}

_{e. Set the
database init.ora parameter for
using huge pages in both databases:}

_{abc> alter system
set use_large_pages=ONLY scope=spfile sid=’*’;}

_{abc> alter system set use_large_pages=ONLY scope=spfile
sid=’*’;}

_{f) Restart databases}

_{$ srvctl stop database -d test
$ srvctl start database -d test}

_{More details:}

MOS Doc ID 401749.1 and 749851.1

_{9. Exafusion for OLTP}

Exafusion is the new feature in Exadata starting from 12.1.2.1.1 for speeding up OLTP and it is labelled as “Hardware Optimized OLTP Messaging”.

It is a re-implementation of cache fusion.

Messages bypass network stack leading to a performance improvement.

Exafusion is disabled by default.

To enable Exafusion, set the EXAFUSION_ENABLED initialization parameter to 1.

To disable Exafusion, set the EXAFUSION_ENABLED initialization parameter to 0.

This parameter cannot be set dynamically. It must be set before instance startup.

All of the instances in an Oracle RAC cluster must enable this parameter, or all of the instances in an Oracle RAC cluster must disable the parameter.

The minimum requirements of Exafusion are Oracle Exadata Storage Server Software release 12.1.2.1.1 or later and Oracle Database software Release 12.1.0.2.0 BP6 or later

Search This Blog

Exadata

Exadata Performance Improvement Tips

Comments

Post a Comment

Popular posts from this blog

Exadata(Half Rack) Image Upgrade (Rolling)

Exadata Image Upgrade (Non-Rolling)