Tuesday, April 30, 2013

How do I know if I need Oracle ACO / ASO / RAC / RAT ?

   The Database Technology market is full of various products, even when we look at just Oracle's line of products - Advanced Security, Advanced Compression, Real Application Clusters (RAC), Real Application Testing (RAT), Exadata & so on ! However, we don't just go out and buy anything - we have to have a need to have certain helper software / features to make our lives easier. How do we assess whether I need one or another ? Here's my humble attempt today to make a checklist for these 4 products (To print-out or view the sheets fully, use the direct links) ... Exadata coming soon :-) :

(1) Assessing the need for Oracle Real Application Clusters (RAC) :

RAC checklist

(2) Assessing the need for Oracle Advanced Security Option (ASO) :

ASO checklist sheet1
ASO checklist sheet2

(3) Assessing the need for Oracle Advanced Compression Option (ACO)

ACO checklist

(4) Assessing the need for Oracle Real Application Testing (RAT)

RAT checklist sheet1
RAT checklist sheet2

   If you have any suggestions on improvement, please let me know. Ia appreciate your time spent here - I hope you get some useful information out of this. Have fun architect-ing :-) !

Wednesday, July 08, 2009

HWM (high watermark) of a tablespace ?

How many times have you thought of calculating HWM (high watermark) of a tablespace as a DBA ? Have you ever found out that a tablespace is overallocated in a database, as compared to the data being used in it ? In such a case, you'd want to reduce the wasted space & resize the tablespace to a smaller size, although greater than the current usage.
HWM is at the maximum blocks ever occupied by a SEGMENT (unless we perform a truncate on it). Segments span over datafiles however remain within a tablespace. All segments are physically allocated by way of extents, which are contiguous blocks of data. Now, when we talk about contiguous blocks, 2 datafiles (even from same tablespace) are highly unlikely to have contiguguos data blocks. Hence, theoretically, it's possible to calculate high watermark within a datafile, not a tablespace directly, as the latter is only a logical unit. Although, we can resize a tablespace, however the resize value needs to be based on exact calculations, or better, based on the above SEGMENT theory, we should resize individual datafiles within a tablespace more accurately.
For the sake of simplicity & ease of understanding, we may still see the datafile usage as an overall tablespace usage :

set verify off
column file_name format a50 word_wrapped
column smallest format 999,990 heading "Smallest|Size|Poss."
column currsize format 999,990 heading "Current|Size"
column savings format 999,990 heading "Poss.|Savings"
break on report
compute sum of savings on report

column value new_val blksize
select value from v$parameter where name = 'db_block_size';

--Tablespace-wise Used / unused blocks for HWM calculation

select tablespace_name, file_name,
ceil( (nvl(hwm,1)*&&blksize)/1024/1024 ) smallest,
ceil( blocks*&&blksize/1024/1024) currsize,
ceil( blocks*&&blksize/1024/1024) -
ceil( (nvl(hwm,1)*&&blksize)/1024/1024 ) savings
from dba_data_files a,
( select file_id, max(block_id+blocks-1) hwm
from dba_extents
group by file_id ) b
where a.file_id = b.file_id(+)
order by 1;

You may save this above data output as a snapshot of starting point. Now, let's create a script to resize individual datafiles (you may exclude some important ones, like SYSTEM, UNDO, TEMP). I've used 2 types of scripts - one is exact resizing to the HWM & other one is with some future space allocation planning built-in & still reducing wasted space :

--Exact resizing

select 'alter database datafile ''' || file_name || ''' resize ' ||
ceil( (nvl(hwm,1)*&&blksize)/1024/1024 ) || 'm;' cmd
from dba_data_files a,
( select file_id, max(block_id+blocks-1) hwm
from dba_extents
group by file_id ) b
where a.file_id = b.file_id(+)
and ceil( blocks*&&blksize/1024/1024) -
ceil( (nvl(hwm,1)*&&blksize)/1024/1024 ) > 0

-- Future Space planned resizing - change the 0.1, which is 10% estimate, to any other value less than 1, to alter how much extra you wish to still keep in each datafile / tablespace

select 'alter database datafile ''' || file_name || ''' resize ' ||
ceil( (nvl(hwm,1)*8192)/1024/1024 + 0.1*(nvl(hwm,1)*8192)/1024/1024) || 'm;' cmd
from dba_data_files a,
( select file_id, max(block_id+blocks-1) hwm
from dba_extents
group by file_id ) b
where a.file_id = b.file_id(+)
and ceil( blocks*8192/1024/1024) -
ceil( (nvl(hwm,1)*8192)/1024/1024 ) > 0

Dear reader, I've been able to make use of this procedure successfully. Please let me know your results & learnings.

Tuesday, June 09, 2009

How to speed up data transfer between servers

This article contains values as an example of a specific case. Your servers may have different configuration. Please use accordingly.

Over a network, although the network bandwidth could be 10 GigaBits, yet, if the servers have
1 GigE Ethernet network cards (1 GigaBit), the overall file-transfer
throughput is 1 GigaBits / sec, which means (1024*1024*1024 / 8) bytes
per sec = 125 MB/sec. This is the maximum throughput available on the network.

  • For convenience, the 2 esrvers we need to improve data transfer speeds between - we'd call one as source, & other target. Now, if we ping from a source to target server, we get about 42 ms of turnaround time.

  • This gives us maximum amount of data at any point of time possible on this network as (125 MB / s) * (0.042) = 5.250 MB . However, the default network level settings on the servers don’t let this much bandwith to be used. We may verify it with “ndd & route” commands (only SA can run these commands):

  • If you have root access to any locale Linux box, you may try "ndd get ..." & "route print" commands, otherwise ask the SA to provide you the outputs, to find out the current settings.

  • The **recv** & **xmit** (receive & transmit) values are the default ones, used by the server & maximum values are defined by **cwnd** & **max_buf. e.g. The 49152 value indicates that this server (or any RMAN job) would send / receive, by default, 48 KB of traffic only. Moreover, if someone is manually writing a program, may use the maximum of 1048576 bytes
    = 1 MB, by using the **max** variables)

  • Now, our purpose is to be able to use the whole 5.25 MB that can be used on the current
    connection. However, the allowed values for these 4 variables, should be a power of 2. Hence, we would pick the next higher power of 2, which is 8 MB (=2 ^ 3). We would need to set the **max** limit variables also equal to 8 MB.

  • DBA should ask SA to run these commands for setting the values, with “set” option, instead of “get”. E.g. “sudo ndd -set /dev/tcp tcp_recv_hiwat 8388608”

  • The “ndd” commands get / set values for ALL connections. We may not want to do that everytime,
    because if a non Netbackup / media server (say, a windows client) connects to this server, which has a lower capacity of being able to handle traffic (say, 1 MB), then the 5 MB packets would overwhelm the connection. Hence,
    to avoid this scenario, we should use “route” commands to set connections settings between specific IP’s , which don’t work the same as “ndd”.

  • It’s always safer to use “route” settings instead of
    excepttcp_max_buf”. Here are some screenshots (please esarch internet to see more detailed syntax of route / ndd commands) :

  • The “sendpipe” option is same as “tcp_xmit_hiwat” of ndd & recvpipe is “tcp_recv_hiwat”.
    The “ssthresh” is same as “tcp_cwnd_max”. However, the “tcp_max_buf” still needs to be set through “ndd”.

  • With "route" settings, all these settings will be wiped out & will go back to default whenever a server is rebooted, so changes are temporary till next boot. With "ndd', it may be made permanent, which may be unnecessary, unless it's between some DB & RMAN backup server (viz. Netbackup) where there's a regular transfer of data.

  • "ndd" uses get / set options to show or edit values, whereas "route" uses print / add / delete options.

  • Now, having understood all the above concepts, when a DBA / SA is ready to start working on their specific environment of servers, you may follow this :

    1) 1st we need to get the current values from the SA, of the 4 parameters (to be executed from source / target server),
    a. ndd –get /dev/tcp tcp_recv_hiwat
    b. ndd –get /dev/tcp tcp_xmit_hiwat
    c. ndd –get /dev/tcp tcp_max_buf
    d. ndd –get /dev/tcp tcp_cwnd_max
    e. route get

    2) For “route get” output, these parameters are named sendpipe, recvpipe, ssthresh (there’s no equivalent of tcp_
    max_buf here – it can be set only through ndd command). “ndd” sets the parameters universally, for all IP’s, whereas “route” can be used to set between specific IP’s & temporarily.

    3) Along with this we need to know the server Ethernet card types (on both source & target servers) , viz. 1 GigE, 10GigE ; AND the network bandwidth , viz. a 10 Gbps line, 24 Mbps line etc.

    4) Then we need to make calculations based on the above network bandwidth values, taking into the consideration, an example calculation on my Blog - http://databaseoracle.blogspot.com/2009/06/how-to-speed-up-data-transfer-between.html .

    5) This way, we’d arrive at a number, which is the total bandwidth that CAN BE available between the 2 servers under consideration. We’ll call this number (if it is a power of 2) or the next power of 2, MAX-BANDWIDTH

    6) Then we request the SA to set the above parameters to new values (use route command, instead of ndd , to set these values temporarily) :
    a. route add –host … -sendpipe MAX-BANDWIDTH -recvpipe MAX-BANDWIDTH -ssthresh MAX-BANDWIDTH
    b. ndd –set /dev/tcp tcp_cwnd_max MAX-BANDWIDTH (this parameter is available only through ndd & SA may decide whether to set it or not, based on implications) .

    7) If the SA is responding in minutes, this whole calculation would take about 10-20 minutes.

    8) Please let me know in case of any doubts & your experience when you try these modifications.

    9) Happy transferring lightning-fast data !

    Sunday, January 27, 2008

    Issue with invalid objects while upgrading the database to

    Some times we face a problem with invalid objects while upgrading a 9i database to …When the objects in CTXSYS are invalid the upgrade will not proceed. In this case we need manually uninstall & install Oracle text. This is what we have done with one of our databases on a Solaris 9 unix server.

    Please refer the following notes for uninstalling & installing the Oracle text

    Note 275689.1 "Manual installation, deinstallation of Oracle Text 9.2.0.x"
    Note.280713.1 Manual installation, deinstallation of Oracle Text 10gR1 and 10gR2

    An additional point to remember ……….. We faced this problem while doing it in one database :

    While installing oracle text, especially when we run the script “start ?/ctx/admin/defaults/drdefus.sql” (refer the aforementioned notes), it may throw the following error..

    SQL> create index dropme_idx on dropme(name) indextype is ctxsys.context;
    create index dropme_idx on dropme(name) indextype is ctxsys.context
    ERROR at line 1:
    ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
    ORA-04068: existing state of packages has been discarded
    ORA-04063: stored procedure "CTXSYS.TEXTINDEXMETHODS" has errors

    IN this case please do the following…….

    Sql>connect ctxsys/****
    Sql>connect sys as sysdba
    ver varchar2(80);
    select ver_dict into ver from ctxsys.ctx_version;
    dbms_registry.loaded('CONTEXT', ver,
    'Oracle Text Release 'ver' - Production');

    -- to do: a validation procedure

    Thursday, November 29, 2007

    How to optimize UNDO tablespace ?

    --contributed by Manjunath Ulthi,

    It becomes a challenge to figure out the right amount of UNDO to be allocated to a database. Try this method :

    UndoSpace = [UR * (UPS * DBS)] + (DBS * 24)

    (UR) UNDO_RETENTION in seconds - an init parameter
    (UPS) Number of undo data blocks generated per second
    (DBS) Overhead varies based on extent and file size (db_block_size)

    Two (UPS) can be obtained from the initialization file: UNDO_RETENTION and DB_BLOCK_SIZE.
    The third piece of the formula (DBS) requires a query against the database. The number of undo blocks generated per second can be acquired from V$UNDOSTAT.

    The following formula calculates the total number of blocks generated and divides
    it by the amount of time monitored, in seconds:

    SQL>SELECT (SUM(undoblks))/ SUM ((end_time - begin_time) * 86400)
    FROM v$undostat;

    The following query calculates the number of bytes needed:

    SQL> SELECT (UR * (UPS * DBS)) + (DBS * 24) AS "Bytes"
    FROM (SELECT value AS UR FROM v$parameter WHERE name = 'undo_retention'),
    (SELECT (SUM(undoblks)/SUM(((end_time - begin_time)*86400))) AS UPS FROM v$undostat),
    (select block_size as DBS from dba_tablespaces where tablespace_name=
    (select value from v$parameter where name = 'undo_tablespace'));

    Simple, ain't it ? Please provide your inputs & actual case studies through the comments link ! We love feedback.

    Thursday, October 18, 2007

    Intereseting --- 249 days bug for Oracle on Linux servers

    --contributed by Manjunath Ulthi, Senior DBA at Infosys Technologies limited

    1 Introduction

    Problem: OCI client spins when machine uptime >= 249 days in Linux machines. Once the uptime crosses 249 days the database client hangs and leads to high CPU utilization and server hung state. The user will not be able to connect to the databases This is oracle known bug 4612267.

    Solution: This can be handled by applying the patch 4612267.

    The patch has been applied in the servers server126, server041 & server120 and tested. After the patch has been applied the client connections were normal after 249 days and all the operations of databases/applications were normal and didn’t encounter any problems.

    For further information refer the Metalink Notes

    Note: 338461.1: SQL*Plus With Instant Client Hangs, When System Uptime Is More Than 248 Days

    Note:4612267.8 : Bug 4612267 - OCI client spins when machine uptime >= 249 days

    2 General Instructions

    Procedure to apply the patch

    1. Download the patch from oracle metalink.

    The downloaded patch (p4612267_10201_LINUX.zip) is available at server120 at location /app/oracle

    2. Copy the Zip file at $ORACLE_HOME
    3. Unzip the file .It will create a directory with patch number 4612267
    4. Test the opatch utility and check the patches which have been applied to the database server.

    bash-2.05b$ $ORACLE_HOME/OPatch/opatch lsinventory

    bash-2.05b$ $ORACLE_HOME/OPatch/opatch lsinventory

    Invoking OPatch

    Oracle interim Patch Installer version

    Copyright (c) 2005, Oracle Corporation. All rights reserved..

    Oracle Home : /app/oracle/product/10.2.0

    Central Inventory : /app/oracle/oraInventory

    from : /app/oracle/product/10.2.0/oraInst.loc

    OPatch version :

    OUI version :

    OUI location : /app/oracle/product/10.2.0/oui

    Log file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/opatch-2007_Oct_13_10-11-13-EDT_Sat.log

    Lsinventory Output file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/lsinv/lsinventory-2007_Oct_13_10-11-13-EDT_Sat.txt


    Installed Top-level Products (1):

    Oracle Database 10g

    There are 1 products installed in this Oracle Home.

    Interim patches (2) :

    Patch 5865568 : applied on Thu Mar 01 21:50:15 EST 2007

    Created on 15 Feb 2007, 20:01:44 hrs PST8PDT

    Bugs fixed:


    Patch 4516865 : applied on Wed Sep 20 04:47:39 EDT 2006

    Created on 14 Feb 2006, 00:22:00 hrs US/Pacific

    Bugs fixed:

    5039292, 4747264, 4908101, 4516865


    OPatch succeeded.

    5. Take the backup of lib directory in $ORACLE_HOME

    bash-2.05b$ pwd


    bash-2.05b$ cp -R * ../bkp_lib/

    6. Go to $ORACLE_HOME/4612267
    7. Bring down the databases, listeners and DB consoles. Ensure that you have good backup pieces before proceeding
    8. Apply the patch

    $ORACLE_HOME/4612267>>>$ $ORACLE_HOME/OPatch/opatch apply

    Invoking OPatch

    Oracle interim Patch Installer version

    Copyright (c) 2005, Oracle Corporation. All rights reserved..

    Oracle Home : /app/oracle/product/10.2.0

    Central Inventory : /app/oracle/oraInventory

    from : /app/oracle/product/10.2.0/oraInst.loc

    OPatch version :

    OUI version :

    OUI location : /app/oracle/product/10.2.0/oui

    Log file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/opatch-2007_Oct_13_10-22-20-EDT_Sat.log

    ApplySession applying interim patch '4612267' to OH '/app/oracle/product/10.2.0'

    Invoking fuser to check for active processes.

    Invoking fuser on "/app/oracle/product/10.2.0/bin/oracle"

    OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only.

    Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.

    (Oracle Home = '/app/oracle/product/10.2.0')

    Is the local system ready for patching?

    Do you want to proceed? [y|n]


    User Responded with: Y

    Backing up files and inventory (not for auto-rollback) for the Oracle Home

    Backing up files affected by the patch '4612267' for restore. This might take a while...

    Backing up files affected by the patch '4612267' for rollback. This might take a while...

    Patching component oracle.oracore.rsf,

    Updating archive file "/app/oracle/product/10.2.0/lib/libcore10.a" with "lib/libcore10.a/sltrg.o"

    Patching component oracle.rdbms,

    Updating archive file "/app/oracle/product/10.2.0/lib/libcore10.a" with "lib/libcore10.a/sltrg.o"

    Running make for target client_sharedlib

    Running make for target client_sharedlib

    Running make for target ioracle

    ApplySession adding interim patch '4612267' to inventory

    The local system has been patched and can be restarted.

    OPatch succeeded.

    9. Verify the patch in inventory……. Check the patch number 4612267 in the results

    bash-2.05b$ $ORACLE_HOME/OPatch/opatch lsinventory

    Invoking OPatch

    Oracle interim Patch Installer version

    Copyright (c) 2005, Oracle Corporation. All rights reserved..

    Oracle Home : /app/oracle/product/10.2.0

    Central Inventory : /app/oracle/oraInventory

    from : /app/oracle/product/10.2.0/oraInst.loc

    OPatch version :

    OUI version :

    OUI location : /app/oracle/product/10.2.0/oui

    Log file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/opatch-2007_Oct_13_10-24-32-EDT_Sat.log

    Interim patches (3) :

    Patch 4612267 : applied on Sat Oct 13 10:23:06 EDT 2007

    Created on 5 Oct 2005, 13:48:00 hrs US/Pacific

    Bugs fixed:


    Patch 5865568 : applied on Thu Mar 01 21:50:15 EST 2007

    Created on 15 Feb 2007, 20:01:44 hrs PST8PDT

    Bugs fixed:


    Patch 4516865 : applied on Wed Sep 20 04:47:39 EDT 2006

    Created on 14 Feb 2006, 00:22:00 hrs US/Pacific

    Bugs fixed:

    5039292, 4747264, 4908101, 4516865


    OPatch succeeded.

    10. Start the databases, listeners & DB Consoles and verify the databases & applications ::::

    Monday, October 01, 2007

    The art and science of Oracle DB performance tuning

    Back to square one, where I started this blog ? ...

    Nooooo ... Here I am not providing a step by step procedure for Oracle performance tuning, however a framework to help a DBA with tackling Oracle performance issues. This could be seen as a reference to understand where to start, what information to collect, where to proceed, how to conclude, etc. .

    Without much ado, let's go ahead.

    What's the problem ?

    Some of us (DBA's) get bad performance notice from users & we might tend to directly jump into trying to find a solution. Instead of trying that, a better way to do that is to first find out what that problem is & how is it seen by the user. First, start with these questions asking the requestor :

    1 Is the whole application slow or only some modules are slow ?
    2. When did you first notice this problem ?
    3. Is the problem noticed all of sudden? Is it gradual ?
    4. Is it recurring as you mentioned ?
    5. Is there any recent upgrade to application ?
    6. Is there any substantial increase in data ?
    7. Is there any increase in number of concurrent users ?
    8. Is there any increase in total number of users ?
    8. Define the problem in your own words.
    9. Suppose if we find the certain module of the application is causing the problem, Is it possible to change that module ?
    10. Any other comments about the problem ?

    PS : I recently got a request to bounce (/ restart) a PRODuction database from a user, stating that the DB was down. When I spoke to him, I discovered that just because he was not able to connect to his application & see anything, he thought the Oracle database behind his application, was down & needed a bounce. I already knew that the database was up & fine, I asked him to speak to the Application team. It was discovered as an application issue, never to be handled by a DBA :-)

    Tackle the problem now

    Once the problem is clear & the DBA knows that there is a real issue, one may follow these sure-shot steps. You are welcome to follow your own strategy & resolve the issue. If anyone has some strategy to go about performance issues, please share your way of doing things, through comments on this post. I will incorporate them here, as your contribution.

    1. Talk to the user who is facing the problem, to clarify any of the above inputs
    2. Install & run STATSPACK on the concerned Oracle database. By default, spauto.sql schedules the snapshots hourly.
    3. STATSPACK would take some time to provide some useful report. Meanwhile, look at the other aspects below.
    4. If the user knows some areas / schema / SQL / PL/SQL code areas of the database, which might be causing trouble, look at them once. Understand from the user, the database logic used in their "problem operation".
    5. Analyze the database objects involved in the above key areas, for appropriate indexing, fresh statistics.
    6. Look at the explain plan if there's already a SQL code pointed out , otherwise perform this step after you get STATSPACK report.
    7. Start tracing for the schema(s) which the users use to connect to the database.
    8. Have the user run their operation once, to make sure it is captured in STATSPACK & the trace.
    9. Generate the STATSPACK report & use my "STATSPACK post" , the "Survival guide" or a statspack analysis tool to see it in a more understandable format with recommendations too (this tool is not necessary, however, it is helpful).
    10. Format the recent trace (with tkprof utility), generated after the trial of the concerned operation by the user.
    11. Analyze the above 2 reports & find out the issues / areas to be focussed on. STATSPACK report can list out top SQL's / PL/SQL areas of the database. We can start your real, concentrated analysis here. You may repeat steps 4, 5, 6, 7 again if required (most of the times, they will be).
    12. If the problem area is identified as PL/SQL code, it's best to use DBMS_PROFILER to find out the problem line(s) of code. You may use "Oracle official area" to know how to use DBMS_PROFILER or try the database journal (you may just enter DBMS_PROFILER in a search engine & get 100's of results on how to use it). DBMS_PROFILER needs to be installed first as SYS user (profload.sql) & then some objects are created in the concerned schema through schema login (proftab.sql).

    Let me know your case-studies - I'll post them here too, along with mine. More on tuning later ...