Thursday, November 29, 2007

How to optimize UNDO tablespace ?

--contributed by Manjunath Ulthi,

It becomes a challenge to figure out the right amount of UNDO to be allocated to a database. Try this method :

UndoSpace = [UR * (UPS * DBS)] + (DBS * 24)

where,
(UR) UNDO_RETENTION in seconds - an init parameter
(UPS) Number of undo data blocks generated per second
(DBS) Overhead varies based on extent and file size (db_block_size)


Two (UPS) can be obtained from the initialization file: UNDO_RETENTION and DB_BLOCK_SIZE.
The third piece of the formula (DBS) requires a query against the database. The number of undo blocks generated per second can be acquired from V$UNDOSTAT.


The following formula calculates the total number of blocks generated and divides
it by the amount of time monitored, in seconds:

SQL>SELECT (SUM(undoblks))/ SUM ((end_time - begin_time) * 86400)
FROM v$undostat;



The following query calculates the number of bytes needed:

SQL> SELECT (UR * (UPS * DBS)) + (DBS * 24) AS "Bytes"
FROM (SELECT value AS UR FROM v$parameter WHERE name = 'undo_retention'),
(SELECT (SUM(undoblks)/SUM(((end_time - begin_time)*86400))) AS UPS FROM v$undostat),
(select block_size as DBS from dba_tablespaces where tablespace_name=
(select value from v$parameter where name = 'undo_tablespace'));


Simple, ain't it ? Please provide your inputs & actual case studies through the comments link ! We love feedback.

Thursday, October 18, 2007

Intereseting --- 249 days bug for Oracle on Linux servers

--contributed by Manjunath Ulthi, Senior DBA at Infosys Technologies limited

1 Introduction


Problem: OCI client spins when machine uptime >= 249 days in Linux machines. Once the uptime crosses 249 days the database client hangs and leads to high CPU utilization and server hung state. The user will not be able to connect to the databases This is oracle known bug 4612267.


Solution: This can be handled by applying the patch 4612267.


The patch has been applied in the servers server126, server041 & server120 and tested. After the patch has been applied the client connections were normal after 249 days and all the operations of databases/applications were normal and didn’t encounter any problems.

For further information refer the Metalink Notes

Note: 338461.1: SQL*Plus With Instant Client 10.2.0.1 Hangs, When System Uptime Is More Than 248 Days

Note:4612267.8 : Bug 4612267 - OCI client spins when machine uptime >= 249 days

2 General Instructions

Procedure to apply the patch

1. Download the patch from oracle metalink.

The downloaded patch (p4612267_10201_LINUX.zip) is available at server120 at location /app/oracle

2. Copy the Zip file at $ORACLE_HOME
3. Unzip the file .It will create a directory with patch number 4612267
4. Test the opatch utility and check the patches which have been applied to the database server.

bash-2.05b$ $ORACLE_HOME/OPatch/opatch lsinventory

bash-2.05b$ $ORACLE_HOME/OPatch/opatch lsinventory

Invoking OPatch 10.2.0.1.0

Oracle interim Patch Installer version 10.2.0.1.0

Copyright (c) 2005, Oracle Corporation. All rights reserved..

Oracle Home : /app/oracle/product/10.2.0

Central Inventory : /app/oracle/oraInventory

from : /app/oracle/product/10.2.0/oraInst.loc

OPatch version : 10.2.0.1.0

OUI version : 10.2.0.1.0

OUI location : /app/oracle/product/10.2.0/oui

Log file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/opatch-2007_Oct_13_10-11-13-EDT_Sat.log

Lsinventory Output file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/lsinv/lsinventory-2007_Oct_13_10-11-13-EDT_Sat.txt

--------------------------------------------------------------------------------

Installed Top-level Products (1):

Oracle Database 10g 10.2.0.1.0

There are 1 products installed in this Oracle Home.

Interim patches (2) :

Patch 5865568 : applied on Thu Mar 01 21:50:15 EST 2007

Created on 15 Feb 2007, 20:01:44 hrs PST8PDT

Bugs fixed:

5865568

Patch 4516865 : applied on Wed Sep 20 04:47:39 EDT 2006

Created on 14 Feb 2006, 00:22:00 hrs US/Pacific

Bugs fixed:

5039292, 4747264, 4908101, 4516865

--------------------------------------------------------------------------------

OPatch succeeded.

5. Take the backup of lib directory in $ORACLE_HOME

bash-2.05b$ pwd

/app/oracle/product/10.2.0/lib

bash-2.05b$ cp -R * ../bkp_lib/

6. Go to $ORACLE_HOME/4612267
7. Bring down the databases, listeners and DB consoles. Ensure that you have good backup pieces before proceeding
8. Apply the patch

$ORACLE_HOME/4612267>>>$ $ORACLE_HOME/OPatch/opatch apply

Invoking OPatch 10.2.0.1.0

Oracle interim Patch Installer version 10.2.0.1.0

Copyright (c) 2005, Oracle Corporation. All rights reserved..

Oracle Home : /app/oracle/product/10.2.0

Central Inventory : /app/oracle/oraInventory

from : /app/oracle/product/10.2.0/oraInst.loc

OPatch version : 10.2.0.1.0

OUI version : 10.2.0.1.0

OUI location : /app/oracle/product/10.2.0/oui

Log file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/opatch-2007_Oct_13_10-22-20-EDT_Sat.log

ApplySession applying interim patch '4612267' to OH '/app/oracle/product/10.2.0'

Invoking fuser to check for active processes.

Invoking fuser on "/app/oracle/product/10.2.0/bin/oracle"

OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.

(Oracle Home = '/app/oracle/product/10.2.0')

Is the local system ready for patching?

Do you want to proceed? [y|n]

y

User Responded with: Y

Backing up files and inventory (not for auto-rollback) for the Oracle Home

Backing up files affected by the patch '4612267' for restore. This might take a while...

Backing up files affected by the patch '4612267' for rollback. This might take a while...

Patching component oracle.oracore.rsf, 10.2.0.1.0...

Updating archive file "/app/oracle/product/10.2.0/lib/libcore10.a" with "lib/libcore10.a/sltrg.o"

Patching component oracle.rdbms, 10.2.0.1.0...

Updating archive file "/app/oracle/product/10.2.0/lib/libcore10.a" with "lib/libcore10.a/sltrg.o"

Running make for target client_sharedlib

Running make for target client_sharedlib

Running make for target ioracle

ApplySession adding interim patch '4612267' to inventory

The local system has been patched and can be restarted.

OPatch succeeded.

9. Verify the patch in inventory……. Check the patch number 4612267 in the results

bash-2.05b$ $ORACLE_HOME/OPatch/opatch lsinventory

Invoking OPatch 10.2.0.1.0

Oracle interim Patch Installer version 10.2.0.1.0

Copyright (c) 2005, Oracle Corporation. All rights reserved..

Oracle Home : /app/oracle/product/10.2.0

Central Inventory : /app/oracle/oraInventory

from : /app/oracle/product/10.2.0/oraInst.loc

OPatch version : 10.2.0.1.0

OUI version : 10.2.0.1.0

OUI location : /app/oracle/product/10.2.0/oui

Log file location : /app/oracle/product/10.2.0/cfgtoollogs/opatch/opatch-2007_Oct_13_10-24-32-EDT_Sat.log

Interim patches (3) :

Patch 4612267 : applied on Sat Oct 13 10:23:06 EDT 2007

Created on 5 Oct 2005, 13:48:00 hrs US/Pacific

Bugs fixed:

4612267

Patch 5865568 : applied on Thu Mar 01 21:50:15 EST 2007

Created on 15 Feb 2007, 20:01:44 hrs PST8PDT

Bugs fixed:

5865568

Patch 4516865 : applied on Wed Sep 20 04:47:39 EDT 2006

Created on 14 Feb 2006, 00:22:00 hrs US/Pacific

Bugs fixed:

5039292, 4747264, 4908101, 4516865

--------------------------------------------------------------------------------

OPatch succeeded.

10. Start the databases, listeners & DB Consoles and verify the databases & applications ::::

Monday, October 01, 2007

The art and science of Oracle DB performance tuning

Back to square one, where I started this blog ? ...

Nooooo ... Here I am not providing a step by step procedure for Oracle performance tuning, however a framework to help a DBA with tackling Oracle performance issues. This could be seen as a reference to understand where to start, what information to collect, where to proceed, how to conclude, etc. .

Without much ado, let's go ahead.

What's the problem ?

Some of us (DBA's) get bad performance notice from users & we might tend to directly jump into trying to find a solution. Instead of trying that, a better way to do that is to first find out what that problem is & how is it seen by the user. First, start with these questions asking the requestor :

1 Is the whole application slow or only some modules are slow ?
2. When did you first notice this problem ?
3. Is the problem noticed all of sudden? Is it gradual ?
4. Is it recurring as you mentioned ?
5. Is there any recent upgrade to application ?
6. Is there any substantial increase in data ?
7. Is there any increase in number of concurrent users ?
8. Is there any increase in total number of users ?
8. Define the problem in your own words.
9. Suppose if we find the certain module of the application is causing the problem, Is it possible to change that module ?
10. Any other comments about the problem ?

PS : I recently got a request to bounce (/ restart) a PRODuction database from a user, stating that the DB was down. When I spoke to him, I discovered that just because he was not able to connect to his application & see anything, he thought the Oracle database behind his application, was down & needed a bounce. I already knew that the database was up & fine, I asked him to speak to the Application team. It was discovered as an application issue, never to be handled by a DBA :-)

Tackle the problem now

Once the problem is clear & the DBA knows that there is a real issue, one may follow these sure-shot steps. You are welcome to follow your own strategy & resolve the issue. If anyone has some strategy to go about performance issues, please share your way of doing things, through comments on this post. I will incorporate them here, as your contribution.

  1. Talk to the user who is facing the problem, to clarify any of the above inputs
  2. Install & run STATSPACK on the concerned Oracle database. By default, spauto.sql schedules the snapshots hourly.
  3. STATSPACK would take some time to provide some useful report. Meanwhile, look at the other aspects below.
  4. If the user knows some areas / schema / SQL / PL/SQL code areas of the database, which might be causing trouble, look at them once. Understand from the user, the database logic used in their "problem operation".
  5. Analyze the database objects involved in the above key areas, for appropriate indexing, fresh statistics.
  6. Look at the explain plan if there's already a SQL code pointed out , otherwise perform this step after you get STATSPACK report.
  7. Start tracing for the schema(s) which the users use to connect to the database.
  8. Have the user run their operation once, to make sure it is captured in STATSPACK & the trace.
  9. Generate the STATSPACK report & use my "STATSPACK post" , the "Survival guide" or a statspack analysis tool to see it in a more understandable format with recommendations too (this tool is not necessary, however, it is helpful).
  10. Format the recent trace (with tkprof utility), generated after the trial of the concerned operation by the user.
  11. Analyze the above 2 reports & find out the issues / areas to be focussed on. STATSPACK report can list out top SQL's / PL/SQL areas of the database. We can start your real, concentrated analysis here. You may repeat steps 4, 5, 6, 7 again if required (most of the times, they will be).
  12. If the problem area is identified as PL/SQL code, it's best to use DBMS_PROFILER to find out the problem line(s) of code. You may use "Oracle official area" to know how to use DBMS_PROFILER or try the database journal (you may just enter DBMS_PROFILER in a search engine & get 100's of results on how to use it). DBMS_PROFILER needs to be installed first as SYS user (profload.sql) & then some objects are created in the concerned schema through schema login (proftab.sql).

Let me know your case-studies - I'll post them here too, along with mine. More on tuning later ...

Wednesday, September 26, 2007

Tracing Errors


For most of us, writing procedures, functions or say just a PL/SQL Code Block based on a given logic is wasy. The challenge lies in designing and placing the "Exception Handling" Block with the intent of capturing the correct line of Code that raised the exception so that one can them Zoom into the erroneous line and do make the necessary fixes. Now many of you would agree that this comes with experience 'n' ofcourse proactive thinking.
Now conder the following Code Snippet,

WHEN NO_DATA_FOUND
THEN dbms_output.put_line (sqlerrm);
END;

Looks familiar...? But here's a catch - this code block will keep rescuing us as far as Error String is less than 255 Characters. (The limitation with put_line() is that it can capture a max of 255 Characters).
Oracle has revolved this by its offering -DBMS_UTILITY.FORMAT_ERROR_STACK

BEGIN
error();
EXCEPTION WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE( SQLERRM );
DBMS_OUTPUT.PUT_LINE( DBMS_UTILITY.FORMAT_ERROR_BACKTRACE );
END;
/
ORA-00900: invalid SQL statement
ORA-06512: at line 2

Now, as evident from output above, the information provided in error stack is sufficent for application error logs and can be worked upon accordingly. And the best part is there's no limitation of number of characters we can capture in error stack. Makes life easy indeed :)

Wednesday, September 19, 2007

Statspack report analysis


Have you ever faced this ? ... You have your statspack report ready & are trying to work with your Statspack analyzer tool, which performs statspack report analysis & generates an html file, which is more understandable to non-DBA's too (with some charts, top-queries, recommendations, etc.) . At this moment, either your analysis tool gives up, or the website which you used for analysis, is unavailable (recently happened with oraperf, which we used) !
A solution to this, for a DBA, is to be able to read through the statspack report & identify the key areas, issues of importance . This place offers a simple statspack survival guide, to help understand the report. Hope this helps a DBA in speedy statspack analysis, instead of depending on a tool that is not working, or giving erratic results.