Document ID: 33838.1
Subject: Determining the execution plan for a distributed query
Last Revision Date: 06 September 1996
Author: rbeard
Introduction
------------
This bulletin discusses how to determine the execution plan for a
distributed query. Information in this bulletin is only relevant to Oracle7,
version 7.2 or earlier. The functionality documented may change with
later versions of Oracle.
How Oracle Processes Queries
----------------------------
Before processing any query, Oracle uses the optimizer to determine the best
execution plan. Oracle7 posesses two optimizers, the cost based optimizer and
the older rule based optimizer. It is recommended that distributed queries are
run using the cost based optimizer as this contains added functionality
which means that it processes distributed queries more efficiently.
A distributed query is a sql statement where one or more objects are
accessed remotely using a database link. These remote objects can be any
valid object eg. tables or views.
For further information about some of the commands and utilities referenced
in this article, see the following manuals:
Oracle7 Server Application Developers Guide
Appendix B Performance Diagnostic Tools
Oracle 7.2 Tuning Guide
Appendix A Performance Diagnostic Tools
How To Determine A Query's Execution Plan
-----------------------------------------
The EXPLAIN PLAN or TKPROF command can be used to determine a query's
execution plan. In both cases, an intermediate table is populated with
the query's execution plan.
Using EXPLAIN PLAN, this table is called PLAN_TABLE and must be created
in your schema before running the command. 'Explaining' a statement
loads rows into the plan table, which can be queried later. Once you
have finished examining the plan a query has used, the rows
must be deleted manually. To create the PLAN_TABLE in your schema, run
the sql script UTLXPLAN.SQL.
Using TKPROF, the table is called PROF$PLAN_TABLE, and is created as a
temporary table, populated, the output produced and then the table dropped.
The format of the output created using TKPROF cannot be changed by the
user, so for the purposes of determining the sql statements being sent
to remote database(s), the EXPLAIN PLAN command should be used.
How To Determine The SQL Statement(s) Sent To The Remote Database
-----------------------------------------------------------------
The PLAN_TABLE has a column OTHER, which for a specific execution step
contains additional information that a user may find useful. In the case,
where the OPERATION is REMOTE, the OTHER column contains the text of the SQL
statement sent to the remote database.
The following is an example script that could be used to format the
output from the PLAN_TABLE:
REM remote_plan.sql
set long 2000
set arraysize 1
col operation format a22
col options format a8
col object_name format a10
col object_node format a5
col other format a20
col position format 99999
col optimizer format a10
select lpad(' ',2*(level-1))||operation operation,options,object_name,
optimizer,object_node,other
from plan_table
start with id=0 and statement_id='A'
connect by prior id=parent_id and statement_id='A';
set echo on
Note: Be careful because sometimes the output from the OTHER column is
truncated.
How to Determine the Execution Plan on the Remote Database
----------------------------------------------------------
As stated above, the two methods of determining the execution plan for
a sql statement are TKPROF and EXPLAIN PLAN. In the case of the remote
query, we need to trace the access path of the sql sent to the remote
database by producing a trace file and using TKPROF to format it. The
method of producing the trace file is dependant on the Oracle7 release.
Prior to Oracle 7.1, to produce a trace file on the remote database, sql
trace must be enabled at the instance level by setting SQL_TRACE=TRUE. The
resulting trace file(s) formatted using TKPROF with the EXPLAIN option.
From Oracle 7.1, a new package is available which allows a user to enable
sql trace on the remote database from the local database. The package
is called:
dbms_session.set_sql_trace(sql_trace boolean)
To enable sql trace on the remote database from within a local session,
a remote procedure should be created which calls dbms_session.set_sql_trace.
From the local database, the remote procedure is executed, which opens the
database link and enables sql trace. To enable sql trace on the remote
database from the local database, the following commands can be followed:
On the local database:
1. Create a database link to remote database:
SQLPLUS <localuser>/<localpassword>
CREATE DATABASE LINK <mydblink>
CONNECT TO <remoteuser>
IDENTIFIED BY <remotepassword>
USING '<connect string>';
On the remote database:
1. Grant 'alter session' privilege to the <remoteuser>. From any
DBA user account:
GRANT ALTER SESSION TO <remoteuser>;
2. Create the procedure to execute dbms_session under the schema
defined by <remoteuser>:
CONNECT <remoteuser>/<remotepassword>
CREATE OR REPLACE PROCEDURE set_trace
as
begin
dbms_session.set_sql_trace(TRUE);
end;
/
3. Validate that the procedure executes correctly:
set serveroutput on
EXECUTE set_trace
On the local database:
1. Execute the procedure on the remote database:
SQLPLUS <localuser>/<localpassword>
EXECUTE set_trace@<mydblink>
2. Execute distributed query
SELECT e.empno, e.ename, d.dname
FROM emp e,dept@<mydblink> d
WHERE d.deptno = e.deptno
AND e.empno = 7934;
On the remote database:
1. Determine location of trace files (normally identified by the
initialization parameter user_dump_dest). Trace files will normally
be named <sid>_ora_<pid>.trc where <sid> is the value of ORACLE_SID and
<pid> is the process id of the open dblink.
2. Use TKPROF with the EXPLAIN option to format the trace file:
TKPROF <sid>_ora_<pid>.trc explain.out
explain=<remoteuser>/<remotepassword>
This will produce a formatted trace file called explain.out
Alternatively sql trace can be enabled on the remote database by setting
the parameter SQL_TRACE=TRUE in the initialisation parameter file. Restart
the database with this parameter, execute the distributed query. A trace
file will be produced in the normal trace file directory. Format using
TKPROF with the EXPLAIN option. The resulting output file will contain
the execution path taken by the remote query. But remember that setting
SQL_TRACE=TRUE at the instance level means that every session will produce
a trace file.
From Oracle 7.2, a second package is available which can be used to
enable sql trace on the remote database. This package must be issued on
the remote database having first opened a session from the local database.
The package is called:
dbms_system.set_sql_trace_in_session(sid number, serial# number,
sql_trace boolean)
The default installation of the package DBMS_SYSTEM only allows the user SYS
to execute the package. To enable another user to execute the package, SYS
must grant execute permission on the package to the other user.
On the remote database:
1. From sqlplus as the user SYS:
GRANT EXECUTE ON dbms_system TO <remoteuser>;
To initiate sql trace on the remote database, the following commands can
be used:
On the local database:
1. Create a database link to remote database:
SQLPLUS <localuser>/<localpassword>
CREATE DATABASE LINK <mydblink>
CONNECT TO <remoteuser>
IDENTIFIED BY <remotepassword>
USING '<connect string>';
2. Open the dblink by using it within a query:
SELECT * FROM <remote table>@<mydblink>;
On the remote database:
1. Determine the sid, serial# of the open dblink. Issue the
following query from within sqlplus as a dba user:
SQLPLUS <dbauser>/<dbapassword>
COLUMN machine FORMAT a15
SELECT sid, serial#, username, machine
FROM v$session
WHERE machine='<node name of local machine>';
The USERNAME returned should be the name of the <localuser>.
2. Using the sid, serial# from the above query:
SQLPLUS <dba user>/<dba user password>
EXECUTE sys.dbms_system.set_sql_trace_in_session(<sid>,<serial#>,TRUE)
On the local database:
1. Execute distributed query
On the remote database:
1. Determine location of trace files (normally identified by the
initialization parameter user_dump_dest). Trace files will normally
be named ora_<sid>.trc.
2. Use TKPROF with the EXPLAIN option to format the trace file:
tkprof ora_<sid>.trc explain.out explain=<remoteuser>/<remotepassword>
This will produce a formatted trace file called explain.out
How Oracle Processes Distributed Queries
----------------------------------------
A distributed query is a query that references one or more objects on
one or more remote databases.
If, within a query, all the objects are remote and use the same database
link, then oracle will attempt to send the complete query to the remote
database for processing.
If, within a query, some of the objects are remote and some are local, then
oracle will break up the query into individual sql statements and pass sql to
the remote database for each remote onject. Relevant predicates are sent where
possible. Remote joins are not performed by default.
If you want to join the remote tables, the best way to achieve this is by
building a view of these tables on the remote site. This view can then be
referenced by your query. Note that the predicates do not have to be created
within the view as they should be passed within the sql statement sent from the
local to remote node.
Examples
--------
The following examples use tables which can be created by running the
supplied script DEMOBLD.SQL. In addition a 3rd table, LOCATION is used.
To create this table, run the following from sqlplus:
CREATE TABLE location
(loc)
AS
SELECT loc FROM dept;
Example 1 - select using 2 remote tables and 1 local table
---------
EXPLAIN PLAN
SET statement_id='A'
FOR
SELECT e.empno, e.ename, d.dname, l.loc, d.deptno
FROM emp@<mydblink> e, dept@<mydblink> d, location l
WHERE d.deptno = e.deptno
AND d.loc = l.loc
AND e.empno = 7934
/
After running remote_plan.sql, the formatted explain plan output is as follows:
OPERATION OPTIONS OBJECT_NAM OPTIMIZER OBJEC OTHER
---------------------- -------- ---------- ---------- ----- --------------------
SELECT STATEMENT CHOOSE
NESTED LOOPS
NESTED LOOPS
REMOTE ROSE. SELECT "EMPNO","ENAM
WORLD E","DEPTNO" FROM
"EMP" E WHERE
"EMPNO"=7934
REMOTE ROSE. SELECT "DEPTNO","DNA
WORLD ME","LOC" FROM "DEPT
" D WHERE "DEPTNO"=:
1
TABLE ACCESS FULL LOCATION ANALYZED
With remote tables, the row with OPERATION='REMOTE' stores the sql sent the
remote node in the OTHER column.
Example 2 - select using 1 remote view and 1 local table
---------
Where the remote view is:
CREATE OR REPLACE VIEW remote_view
(empno, ename, dname, loc)
AS
SELECT e.empno, e.ename, d.dname, d.loc
FROM emp e, dept d
WHERE e.deptno = d.deptno;
EXPLAIN PLAN
SET statement_id='A'
FOR
SELECT rv.empno, rv.ename, rv.dname, l.loc
FROM remote_view@<mydblink> rv, location l
WHERE rv.loc = l.loc
AND rv.empno = 7934
/
The explain plan output for this query is as follows:
OPERATION OPTIONS OBJECT_NAM OPTIMIZER OBJEC OTHER
---------------------- -------- ---------- ---------- ----- --------------------
SELECT STATEMENT CHOOSE
NESTED LOOPS
REMOTE ROSE. SELECT "EMPNO","ENAM
WORLD E","DNAME","LOC" FRO
M "REMOTE_VIEW" RV W
HERE "EMPNO"=7934
TABLE ACCESS FULL LOCATION ANALYZED
From this you can see that for the OPERATION='REMOTE' row, the OTHER column
shows the additional where predicate (ie. 'WHERE empno=7934') that has been
sent to the remote site.
Example 3 - All remote tables
---------
EXPLAIN PLAN
SET statement_id='A'
FOR
SELECT e.empno, e.ename, d.dname, l.loc, d.deptno
FROM emp@<mydblink> e, dept@<mydblink> d, location@<mydblink> l
WHERE d.deptno = e.deptno
AND d.loc = l.loc
AND e.empno = 7934
/
produced the following explain plan output:
OPERATION OPTIONS OBJECT_NAM OPTIMIZER OBJEC OTHER
---------------------- -------- ---------- ---------- ----- --------------------
SELECT STATEMENT REMOTE CHOOSE
NESTED LOOPS
NESTED LOOPS
TABLE ACCESS FULL EMP ANALYZED V722.
WORLD
TABLE ACCESS FULL DEPT ANALYZED V722.
WORLD
TABLE ACCESS FULL LOCATION V722.
With all tables located on the remote site, the OPTIONS column is set to REMOTE
rather
then the OPERATION column. In this case the complete query was sent to the
remote
node and processed remotely.
On the remote node, from the tkprof output:
SELECT A3."EMPNO",A3."ENAME",A2."DNAME",A1."LOC",A2."DEPTNO"
FROM "EMP" A3,"DEPT" A2,"LOCATION" A1
WHERE A2."DEPTNO"=A3."DEPTNO"
AND A2."LOC"= A1."LOC"
AND A3."EMPNO"=7934
and the execution plan from tkprof:
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT HINT: CHOOSE
0 NESTED LOOPS
0 NESTED LOOPS
0 TABLE ACCESS HINT: ANALYZED (FULL) OF 'EMP'
0 TABLE ACCESS HINT: ANALYZED (FULL) OF 'DEPT'
0 TABLE ACCESS (FULL) OF 'LOCATION'