About Me

My photo
Dhaka, Dhaka, Bangladesh
✔4x Salesforce Certified ✔Web Application Developer ✔Database Developer with DWH/ETL/BI • Successful solution engineer and developer with 16+ years of experience multiple technology and in different countries. Proficient in implementing business requirements into the technical solution. Experience handling all phases of the project lifecycle from discovery through to deployment. Worked as technical lead for a team of junior developers to make sure code stayed in line with requirements and standard best practices. Skilled at integrating disparate systems with Salesforce.Experience implementing Salesforce Community Cloud at two previous companies.

Monday, October 22, 2012

Big Data performance delivered Oracle noSql Database on cisco ucs

Wednesday, October 17, 2012

tonymacx86 Blog: iBoot + MultiBeast: Install Mac OS X on any Intel-based PC

tonymacx86 Blog: iBoot + MultiBeast: Install Mac OS X on any Intel-based PC

Monday, July 16, 2012

Fast refreshable materialized view

Fast refreshable join materialized view:
sql>  create table myemp as select * from emp
 2  /

Table created.

sql>  create table mydept as select * from dept
 2  /

Table created.

sql>  alter table myemp add primary key (empno)
 2  /

Table altered.

sql>  alter table mydept add primary key (deptno)
 2  /

Table altered.

sql>  alter table myemp add foreign key (deptno) references mydept(deptno)
 2  /

Table altered.

Join MV's require materialized view logs with the rowid:


sql>  create materialized view log on myemp
 2    with rowid
 3  /

Materialized view log created.

sql>  create materialized view log on mydept
 2    with rowid
 3  /

Materialized view log created.

sql>  create materialized view empdept_mv
 2    refresh fast on commit
 3  as
 4  select e.rowid e_rowid
 5       , d.rowid d_rowid
 6       , e.empno
 7       , e.ename
 8       , d.deptno
 9       , d.dname
10    from myemp e
11       , mydept d
12   where e.deptno = d.deptno
13     and d.deptno = 10
14  /

Materialized view created.

The join MV selects some attributes from both myemp and mydept, but only from department 10. This setup works:


sql>  select * from empdept_mv
 2  /

E_ROWID            D_ROWID            EMPNO ENAME      DEPTNO DNAME
------------------ ------------------ ----- ---------- ------ --------------
AAAS3MAAEAAACJTAAG AAAS3NAAEAAACJbAAA  7782 CLARK          10 ACCOUNTING
AAAS3MAAEAAACJTAAI AAAS3NAAEAAACJbAAA  7839 KING           10 ACCOUNTING
AAAS3MAAEAAACJTAAN AAAS3NAAEAAACJbAAA  7934 MILLER         10 ACCOUNTING

3 rows selected.

sql>  insert into myemp (empno,ename,deptno)
 2    values (7777,'VAN WIJK',10)
 3  /

1 row created.

sql>  commit
 2  /

Commit complete.

sql>  select * from empdept_mv
 2  /

E_ROWID            D_ROWID            EMPNO ENAME      DEPTNO DNAME
------------------ ------------------ ----- ---------- ------ --------------
AAAS3MAAEAAACJTAAG AAAS3NAAEAAACJbAAA  7782 CLARK          10 ACCOUNTING
AAAS3MAAEAAACJTAAI AAAS3NAAEAAACJbAAA  7839 KING           10 ACCOUNTING
AAAS3MAAEAAACJTAAN AAAS3NAAEAAACJbAAA  7934 MILLER         10 ACCOUNTING
AAAS3MAAEAAACJVAAA AAAS3NAAEAAACJbAAA  7777 VAN WIJK       10 ACCOUNTING

4 rows selected.


According to the 
documentation, the following restrictions apply for fast refreshable join MV's:


  1. All restrictions from "General Restrictions on Fast Refresh".
  2. Rowids of all the tables in the FROM list must appear in the SELECT list of the query.
  3. They cannot have GROUP BY clauses or aggregates.
  4. Materialized view logs must exist with rowids for all the base tables in the FROM list of the query.
  5. You cannot create a fast refreshable materialized view from multiple tables with simple joins that include an object type column in the SELECT statement.

The first one is discussed in the previous post (see link at the top of this post)


2) Rowids of all the tables in the FROM list must appear in the SELECT list of the query.

Here I'll omit the rowid of mydept:


sql>  delete myemp where empno = 7777
  2  /

1 row deleted.

sql>  drop materialized view empdept_mv
  2  /

Materialized view dropped.

sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , e.empno
  6       , e.ename
  7       , d.deptno
  8       , d.dname
  9    from myemp e
 10       , mydept d
 11   where e.deptno = d.deptno
 12     and d.deptno = 10
 13  /
   and d.deptno = 10
                  *
ERROR at line 12:
ORA-12052: cannot fast refresh materialized view RWIJK.EMPDEPT_MV

We have seen the ORA-12052 a few times in the previous post, but this time it's because d.rowid is missing from the select list of the materialized view. For some reason, I usually forget adding the rowid when adding a table to the from-clause. So I am very familiar with this particular situation.


3) They cannot have GROUP BY clauses or aggregates.

This situation overlaps a bit with the next post's topic: aggregate MV's. Let's just add a dummy group by clause:


sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , d.rowid d_rowid
  6       , e.empno
  7       , e.ename
  8       , d.deptno
  9       , d.dname
 10    from myemp e
 11       , mydept d
 12   where e.deptno = d.deptno
 13     and d.deptno = 10
 14   group by e.rowid
 15       , d.rowid
 16       , e.empno
 17       , e.ename
 18       , d.deptno
 19       , d.dname
 20  /
   and d.deptno = 10
                  *
ERROR at line 13:
ORA-32401: materialized view log on "RWIJK"."MYDEPT" does not have new values

As we'll see in the next post as well, the materialized view logs for aggregate MV's need all columns and the "including new values" clause:


sql>  drop materialized view log on myemp
  2  /

Materialized view log dropped.

sql>  drop materialized view log on mydept
  2  /

Materialized view log dropped.

sql>  create materialized view log on myemp
  2    with rowid (empno,ename,deptno) including new values
  3  /

Materialized view log created.

sql>  create materialized view log on mydept
  2    with rowid (deptno,dname) including new values
  3  /

Materialized view log created.

sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , d.rowid d_rowid
  6       , e.empno
  7       , e.ename
  8       , d.deptno
  9       , d.dname
 10    from myemp e
 11       , mydept d
 12   where e.deptno = d.deptno
 13     and d.deptno = 10
 14   group by e.rowid
 15       , d.rowid
 16       , e.empno
 17       , e.ename
 18       , d.deptno
 19       , d.dname
 20  /

Materialized view created.

With the new materialized view logs in place, the MV is created successfully. And even better, the MV gets refreshed:


sql>  select * from empdept_mv
  2  /

E_ROWID            D_ROWID            EMPNO ENAME      DEPTNO DNAME
------------------ ------------------ ----- ---------- ------ --------------
AAAS3MAAEAAACJTAAI AAAS3NAAEAAACJbAAA  7839 KING           10 ACCOUNTING
AAAS3MAAEAAACJTAAN AAAS3NAAEAAACJbAAA  7934 MILLER         10 ACCOUNTING
AAAS3MAAEAAACJTAAG AAAS3NAAEAAACJbAAA  7782 CLARK          10 ACCOUNTING

3 rows selected.

sql>  insert into myemp (empno,ename,deptno)
  2    values (7777,'VAN WIJK',10)
  3  /

1 row created.

sql>  commit
  2  /

Commit complete.

sql>  select * from empdept_mv
  2  /

E_ROWID            D_ROWID            EMPNO ENAME      DEPTNO DNAME
------------------ ------------------ ----- ---------- ------ --------------
AAAS3MAAEAAACJTAAI AAAS3NAAEAAACJbAAA  7839 KING           10 ACCOUNTING
AAAS3MAAEAAACJTAAN AAAS3NAAEAAACJbAAA  7934 MILLER         10 ACCOUNTING
AAAS3MAAEAAACJTAAG AAAS3NAAEAAACJbAAA  7782 CLARK          10 ACCOUNTING
AAAS3MAAEAAACJVAAB AAAS3NAAEAAACJbAAA  7777 VAN WIJK       10 ACCOUNTING

4 rows selected.


But then why is it mentioned as a restriction? Let's try again, but with an extra aggregate function in the select list. Maybe this will break it?


sql>  delete myemp where empno = 7777
  2  /

1 row deleted.

sql>  drop materialized view empdept_mv
  2  /

Materialized view dropped.

sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , d.rowid d_rowid
  6       , e.empno
  7       , e.ename
  8       , d.deptno
  9       , d.dname
 10       , count(*) cnt
 11    from myemp e
 12       , mydept d
 13   where e.deptno = d.deptno
 14     and d.deptno = 10
 15   group by e.rowid
 16       , d.rowid
 17       , e.empno
 18       , e.ename
 19       , d.deptno
 20       , d.dname
 21  /

Materialized view created.

sql>  select * from empdept_mv
  2  /

E_ROWID            D_ROWID            EMPNO ENAME      DEPTNO DNAME          CNT
------------------ ------------------ ----- ---------- ------ -------------- ---
AAAS3MAAEAAACJTAAI AAAS3NAAEAAACJbAAA  7839 KING           10 ACCOUNTING       1
AAAS3MAAEAAACJTAAN AAAS3NAAEAAACJbAAA  7934 MILLER         10 ACCOUNTING       1
AAAS3MAAEAAACJTAAG AAAS3NAAEAAACJbAAA  7782 CLARK          10 ACCOUNTING       1

3 rows selected.

sql>  insert into myemp (empno,ename,deptno)
  2    values (7777,'VAN WIJK',10)
  3  /

1 row created.

sql>  commit
  2  /

Commit complete.

sql>  select * from empdept_mv
  2  /

E_ROWID            D_ROWID            EMPNO ENAME      DEPTNO DNAME          CNT
------------------ ------------------ ----- ---------- ------ -------------- ---
AAAS3MAAEAAACJTAAI AAAS3NAAEAAACJbAAA  7839 KING           10 ACCOUNTING       1
AAAS3MAAEAAACJTAAN AAAS3NAAEAAACJbAAA  7934 MILLER         10 ACCOUNTING       1
AAAS3MAAEAAACJTAAG AAAS3NAAEAAACJbAAA  7782 CLARK          10 ACCOUNTING       1
AAAS3MAAEAAACJVAAA AAAS3NAAEAAACJbAAA  7777 VAN WIJK       10 ACCOUNTING       1

4 rows selected.


And again it works. This restriction doesn't seem to be a restriction. But maybe it breaks when adding some more complexity. I certainly wouldn't base a production system on it, after having been warned by the documentation.


4) Materialized view logs must exist with rowids for all the base tables in the FROM list of the query.

First restore the myemp materialized view log and then drop the mydept view log:


sql>  drop materialized view log on myemp
  2  /

Materialized view log dropped.

sql>  create materialized view log on myemp
  2    with rowid
  3  /

Materialized view log created.

sql>  drop materialized view log on mydept
  2  /

Materialized view log dropped.

sql>  drop materialized view empdept_mv
  2  /

Materialized view dropped.

sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , d.rowid d_rowid
  6       , e.empno
  7       , e.ename
  8       , d.deptno
  9       , d.dname
 10    from myemp e
 11       , mydept d
 12   where e.deptno = d.deptno
 13     and d.deptno = 10
 14  /
   and d.deptno = 10
                  *
ERROR at line 13:
ORA-23413: table "RWIJK"."MYDEPT" does not have a materialized view log


This error message couldn't be better!


5) You cannot create a fast refreshable materialized view from multiple tables with simple joins that include an object type column in the SELECT statement.


sql>  create materialized view log on mydept
  2    with rowid
  3  /

Materialized view log created.

sql>  create type to_emp is object
  2  ( empno number(4)
  3  , ename varchar2(10)
  4  );
  5  /

Type created.

sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , d.rowid d_rowid
  6       , to_emp
  7         ( e.empno
  8         , e.ename
  9         ) emp_object
 10       , d.deptno
 11       , d.dname
 12    from myemp e
 13       , mydept d
 14   where e.deptno = d.deptno
 15     and d.deptno = 10
 16  /
  from myemp e
       *
ERROR at line 12:
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view


Hmm, we have seen this one before.

Next up are the fast refreshable aggregate MV's.

6) An extra undocumented restriction (see first comment by Timur Akhmadeev): the materialized view cannot contain an ANSI join.


sql>  drop materialized view empdept_mv
  2  /
drop materialized view empdept_mv
*
ERROR at line 1:
ORA-12003: materialized view "RWIJK"."EMPDEPT_MV" does not exist


sql>  create materialized view empdept_mv
  2    refresh fast on commit
  3  as
  4  select e.rowid e_rowid
  5       , d.rowid d_rowid
  6       , e.empno
  7       , e.ename
  8       , d.deptno
  9       , d.dname
 10    from myemp e
 11         inner join mydept d on (e.deptno = d.deptno)
 12   where d.deptno = 10
 13  /
 where d.deptno = 10
                  *
ERROR at line 12:
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view

And this situation also leads to ORA-12054. 


Thursday, June 21, 2012

Using Oracle's Parallel Execution Features


Using Oracle's Parallel Execution Features

Parallel Features
The full list of Oracle parallel execution features currently includes the following
Operations That Can Be Parallelized
Oracle can parallelize operations that involve processing an entire table or an entire partition. These operations include:
  • SQL queries requiring at least one full table scan or queries involving an index range scan spanning multiple partitions.
  • Operations such as creating or rebuilding an index or rebuilding one or more partitions of an index.
  • Partition operations such as moving or splitting partitions
  • CREATE TABLE AS SELECT operations, if the SELECT involves
    a full table or partition scan.
  • INSERT INTO . . . SELECT operations, if the SELECT involves a full table or partition scan.
  • Update and delete operations on partitioned tables
Parallel query is the most commonly used of Oracle's parallel execution features. It was the first parallel execution feature to be developed by Oracle and was introduced in Oracle Release 7.1 as the Oracle Parallel Query Option (PQO). Parallel execution can significantly reduce the elapsed time for large queries, but it doesn't apply to every query.
To parallelize a SELECT statement, the following conditions must be met:
  • At least one of the tables is accessed through a full table scan, or an index is accessed through a range scan involving multiple partitions.
  • If the execution involves a full table scan, the statement must contain a PARALLEL hint specifying the corresponding table, or the corresponding table must have a parallel declaration in its definition.
  • If the execution involves an index range scan spanning multiple partitions, the statement must contain a PARALLEL_INDEX hint specifying the corresponding index, or the corresponding index must have a parallel declaration in its definition.
The following two sections explain how the degree of parallelism is chosen for a SELECT statement and discuss restrictions on the use of the parallel query feature.
Setting the Degree of Parallelism
Once Oracle decides to execute a SELECT statement in parallel, the degree of parallelism is determined by following precedence rules:
  • Oracle retrieves the DEGREE and INSTANCES specifications from the definition of all tables and indexes involved in the query and chooses the highest values found for those settings.
  • Oracle checks the statement for a parallel hint. If such a hint is found, the hint overrides the degree of parallelism obtained as a result of the previous step.
You can use the PARALLEL and PARALLEL_INDEX hints to specify the degree of parallelism for a SELECT statement. You can use theNOPARALLEL and NOPARALLEL_INDEX hints to ensure that parallel execution is not performed.
Example
alter table emp parallel (degree 4);select degree from user_tables where table_name = 'EMP';
select count(*) from emp;
alter table emp noparallel;
SELECT /*+ PARALLEL(emp,4) */ COUNT(*)
FROM emp;
Data Manipulation Language (DML) operations such as INSERT, UPDATE, and DELETE can be parallelized by Oracle. Parallel execution can speed up large DML operations and is particularly advantageous in data warehousing environments where it's necessary to maintain large summary or historical tables. In OLTP systems, parallel DML sometimes can be used to improve the performance of long-running batch jobs.
Deciding to Parallelize a DML Statement
When you issue a DML statement such as an INSERT, UPDATE, or DELETE, Oracle applies a set of rules to determine whether that statement can be parallelized. For UPDATE and DELETE statements, the rules are identical. INSERT statements, however, have their own set of rules.
Rules for UPDATE and DELETE statements
  • Oracle can parallelize UPDATE and DELETE statements on partitioned tables, but only when multiple partitions are involved.
  • You cannot parallelize UPDATE or DELETE operations on a nonpartitioned table or when such operations affect only a single partition.
Rules for INSERT statements
  • Standard INSERT statements using a VALUES clause cannot be parallelized.
  • Oracle can parallelize only INSERT . . . SELECT . . . FROM statements.
Examples
alter session enable parallel dml;
insert /*+ parallel (emp_big,4,1) */
 
 into emp_big select * from emp;
commit;
alter session disable parallel dml;
Parallel DDL works for both tables and indexes, whether partitioned or nonpartitioned.
For nonpartitioned tables and indexes, only the following types of DDL statements can be parallelized:
CREATE TABLE...AS SELECT
CREATE INDEX
ALTER INDEX...REBUILD
If you're working with partitioned tables and indexes, the scope of Oracle's parallel DDL support broadens. The following statements can be parallelized for partitioned tables and indexes:
CREATE TABLE...AS SELECT
ALTER TABLE...MOVE PARTITION
ALTER TABLE...SPLIT PARTITION
CREATE INDEX
ALTER INDEX...REBUILD PARTITION
ALTER INDEX...SPLIT PARTITION
Not all tables allow these operations to be executed in parallel. Tables with object columns or LOB columns don't allow parallel DDL.
Example
create table big_emp parallel (degree 4)
  as select * from emp;
CREATE INDEX emp_ix ON emp (emp_id)
   TABLESPACE ind
   STORAGE (INITIAL 1M NEXT 1M PCTINCREASE 0 MAXEXTENTS 20)
   PARALLEL (DEGREE 4);
Oracle's SQL*Loader utility loads data into Oracle tables from external files. With some restrictions, SQL*Loader supports the loading of data in parallel. If you have a large amount of data to load, SQL*Loader's parallel support can dramatically reduce the elapsed time needed to perform that load.
Initiating Parallel Data Loading
SQL*Loader supports parallel loading by allowing you to initiate multiple concurrent direct path load sessions that all load data into the same table or into the same partition of a partitioned table. Unlike the case when you execute a SQL statement in parallel, the task of dividing up the work falls on your shoulders. Follow these steps to use parallel data loading:
  • Create multiple input datafiles.
  • Create a SQL*Loader control file for each input datafile.
  • Initiate multiple SQL*Loader sessions, one for each control file
    and datafile pair.
Example
SQLLOAD scott/tiger CONTROL=con1.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=con2.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=con3.ctl DIRECT=TRUE PARALLEL=TRUE
SQLLOAD scott/tiger CONTROL=con4.ctl DIRECT=TRUE PARALLEL=TRUE
Note that the commands here should be executed from four different operating system sessions. The intent is to get four SQL*Loader sessions going at once, not to run four sessions one at a time. For example, if you are using the Unix operating system, you might open four command-prompt windows and execute one SQL*Loader command in each window.
Another important thing to note here is that you need to use the direct path in order to perform a load in parallel, as explained in the next section. This is achieved by the command-line argument DIRECT=TRUE. Parallel loads are not possible using the conventional path option.
Parallel recovery can speed up both instance recovery and media recovery. In parallel recovery, multiple parallel slave processes are used to perform recovery operations. The SMON background process reads the redo log files, and the parallel slave processes apply the changes to the datafiles. 
In a serial recovery scenario, the SMON background process both reads the redo log files and applies the changes to the datafiles. This may take a considerably long time when multiple datafiles need to be recovered. However, when parallel recovery is being used, the SMON process is responsible only for reading the redo log files. The changes are applied to the datafiles by multiple parallel slave processes, thereby reducing the recovery time.
Recovery requires that the changes be applied to the datafiles in exactly the same order in which they occurred. This is achieved by single-threading the read phase of the recovery process by the SMON process. SMON reads the redo log files and serializes the changes before dispatching them to the parallel slave processes. The parallel slave processes then apply those changes to the datafiles in the proper order. Therefore, the reading of the redo log files is performed serially even during a parallel recovery operation.
Specifying the RECOVERY_PARALLELISM Parameter
The RECOVERY_PARALLELISM initialization parameter controls the degree of parallelism to use for a recovery. You can override that setting for a specific situation by using the RECOVER command's PARALLEL clause.
A value of or 1 indicates serial recovery, no parallelism will be used. The RECOVERY_PARALLELISM parameter setting cannot exceed the PARALLEL_MAX_SERVERS setting.
Example
RECOVER TABLESPACE tab PARALLEL (DEGREE 4);
RECOVER DATABASE PARALLEL (DEGREE DEFAULT);
Oracle provides replication mechanisms allowing you to maintain copies of database objects in multiple databases. Changes are propagated among these databases over database links. The SNP (snapshot) background processes perform the replication process. For large volumes of replicated data, parallel propagation can be used to enhance throughput.
With parallel propagation, Oracle enlists multiple parallel slave processes to propagate replicated transactions using multiple parallel streams. Oracle orders the dependent transactions properly based on the System Change Number (SCN). During parallel propagation, you can see multiple connections to the destination database.
You enable parallel replication propagation at the database link level. A database link is created for a particular destination database. When you enable parallel propagation for a database link, Oracle uses multiple parallel slave processes to replicate to the corresponding destination.
Enable Parallel Replication Propagation
To enable parallel replication propagation from the SQL*Plus command line, you need to use the Oracle built-in package DBMS_DEFER_SYS. Execute the DBMS_DEFER_SYS.SCHEDULE_PUSH procedure for the destination database link, and pass the desired degree of parallelism as the value for the parallelism argument. 
Example for SQL*Plus
EXECUTE DBMS_DEFER_SYS.SCHEDULE_PUSH (-
DESTINATION => 'por1.world', -
INTERVAL => 'SYSDATE+1/24', -
NEXT_DATE => 'SYSDATE+1/24', -
PARALLELISM => 6);
This example sets the degree of parallelism to 6 for propagating to the "por1.world" destination database.
Oracle divides the task of executing a SQL statement into multiple smaller units, each of which is executed by a separate process. When parallel execution is used, the user's shadow process takes on the role of the parallel coordinator. The parallel coordinator is also referred to as parallel execution coordinator or query coordinator.
The parallel coordinator does the following:
  1. Dynamically divides the work into smaller units that can be parallelized.
      
  2. Acquires a sufficient number of parallel processes to execute the individual smaller units. These parallel processes are called parallel slave processes. They also are sometimes referred to as parallel execution server processes, parallel server processes, parallel query slaves, or simply slave processes. The most common of the terms, parallel slave processes and slave processes, are used throughout this book.
     
  3. Assigns each unit of work to a slave process.
     
  4. Collects and combines the results from the slave processes, and returns those
    results to the user process.
     
  5. Releases the slave processes after the work is done.
The Pool of Parallel Slave Processes
Oracle maintains a pool of parallel slave processes for each instance. The parallel coordinator for a SQL statement assigns parallel tasks to slave processes from this pool. These parallel slave processes remain assigned to a task until its execution is complete. After that, these processes return to the pool and can be assigned tasks from some other parallel operation. A parallel slave process serves only one SQL statement at a time.
The following parameters control the number of parallel slave processes in the pool:
  • PARALLEL_MIN_SERVERS 
Specifies the minimum number of parallel slave processes for an instance. When an instance starts up, it creates the specified number of parallel slave processes. The default value for this parameter is 0, meaning that no slave processes would be created at startup.
  • PARALLEL_MAX_SERVERS
Specifies the maximum number of parallel slave processes that an instance is allowed to have at one time. The default value for PARALLEL_MAX_SERVERS is platform-specific.
It takes time and resources to create parallel slave processes. Since parallel slave processes can serve only one statement at a time, you should set PARALLEL_MIN_SERVERS to a relatively high value if you need to run lots of parallel statements concurrently. That way, performance won't suffer from the need to constantly create slave processes.
You also need to consider how to set PARALLEL_MAX_SERVERS. Each parallel slave process consumes memory. Setting PARALLEL_MAX_SERVERS too high may lead to memory shortages during peak usage times. On the other hand, if PARALLEL_MAX_SERVERS is set too low, some operations may not get a sufficient number of parallel slave processes.
The Degree of Parallelism
The number of parallel slave processes associated with an operation is called its degree of parallelism . Don't confuse this term with the DEGREE keyword. They aren't exactly the same thing. In Oracle, the degree of parallelism consists of two components, the number of instances to use and the number of slave processes to use on each instance. In Oracle's SQL syntax, the keywords INSTANCES and DEGREE are always used to specify values for these two components as follows:
  • INSTANCES: Specifies the number of instances to use
  • DEGREE: Specifies the number of slave processes to use on each instance
INSTANCES applies only to the Oracle Parallel Server configuration. Unless you are using OPS, the value of INSTANCES should be set to 1; any other value is meaningless. 
Level of parallel execution
The degree of parallelism used for a SQL statement can be specified at three different levels:
  • Statement level 
Using hints or the PARALLEL clause
  • Object level
Found in the definition of the table, index, or other object
  • Instance level 
Using default values for the instance
Oracle determines the degree of parallelism to use for a SQL statement by checking each item in this list in the order shown. Oracle first checks for a degree of parallelism specification at the statement level. If it can't find one, it then checks the table or index definition. If the table or index definition does not explicitly specify values for DEGREE and INSTANCES, Oracle uses the default values established for the instance.
Specifying the degree of parallelism at the statement level
You can specify the degree of parallelism at the statement level by using hints or by using a PARALLEL clause. PARALLEL and PARALLEL_INDEX hints are used to specify the degree of parallelism used for queries and DML statements. However, DDL statements that support parallel execution provide an explicit PARALLEL clause in their syntax.
SELECT /*+ PARALLEL(orders,4,1) */ COUNT(*)
FROM orders;
Specifying the degree of parallelism at the object definition level
You can specify the degree of parallelism to use for a table or an index when you create it. You do that by using the PARALLEL clause of the CREATE TABLE and CREATE INDEX statements.
You also can specify a PARALLEL clause when you alter a table or an index.
ALTER TABLE order_items PARALLEL (DEGREE 4);
When you specify DEGREE and INSTANCES values at the table or index level, those values are used for all SQL statements involving the table or index unless overridden by a hint.
Specifying the degree of parallelism at the instance level
Each instance has associated with it a set of default values for DEGREE and INSTANCES. The default DEGREE value is either the number of CPUs available or the number of disks upon which a table or index is stored, whichever is less.
Oracle will use the instance-level defaults whenever the keyword DEFAULT is used in a hint or in a table or index definition. Oracle also will use the instance-level defaults when there are no hints and when no degree of parallelism has been specified at the table or index level.


Ref: http://www.akadia.com/services/ora_parallel_processing.html

Saturday, June 16, 2012

Oracle Database Tuning


Oracle Database Tuning
(and Other Performance Issues)

Objectives

  • Identify how poor system performance results from a poor database system design.
  • Familiarize with how database tuning focuses on identifying and fixing underlying flaws.
  • Familiarize with typical objects that can be tuned.

TUNING APPLICATION DESIGN

Effective Table Design

  • Poor table design always leads to poor performance.
  • Rigid adherence to fully normalized relational table guidelines can also result in poor physical performance.  These inefficiencies result from:
    • requiring too many joins.
    • failure to reflect the normal access paths for data.
  • Queries with large numbers of columns that come from multiple tables can cause performance to suffer because several tables must be joined.
  • Design options include:
    • Denormalizing 1NF, 2NF and 3NF solutions.
    • Creating small summary tables from large, static tables that stores data in the format in which users ask for the data - this avoids joins where data are often requested and the data doesn't change very often.
    • Separating individual tables into several tables by either vertical partitioning and/or horizontal partitioning.

Distribution of CPU Requirements

  • An Oracle database that is CPU-Bound (limited by CPU resources) as opposed to Wait-Bound (waiting on disk writes of some type) is one that's effectively designed.  This means that other resources are not limiting the database.
  • Schedule long-running batch query/update programs for off-peak hours; then run them at normal priority.
  • Store data in its most appropriate place in a distributed computing environment - this distributes computing CPU requirements from one server to another.
  • Use the Parallel Query option to distribute processing requirements of selected SQL statements among multiple CPUs if they are available.

Effective Application Design

  • There are two principles to follow here: 
1.           limit the number of times that users access the database, and
2.           coordinate the requests of users for data.  This requires you to know how users tend to access data.
  • Try to use the same queries to handle similar application data requirements; this will increase the likelihood that a data requirement can be resolved by data already in the SGA.
  • Use snapshots, which are non-updatable views of part of a database that can be distributed to support typical managerial querying.
  • Create stored procedures, functions, and packages and compile them to eliminate run-time compilation. The parsed version may exist in theShared SQL Pool.
1.           The SQL Text for all procedures, functions, and packages can be viewed in the TEXT column of the DBA_SOURCE view.
2.           These objects (procedural code) are stored in the SYSTEM tablespace so you need to allocate more space to it -- usually double its size.


TUNING SQL

  • Most SQL tuning requires the DBA to work with an application developer.
  • Most improvement in database processing will come from tuning SQL.
  • The key to SQL tuning is to minimize the search path that a database uses to find data.  For the most part this requires the creation of appropriate indexes that the database engine will use to find data.

Indexes

  • An Index enables Oracle to locate a row according to its physical location in a datafile by going to the correct file, finding the correct block, thenfinding the correct row within the block.  Taken together, these values are like relative record numbers in a relative file.
  • The File ID portion of ROWID can be compared to the FILE_ID column of the DBA_DATA_FILES view to determine the file to which an index belongs.
  • A query with no WHERE clause normally results in a full table scan, reading every block in a table.
  • A query for specific rows may cause use of an index.
    • The index maps logical values in a table (key columns) to their ROWIDs which enables location of the rows directly by their physical location.
  • You may index several columns together to form a concatenated index.
    • A concatenated index is only used if its leading column is used in the query's WHERE clause.
    • Consider the following example:
CREATE INDEX City_state_zip_ndx
ON Employee (City, State, Zip)
TABLESPACE Indexes;
SELECT *
FROM Employee
WHERE State = 'NJ';
  • The index is NOT used because the WHERE clause value of State does not match the leading column (City) of the index.
  • This example would only add overhead because the index would be maintained, even if it's not used.
  • The index should be recreated with proper ordering of the component fields in the index if the query is executed often.

Ordering Data

  • While row ordering does not matter in relational theory, it is important to order rows as much as possible when tables are initially created, e.g. when you are porting a system into Oracle from another platform or DBMS.
  • Ordered rows may enable Oracle to find needed rows while minimizing the data blocks that are retrieved where users execute queries that specify ranges of values (recall the BETWEEN operator in SQL for range of value queries).
  • Consider the following example which will require fewer data blocks to be read if the records are physically ordered on the Empno field.
SELECT *
FROM Employee
WHERE Empno BETWEEN 1 and 100;

·        You can physically sort table rows by SELECTing them to another file with use of the ORDER BY clause, then truncating the original table and loading the rows back into the original table. 

Clusters

  • We covered indexed clusters in an earlier module. 
  • Another type of cluster, the hash cluster, stores rows in a specific location based on its value in the cluster key column.
    • Every time a row is inserted, its cluster key value is used to determine which block to store the row in.
    • This enables hashing directly to data blocks without use of an index.
    • The hash cluster is only used with equivalence queries - where the exact value stored in a column is to be found.

Explain Plan

  • The EXPLAIN PLAN command shows the execution path for a query and stores this information to a table (PLAN_TABLE) in the database.  You can then query the table.  Example:

EXPLAIN PLAN
SET Statement_id = 'TEST'
FOR
SELECT *
FROM Employee
WHERE city > 'Y%';


  • The query above is not actually executed; rather the plan for execution is stored to the PLAN_TABLE.
  • Your account must have a PLAN_TABLE in your schema.  The script to create this table is UTLXPLAN.SQL and is located in the$ORACLE_HOME/rdbms/admin subdirectory.
  • Query the table to produce the output that shows the execution path.

SELECT LPAD(' ',2*LEVEL) || operation
       || ' ' || options ||
       ' ' || object_name Path_Plan
FROM Plan_Table
WHERE Statement_id = 'TEST'
CONNECT BY PRIOR Id = Parent_id
    AND Statement_id = 'TEST'
    START WITH Id=1;
Path_Plan ----------------------------------- TABLE ACCESS BY ROWID EMPLOYEE   INDEX RANGE SCAN CITY_ST_ZIP_NDX

  • The output shows that data will be accessed by ROWID through an index range scan of the named index.
  • Alternatively, you can also use the SET AUTOTRACE ON command in SQL*Plus to generate the explain plan output and trace information for every query that you run.
  • Evaluate the output by ensuring that the most selective (most nearly unique) indexes are used by a query.


TUNING MEMORY USAGE

You can use the Oracle Enterprise Manager software to analyze usage of memory by Oracle’s various memory caches.
  • The dictionary cache in memory is not directly sized or tuned as it is part of the Shared SQL Pool.
  • These memory areas are managed by the LRU (least recently used) algorithm.  You set the Shared SQL Pool size with theSHARED_POOL_SIZE parameter. 
  • If your Shared SQL Pool is too large, then you are wasting memory. 
  • The Hit Ratio measures how well the data buffer cache handles requests.

Hit Ratio = (Logical Reads - Physical Reads) /
                    Logical Reads

  • perfect ratio is 1.00 - all reads are logical reads; of course, this is generally impossible to obtain since it indicates that all the data that a system user will ever need to access is stored in the SGA.
  • On-line transaction processing applications should have Hit Ratios in excess of 0.90.
  • If processing for the Hit Ratio is within tolerance, you need to check to see if you can reduce the size of the Shared SQL Pool and still maintain a good Hit Ratio.
  • Add the following to the INIT.ORA file.

 DB_BLOCK_LRU_STATISTICS = TRUE

  • Shutdown the database and restart it.
  • The system dictionary table SYS.X$KCBCBH maintains memory statistics.  One row is maintained for each buffer in the buffer cache.  You can query this information to determine how many buffers are not being used.
  • Use the following query to determine how many cache hits (the COUNT column) would be lost if you reduced the number of buffers (the INDXcolumn).

SELECT Sum(Count) Lost_Hits
FROM Sys.X$Kcbcbh
WHERE indx >= New_Number_Of_Buffers;
(NOTE: You supply the value in the WHERE clause)

  • If you have lost hits, the system will require additional physical reads - the Hit Ratio for this new number of data buffers is:

Hit Ratio =
  (Logical Reads - Physical Reads - Lost Hits)
              / Logical Reads

  • Since running the database in a statistics gathering mode will slow it down due to the additional overhead, you should comment out theDB_BLOCK_LRU_STATISTICS parameter after you have finished tuning and restart the database.


TUNING DATA STORAGE

Defragmentation of Segments

Fragmented tables with multiple extents will slow down query processing.  This can also slow down the storage of new records because the database may have to dynamically combine free extents to create a new extent large enough to meet the storage parameters of the object where data are being stored.
  • We know that a segment is created to hold data associated with a new object (index or table) when an object is created.
  • The space allocated is used unless the segment is released (dropped) or truncated (tables only).
  • It would be best if each segment was composed of a single large INITIAL extent - compute the size for the initial extent such that it is large enough to handle all of a segment's data.
  • Use the DBA_SEGMENTS data dictionary view to determine which segments are comprised of ten or more extents.

SELECT Tablespace_name TSName, Owner,
    Segment_Name SNName,
    Segment_type SNType, Extents,
    Blocks, Bytes
FROM Sys.DBA_Segments;
 
TSNAME OWNER SNNAME    SNTYPE EXTENTS BLOCKS BYTES DATA   DBOCK LONGTIME  TABLE  1       15     61440 DATA   DBOCK MAGAZINE  TABLE  1       15     61440 DATA   DBOCK MATH      TABLE  1       15     61440 DATA   DBOCK WORKERANDSKILL CLUSTER 2 30    122880
 
  • To see the size of the individual extents for a segment, query the DBA_EXTENTS view.  Supply the type of segment you desire (TABLE, INDEX, CLUSTER, ROLLBACK, TEMPORARY, etc.).

SELECT Tablespace_name TSNAME, Owner,
    Segment_Name SNNAME,
    Segment_Type SNTYPE, Extent_id EID,
    File_id FID, Block_id BID, Bytes, Blocks
FROM Sys.DBA_Extents
WHERE Segment_type = 'segment_name'
ORDER BY Extent_id;
TSNAME OWNER SNNAME     SNTYPE EID FID BID Bytes Blocks
DATA   DBOCK SYS_C00890 INDEX  0   4   137 61440 15 DATA   DBOCK SYS_C00891 INDEX  0   4   152 61440 15
 
  • If a segment is fragmented, you can rebuild the object into a single segment by using the proper size for the storage parameters.  Export the data for the segment, recreate the object, then import the data into the INITIAL extent.

Defragmentation of Free Extents

Free Extent is a collection of contiguous free blocks in a tablespace that are unused.
  • If a segment is dropped, its extents are deallocated and become free, but these extents are not recombined with neighboring free extents.
  • SMON periodically coalesces neighboring free extents only if the default PCTINCREASE for a tablespace is non-zero.
  • The ALTER TABLESPACE tablespace_name COALESCE command can be used to force the combining of free extents.
  • Your readings will list a number of scripts available in Oracle to test whether or not free space needs to be coalesced.

Identifying chained Rows

  • Chained rows occur when a row is updated and will no longer fit into a single data block.
  • If you store rows that are larger than the Oracle block size, then you will cause chaining.
  • Chaining affects performance because of the need for Oracle to search multiple blocks for a logical row.
  • The ANALYZE command can be used to determine if chaining is occurring.

ANALYZE TABLE Table_Name
LIST CHAINED ROWS INTO Chained_Rows;

  • The output is stored to the CHAINED_ROWS table in your schema.  The CHAINED_ROWS table needs to first be created by executing theUTLCHAIN.SQL script in the $ORACLE_HOME/rdbms/admin directory.
  • If chaining is prevalent (all chained rows are listed), then rebuild the table with a higher PCTFREE parameter.


Increasing the Oracle Block Size

  • Oracle support different block sizes, but the most common block sizes used are 4K and 8K.
  • Installation routines default to different block size values.  For example, with our version of Oracle in a LINUX environment the system defaults to 8K if you do not specify the block size when you create the database.
o   Using the next higher block size value may improve performance of query-intensive operations by up to 50 percent.
o   Problem:  You must rebuild the entire database to increase the block size.
o   Improvement comes because the block header does not increase significantly leaving more space in a block for transaction information and for data rows.

Bulk Deletes: The TRUNCATE Command

  • Deleting all the rows in a table will not save any space because the segment for the table is still allocated all of the extents beyond those the first one that was allocated by the INITIAL parameter.
  • Deleting all rows can also result in UNDO errors because the bulk delete causes a very large transaction – can lead to overwrites and theSnapshot Too Old error message..
  • The TRUNCATE command resolves both problems, but you need to realize that this is a DDL command, not a DML command, so it cannot berolled back.
  • The TRUNCATE command is the fastest way to delete large volumes of data.

TRUNCATE TABLE Employee DROP STORAGE;
TRUNCATE CLUSTER Emp_Dept REUSE STORAGE;

  • The above command deletes all Employee table rows and the DROP STORAGE clause de-allocates the non-INITIAL extents.  The second example command is for clusters.