Oracle Today: 2011

Direct NFS Clonedb is a feature in 11.2 that you can use to clone your databases. Kevin Closson explains what it is in this post. In his demo videos he is using a perl script to automate the process of generating the necessary scripts. That script is not publicly available as of today but the MOS note 1210656.1 explains how to do the clone manually without the perl script.

Tim Hall also has a step by step guide on how to do the cloning in this post. He also uses the perl script mentioned above.

We have been testing backups and clones on Exadata connected to a 7410 ZFS Storage Appliance, I wanted to share our test on Direct NFS Clonedb. This test is on a quarter rack x2-2 connected to a 7410 storage via Infiniband. A RAC database will be cloned as a single instance database and the clone database will opened in one db node.

Enable Direct NFS on Exadata

For security, as of today default Exadata installation disables some services needed by NFS. To use NFS on Exadata db nodes we enabled those services first.

service portmap start

service nfslock start

chkconfig --level 345 portmap on

chkconfig --level 345 nfslock on

Recent Exadata installations come with Direct NFS (dNFS) enabled, you can check if you have it enabled by looking at the database alert log. When the database is started you can see this line in the alert log if you have dNFS enabled.

Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0

If it is not enabled you can use this command after stopping the database to enable it.

make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

Mount the NFS share

I am assuming 7410 is configured and NFS sharing is up on it at this point. To mount the NFS share you can use a mount command like this on Exadata db nodes.

mount 192.168.27.105:/export/backup1 /backup -o rw,bg,hard,intr,rsize=131072,wsize=131072,intr,nodev,nosuid,actimeo=0

Back up the source database

You can use OS copies or RMAN image copies to back up the database for use in the cloning process. Here are the commands we used, do not forget to create the target directory before.

sql 'alter database begin backup';

backup as copy database format '/backup/clone_backup/%U';

sql 'alter database end backup';

Prepare the clone db

To start the clone database we need an init.ora file and a create controlfile script. You can back up the source database's control file to a text file and use it. In the source database run this to get the script, this will produce a script under the udump directory (/u01/app/oracle/diag/rdbms/dbm/dbm1/trace in Exadata).

SQL> alter database backup controlfile to trace;

Database altered.

After editing the script this is the one we can use for the clone database.

CREATE CONTROLFILE REUSE SET DATABASE "clone" RESETLOGS NOARCHIVELOG

MAXLOGFILES 1024

MAXLOGMEMBERS 5

MAXDATAFILES 32767

MAXINSTANCES 32

MAXLOGHISTORY 33012

LOGFILE

GROUP 1 (

'/u01/app/oradata/clone/redo01.log'

) SIZE 4096M BLOCKSIZE 512,

GROUP 2 (

'/u01/app/oradata/clone/redo02.log'

) SIZE 4096M BLOCKSIZE 512,

GROUP 3 (

'/u01/app/oradata/clone/redo03.log'

) SIZE 4096M BLOCKSIZE 512

DATAFILE

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-UNDOTBS1_FNO-3_bnm5ajrp',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-SYSTEM_FNO-1_blm5ajro',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-SYSAUX_FNO-2_bmm5ajro',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-UNDOTBS2_FNO-4_bom5ajrp',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-GNS_DATA01_FNO-7_bqm5ajrp',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-5_bpm5ajrp',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-9_bsm5ajrp',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-10_btm5ajrp',

'/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-8_brm5ajrp'

CHARACTER SET AL32UTF8

;

/u01/app/oradata/clone is a directory on the local disks, you can also use NFS for redo logs if you want to. The DATAFILE section lists the image copies we have just produced using RMAN. You can get this list using this sql, be careful about the completion time because you may have previous image copies in the same directory.

select name,completion_time from V$BACKUP_COPY_DETAILS;

Now we need an init.ora file, we can just copy the source database's file and edit it.

SQL> create pfile='/backup/clone.ora' from spfile;

File created.

Since the source database is a RAC database you need to remove parameters related to RAC (like cluster_database, etc...). You also need to change the paths to reflect the new clone database, like in the parameter control_files. Here is the control_files parameter in this test.

*.control_files='/u01/app/oradata/clone/control.ctl'

I also use a local directory, not NFS, for the control file.

There is one parameter you need to add when cloning a RAC database to a single instance database.

_no_recovery_through_resetlogs=TRUE

If you do not set this parameter you will get an error when you try to open the clone database with resetlogs. MOS note 334899.1 explains why we need to set this. If you do not set this this is the error you will get when opening the database.

RMAN> sql 'alter database open resetlogs';

sql statement: alter database open resetlogs

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of sql command on default channel at 02/22/2011 16:13:07

RMAN-11003: failure during parse/execution of SQL statement: alter database open resetlogs

ORA-38856: cannot mark instance UNNAMED_INSTANCE_2 (redo thread 2) as enabled

Now we are ready to create the clone database.

Create the clone db

After preparing the init.ora file and the create controlfile script we can create the database.

export ORACLE_SID=clone

SQL> startup nomount pfile='/backup/clone.ora';

ORACLE instance started.

Total System Global Area 3991842816 bytes

Fixed Size 2232648 bytes

Variable Size 754978488 bytes

Database Buffers 3087007744 bytes

Redo Buffers 147623936 bytes

SQL> @cr_control

Control file created.

Now we need to rename the datafiles and point them to a NFS location we want. dbms_dnfs is the package needed for this.

begin

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-UNDOTBS1_FNO-3_bnm5ajrp','/backup/oradata/undotbs1.263.740679581');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-SYSTEM_FNO-1_blm5ajro','/backup/oradata/system.261.740679559');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-SYSAUX_FNO-2_bmm5ajro','/backup/oradata/sysaux.262.740679571');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-UNDOTBS2_FNO-4_bom5ajrp','/backup/oradata/undotbs2.265.740679601');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-GNS_DATA01_FNO-7_bqm5ajrp','/backup/oradata/gns_data01.264.741356977');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-5_bpm5ajrp','/backup/oradata/users.266.740679611');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-9_bsm5ajrp','/backup/oradata/users.274.741357097');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-10_btm5ajrp','/backup/oradata/users.275.741357121');

dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-8_brm5ajrp','/backup/oradata/users.273.741357075');

end;

The first parameter to dbms_dnfs is the backup image copy name we set in the controlfile script, the second parameter is the target filename which should reside in NFS. You can create this script using the following sql on the source database.

select 'dbms_dnfs.clonedb_renamefile('''||b.name||''',''/backup/oradata/'||

substr(d.file_name,instr(d.file_name,'/',-1)+1)||''');'

from v$backup_copy_details b,dba_data_files d

where b.file#=d.file_id

and b.completion_time>sysdate-1/24;

If you have multiple image copies be careful about the completion_time predicate. In this example I am looking at the image copies of the last hour.

If the target location in the dbms_dnfs call is not on NFS here is what you will get:

SQL> exec dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-10_b1m59uem','/u01/app/oradata/clone/users.275.741357121');

BEGIN dbms_dnfs.clonedb_renamefile('/backup/clone_backup/data_D-DBM_I-1222199824_TS-USERS_FNO-10_b1m59uem','/u01/app/oradata/clone/users.275.741357121'); END;

ERROR at line 1:

ORA-01511: error in renaming log/data files

ORA-17513: dNFS package call failed

ORA-06512: at "SYS.X$DBMS_DNFS", line 10

ORA-06512: at line 1

I got this error in my first try and searched in MOS and Google for some time with no results. Then I realized that my target locations was not on NFS but was on local disk.

As we renamed the datafiles and they are pointing at the NFS location, now we can open the clone database.

SQL> alter database open resetlogs;

Database altered.

At this point we have a clone database ready to use.

The target directory we used shows our files.

[oracle@dm01db01 oradata]$ ls -l

total 98523

-rw-r-----+ 1 oracle dba 10737426432 Feb 22 16:44 gns_data01.264.741356977

-rw-r-----+ 1 oracle dba 17179877376 Feb 22 16:44 sysaux.262.740679571

-rw-r-----+ 1 oracle dba 17179877376 Feb 22 16:44 system.261.740679559

-rw-r-----+ 1 oracle dba 17179877376 Feb 22 16:44 undotbs1.263.740679581

-rw-r-----+ 1 oracle dba 17179877376 Feb 22 16:44 undotbs2.265.740679601

-rw-r-----+ 1 oracle dba 10737426432 Feb 22 16:44 users.266.740679611

-rw-r-----+ 1 oracle dba 2147491840 Feb 22 16:44 users.273.741357075

-rw-r-----+ 1 oracle dba 2147491840 Feb 22 16:44 users.274.741357097

-rw-r-----+ 1 oracle dba 2147491840 Feb 22 16:44 users.275.741357121

Even though ls shows their size equal to the source database file sizes, du shows us the truth.

[oracle@dm01db01 oradata]$ du -hs *

129K gns_data01.264.741356977

74M sysaux.262.740679571

2.8M system.261.740679559

15M undotbs1.263.740679581

5.4M undotbs2.265.740679601

129K users.266.740679611

129K users.273.741357075

129K users.274.741357097

129K users.275.741357121

The datafile sizes are minimal since we did not do any write activity on the clone database yet. These sizes will get bigger after some activity.

When you enable the database smart flash cache and start using it you will see new wait events related to that. These events help to find out if the problem is about the flash cache or not.

The ones I faced till now are "db flash cache single block physical read", "db flash cache multiblock physical read" and "write complete waits: flash cache". These are from a 11.2.0.1 database using the F5100 flash array as the database smart flash cache.

db flash cache single block physical read

"db flash cache single block physical read" is the flash cache equivalent of "db file sequential read". Read waits from the flash cache are not accounted for in the "db file sequential read" event and have their own wait event. The following is from an AWR report of 30 mins from a database using the database smart flash cache.

Top 5 Timed Foreground Events

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
db file sequential read	649,576	15,469	24	47.33	User I/O
DB CPU		10,451		31.98
db flash cache single block physical read	11,217,303	5,044	0	15.43	User I/O
buffer busy waits	71,323	1,845	26	5.65	Concurrency
log file switch (checkpoint incomplete)	2,305	251	109	0.77	Configuration

There are over 11 million "db flash cache single block physical read" waits which took about 0.44ms on average (AWR reports it as 0ms). "db file sequential read" waits are over 600,000. This means we had a high flash cache hit ratio, most of the reads were coming from the flash cache, not the disks.

This is the wait event histogram from the same AWR report.

		% of Waits
Event	Total Waits	<1ms	<2ms	<4ms	<8ms	<16ms	<32ms	<=1s	>1s
db flash cache single block physical read	11.2M	99.0	.9	.1	.0	.0	.0	.0

99% of all flash cache single block reads were under 1ms, none of them are over 4ms.

db flash cache multiblock physical read

"db flash cache multiblock physical read" is the flash cache equivalent of "db file scattered read". It is the event we see when we are reading multiple blocks from the flash cache. The AWR report I am using in this post does not contain many multiblock operations but here is the event from the foreground wait events section.

Event	Waits	%Time -outs	Total Wait Time (s)	Avg wait (ms)	Waits /txn	% DB time
db flash cache multiblock physical read	1,048	0	1	1	0.00	0.00

We had 1048 waits of 1ms on average.

	% of Waits
Event	Total Waits	<1ms	<2ms	<4ms	<8ms	<16ms	<32ms	<=1s	>1s
db flash cache multiblock physical read	1171	83.2	12.4	3.2	.9	.3

The wait event histogram shows that again most of the waits are below 1ms with some being up to 16ms.

write complete waits: flash cache

This is the wait event we see when DBWR is trying to write a block from the buffer cache to the flash cache and a session wants to modify that block. The session waits on this event and goes on with the update after DBWR finishes his job. You can find a good explanation and a good example for this in Guy Harrison's post. Here is the event and its histogram in the same AWR report I have.

Event	Waits	%Time -outs	Total Wait Time (s)	Avg wait (ms)	Waits /txn	% DB time
write complete waits: flash cache	345	0	1	4	0.00	0.00

		% of Waits
Event	Total Waits	<1ms	<2ms	<4ms	<8ms	<16ms	<32ms	<=1s	>1s
write complete waits: flash cache	345	22.9	23.2	27.2	19.7	5.5	.3	1.2

I had 345 waits in 30 mins with an average time of 4ms. But the important thing is the contribution of this event to DB time was only 1 second in a 30 minute workload.

You can see that this event starts climbing up in some situations, especially if you are having a problem with your flash cache device and writes to it start to take longer.

Here is one case where poor DBWR processes are struggling to write to the flash cache. Database control was showing high waits in the "Configuration" class and those waits were "write complete waits", "free buffer waits" and "write complete waits: flash cache". This was all because my flash cache device was a single conventional HDD, not even an SSD. After changing the db_flash_cache_file parameter to use the F5100 array this picture was not seen anymore.

Oracle Today

Thursday, February 24, 2011

Direct NFS Clonedb

Tuesday, January 04, 2011

Database smart flash cache wait events

About Me

Disclaimer

Blog Archive

Shared readings

RSS