Wednesday, December 29, 2010

Using an ASM diskgroup as flash cache

We have been testing the F5100 flash array in our humble lab (borrowed that term from a colleague, he knows who he is). There are two ways to use it, one is to place your datafiles on it, the other is to use it as the database flash cache.

The database flash cache feature came with 11.2 and is a way to extend the SGA. It is not the same thing as the flash cache in Exadata, read Kevin Closson's this post to find out what the difference is. F5100 is one of the products you can use as the flash cache, the other is the F20 card.

It is possible and (may be the best option) to use ASM to configure the flash cache. The documentation states that you can use a filename or an ASM diskgroup name for db_flash_cache_file parameter which is the parameter to enable the flash cache feature.

So, how do we do this?

The first step is creating an ASM diskgroup on the flash devices. In my test, for simplicity I use one flash device which is /dev/rdsk/c1t9d0s0. This is just one flash module (of size 24GB) from F5100. The process for creating the diskgroup is no different than creating a diskgroup on conventional disks, just make sure your asm_diskstring includes the flash device path. By using asmca I created a diskgroup named FLASH using external redundancy on this single flash device.

Then following the documentation I set the parameters to enable the flash cache.


SQL> alter system set db_flash_cache_size=20G scope=spfile;


System altered.


SQL> alter system set db_flash_cache_file='+FLASH' scope=spfile;

System altered.

Now time to restart to make the new parameters effective.

SQL> startup force;
ORACLE instance started.

Total System Global Area 2606465024 bytes
Fixed Size                  2150840 bytes
Variable Size            2113932872 bytes
Database Buffers          469762048 bytes
Redo Buffers               20619264 bytes
Database mounted.
ORA-03113: end-of-file on communication channel
Process ID: 17560
Session ID: 2471 Serial number: 3

The instance fails to start. Looking at the alert log file we see this.

Errors in file /u01/app/oracle/diag/rdbms/flash/flash/trace/flash_dbw0_18053.trc:
ORA-01565: error in identifying file '+FLASH'
ORA-17503: ksfdopn:2 Failed to open file +FLASH
ORA-15045: ASM file name '+FLASH' is not in reference form
DBW0 (ospid: 18053): terminating the instance due to error 1565

So it seems that we need to specify the actual file name, not the name of the diskgroup. Let's see what we have in the diskgroup FLASH using amscmd.

ASMCMD> pwd
+FLASH/FLASH/FLASHFILE
ASMCMD> ls
bufpool#2147472020.256.739041793

Even if we got an error when starting the instance there is a file created under +FLASH/FLASH/FLASHFILE. The first FLASH in the path is the diskgroup name, the second is my database name.

What we need to do is set this file as the db_flash_cache_file parameter to be able to start the instance with flash cache enabled. I edit my text init.ora file and change the parameter.

db_flash_cache_file='+FLASH/FLASH/FLASHFILE/bufpool#2147472020.256.739041793'

Now we can start the instance.

SQL> create spfile from pfile='/tmp/x.ora';

File created.

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 2606465024 bytes
Fixed Size                  2150840 bytes
Variable Size            2113932872 bytes
Database Buffers          469762048 bytes
Redo Buffers               20619264 bytes
Database mounted.
Database opened.
SQL> show parameter flash_cache

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_flash_cache_file                  string      +FLASH/flash/flashfile/bufpool
                                                 #2147472020.256.739041793
db_flash_cache_size                  big integer 20G

The instance is up and I can see that the flash cache is enabled and using the correct ASM diskgroup.

The documentation states that we can use the diskgroup name but as we saw it needs some correction.

A problem and a lesson

The path I followed to this point is a little embarrassing and teaches a lesson about setting parameters.

At first I used db_flash_cache_file='FLASH' assuming it would use the ASM diskgroup. After restarting the instance I immediately started a workload on the database to see the effects of the flash cache. Here is what I saw in the Enterprise Manager performance page.


The waits of class "Configuration" were holding back the database. When I clicked on "Configuration" link I saw this.



The system was mostly waiting on "free buffer waits", "write complete waits" and "write complete waits: flash cache". This is because of DBWR. Someone has to write the blocks from the buffer cache to the flash when the buffer cache is full. This process is DBWR. Since I love using the DTrace toolkit I used the iosnoop script in that toolkit to find out what DBWR was doing. iosnoop can show you what file a process reading from or writing to most. So I ran iosnoop for one of the DBWR processes.

bash-3.00# ./iosnoop -p 16631
  UID   PID D     BLOCK   SIZE       COMM PATHNAME
  101 16631 W    175311   8704     oracle
  101 16631 W 246883104   8192     oracle /u01/app/oracle/product/11.2.0/dbhome_1/dbs/FLASH
  101 16631 W 246883104   8192     oracle /u01/app/oracle/product/11.2.0/dbhome_1/dbs/FLASH
  101 16631 W    175328   8704     oracle
  101 16631 W 246883120   8192     oracle /u01/app/oracle/product/11.2.0/dbhome_1/dbs/FLASH
  101 16631 W 246883120   8192     oracle /u01/app/oracle/product/11.2.0/dbhome_1/dbs/FLASH
  101 16631 W    175345   8704     oracle
  101 16631 W 246883136   8192     oracle /u01/app/oracle/product/11.2.0/dbhome_1/dbs/FLASH
  101 16631 W 246883136   8192     oracle /u01/app/oracle/product/11.2.0/dbhome_1/dbs/FLASH

The file it is trying to write to is $ORACLE_HOME/dbs/FLASH and since that directory is on a slow local disk it is taking a long time and causing the waits.

If I had looked at the db_flash_cache_file parameter after restarting the database to see if it was alright I would have seen this before starting the workload. So, once more it teaches to check if the parameter was set the way you want it before taking further action.

0 comments: