XMLDB Performance: The Side Effects of a Simple “count(*)”

While setting up a baseline for my XMLDB performance tests, I noticed that my “count(*)” on a Binary XML table (using Securefile LOB storage) called “WIKI_STAGE” took an awful long time. So long, that I even had to kill the SQL*Plus session, that was executing the “count(*)”. I started wondering. Why did it take so long to come up with the result?

In the end, due to good advice from Jonathan Lewis, I came up with a solution (although probably unsupported) and a better understanding off the mechanics involved. Also, as a side effect, it triggered a really good discussion on the “Oracle-L” freelist, regarding “counting”.

But lets start from the beginning…

The Setup

As you may still know, after reading “Setting Up an XMLDB Performance “Baseline” Environment“, I am busy trying to figure out the best baseline environment to do some serious XML data loading and testing on different setup’s. Its “hard” work, every load session (within my current setup) takes roundabout 20 hours to complete. The original data, an XML dumpfile from Mediawiki (English) is 17 Gb in size (on Windows NTFS) and contains approximately 7 million records.

I have a Oracle schema called “wiki”, granted the dba and xdbadmin role, with a default tablespace called MEDIAWIKI_STAGE. The rest of the details and / or RDA output will follow when I post testing results.

This tablespace MEDIAWIKI_STAGE has the following properties in a Oracle 11.1.0.6.0 Enterprise Edition database, situated on Oracle Enterprise Linux (32bit) Version 5 (base release):


SQL> select * from v$version;

BANNER
----------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0      Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

5 rows selected.

SQL> set pages 5000
SQL> set long 10000000
SQL> col "COL" for a200

SQL> select dbms_metadata.get_ddl('TABLESPACE','MEDIAWIKI_STAGE')        
  2  as "COL" from dual;

COL
----------------------------------------------------------------------
  CREATE BIGFILE TABLESPACE "MEDIAWIKI_STAGE" DATAFILE
  '/u02/oracle/oradata/TEST/mediawiki_stage.dbf' SIZE 26843545600
  AUTOEXTEND ON NEXT 1073741824 MAXSIZE 46080M
  LOGGING ONLINE PERMANENT BLOCKSIZE 8192
  EXTENT MANAGEMENT LOCAL AUTOALLOCATE SEGMENT SPACE MANAGEMENT AUTO
   ALTER DATABASE DATAFILE
  '/u02/oracle/oradata/TEST/mediawiki_stage.dbf' RESIZE 28991029248

1 row selected.

The table “WIKI_STAGE” which is build on this tablespace, has the following characteristics, after it had been filled with the XML data and the Securefile lob was compressed with “COMPRESS HIGH”:

SQL> select dbms_metadata.get_ddl('TABLE','WIKI_STAGE')
  2  as "COL"
  3  from dual;

COL
----------------------------------------------------------------------
  CREATE TABLE "WIKI"."WIKI_STAGE" OF "SYS"."XMLTYPE"
 OIDINDEX  ( PCTFREE 10 INITRANS 2 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "MEDIAWIKI_STAGE" )
 PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "MEDIAWIKI_STAGE"
 XMLTYPE COLUMN "SYS_NC_ROWINFO$" STORE AS SECUREFILE BINARY XML (
  TABLESPACE "MEDIAWIKI_STAGE" ENABLE STORAGE IN ROW CHUNK 8192
  NOCACHE LOGGING  DECRYPT  COMPRESS HIGH  KEEP_DUPLICATES
  STORAGE(INITIAL 106496 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 BUFFER_POOL DEFAULT)) ALLOW NONSCHEMA DISALLOW ANYSCHEMA

1 row selected.

After the loading process had been successfully executed, I gathered the following information:

SQL> @segment

-- an old script off mine, based on dba_segments

Segments:

Enter value for segment_name:
Enter value for segment_type:
Enter value for owner:
Enter value for extents:
Enter value for tablespace_name: MEDIAWIKI_STAGE

Segment                        Type                   Bytes  Tot.Ext      IniSize     NextSize  %Inc  MinE    MaxE Owner           TableSpace
------------------------------ --------------- ------------ -------- ------------ ------------ ----- ----- ------- --------------- --------------------
SYS_LOB0000070946C00003$$      LOBSEGMENT       18909364224      456       106496                        1 ####### WIKI            MEDIAWIKI_STAGE
WIKI_STAGE                     TABLE             8319401984      309        65536                        1 ####### WIKI            MEDIAWIKI_STAGE
SYS_C009650                    INDEX              352321536      113        65536                        1 ####### WIKI            MEDIAWIKI_STAGE
SYS_IL0000070946C00003$$       LOBINDEX               65536        1        65536                        1 ####### WIKI            MEDIAWIKI_STAGE
                                               ------------
sum                                             27581153280

SQL> select dbms_metadata.get_ddl('INDEX','SYS_C009650','WIKI') as "COL"
  2  from dual;

COL
----------------------------------------------------------------------
  CREATE UNIQUE INDEX "WIKI"."SYS_C009650" ON "WIKI"."WIKI_STAGE" ("SYS_NC_OID$")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "MEDIAWIKI_STAGE"

1 row selected.

.

Count(*) – A Special Case?

I am “an Oldtimer” regarding Oracle database usage and maybe one off the myths I remember and wondered about while assessing the problem was that I believed that a “count(*)” is special. Special in the way that the Oracle database does smart things with it, performance wise. Mind you I am not talking about “count(1)” or “count(column)” but only about the special case: “count(*)”.

While doing a simple “select count(*) from wiki_stage”, I noticed that it took far to long as I would have expected, and killed the session after a few hours.

12:26:24 SQL> select count(*) from wiki_stage;
select count(*) from wiki_stage
                     *
ERROR at line 1:
ORA-00600: internal error code, arguments: [%s], [%s], [%s], [%s], [%s], [%s],
[%s], [%s]
.

I tried to figure out what was going on so I did an explain plan:

15:47:55 SQL> explain plan for
15:48:24   2  select count(*) from wiki_stage;

Explained.

Elapsed: 00:00:00.02

15:48:31 SQL> SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE',NULL,'ALL'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
Plan hash value: 3207986117

-------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |   275K  (1)| 00:55:03 |
|   1 |  SORT AGGREGATE    |            |     1 |            |          |
|   2 |   TABLE ACCESS FULL| WIKI_STAGE |  6921K|   275K  (1)| 00:55:03 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / WIKI_STAGE@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT(*)[22]

Note

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
   - dynamic sampling used for this statement

24 rows selected.

Elapsed: 00:00:00.02
.

Hmmm, full table access.

That explains why it took so long. But why is is doing a full table access in the first place, I thought?

There is a smarter way to do the count, so why doesn’t use the Cost Based Optimizer (CBO) the index available…

The “WIKI_STAGE” table has a hidden OID column called SYS_NC_OID$ and on this column is a UNIQUE index called “SYS_C009650”. If the CBO would have used it the following would have happened:

15:14:43 SQL> select count(SYS_NC_OID$) from wiki_stage;

COUNT(SYS_NC_OID$)
------------------
           6996002

Elapsed: 00:03:05.32

-- to get more info I did a "set autotrace on"


15:41:25 SQL>  select count(SYS_NC_OID$) from wiki_stage;

COUNT(SYS_NC_OID$)
------------------
           6996002

Elapsed: 00:00:00.36

Execution Plan
----------------------------------------------------------
Plan hash value: 1000365655

-------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |    10 | 11596   (1)| 00:02:20 |
|   1 |  SORT AGGREGATE       |             |     1 |    10 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009650 |  6921K|    66M| 11596   (1)| 00:02:20 |
-------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
       1233  recursive calls
          0  db block gets
      42748  consistent gets
          0  physical reads
          0  redo size
        431  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         24  sorts (memory)
          0  sorts (disk)
          1  rows processed


-- and also tried an "explain plan for"


15:47:52 SQL> SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE',NULL,'ALL'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
Plan hash value: 1000365655

-------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |    10 | 11596   (1)| 00:02:20 |
|   1 |  SORT AGGREGATE       |             |     1 |    10 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009650 |  6921K|    66M| 11596   (1)| 00:02:20 |
-------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------

   1 - SEL$1
   2 - SEL$1 / WIKI_STAGE@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT("SYS_NC_OID$")[22]
   2 - "SYS_NC_OID$"[RAW,16]


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

25 rows selected.
.

I saw being mentioned “dynamic sampling used for this statement” in the output and thought, OK, probably that is the issue, I forgot to sample for statistics. So I retried it but now generating statistics first.

21:48:29 SQL>  exec dbms_stats.gather_schema_stats('WIKI', cascade=>TRUE);
BEGIN dbms_stats.gather_schema_stats('WIKI', cascade=>TRUE); END;

PL/SQL procedure successfully completed.

Elapsed: 01:08:50.54


09:09:57 SQL> explain plan for
09:10:00   2  select count(*) from wiki_stage;

Explained.

Elapsed: 00:00:00.05
09:11:51 SQL> SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE',NULL,'ALL'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
Plan hash value: 3207986117

-------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |   275K  (1)| 00:55:03 |
|   1 |  SORT AGGREGATE    |            |     1 |            |          |
|   2 |   TABLE ACCESS FULL| WIKI_STAGE |  6996K|   275K  (1)| 00:55:03 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):

PLAN_TABLE_OUTPUT
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$1 / WIKI_STAGE@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT(*)[22]

20 rows selected.

Elapsed: 00:00:00.14

09:11:59 SQL> explain plan for
09:12:31   2  select count(SYS_NC_OID$) from wiki_stage;

Explained.

Elapsed: 00:00:00.02

09:12:33 SQL> SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE',NULL,'ALL'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
Plan hash value: 1000365655

-------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |    17 | 11564   (1)| 00:02:19 |
|   1 |  SORT AGGREGATE       |             |     1 |    17 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009650 |  6996K|   113M| 11564   (1)| 00:02:19 |
-------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):

PLAN_TABLE_OUTPUT
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$1 / WIKI_STAGE@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT("SYS_NC_OID$")[22]
   2 - "SYS_NC_OID$"[RAW,16]

21 rows selected.

Elapsed: 00:00:00.04

09:12:35 SQL> set autotrace on
09:15:54 SQL>  select count(SYS_NC_OID$) from wiki_stage;

COUNT(SYS_NC_OID$)
------------------
           6996002

Elapsed: 00:00:21.34

Execution Plan
----------------------------------------------------------
Plan hash value: 1000365655

-------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |    17 | 11564   (1)| 00:02:19 |
|   1 |  SORT AGGREGATE       |             |     1 |    17 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009650 |  6996K|   113M| 11564   (1)| 00:02:19 |
-------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      42373  consistent gets
       5207  physical reads
          0  redo size
        431  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
.

Nope, same effect. That is, the “select count(*) on wiki_stage” still used a full table access to do the count on the table and ignores the fact that there is an index that could be used to achieve the same.

Finding Answers via Oracle-L

While being stuck for answers, I sought help via my peers on the “Oracle-L” Freelist. As said, the discussion progressed far beyond the context of my initial discussion, but is definitely a good read regarding the internal mechanics involved. I have to read up on a Masterclass from Steve Adams I once followed, to bring me up to speed again on the context off dumping block segments, but most of it could make sense regarding an opportunity for the CBO to maximize a smart mechanism for doing a “count(*)”.

Anyway, the following applies to this situation:

http://www.freelists.org/archives/oracle-l/07-2008/msg00378.html

Re: Performance off “count(*)”

* From: “Jonathan Lewis” [jonathan @xxxxxxxxxxxxxxxxxx]
* To: [oracle -l@xxxxxxxxxxxxx]
* Date: Fri, 18 Jul 2008 15:20:11 +0100

In principle the optimizer knows that count(*) means:

“how many rows in the table” and will choose the smallest object that could return the result to do the scan. This means it may choose to do a fast-full scan of an index if at least one column of the index is declared as not null.

Depending on version, the choice between indexes of identical cost may simply be based on alphabetical ordering of the names – although number of distinct keys is supposed to have an effect as well in recent versions

Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users’ FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

OK, that sounded reasonable (and who am I to doubt Jonathan). 😉

A little bit later in in the discussion thread, Jonathan suggested the following:

http://www.freelists.org/archives/oracle-l/07-2008/msg00394.html

Re: Performance off “count(*)”

* From: “Jonathan Lewis” [jonathan @xxxxxxxxxxxxxxxxxx]
* To: [oracle -l@xxxxxxxxxxxxx]
* Date: Sat, 19 Jul 2008 09:33:44 +0100

Marco,

XMLType table are a version of object tables, which means they have a hidden OID column called sys_nc_oid$.

The problem seems to be that the sys_nc_oid$ column (the objects unique object id) does not get declared with a not null constraint. So the unique index on this column cannot be used for the count.

I can’t think of a mechanism that could (legally) ever allow the OID to be null, so it seems that you should be safe issuing

alter table XXX modify sys_nc_oid$ not null;

If you do, then Oracle will automatically use the index in the count(*).

I’d raise an SR to get confirmation that this is safe before doing it on live data though.

Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users’ FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Unsupported Solutions to a “Count(*)”

For this database version and for this special case, being an XMLType table based on Binary XML, I can not see (yet) a reason why the CBO should not use a SYS_NC_OID$ column with the extra restriction “not null”.

While doing the unsupported (on my test database!), I got the following result:

22:57:16 SQL> alter table WIKI_STAGE
11:38:42   2  modify sys_nc_oid$ not null;

Table altered.

Elapsed: 01:07:28.80

12:46:11 SQL> exec dbms_stats.gather_schema_stats('WIKI',cascade=>TRUE);

PL/SQL procedure successfully completed.

Elapsed: 01:06:37.50

21:27:44 SQL> explain plan for
21:27:50   2  select count(*) from wiki_stage;

Explained.

Elapsed: 00:00:00.18

21:27:56 SQL> SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE',NULL,'ALL'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
Plan hash value: 1000365655

-----------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 | 11108   (1)| 00:02:14 |
|   1 |  SORT AGGREGATE       |             |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009650 |  6996K| 11108   (1)| 00:02:14 |
-----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1 / WIKI_STAGE@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) COUNT(*)[22]

20 rows selected.

Elapsed: 00:00:00.32

21:30:09 SQL> select count(*) from wiki_stage;

  COUNT(*)
----------
   6996002

Elapsed: 00:00:48.35

Execution Plan
----------------------------------------------------------
Plan hash value: 1000365655

-----------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 | 11108   (1)| 00:02:14 |
|   1 |  SORT AGGREGATE       |             |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009650 |  6996K| 11108   (1)| 00:02:14 |
-----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      42373  consistent gets
      14470  physical reads
          0  redo size
        421  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
.

Worked like clockwork, as it should do anyway following the theory, but then again, I have no knowledge of the internal mechanisms involved regarding Binary XML. I have also no idea if the “not null” restriction can always be applied in this (special) case. I will make a case with the XMLDB Development team, via a Service Request, and see if this could be an enhancement to the database and/or the Cost Based Optimizer.

HTH

Marco

Related Posts

XMLDB Performance: The Side Effects of a Simple “count(*)”

The Setup

Count(*) – A Special Case?

Finding Answers via Oracle-L

Unsupported Solutions to a “Count(*)”

2 Comments