• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
Techgoeasy

Techgoeasy

Learn Oracle, PHP, HTML,CSS,Perl,UNIX shell scripts

  • Home
  • Oracle
    • Oracle database
    • Oracle Ebusiness Suite
    • Oracle weblogic
    • Oracle Performance Tuning
    • Oracle Hyperion
    • Oracle Cloud Tutorials
  • SQL
  • interview questions
  • Linux
  • PHP and HTML
  • Downloads
  • General
Home » Oracle » approx_count_distinct : Oracle Database 12c

approx_count_distinct : Oracle Database 12c

November 8, 2017 by techgoeasy Leave a Comment

Table of Contents

  • Introduction
  • approx_count_distinct Definition
  • How it works
  • Optimizer Changes
  • Brief History
  • How to use it

Introduction

approx_count_distinct is the new function included in 12c to estimate the distinct values in the column in a faster manner without deviating much from actual values.
It returns approximate number of rows that contain distinct values of expression.

approx_count_distinct Definition

We know that it take considerable time to project desired output using traditional COUNT(DISTINCT) appraoch.

With Oracle 12c (12.1.0.2), we have a function “APPROX_COUNT_DISTINCT” , which is claim to faster then tradition COUNT(DISTINCT <>) approach to get an idea on NDV.

It’s alternative to the COUNT (DISTINCT expr) function, which returns the exact number of rows that contain distinct values of expr.

For processing large amounts of data it’s significantly faster than COUNT, with negligible deviation of values from the exact result.

Statistically, the approx_count_distinct approximations provide a statistically insignificant difference from count distinct, so the approximation is statistically valid.

The APPROX_COUNT_DISTINCT() function ignores records that contain a null value for the expression. Plus is performs less work on the sorting and aggregations

How it works

In a traditional count distinct, Oracle’s read consistency mechanism is invoked, causing a large time lag when counting the number of distinct values in a very large table. Also, as the number of distinct values increase, the elapsed time and memory usage of the count distinct increases drastically.

In contrast, the approx_count_distinct bypasses the read consistency mechanism and give a fast and relatively accurate approximation of the number of distinct values in a table column.

Optimizer Changes

With APPROX_COUNT_DISTINCT we got an new Aggregation operation in execution plan “SORT AGGREGATE APPROX”.
As compare to tradition approach, we would need to perform GROUP BY and then AGGREGATION operation to serve COUNT and DISTINCT result.

If for any analysis, we just need an estimate of NDV with acceptable variation from actual values.
APPROX_COUNT_DISTINCT is the function to be used

 

Brief History

The APPROX_COUNT_DISTINCT function was added, but not documented, in Oracle 11g to improve the speed of calculating the number of distinct values when gathering statistics using the DBMS_STATS package. Oracle database 12c (12.1.0.2) now includes the function in the documentation, so we are free to use it in our applications as a supported SQL function.

How to use it

Traditional way

SELECT
COUNT(DISTINCT co11),
COUNT(DISTINCT col2)
FROM EXP_TABLE;

Using this function

SELECT
APPROX_COUNT_DISTINCT(co11),
APPROX_COUNT_DISTINCT(col2),
FROM EXP_TABLE;

You can compare the approximation also and see the difference in performance

Related Articles

New Optimizer Feature with 11g and 12c

Online move of active datafile in 12c

5 Simple (But Important) Things To Remember About Oracle Database 12c views ,parameters and packages

Top Oracle 12c New Features for developers

Very useful 10 new things in 12c database

Filed Under: Oracle, Oracle Sql

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar



Subscribe to our mailing list

Enter your email address to subscribe to this blog and receive notifications of new posts by email

Recent Posts

  • Oracle tkprof utility
  • What is Skew in Database?
  • how to find file based on size in linux
  • SQL trace, 10046 event in Oracle
  • How to use sed replace string in file

Copyright © 2023 : TechGoEasy

  • Hire me
  • Privacy Policy
  • Contact Us
  • New? Start Here
  • About Us