Introduction
Even though Oracle has a wide variety of aggregates, not everything is implemented. For example, a basic product is lacking from the list. Oracle Data Cartridge helps to extend the functionality in the database and allows to create for example custom aggregates using PL/SQL. This article shows few simple examples of custom aggregates.
The First Aggregate, Product
Schema for Aggregates (Optional)
Before creating the aggregates, I chose to create a new schema. This is not mandatory but I felt that it would simplify administration if the types and functions that reside in a separate schema:
CREATE USER CustomAggregates
IDENTIFIED BY Calculations
DEFAULT TABLESPACE Users
TEMPORARY TABLESPACE TEMP;
GRANT CONNECT TO CustomAggregates;
GRANT RESOURCE TO CustomAggregates;
When running the user creation, change the default and temporary tablespaces suitable to your environment.
PL/SQL Implementation
Creating an aggregate requires to create a type including the header and the body. The header includes the variables needed to store intermediate values in the context and the functions which Oracle will call at runtime. The mandatory functions are:
ODCIAggregateInitialize
: This function is called to create a new context for a set of values. The purpose of this static
method is to return a new, initialized instance for the type. ODCIAggregateIterate
: Each value is passed to this method which is responsible for making the necessary calculations, etc. ODCIAggregateMerge
: This method is used when two contexts are merged into a single one. Merging happens if the original set of values are iterated using several different contexts. ODCIAggregateTerminate
: When all values have been handled and contexts are merged, this function returns the result.
Merging is best explained with a small (simplified) diagram:
The query processor may divide a set of values to a smaller subsets. When the aggregation is done, each group has its own instance of the type to handle the subset. Each instance is first initialized and the iteration is done for each value in the set. After iterations have been done, the context is merged to another context. Finally, when all the contexts have been merged, the aggregation is terminated and the result is returned to the consumer. For this reason, the aggregate must be designed to support operating on partial sets at iteration time.
The type definition for calculating the product looks like the following:
CREATE OR REPLACE TYPE CustomAggregates.ProductType AS OBJECT (
mProduct NUMBER,
mElements NUMBER,
STATIC FUNCTION ODCIAggregateInitialize
(sctx IN OUT CustomAggregates.ProductType)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateIterate
(self IN OUT CustomAggregates.ProductType,
value IN NUMBER)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateMerge
(self IN OUT CustomAggregates.ProductType,
ctx2 IN CustomAggregates.ProductType)
RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateTerminate
(self IN CustomAggregates.ProductType,
returnValue OUT NUMBER,
flags IN NUMBER)
RETURN NUMBER
);
The header defines two variables used for product calculation and all the mandatory functions. The body implements the logic:
CREATE OR REPLACE TYPE BODY CustomAggregates.ProductType IS
STATIC FUNCTION ODCIAggregateInitialize
( sctx IN OUT CustomAggregates.ProductType)
RETURN NUMBER IS
BEGIN
sctx := ProductType( TO_NUMBER(NULL), 0);
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateIterate
( self IN OUT CustomAggregates.ProductType,
value IN NUMBER)
RETURN NUMBER IS
BEGIN
IF (self.mElements = 0) THEN
self.mProduct := value;
ELSE
self.mProduct := self.mProduct * value;
END IF;
self.mElements := self.mElements + 1;
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateMerge
( self IN OUT CustomAggregates.ProductType,
ctx2 IN CustomAggregates.ProductType)
RETURN NUMBER IS
BEGIN
IF (self.mElements = 0) THEN
self.mProduct := ctx2.mProduct;
ELSIF (ctx2.mElements = 0) THEN
NULL;
ELSE
self.mProduct := self.mProduct * ctx2.mProduct;
END IF;
self.mElements := self.mElements + ctx2.mElements;
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateTerminate
( self IN CustomAggregates.ProductType,
returnValue OUT NUMBER,
flags IN NUMBER) RETURN NUMBER IS
BEGIN
returnValue := self.mProduct;
return ODCIConst.Success;
END;
END;
In the ODCIAggregateInitialize
, a new instance of ProductType
is created. The mProduct
variable is initialized with a null
and the mElements
with 0
.
All the methods return either ODCIConst.Success
or ODCIConst.Error
depending on whether the operation has succeeded. The ODCIAggregateIterate
method multiplies each new value with the existing product and increases the number of elements by 1
. ODCIAggregateMerge
multiplies products from both contexts if they have had values, otherwise it chooses the product from the context having values. And the ODCIAggregateTerminate
function simply sets the return value.
The last step is to create the aggregate itself with CREATE FUNCTION
statement:
CREATE OR REPLACE FUNCTION CustomAggregates.Product (value NUMBER) RETURN NUMBER
PARALLEL_ENABLE AGGREGATE USING CustomAggregates.ProductType;
The function is defined with PARALLEL_ENABLE
to let the optimizer know that calculation can be done in parallel. The AGGREGATE USING
clause defines the type which implements the aggregate.
Let’s test this:
SELECT CustomAggregates.Product(a.Value) AS Result
FROM ( SELECT 4 AS Value FROM DUAL UNION ALL
SELECT 2 AS Value FROM DUAL UNION ALL
SELECT 5 AS Value FROM DUAL) a;
The result is:
RESULT
----------
40
Well, that was somewhat expectable. What happens if there’s a null
in the values.
SELECT CustomAggregates.Product(a.Value) AS Result
FROM ( SELECT 4 AS Value FROM DUAL UNION ALL
SELECT 2 AS Value FROM DUAL UNION ALL
SELECT NULL AS Value FROM DUAL UNION ALL
SELECT 5 AS Value FROM DUAL) a;
The result is the same:
RESULT
----------
40
As the ANSI standard defines that null
s are ignored in aggregates such as SUM
, MIN
etc., they are not sent to the ODCIAggregateIterate
at all. This is why null
s were not handled differently in the function. Note that this behaviour is different in Oracle 11g where null
s are actually passed to custom aggregates.
Let's have a final test with an empty set:
SELECT CustomAggregates.Product(a.Value) AS Result
FROM ( SELECT 1 AS Value FROM DUAL WHERE 1=0) a;
The result is null
:
RESULT
----------
Few More Aggregates, Geometric Mean and Harmonic Mean
These aggregates are very similar as the product. Geometric mean is defined:
Now, since this calculation cannot be done in whole in iteration (in this form), we calculate the product in iteration and merge steps and the final result is calculated in the terminate function. So the difference is in the ODCIAggregateTerminate
function:
MEMBER FUNCTION ODCIAggregateTerminate
( self IN CustomAggregates.GeometricMeanType,
returnValue OUT NUMBER,
flags IN NUMBER)
RETURN NUMBER IS
BEGIN
IF ( self.mElements = 0) THEN
returnValue := NULL;
ELSE
returnValue := POWER(self.mProduct, 1 / self.mElements);
END IF;
return ODCIConst.Success;
END;
Harmonic mean is a bit different:
So, now instead of calculating the product, we summarize (1/value):
self.mSum := self.mSum + (1 / value);
Obviously, the merge only summarizes both contexts and the terminate returns:
returnValue := self.mElements / self.mSum;
Run the types and functions for both means from the script and then let’s test both of these:
SELECT CustomAggregates.GeometricMean(a.Value) AS Result
FROM ( SELECT 34 AS Value FROM DUAL UNION ALL
SELECT 27 AS Value FROM DUAL UNION ALL
SELECT 45 AS Value FROM DUAL UNION ALL
SELECT 55 AS Value FROM DUAL UNION ALL
SELECT 22 AS Value FROM DUAL UNION ALL
SELECT 34 AS Value FROM DUAL) a;
Results:
RESULT
----------
34,54511
And then the harmonic mean:
SELECT CustomAggregates.HarmonicMean(a.Value) AS Result
FROM ( SELECT 34 AS Value FROM DUAL UNION ALL
SELECT 27 AS Value FROM DUAL UNION ALL
SELECT 45 AS Value FROM DUAL UNION ALL
SELECT 55 AS Value FROM DUAL UNION ALL
SELECT 22 AS Value FROM DUAL UNION ALL
SELECT 34 AS Value FROM DUAL) a;
Is giving:
RESULT
----------
33,0179837
Analytic Functions
Custom aggregates can be used with analytic clauses. For example, if we divide the previous data to two different categories and we want to have the geometric mean for each category, the query looks like:
SELECT DISTINCT
CustomAggregates.GeometricMean(a.Value)
OVER (PARTITION BY a.Cat) AS Result
FROM ( SELECT 1 AS Cat, 34 AS Value FROM DUAL UNION ALL
SELECT 1 AS Cat, 27 AS Value FROM DUAL UNION ALL
SELECT 1 AS Cat, 45 AS Value FROM DUAL UNION ALL
SELECT 2 AS Cat, 55 AS Value FROM DUAL UNION ALL
SELECT 2 AS Cat, 22 AS Value FROM DUAL UNION ALL
SELECT 2 AS Cat, 34 AS Value FROM DUAL) a;
When this is run, the result is:
RESULT
----------
34,5213758
34,5688606
Using Other Datatypes as Parameters, Words-aggregate
Number is not the only datatype that can be used when creating aggregates. The last example is an aggregate which lists all the distinct words in varchar2
fields.
The implementation is:
CREATE OR REPLACE TYPE BODY CustomAggregates.WordsType IS
STATIC FUNCTION ODCIAggregateInitialize
( sctx IN OUT CustomAggregates.WordsType)
RETURN NUMBER IS
BEGIN
sctx := WordsType( ';', ';');
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateIterate
( self IN OUT CustomAggregates.WordsType,
value IN VARCHAR2) RETURN NUMBER IS
nStart NUMBER := 1;
nPosition NUMBER := 1;
sPortion VARCHAR2(32767);
BEGIN
WHILE nPosition <= LENGTH(value) LOOP
IF SUBSTR(value, nPosition, 1)
IN (' ', '.', ';', '/', ':', ',', '!', '?', '(', ')') THEN
sPortion := SUBSTR(value, nStart, nPosition - nStart + 1);
sPortion := LTRIM(
RTRIM( sPortion, ' ,.-;:_?=)(/&%¤#"!'), ' ,.-;:_?=)(/&%¤#"!');
IF LENGTH(sPortion) > 0 THEN
IF INSTR(mWords, self.mListDelimiter
|| sPortion || self.mListDelimiter)= 0 THEN
mWords := mWords || sPortion || self.mListDelimiter;
END IF;
END IF;
nStart := nPosition + 1;
END IF;
nPosition := nPosition + 1;
END LOOP;
sPortion := SUBSTR(value, nStart, nPosition - nStart + 1);
sPortion := LTRIM( RTRIM( sPortion, ' ,.-;:_?=)(/&%¤#"!'), ' ,.-;:_?=)(/&%¤#"!');
IF LENGTH(sPortion) > 0 THEN
IF INSTR(mWords, self.mListDelimiter || sPortion || _
self.mListDelimiter) = 0 THEN
mWords := mWords || sPortion || self.mListDelimiter;
END IF;
END IF;
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateMerge
( self IN OUT CustomAggregates.WordsType,
ctx2 IN CustomAggregates.WordsType)
RETURN NUMBER IS
BEGIN
RETURN self.ODCIAggregateIterate(ctx2.mWords);
END;
MEMBER FUNCTION ODCIAggregateTerminate
( self IN CustomAggregates.WordsType,
returnValue OUT VARCHAR2,
flags IN NUMBER)
RETURN NUMBER IS
BEGIN
returnValue := RTRIM(LTRIM_
(self.mWords, self.mListDelimiter), self.mListDelimiter);
return ODCIConst.Success;
END;
END;
Basically, the iterate
function goes char
by char
and searches for delimiters. If a word is found, it’s checked against the existing word list and if it’s not present, it will be added. The result of this aggregate is a list of words delimited by semicolon (;).
To test this, let’s take a few simple character string
s:
SELECT CustomAggregates.Words(a.Value) AS Result
FROM ( SELECT 'This is the first string' AS Value FROM DUAL UNION ALL
SELECT 'And this is the second string' AS Value FROM DUAL) a;
The result is:
RESULT
--------------------------------------------------------------------------------
This;is;the;first;string;And;this;second
So the result is case sensitive since the word This
is listed twice. If we want to get the list case insensitively, execute:
SELECT CustomAggregates.Words(LOWER(a.Value)) AS Result
FROM ( SELECT 'This is the first string' AS Value FROM DUAL UNION ALL
SELECT 'And this is the second string' AS Value FROM DUAL) a;
And the result is:
RESULT
--------------------------------------------------------------------------------
this;is;the;first;string;and;second
And finally with a little more complex input:
SELECT CustomAggregates.Words(LOWER(a.Value)) AS Result
FROM ( SELECT 'This is the first sentence. And the second: Is this' AS Value
FROM DUAL UNION ALL
SELECT '"quote" from somewhere; And the second sentence again!' AS Value
FROM DUAL UNION ALL
SELECT 'Cursing #!#%# not allowed :)' AS Value
FROM DUAL) a;
Now the result is:
RESULT
--------------------------------------------------------------------------------
this;is;the;first;sentence;and;second;quote;from;somewhere;again;cursing;not;allowed
Final Words
That’s it for now and I hope you found something new. I’d be grateful if you would have the extra time for comments and votes. Thank you.
History
- 19th March, 2011: Initial post