This is an interesting problem that has come up a few times in discussions (and I think it has been mentioned on the SQL forums as well). When using LISTAGG on very large data sets you can sometimes create a list that is too long and consequently get an:
ORA-01489: result of string concatenation is too long
error. Wouldn’t it be great if there was a simple yet elegant way to resolve this issue?
Actually there is and we can use a few of the most recent analytical SQL functions. If you are using Database 12c you can make use of the MATCH_RECOGNIZE function to effectively create chunks of strings that do not exceed the VARCHAR2 limit.
For example, let’s assume that we have the following statement (to keep things relatively simple let’s use the EMP table in the schema SCOTT)
SELECT
deptno,
LISTAGG(ename, ';') WITHIN GROUP (ORDER BY empno) AS namelist
FROM scott.emp
GROUP BY deptno;
generates a nicely concatenated list of employee names within each department.
Now let’s assume that the above statement does not run and that we have a limit of 15 characters that can be returned by each row in our LISTAGG function.
We can use the Database 12c SQL pattern matching function, MATCH_RECOGNIZE, to return a list of values that does not exceed 15 characters. First step is to wrap the processing in a view so that we can then get data from this view to feed our LISTAGG function. Here is the view that contains the MATCH_RECOGNIZE clause:
CREATE OR REPLACE VIEW emp_mr AS
SELECT * FROM scott.emp MATCH_RECOGNIZE(
PARTITION BY deptno ORDER BY empno
MEASURES
match_number() AS mno,
classifier() as pattern_vrb
ALL ROWS PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S B+)
DEFINE B AS LENGTHB(S.ename) + SUM(LENGTHB(CONCAT(b.ename, ';'))) + LENGTHB(';') < = 15
);
Let's view the data returned by MATCH_RECOGNIZE and using the analytic function version of SUM() to calculate a running total to show the overall length of the concatenated string of names within each department:
SELECT
deptno,
empno,
ename,
mno,
pattern_vrb,
sum(LENGTH(ename)) OVER (PARTITION BY deptno, mno ORDER BY empno) AS str_length
FROM emp_mr;
You might well ask: why don’t we put the LISTAGG function inside the measure clause? At the moment it is not possible to include analytical functions such as LISTAGG in the measure clause. Therefore, we have put the LISTAGG function in a separate SQL statement:
SELECT
deptno,
LISTAGG(ename, ';') WITHIN GROUP (ORDER BY empno) AS namelist
FROM emp_mr
GROUP BY deptno, mno;
The above code now returns groups of strings for each department where the total length of each group is less than 15 characters:
SELECT
deptno,
LISTAGG(ename, ';') WITHIN GROUP (ORDER BY empno) AS namelist,
length(LISTAGG(ename, ';') WITHIN GROUP (ORDER BY empno)) AS how_long
FROM emp_mr
GROUP BY deptno, mno;
You can change the cut-off point for the string truncation process by changing the value shown in bold on the last line of the code...in this version I have changed the truncation point to 25 characters:
CREATE OR REPLACE VIEW emp_mr AS
SELECT * FROM scott.emp MATCH_RECOGNIZE(
PARTITION BY deptno ORDER BY empno
MEASURES
match_number() AS mno,
classifier() as pattern_vrb
ALL ROWS PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (S B*)
DEFINE B AS LENGTHB(S.ename) + SUM(LENGTHB(CONCAT(b.ename, ';'))) + LENGTHB(';') < = 25
);
Let's view the data returned by MATCH_RECOGNIZE:
SELECT
deptno,
empno,
ename,
mno,
pattern_vrb,
sum(LENGTH(ename)) OVER (PARTITION BY deptno, mno ORDER BY empno) AS str_length
FROM emp_mr;
The above code now returns groups of strings for each department where the total length of each group is less than 25 characters:
SELECT
deptno,
LISTAGG(ename, ';') WITHIN GROUP (ORDER BY empno) AS namelist,
LENGTH(LISTAGG(ename, ';') WITHIN GROUP (ORDER BY empno)) AS how_long
FROM emp_mr
GROUP BY deptno, mno;