How Complicated is Delimited String Parsing?
The more commonly known function: LISTAGG converts a single column output of many rows into a string with a custom delimiter. There originally was no real alternative to do the reverse except through a procedural path in our code:
for i in 1 .. l_count
loop
select regexp_substr(string_to_parse,'[^,]+',1,i)
into l_value from dual;
dbms_output.put_line(l_value);
end loop; ...
[Sourced from an example on: Oradev.com]
This example loops through the input string "string_to_parse" and breaks the string into pieces based on a self-declared delimiter. A fancier approach is to use the CONNECT BY syntax (a hierarchical SQL syntax concept) and make an artificial cartesian product to force an iteration over multiple delimited elements:
[Ask Tom has an example] of this:
select trim( substr (txt, instr (txt, ',', 1, level ) + 1,
instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 ) )
from dual
connect by level <= length(txt) - length(replace(txt,',','')) -1)
this one is clever and elegant but it takes a while to interpret. The use of "connect by...level" is not readily obvious and looks like something is missing from the clause when I look at it.
Hidden Constraints: The Trouble with Misapplied PL/SQL Syntax
The summary of the comma_to_table function is that it was created with a specific purpose in mind and does not have a fully generic application for all cases. With a single command, this function allows coders to convert a comma-separated series of string values such as:
'1,2,3,four,fifteen,orange,shack'
to a row-wise output:
1
2
3
four
fifteen
orange
shack
Ask-Tom also warns that the original intent of this function derives from the expression: NAME TOKENIZE. [1] This function was made to assist with internal database ETL processes where the task was to parse valid object names (such as tables) from comma delimited string inputs. This limits the types of values that can be contained within the string:
- Each delimited element must be no longer than about 30 characters.
- The string values must represent valid database object names.
- Objects are defined as:
"[schema].[object_name].[procedure|function]@[database link]" - Certain reserved words that do not make valid object names will cause an error when applied.
Source Citations:
- DBMS_UTILITY.comma_to_table Limitations from "Ask Tom" at: https://asktom.oracle.com/pls/asktom ...
No comments:
Post a Comment