Last week I had a talk at Oracle’s OpenWorld 2011 titled Modern PL/SQL Code Checking and Dependency Analysis.
The problem I described in chapter 4 was to find all view columns using the column UNIT_COST of the table COSTS in the SH schema. Other usages of this column (e.g. in where or order by clauses) have to be ignored. To solve this problem within the Oracle Database Server 11.2 a parser is necessary (at least I’m not aware of another solution). Even a DBA_DEPENDENCY_COLUMNS view as described in Rob van Wijk’s post is not enough to solve this problem.
However, in this particular case no custom or 3rd party parser is necessary. Oracle provides a procedure named PARSEQUERY in the PL/SQL package UTL_XML which is in fact well suited to solve this problem as I will show later. First, I’d like explain which columns should be found by a dependency analysis procedure based on some sample views.
Oracle’s sales history demo schema SH provides view named PROFITS, which is defined as follows:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
CREATE OR REPLACE VIEW PROFITS AS
SELECT s.channel_id,
s.cust_id,
s.prod_id,
s.promo_id,
s.time_id,
c.unit_cost,
c.unit_price,
s.amount_sold,
s.quantity_sold,
c.unit_cost * s.quantity_sold TOTAL_COST
FROM costs c, sales s
WHERE c.prod_id = s.prod_id
AND c.time_id = s.time_id
AND c.channel_id = s.channel_id
AND c.promo_id = s.promo_id; |
The columns using COSTS.UNIT_COSTS are highlighted.
The following view uses the column TOTAL_COSTS in GROSS_MARGIN (line 14) and GROSS_MARGIN_PERCENT (lines 14 and 15). The usage is not evident at the first glance since it is based on the column GROSS_MARGIN (line 4) of the named query GM and the column COST (line 8) in GM’s subquery. This kind of dependencies need to be identified.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
CREATE OR REPLACE VIEW GROSS_MARGINS AS
WITH
gm AS (
SELECT time_id, revenue, revenue - cost AS gross_margin
FROM (
SELECT time_id,
unit_price * quantity_sold AS revenue,
total_cost AS cost
FROM profits
)
)
SELECT t.fiscal_year,
SUM(revenue) AS revenue,
SUM(gross_margin) AS gross_margin,
round(100 * SUM(gross_margin) / SUM(revenue), 2)
AS gross_margin_percent
FROM gm
INNER JOIN times t ON t.time_id = gm.time_id
GROUP BY t.fiscal_year
ORDER BY t.fiscal_year; |
The next view does not present the data of COSTS.UNIT_COST as a column, even if the view depends on the table COSTS
|
1 2 3 |
CREATE OR REPLACE VIEW REVENUES AS
SELECT fiscal_year, revenue
FROM gross_margins; |
The last view uses COSTS.UNIT_COST but not as part of a column expression and therefore has not to be reported. The usage in the order by clause is considered save.
|
1 2 3 4 5 6 7 8 9 10 |
CREATE OR REPLACE VIEW SALES_ORDERED_BY_GM AS
SELECT channel_id,
cust_id,
prod_id,
promo_id,
time_id,
amount_sold,
quantity_sold
FROM profits
ORDER BY (unit_price - unit_cost) DESC; |
So, the following result of the dependency analysis is expected:
| Schema | View | Column |
|---|---|---|
| SH | PROFITS | UNIT_COST |
| SH | PROFITS | TOTAL_COST |
| SH | GROSS_MARGINS | GROSS_MARGIN |
| SH | GROSS_MARGINS | GROSS_MARGIN_PERCENT |
Exactly this result is created by the following query
|
1 2 |
SELECT *
FROM TABLE(coldep_pkg.get_dep('sh', 'costs', 'unit_cost')); |
Now I just list all the code snippets I’ve written to create this result. Please note, that this is considered just a proof-of-concept code to show how UTL_XML.PARSEQUERY could be used for SQL dependency analysis in conjunction with Oracle dictionary views. This means that this is not a complete implementation. For example wild cards (*) is not handled which may lead to missing dependencies. Additionally table/view sources are not checked which may lead to false positives (in case a column is used in multiple view/table sources). – Please feel free to complete the code. However, an update is highly appreciated ;-)
|
1 |
GRANT EXECUTE ON SYS.UTL_XML TO SH; |
|
1 2 3 4 5 6 7 |
CREATE OR REPLACE TYPE "SH"."COLDEP_TYP" AS
OBJECT (schema_name VARCHAR2(30),
view_name varchar2(30),
column_name VARCHAR2(30))
/
CREATE OR REPLACE TYPE "SH"."COLDEP_L" IS TABLE OF coldep_typ
/ |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
CREATE OR REPLACE PACKAGE "SH"."COLDEP_PKG" IS
FUNCTION parse_query(p_query IN VARCHAR2) RETURN xmltype;
FUNCTION get_dep(p_schema_name IN VARCHAR2,
p_object_name IN VARCHAR2,
p_column_name IN VARCHAR2) RETURN coldep_l
PIPELINED;
FUNCTION process_view(p_schema_name IN VARCHAR2,
p_view_name IN VARCHAR2,
p_column_name IN VARCHAR2,
p_query IN CLOB) RETURN coldep_l;
END coldep_pkg;
/
CREATE OR REPLACE PACKAGE BODY "SH"."COLDEP_PKG" IS
FUNCTION parse_query(p_query IN VARCHAR2) RETURN xmltype IS
v_clob CLOB;
v_xml xmltype;
BEGIN
dbms_lob.createtemporary(v_clob, TRUE);
-- parse query and get XML as CLOB
sys.utl_xml.parsequery(USER, p_query, v_clob);
-- create XMLTYPE from CLOB
v_xml := xmltype.createxml(v_clob);
dbms_lob.freetemporary(v_clob);
RETURN v_xml;
END parse_query;
FUNCTION get_dep(p_schema_name IN VARCHAR2,
p_object_name IN VARCHAR2,
p_column_name IN VARCHAR2) RETURN coldep_l
PIPELINED IS
BEGIN
-- query dictionary dependencies
FOR v_dep IN (SELECT d.owner AS schema_name,
d.name AS view_name,
v.text AS query_text
FROM all_dependencies d
INNER JOIN all_views v
ON v.owner = d.owner
AND v.view_name = d.name
WHERE d.referenced_owner = upper(p_schema_name)
AND d.referenced_name = upper(p_object_name)
AND d.type = 'VIEW')
LOOP
-- process every fetched view
FOR v_views IN (
SELECT VALUE(pv) coldep
FROM TABLE(process_view(v_dep.schema_name,
v_dep.view_name,
p_column_name,
v_dep.query_text)) pv)
LOOP
-- return column usages in v_dep.view_name
PIPE ROW(v_views.coldep);
-- get column usages of views using v_dep.view_name (recursive calls)
FOR v_recursive IN (
SELECT VALUE(dep) coldep
FROM TABLE(get_dep(v_views.coldep.schema_name,
v_views.coldep.view_name,
v_views.coldep.column_name)) dep)
LOOP
-- return column usages of recursive call
PIPE ROW(v_recursive.coldep);
END LOOP;
END LOOP;
END LOOP;
END get_dep;
FUNCTION process_view(p_schema_name IN VARCHAR2,
p_view_name IN VARCHAR2,
p_column_name IN VARCHAR2,
p_query IN CLOB) RETURN coldep_l IS
v_search_l coldep_l := coldep_l(coldep_typ(NULL,
NULL,
p_column_name));
v_xml xmltype;
v_previous_count INTEGER := 0;
v_coldep_l coldep_l := coldep_l();
BEGIN
-- parse view query
v_xml := parse_query(p_query);
-- get inline dependencies from secondary select lists
-- TODO: handle table/view source and wildcard properly
WHILE v_previous_count < v_search_l.count
LOOP
v_previous_count := v_search_l.count;
FOR v_secondary IN (
SELECT nvl(x.alias_name, x.column_reference) AS alias_name
FROM (SELECT t.select_list_item,
t.alias_name,
extractvalue(VALUE(c), 'COLUMN') AS column_reference
FROM xmltable('//SELECT_LIST_ITEM[ancestor::FROM or ancestor::WITH]'
passing v_xml
columns select_list_item xmltype path '//SELECT_LIST_ITEM',
alias_name VARCHAR2(30) path '//COLUMN_ALIAS') t,
TABLE(xmlsequence(extract(select_list_item, '//COLUMN'))) c) x
WHERE upper(x.column_reference) IN (SELECT upper(column_name)
FROM TABLE(v_search_l))
AND upper(alias_name) NOT IN (SELECT upper(column_name)
FROM TABLE(v_search_l)))
LOOP
-- add internal column usage
v_search_l.extend;
v_search_l(v_search_l.count) := coldep_typ(NULL,
NULL,
v_secondary.alias_name);
END LOOP;
END LOOP;
-- analyze primary select list
-- TODO: handle table/view source and wildcard properly
FOR v_primary IN (
SELECT x.column_id, atc.column_name
FROM (SELECT t.select_list_item,
t.column_id,
extractvalue(VALUE(c), 'COLUMN') AS column_reference
FROM xmltable('//SELECT_LIST_ITEM[not (ancestor::FROM) and not (ancestor::WITH)]'
passing v_xml
columns column_id FOR ordinality,
select_list_item xmltype path '//SELECT_LIST_ITEM') t,
TABLE(xmlsequence(extract(select_list_item, '//COLUMN'))) c) x
INNER JOIN all_tab_columns atc
ON atc.owner = p_schema_name
AND atc.table_name = p_view_name
AND atc.column_id = x.column_id
WHERE upper(x.column_reference) IN (SELECT upper(column_name)
FROM TABLE(v_search_l))
ORDER BY x.column_id)
LOOP
-- add external column usage
v_coldep_l.extend;
v_coldep_l(v_coldep_l.count) := coldep_typ(p_schema_name,
p_view_name,
v_primary.column_name);
END LOOP;
-- return column dependencies
RETURN v_coldep_l;
END process_view;
END coldep_pkg;
/ |
Below you find the XML parser output of the query defined in the view GROSS_MARGINS. The model becomes quite clear, even if I could not find a schema description.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
<QUERY>
<WITH>
<WITH_ITEM>
<QUERY_ALIAS>GM</QUERY_ALIAS>
<QUERY>
<SELECT>
<SELECT_LIST>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<SUB>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
<COLUMN_REF>
<COLUMN>COST</COLUMN>
</COLUMN_REF>
</SUB>
<COLUMN_ALIAS>GROSS_MARGIN</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
</SELECT_LIST>
</SELECT>
<FROM>
<FROM_ITEM>
<QUERY>
<SELECT>
<SELECT_LIST>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<MUL>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>UNIT_PRICE</COLUMN>
</COLUMN_REF>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>QUANTITY_SOLD</COLUMN>
</COLUMN_REF>
</MUL>
<COLUMN_ALIAS>REVENUE</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<TABLE>PROFITS</TABLE>
<COLUMN>TOTAL_COST</COLUMN>
</COLUMN_REF>
<COLUMN_ALIAS>COST</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
</SELECT_LIST>
</SELECT>
<FROM>
<FROM_ITEM>
<TABLE>PROFITS</TABLE>
</FROM_ITEM>
</FROM>
</QUERY>
</FROM_ITEM>
</FROM>
</QUERY>
</WITH_ITEM>
</WITH>
<SELECT>
<SELECT_LIST>
<SELECT_LIST_ITEM>
<COLUMN_REF>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>FISCAL_YEAR</COLUMN>
</COLUMN_REF>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<SUM>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
</SUM>
<COLUMN_ALIAS>REVENUE</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<SUM>
<COLUMN_REF>
<COLUMN>GROSS_MARGIN</COLUMN>
</COLUMN_REF>
</SUM>
<COLUMN_ALIAS>GROSS_MARGIN</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
<SELECT_LIST_ITEM>
<ROUND>
<DIV>
<MUL>
<LITERAL>100</LITERAL>
<SUM>
<COLUMN_REF>
<COLUMN>GROSS_MARGIN</COLUMN>
</COLUMN_REF>
</SUM>
</MUL>
<SUM>
<COLUMN_REF>
<COLUMN>REVENUE</COLUMN>
</COLUMN_REF>
</SUM>
</DIV>
<LITERAL>2</LITERAL>
</ROUND>
<COLUMN_ALIAS>GROSS_MARGIN_PERCENT</COLUMN_ALIAS>
</SELECT_LIST_ITEM>
</SELECT_LIST>
</SELECT>
<FROM>
<FROM_ITEM>
<JOIN>
<INNER/>
<JOIN_TABLE_1>
<QUERY_ALIAS>GM</QUERY_ALIAS>
</JOIN_TABLE_1>
<JOIN_TABLE_2>
<TABLE>TIMES</TABLE>
<TABLE_ALIAS>T</TABLE_ALIAS>
</JOIN_TABLE_2>
<ON>
<EQ>
<COLUMN_REF>
<TABLE>TIMES</TABLE>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
<COLUMN_REF>
<TABLE_ALIAS>GM</TABLE_ALIAS>
<COLUMN>TIME_ID</COLUMN>
</COLUMN_REF>
</EQ>
</ON>
</JOIN>
</FROM_ITEM>
</FROM>
<GROUP_BY>
<EXPRESSION_LIST>
<EXPRESSION_LIST_ITEM>
<COLUMN_REF>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>FISCAL_YEAR</COLUMN>
</COLUMN_REF>
</EXPRESSION_LIST_ITEM>
</EXPRESSION_LIST>
</GROUP_BY>
<ORDER_BY>
<ORDER_BY_LIST>
<ORDER_BY_LIST_ITEM>
<COLUMN_REF>
<TABLE_ALIAS>T</TABLE_ALIAS>
<COLUMN>FISCAL_YEAR</COLUMN>
</COLUMN_REF>
</ORDER_BY_LIST_ITEM>
</ORDER_BY_LIST>
</ORDER_BY>
</QUERY> |
Please note that UTL_XML.PARSEQUERY is suited for extended query dependency analysis only. DML may be parsed, but the resulting model is incomplete with 11.2.0.2 (e.g. clauses missing in the select statement are not included in the model, like the SET clause in an update statement). If you need to analyze PL/SQL beyond PL/Scope you still may need a 3rd party parser.