Using UTL_XML.PARSEQUERY for SQL Dependency Analysis

Last week I had a talk at Oracle’s OpenWorld 2011 titled Modern PL/SQL Code Checking and Dependency Analysis.

The problem I described in chapter 4 was to find all view columns using the column UNIT_COST of the table COSTS in the SH schema. Other usages of this column (e.g. in where or order by clauses) have to be ignored. To solve this problem within the Oracle Database Server 11.2 a parser is necessary (at least I’m not aware of another solution). Even a DBA_DEPENDENCY_COLUMNS view as described in Rob van Wijk’s post is not enough to solve this problem.

However, in this particular case no custom or 3rd party parser is necessary. Oracle provides a procedure named PARSEQUERY in the PL/SQL package UTL_XML which is in fact well suited to solve this problem as I will show later. First, I’d like explain which columns should be found by a dependency analysis procedure based on some sample views.

Oracle’s sales history demo schema SH provides view named PROFITS, which is defined as follows:

The columns using COSTS.UNIT_COSTS are highlighted.

The following view uses the column TOTAL_COSTS in GROSS_MARGIN (line 14) and GROSS_MARGIN_PERCENT (lines 14 and 15). The usage is not evident at the first glance since it is based on the column GROSS_MARGIN (line 4) of the named query GM and the column COST (line 8) in GM’s subquery. This kind of dependencies need to be identified.

The next view does not present the data of COSTS.UNIT_COST as a column, even if the view depends on the table COSTS

The last view uses COSTS.UNIT_COST but not as part of a column expression and therefore has not to be reported. The usage in the order by clause is considered save.

So, the following result of the dependency analysis is expected:

SCHEMAVIEWCOLUMN
SHPROFITSUNIT_COST
SHPROFITSTOTAL_COST
SHGROSS_MARGINSGROSS_MARGIN
SHGROSS_MARGINSGROSS_MARGIN_PERCENT

Exactly this result is created by the following query

Now I just list all the code snippets I’ve written to create this result. Please note, that this is considered just a proof-of-concept code to show how UTL_XML.PARSEQUERY could be used for SQL dependency analysis in conjunction with Oracle dictionary views. This means that this is not a complete implementation. For example wild cards (*) is not handled which may lead to missing dependencies. Additionally table/view sources are not checked which may lead to false positives (in case a column is used in multiple view/table sources). – Please feel free to complete the code. However, an update is highly appreciated ;-)

Below you find the XML parser output of the query defined in the view GROSS_MARGINS. The model becomes quite clear, even if I could not find a schema description.

Please note that UTL_XML.PARSEQUERY is suited for extended query dependency analysis only. DML may be parsed, but the resulting model is incomplete with 11.2.0.2 (e.g. clauses missing in the select statement are not included in the model, like the SET clause in an update statement). If you need to analyze PL/SQL beyond PL/Scope you still may need a 3rd party parser.

7 Comments

  1. blank Nayan says:

    Hi Phillip, thanks for the explanation its been really helplful….

    Is there a way i could a SQL analysis on 2 different queries, I had this post on the oracle forum as well , please see this link : link

    It would be a great help if you could guide me on how i should proceed about the same.

    1)What are the initial development steps i need to do?
    2)What would be the modules i would have to plan out so as to accomplish the same

    It would be great, if you could help out………..

    Thanks!!

    • blank Philipp says:

      Hi Nayan,

      Yes, you may analyze or compare two SQL statements. The statements and their results. Based on your comparison requirements this can become a quite complex and time consuming task. Honestly I do not understand your requirements good enough to recommend how to proceed exactly to solve your problem. But I recommend to define some simple test cases with expected results (SQL Statement 1, SQL statement 2, exact comparison result) to allow others to provide an appropriate solution approach.

  2. blank Branko Radovanovic says:

    Brilliant stuff.

    One way of solving the wildcard problem might be running the view sources through DBMS_UTILITY.EXPAND_SQL_TEXT() first – this will expand the wildcard columns and also recursively expand the underlying views. (The latter might not be desirable, though.)

    EXPAND_SQL_TEXT() is a 12c feature, but is available on 11g too as DBMS_SQL2.EXPAND_SQL_TEXT() – undocumented, yet seems to work just fine.

    Unfortunately, what coldep_pkg.get_dep() does is impact, while I discovered this blog post looking for a solution for lineage, which is the opposite problem: given the SQL query, find all table columns it depends on. Here, I’d definitely use EXPAND_SQL_TEXT() as the first step, and this post provides the techniques to do the rest – still not easy, but frankly until today I thought it was virtually impossible.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.