The post IslandSQL Episode 6: DML Statements in Oracle Database 23c appeared first on Philipp Salvisberg's Blog.
]]>The IslandSQL grammar now covers all DML statements. This means call
, delete
, explain plan
, insert
, lock table
, merge
, select
and update
.
In this episode, we will focus on new features in the Oracle Database 23c that can be used in insert
, update
, delete
and merge
statements. For the select
statement see the last episode.
The new table value constructor allows you to create rows on the fly. This simplifies statements. Furthermore, it allows you to write a single statement instead of a series of statements, which makes the execution in scripts faster. It can be used in the select
, insert
and merge
statement.
drop table if exists d;
create table d (deptno number(2,0), dname varchar2(14), loc varchar2(13));
insert into d (deptno, dname, loc)
values (10, 'ACCOUNTING', 'NEW YORK'),
(20, 'RESEARCH', 'DALLAS'),
(30, 'SALES', 'CHICAGO'),
(40, 'OPERATIONS', 'BOSTON');
Table D dropped.
Table D created.
4 rows inserted.
merge into d t
using (values
(10, 'ACCOUNTING', 'NEW YORK'),
(20, 'RESEARCH', 'DALLAS'),
(30, 'SALES', 'CHICAGO'),
(40, 'OPERATIONS', 'BOSTON')
) s (deptno, dname, loc)
on (t.deptno = s.deptno)
when matched then
update
set t.dname = s.dname,
t.loc = s.loc
when not matched then
insert (t.deptno, t.dname, t.loc)
values (s.deptno, s.dname, s.loc);
4 rows merged.
The new from_using_clause
can be used in delete
and update
statements.
With this new clause, you can avoid a self-join and, as a result, the optimizer can produce a more efficient execution plan.
The next example is based on the HR schema. We delete all countries that are not used by any department. See line 3 for the from_using_clause
. The join conditions and the filter criteria are part of the where_clause
.
You cannot define the join condition for the table in the from_clause
in the from_using_clause
. This is a documented limitation. Furthermore, we cannot mix ANSI-92 join syntax with Oracle-style outer join syntax (see ORA-25156). As a result, we have to use the Oracle-style join syntax for all tables.
delete
from countries c
from locations l, departments d
where l.country_id (+) = c.country_id
and d.location_id (+) = l.location_id
and l.location_id is null
and d.department_id is null;
11 rows deleted.
--------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------
| 0 | DELETE STATEMENT | |
| 1 | DELETE | COUNTRIES |
| 2 | FILTER | |
| 3 | HASH JOIN OUTER | |
| 4 | FILTER | |
| 5 | HASH JOIN OUTER | |
| 6 | INDEX FULL SCAN | COUNTRY_C_ID_PK |
| 7 | TABLE ACCESS FULL| LOCATIONS |
| 8 | TABLE ACCESS FULL | DEPARTMENTS |
--------------------------------------------------
Having two from
keywords in the delete
statement is funny, but it does not make the statement easier to read. I therefore recommend rewriting the statement like this:
delete countries c
using locations l, departments d
where l.country_id (+) = c.country_id
and d.location_id (+) = l.location_id
and l.location_id is null
and d.department_id is null;
11 rows deleted.
--------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------
| 0 | DELETE STATEMENT | |
| 1 | DELETE | COUNTRIES |
| 2 | FILTER | |
| 3 | HASH JOIN OUTER | |
| 4 | FILTER | |
| 5 | HASH JOIN OUTER | |
| 6 | INDEX FULL SCAN | COUNTRY_C_ID_PK |
| 7 | TABLE ACCESS FULL| LOCATIONS |
| 8 | TABLE ACCESS FULL | DEPARTMENTS |
--------------------------------------------------
Here’s an alternative, pre-23c-style delete
statement without the from_using_clause
. It is accessing the countries
table twice, which might lead to a less efficient execution plan.
delete
from countries c1
where c1.country_id in (
select c2.country_id
from countries c2
left join locations l
on l.country_id = c2.country_id
left join departments d
on d.location_id = l.location_id
where l.location_id is null
and d.department_id is null
);
11 rows deleted.
----------------------------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------------------------
| 0 | DELETE STATEMENT | |
| 1 | DELETE | COUNTRIES |
| 2 | INDEX FULL SCAN | COUNTRY_C_ID_PK |
| 3 | FILTER | |
| 4 | NESTED LOOPS OUTER | |
| 5 | FILTER | |
| 6 | NESTED LOOPS OUTER | |
| 7 | INDEX UNIQUE SCAN | COUNTRY_C_ID_PK |
| 8 | TABLE ACCESS BY INDEX ROWID BATCHED| LOCATIONS |
| 9 | INDEX RANGE SCAN | LOC_COUNTRY_IX |
| 10 | TABLE ACCESS BY INDEX ROWID BATCHED | DEPARTMENTS |
| 11 | INDEX RANGE SCAN | DEPT_LOCATION_IX |
----------------------------------------------------------------------
In this example, we increase the salaries of all employees in Germany and Canada by 20%. See lines 3 to 7 for the from_using_clause
where we use ANSI-92 join syntax.
update employees e
set e.salary = e.salary * 1.2
using departments d
join locations l
on l.location_id = d.location_id
join countries c
on c.country_id = l.country_id
where d.department_id = e.department_id
and c.country_name in ('Germany', 'Canada');
3 rows updated.
----------------------------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | EMPLOYEES |
| 2 | NESTED LOOPS | |
| 3 | NESTED LOOPS | |
| 4 | NESTED LOOPS | |
| 5 | NESTED LOOPS | |
| 6 | INDEX FULL SCAN | COUNTRY_C_ID_PK |
| 7 | TABLE ACCESS BY INDEX ROWID BATCHED| LOCATIONS |
| 8 | INDEX RANGE SCAN | LOC_COUNTRY_IX |
| 9 | TABLE ACCESS BY INDEX ROWID BATCHED | DEPARTMENTS |
| 10 | INDEX RANGE SCAN | DEPT_LOCATION_IX |
| 11 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX |
| 12 | TABLE ACCESS BY INDEX ROWID | EMPLOYEES |
----------------------------------------------------------------------
And here’s an alternative, pre-23c-style update
statement without the from_using_clause
. It is accessing the employees
table twice, which might lead to a less efficient execution plan.
update employees e1
set e1.salary = e1.salary * 1.2
where e1.employee_id in (
select e2.employee_id
from employees e2
join departments d
on d.department_id = e2.department_id
join locations l
on l.location_id = d.location_id
join countries c
on c.country_id = l.country_id
where c.country_name in ('Germany', 'Canada')
);
3 rows updated.
-----------------------------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | EMPLOYEES |
| 2 | HASH JOIN SEMI | |
| 3 | TABLE ACCESS FULL | EMPLOYEES |
| 4 | VIEW | VW_NSO_1 |
| 5 | NESTED LOOPS | |
| 6 | NESTED LOOPS | |
| 7 | NESTED LOOPS | |
| 8 | NESTED LOOPS SEMI | |
| 9 | VIEW | index$_join$_005 |
| 10 | HASH JOIN | |
| 11 | INDEX FAST FULL SCAN | LOC_COUNTRY_IX |
| 12 | INDEX FAST FULL SCAN | LOC_ID_PK |
| 13 | INDEX UNIQUE SCAN | COUNTRY_C_ID_PK |
| 14 | TABLE ACCESS BY INDEX ROWID BATCHED| DEPARTMENTS |
| 15 | INDEX RANGE SCAN | DEPT_LOCATION_IX |
| 16 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX |
| 17 | TABLE ACCESS BY INDEX ROWID | EMPLOYEES |
-----------------------------------------------------------------------
However, we can update an inline view. The Oracle database has supported this for a very long time (without a BYPASS_UJVC
hint). There are some limitations, but otherwise, it works quite well. Here’s an example:
update (
select e.*
from employees e
join departments d
on d.department_id = e.department_id
join locations l
on l.location_id = d.location_id
join countries c
on c.country_id = l.country_id
where c.country_name in ('Germany', 'Canada')
)
set salary = salary * 1.2;
3 rows updated.
---------------------------------------------------------------------
| Id | Operation | Name |
---------------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | EMPLOYEES |
| 2 | NESTED LOOPS | |
| 3 | NESTED LOOPS | |
| 4 | NESTED LOOPS | |
| 5 | INDEX FULL SCAN | COUNTRY_C_ID_PK |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED| LOCATIONS |
| 7 | INDEX RANGE SCAN | LOC_COUNTRY_IX |
| 8 | TABLE ACCESS BY INDEX ROWID BATCHED | DEPARTMENTS |
| 9 | INDEX RANGE SCAN | DEPT_LOCATION_IX |
| 10 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX |
---------------------------------------------------------------------
The execution plan is similar to the variant with the from_using_clause
. So, from a performance point of view, this is a good option. However, I like the from_using_clause
variant better because it’s clearer which table is updated and which tables are just used for query purposes.
The returning_clause
has been extended.
It’s now possible to explicitly return old
and new
values. The default depends on the operation. new
in insert
/update
and old
in delete
statements. I do not see a lot of value for delete
and insert
statements besides maybe making the statements more explicit and therefore easier to read. However, for the update
statement, this new feature can be useful.
Here’s a small SQL script showing the new returning clause in action for insert
, update
and delete
.
set serveroutput on
drop table if exists t;
create table t (id integer, value integer);
declare
l_old_value t.value%type;
l_new_value t.value%type;
begin
dbms_random.seed(16);
insert into t (id, value)
values (1, dbms_random.value(low => 1, high => 100))
return new value into l_new_value;
dbms_output.put_line('Insert: new value ' || l_new_value);
update t
set value = value * 2
where id = 1
return old value, new value into l_old_value, l_new_value;
dbms_output.put_line('Update: old value ' || l_old_value || ', new value ' || l_new_value);
delete t
where id = 1
return old value into l_old_value;
dbms_output.put_line('Delete: old value ' || l_old_value);
end;
/
Table T dropped.
Table T created.
Insert: new value 21
Update: old value 21, new value 42
Delete: old value 42
PL/SQL procedure successfully completed.
The column_definition
clause in the create table
statement has been extended.
Finally, it’s possible to enforce the default on null
expression also for update
and merge
statements.
The next SQL script demonstrates this.
drop table if exists t;
create table t (
id integer not null primary key,
value varchar2(10 char) default on null for insert and update 'my default'
);
insert into t(id, value)
values (1, 'value1'),
(2, null);
select * from t order by id;
update t set value = case id
when 1 then
null
when 2 then
'value2'
end;
select * from t order by id;
merge into t
using (values
(1, 'value3'),
(2, null),
(3, null)
) s (id, value)
on (t.id = s.id)
when matched then
update
set t.value = s.value
when not matched then
insert (t.id, t.value)
values (s.id, s.value);
select * from t order by id;
Table T dropped.
Table T created.
2 rows inserted.
ID VALUE
---------- ----------
1 value1
2 my default
2 rows updated.
ID VALUE
---------- ----------
1 my default
2 value2
3 rows merged.
ID VALUE
---------- ----------
1 value3
2 my default
3 my default
The new datatype_domain
clause comes with a reservable
keyword.
You can update reservable
columns without locking a row. As a result, updating such a column is possible from multiple sessions in a transactional way. However, only numeric columns can be declared as reserveable
.
Let’s make an example.
drop table if exists e;
create table e (
empno number(4,0) not null primary key,
ename varchar2(10) not null,
sal number(7,2) reservable not null
);
insert into e(empno, ename, sal)
values (7788, 'SCOTT', 3000),
(7739, 'KING', 5000);
commit;
Table E dropped.
Table E created.
2 rows inserted.
Commit complete.
After the setup, we run two database sessions in parallel.
update e
set sal = sal + 100
where empno = 7788;
select * from e;
1 row updated.
EMPNO ENAME SAL
---------- ---------- ----------
7788 SCOTT 3000
7739 KING 5000
update e
set sal = sal + 500
where empno = 7788;
select * from e;
1 row updated.
EMPNO ENAME SAL
---------- ---------- ----------
7788 SCOTT 3000
7739 KING 5000
We’ve updated the same record in two sessions. The transactions are pending and the changes are not yet visible in the target table. Let’s complete the pending transactions.
commit;
select * from e;
Commit complete.
EMPNO ENAME SAL
---------- ---------- ----------
7788 SCOTT 3100
7739 KING 5000
commit;
select * from e;
Commit complete.
EMPNO ENAME SAL
---------- ---------- ----------
7788 SCOTT 3600
7739 KING 5000
After committing, the changes are visible in the target table. The changes from both sessions have been applied. Concurrent updates of the same row without locking. Pure magic.
How is that possible? Quite simple. Behind the scenes, the Oracle Database creates a reservation journal table named SYS_RESERVJRNL_<object_id_of_table>
for every table with a reservable
column. This table stores the pending changes per session and applies them on commit. You can query this table, to better understand the process.
See the Database Development Guide for more information about lock-free reservations.
More features are applicable in DML statements. For example, using sys_row_etag
for optimistic locking or when working with JSON-relational duality views. The JSON-Relational Duality Developer’s Guide explains this new feature in detail.
For a complete list see Oracle Database New Features.
For the next episode, the IslandSQL grammar will be extended to cover the PostgreSQL 16 grammar for the current statements in scope. This means all DML statements. I’m sure I will be able to show some interesting differences between the Oracle Database and PostgreSQL. Stay tuned.
The post IslandSQL Episode 6: DML Statements in Oracle Database 23c appeared first on Philipp Salvisberg's Blog.
]]>The post IslandSQL Episode 5: Select in Oracle Database 23c appeared first on Philipp Salvisberg's Blog.
]]>In the last episode, we extended the expressions in the IslandSQL grammar to complete the lock table
statement. The grammar now fully covers expressions, conditions and the select
statement. In this episode, we will focus on optimizer hints and new features in the Oracle Database 23c that can be used in the select
statement.
The full source code is available on GitHub, the binaries are on Maven Central and this VS Code extension uses the IslandSQL library to find text in DML statements and report syntax errors.
ANTLR uses the concept of channels which is based on the idea of radio frequencies. The lexer is responsible for identifying tokens and putting them on the right channel.
For most lexers, these two channels are enough:
DEFAULT_CHANNEL
– all tokens that are relevant to the parserHIDDEN_CHANNEL
– all other tokensHere’s an example:
select█/*+ full(emp) */█*█from█emp█where█empno█=█7788█;
█/*+ full(emp) */█ █ █ █ █ █ █ █
select * from emp where empno = 7788 ;
The first line contains the complete statement where a space token is represented as █
. The syntax highlighting helps to identify the 19 tokens. In the second line, you find all 10 hidden tokens – comments and whitespace. The noise, so to speak. And in the third line are the visible 9 tokens on the default channel.
This is similar to a noise-cancelling system. The parser only gets the tokens that are necessary to do its job.
In this blog post, I explained how you can distinguish hints from ordinary comments and highlight them in SQL Developer. Solving this problem was a bit more complicated because SQL Developer’s parse tree does not contain hints. Because hints are just special comments.
However, in the IslandSQL grammar, we want to define hints as part of a query_block. In other words, we want to make them visible.
Identifying hints in the lexer and putting them on the DEFAULT_CHANNEL
sounds like a good solution. However, we do not want to handle comment tokens that look like a hint in every position in the parser. This would be a nightmare. To avoid that we could add a semantic predicate to consider only hint-style comments following the select
keyword. Of course, we need to ignore whitespace and ordinary comments. Furthermore, we have to ensure that the select
keyword is the start of a query_block
and not used in another context such as a grant
statement.
At that point, it becomes obvious that the lexer would be doing the job of the parser.
We better use the lexer only to identify hint tokens and put them on the HIDDEN_CHANNEL
:
ML_HINT: '/*+' .*? '*/' -> channel(HIDDEN);
ML_COMMENT: '/*' .*? '*/' -> channel(HIDDEN);
SL_HINT: '--+' ~[\r\n]* -> channel(HIDDEN);
SL_COMMENT: '--' ~[\r\n]* -> channel(HIDDEN);
And then we define a semantic predicate in the parser:
queryBlock:
{unhideFirstHint();} K_SELECT hint?
queryBlockSetOperator?
selectList
(intoClause | bulkCollectIntoClause)? // in PL/SQL only
fromClause? // starting with Oracle Database 23c the from clause is optional
whereClause?
hierarchicalQueryClause?
groupByClause?
modelClause?
windowClause?
;
That’s the call of the function unhideFirstHint();}
on line 161. At that point, the parser is at the position of the token K_SELECT
. Here’s the implementation in the base class of the generated parser:
public void unhideFirstHint() {
CommonTokenStream input = ((CommonTokenStream) this.getTokenStream());
List<Token> tokens = input.getHiddenTokensToRight(input.index());
if (tokens != null) {
for (Token token : tokens) {
if (token.getType() == IslandSqlLexer.ML_HINT || token.getType() == IslandSqlLexer.SL_HINT) {
((CommonToken) token).setChannel(Token.DEFAULT_CHANNEL);
return; // stop after first hint style comment
}
}
}
}
We scan all hidden tokens to the right of the keyword select
and set the first hint token to the DEFAULT_CHANNEL
to make it visible to the parser.
Let’s visualise the parse tree of the following query:
select -- A query_block can have only one comment
/* containing hints, and that comment must
follow the SELECT keyword. */
/*+ full(emp) */
--+ index(emp)
ename, sal -- select_list
from emp -- from_clause
where empno = 7788; -- where_clause
We use ParseTreeUtil.dotParseTree
to produce an output in DOT format and paste the result into the web UI of Edotor or any other Graphviz viewer to produce this result:
The leave nodes are sand-coloured rectangles. They represent the visible lexer tokens, the ones on the DEFAULT_CHANNEL
. All other nodes are sky blue and elliptical. They represent a rule in the parser grammar.
I have changed the colour of the hint
node to red so that you can spot it more easily. You see that it contains the /*+ full(emp) */
hint-style comment. All other comments are not visible in the parse tree. That’s what we wanted.
Here’s an alternative textual representation of the parse tree using ParseTreeUtil.printParseTree
. It is better suited to represent larger parse trees. Furthermore, it contains also the symbol name of lexer tokens, for example K_SELECT
or ML_HINT
as you see in lines 7 and 9.
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
hint
ML_HINT:/*+ full(emp) */
selectList
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:ename
COMMA:,
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:sal
fromClause
K_FROM:from
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
sqlName
unquotedId
ID:emp
whereClause
K_WHERE:where
condition
expression:simpleComparisionCondition
expression:simpleExpressionName
sqlName
unquotedId
ID:empno
simpleComparisionOperator:eq
EQUALS:=
expression:simpleExpressionNumberLiteral
NUMBER:7788
sqlEnd
SEMI:;
<EOF>
The Oracle Database 23c comes with a lot of new features. See the new features guide for a complete list.
In the next chapters, we look at a few examples that are relevant when querying data. In other words, at some of the new features that are applicable in the select
statement.
You can use the new graph_table
operator to query property graphs in the Oracle Database. It’s a table function similar to xml_table
or json_table
. A powerful addition to the converged database.
The SQL Language Reference 23 provides some good examples including a setup script.
The setup script is provided here for convenience. It’s a 1:1 copy from the SQL Language Reference with some minor additions and modifications.
The most important change is that business keys are used in the insert statements to retrieve the associated surrogate keys. As a result, it’s easier to add test data.
-- drop existing property graph including data
drop property graph if exists students_graph;
drop table if exists friendships;
drop table if exists students;
drop table if exists persons;
drop table if exists university;
-- create tables, insert data and create property graph
create table university (
id number generated always as identity (start with 1 increment by 1) not null,
name varchar2(10) not null,
constraint u_pk primary key (id),
constraint u_uk unique (name)
);
insert into university (name) values ('ABC'), ('XYZ');
create table persons (
person_id number generated always as identity (start with 1 increment by 1) not null,
name varchar2(10) not null,
birthdate date not null,
height float not null,
person_data json not null,
constraint person_pk primary key (person_id),
constraint person_uk unique (name)
);
insert into persons (name, height, birthdate, person_data)
values ('John', 1.80, date '1963-06-13', '{"department":"IT","role":"Software Developer"}'),
('Mary', 1.65, date '1982-09-25', '{"department":"HR","role":"HR Manager"}'),
('Bob', 1.75, date '1966-03-11', '{"department":"IT","role":"Technical Consultant"}'),
('Alice', 1.70, date '1987-02-01', '{"department":"HR","role":"HR Assistant"}');
create table students (
s_id number generated always as identity (start with 1 increment by 1) not null,
s_univ_id number not null,
s_person_id number not null,
subject varchar2(10) not null,
constraint stud_pk primary key (s_id),
constraint stud_uk unique (s_univ_id, s_person_id),
constraint stud_fk_person foreign key (s_person_id) references persons(person_id),
constraint stud_fk_univ foreign key (s_univ_id) references university(id)
);
insert into students(s_univ_id, s_person_id, subject)
select u.id, p.person_id, d.subject
from (values
(1, 'ABC', 'John', 'Arts'),
(2, 'ABC', 'Bob', 'Music'),
(3, 'XYZ', 'Mary', 'Math'),
(4, 'XYZ', 'Alice', 'Science')
) as d (seq, uni_name, pers_name, subject)
join university u
on u.name = d.uni_name
join persons p
on p.name = d.pers_name
order by d.seq;
create table friendships (
friendship_id number generated always as identity (start with 1 increment by 1) not null,
person_a number not null,
person_b number not null,
meeting_date date not null,
constraint fk_person_a_id foreign key (person_a) references persons(person_id),
constraint fk_person_b_id foreign key (person_b) references persons(person_id),
constraint fs_pk primary key (friendship_id),
constraint fs_uk unique (person_a, person_b)
);
insert into friendships (person_a, person_b, meeting_date)
select a.person_id, b.person_id, d.meeting_date
from (values
(1, 'John', 'Bob', date '2000-09-01'),
(2, 'Mary', 'Alice', date '2000-09-19'),
(3, 'Mary', 'John', date '2000-09-19'),
(4, 'Bob', 'Mary', date '2001-07-10')
) as d (seq, name_a, name_b, meeting_date)
join persons a
on a.name = d.name_a
join persons b
on b.name = d.name_b
order by d.seq;
create property graph students_graph
vertex tables (
persons key (person_id)
label person
properties (person_id, name, birthdate as dob)
label person_ht
properties (height),
university key (id)
)
edge tables (
friendships as friends
key (friendship_id)
source key (person_a) references persons(person_id)
destination key (person_b) references persons(person_id)
properties (friendship_id, meeting_date),
students as student_of
source key (s_person_id) references persons(person_id)
destination key (s_univ_id) references university(id)
properties (subject)
);
The example property graph looks like this:
select a_name, b_name, c_name
from graph_table (
students_graph
match
(a is person)
-[is friends]-> -- a is friend of b
(b is person)
-[is friends]-> -- b is friend of c
(c is person)
-[is friends]-> -- c is friend of a (cyclic path)
(a)
where
a.name = 'Mary' -- start of cyclic path with 3 nodes
columns (
a.name as a_name,
b.name as b_name,
c.name as c_name
)
) g;
A_NAME B_NAME C_NAME
---------- ---------- ----------
Mary John Bob
An edge has a source and a destination vertex. According to the model, Mary is a friend of John and this means that John is also a friend of Mary. When we change the direction of the edges in the query from -[is friends]->
to <-[is friends]-
the query result changes to:
A_NAME B_NAME C_NAME
---------- ---------- ----------
Mary Bob John
We’ve got now the clockwise result of the cyclic path starting with Mary (see the highlighted person vertices in the STUDENTS_GRAPH figure above).
Since there is only one type of edge between the vertices of the type persons
we get the same result by using just <-[]-
or even <-
.
To ignore the direction of a friendship we can use <-[is friends]->
or -[is friends]-
or <-[]->
or -[]-
or <->
or just -
to produce this result:
A_NAME B_NAME C_NAME
---------- ---------- ----------
Mary Bob John
Mary John Bob
IMO this arrow-like syntax is intuitive and makes a graph_table
query relatively easy to read and write.
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
selectList
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:a_name
COMMA:,
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:b_name
COMMA:,
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:c_name
fromClause
K_FROM:from
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
expression:specialFunctionExpressionParent
specialFunctionExpression
graphTable
K_GRAPH_TABLE:graph_table
LPAR:(
sqlName
unquotedId
ID:students_graph
K_MATCH:match
pathTerm
pathTerm
pathTerm
pathTerm
pathTerm
pathTerm
pathTerm
pathFactor
pathPrimary
elementPattern
vertexPattern
LPAR:(
elementPatternFiller
sqlName
unquotedId
keywordAsId
K_A:a
K_IS:is
labelExpression
sqlName
unquotedId
ID:person
RPAR:)
pathFactor
pathPrimary
elementPattern
edgePattern
fullEdgePattern
fullEdgePointingRight
MINUS:-
LSQB:[
elementPatternFiller
K_IS:is
labelExpression
sqlName
unquotedId
ID:friends
RSQB:]
MINUS:-
GT:>
pathFactor
pathPrimary
elementPattern
vertexPattern
LPAR:(
elementPatternFiller
sqlName
unquotedId
ID:b
K_IS:is
labelExpression
sqlName
unquotedId
ID:person
RPAR:)
pathFactor
pathPrimary
elementPattern
edgePattern
fullEdgePattern
fullEdgePointingRight
MINUS:-
LSQB:[
elementPatternFiller
K_IS:is
labelExpression
sqlName
unquotedId
ID:friends
RSQB:]
MINUS:-
GT:>
pathFactor
pathPrimary
elementPattern
vertexPattern
LPAR:(
elementPatternFiller
sqlName
unquotedId
ID:c
K_IS:is
labelExpression
sqlName
unquotedId
ID:person
RPAR:)
pathFactor
pathPrimary
elementPattern
edgePattern
fullEdgePattern
fullEdgePointingRight
MINUS:-
LSQB:[
elementPatternFiller
K_IS:is
labelExpression
sqlName
unquotedId
ID:friends
RSQB:]
MINUS:-
GT:>
pathFactor
pathPrimary
elementPattern
vertexPattern
LPAR:(
elementPatternFiller
sqlName
unquotedId
keywordAsId
K_A:a
RPAR:)
K_WHERE:where
condition
expression:simpleComparisionCondition
expression:binaryExpression
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_A:a
PERIOD:.
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_NAME:name
simpleComparisionOperator:eq
EQUALS:=
expression:simpleExpressionStringLiteral
STRING:'Mary'
K_COLUMNS:columns
LPAR:(
graphTableColumnDefinition
expression:binaryExpression
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_A:a
PERIOD:.
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_NAME:name
K_AS:as
sqlName
unquotedId
ID:a_name
COMMA:,
graphTableColumnDefinition
expression:binaryExpression
expression:simpleExpressionName
sqlName
unquotedId
ID:b
PERIOD:.
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_NAME:name
K_AS:as
sqlName
unquotedId
ID:b_name
COMMA:,
graphTableColumnDefinition
expression:binaryExpression
expression:simpleExpressionName
sqlName
unquotedId
ID:c
PERIOD:.
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_NAME:name
K_AS:as
sqlName
unquotedId
ID:c_name
RPAR:)
RPAR:)
sqlName
unquotedId
ID:g
sqlEnd
SEMI:;
<EOF>
Instead of reading rows from a table/view, you can produce rows on the fly using the new values_clause
. This makes it possible to produce rows without writing a query_block
for each row and using union all
as a kind of row separator.
column english format a7
column german format a7
with
eng (digit, english) as (values
(1, 'one'),
(2, 'two')
)
select digit, english, german
from eng e
natural full join (values
(2, 'zwei'),
(3, 'drei')
) as g (digit, german)
order by digit;
/
DIGIT ENGLISH GERMAN
---------- ------- -------
1 one
2 two zwei
3 drei
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
withClause
K_WITH:with
factoringClause
subqueryFactoringClause
sqlName
unquotedId
ID:eng
LPAR:(
sqlName
unquotedId
ID:digit
COMMA:,
sqlName
unquotedId
ID:english
RPAR:)
K_AS:as
valuesClause
LPAR:(
K_VALUES:values
valuesRow
LPAR:(
expression:simpleExpressionNumberLiteral
NUMBER:1
COMMA:,
expression:simpleExpressionStringLiteral
STRING:'one'
RPAR:)
COMMA:,
valuesRow
LPAR:(
expression:simpleExpressionNumberLiteral
NUMBER:2
COMMA:,
expression:simpleExpressionStringLiteral
STRING:'two'
RPAR:)
RPAR:)
queryBlock
K_SELECT:select
selectList
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:digit
COMMA:,
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:english
COMMA:,
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:german
fromClause
K_FROM:from
fromItem:joinClause
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
sqlName
unquotedId
ID:eng
sqlName
unquotedId
ID:e
joinVariant
outerJoinClause
K_NATURAL:natural
outerJoinType
K_FULL:full
K_JOIN:join
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
valuesClause
LPAR:(
K_VALUES:values
valuesRow
LPAR:(
expression:simpleExpressionNumberLiteral
NUMBER:2
COMMA:,
expression:simpleExpressionStringLiteral
STRING:'zwei'
RPAR:)
COMMA:,
valuesRow
LPAR:(
expression:simpleExpressionNumberLiteral
NUMBER:3
COMMA:,
expression:simpleExpressionStringLiteral
STRING:'drei'
RPAR:)
RPAR:)
K_AS:as
sqlName
unquotedId
ID:g
LPAR:(
sqlName
unquotedId
ID:digit
COMMA:,
sqlName
unquotedId
ID:german
RPAR:)
orderByClause
K_ORDER:order
K_BY:by
orderByItem
expression:simpleExpressionName
sqlName
unquotedId
ID:digit
sqlEnd
SEMI:;
<EOF>
The function json_array
has got a new JSON_ARRAY_query_content
clause. This clause simplifies the creation of JSON documents, similar to SQL/XML. If you use the abbreviated syntax for json_array
and json_object
it feels like writing JSON documents with embedded SQL.
column result format a90
select json [
select json {
'ename': ename,
'sal': sal,
'comm': comm absent on null
}
from emp
where sal >= 3000
returning json
] as result;
RESULT
------------------------------------------------------------------------------------------
[{"ename":"SCOTT","sal":3000},{"ename":"KING","sal":5000},{"ename":"FORD","sal":3000}]
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
selectList
selectItem
expression:specialFunctionExpressionParent
specialFunctionExpression
jsonArray
K_JSON:json
LSQB:[
jsonArrayContent
jsonArrayQueryContent
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
selectList
selectItem
expression:specialFunctionExpressionParent
specialFunctionExpression
jsonObject
K_JSON:json
LCUB:{
jsonObjectContent
entry
regularEntry
expression:simpleExpressionStringLiteral
STRING:'ename'
COLON::
expression:simpleExpressionName
sqlName
unquotedId
ID:ename
COMMA:,
entry
regularEntry
expression:simpleExpressionStringLiteral
STRING:'sal'
COLON::
expression:simpleExpressionName
sqlName
unquotedId
ID:sal
COMMA:,
entry
regularEntry
expression:simpleExpressionStringLiteral
STRING:'comm'
COLON::
expression:simpleExpressionName
sqlName
unquotedId
ID:comm
jsonOnNullClause
K_ABSENT:absent
K_ON:on
K_NULL:null
RCUB:}
fromClause
K_FROM:from
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
sqlName
unquotedId
ID:emp
whereClause
K_WHERE:where
condition
expression:simpleComparisionCondition
expression:simpleExpressionName
sqlName
unquotedId
ID:sal
simpleComparisionOperator:ge
GT:>
EQUALS:=
expression:simpleExpressionNumberLiteral
NUMBER:3000
jsonReturningClause
K_RETURNING:returning
K_JSON:json
RSQB:]
K_AS:as
sqlName
unquotedId
ID:result
sqlEnd
SEMI:;
<EOF>
Where can the new Boolean data type be used in the select
statement? In conversion functions, for example.
column dump_yes_value format a20
select cast('yes' as boolean) as yes_value,
xmlcast(xmltype('<x>no</x>') as boolean) as no_value,
validate_conversion('maybe' as boolean) as is_maybe_boolean,
dump(cast('yes' as boolean)) as dump_yes_value;
YES_VALUE NO_VALUE IS_MAYBE_BOOLEAN DUMP_YES_VALUE
----------- ----------- ---------------- --------------------
TRUE FALSE 0 Typ=252 Len=1: 1
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
selectList
selectItem
expression:specialFunctionExpressionParent
specialFunctionExpression
cast
K_CAST:cast
LPAR:(
expression:simpleExpressionStringLiteral
STRING:'yes'
K_AS:as
dataType
oracleBuiltInDatatype
booleanDatatype
K_BOOLEAN:boolean
RPAR:)
K_AS:as
sqlName
unquotedId
ID:yes_value
COMMA:,
selectItem
expression:specialFunctionExpressionParent
specialFunctionExpression
xmlcast
K_XMLCAST:xmlcast
LPAR:(
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_XMLTYPE:xmltype
LPAR:(
functionParameter
condition
expression:simpleExpressionStringLiteral
STRING:'<x>no</x>'
RPAR:)
K_AS:as
dataType
oracleBuiltInDatatype
booleanDatatype
K_BOOLEAN:boolean
RPAR:)
K_AS:as
sqlName
unquotedId
ID:no_value
COMMA:,
selectItem
expression:specialFunctionExpressionParent
specialFunctionExpression
validateConversion
K_VALIDATE_CONVERSION:validate_conversion
LPAR:(
expression:simpleExpressionStringLiteral
STRING:'maybe'
K_AS:as
dataType
oracleBuiltInDatatype
booleanDatatype
K_BOOLEAN:boolean
RPAR:)
K_AS:as
sqlName
unquotedId
ID:is_maybe_boolean
COMMA:,
selectItem
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
ID:dump
LPAR:(
functionParameter
condition
expression:specialFunctionExpressionParent
specialFunctionExpression
cast
K_CAST:cast
LPAR:(
expression:simpleExpressionStringLiteral
STRING:'yes'
K_AS:as
dataType
oracleBuiltInDatatype
booleanDatatype
K_BOOLEAN:boolean
RPAR:)
RPAR:)
K_AS:as
sqlName
unquotedId
ID:dump_yes_value
sqlEnd
SEMI:;
<EOF>
The impact of Boolean expressions is huge. A condition
becomes an expression
that returns a Boolean expression. Consequently, conditions can be used wherever expressions are permitted.
with
function f(p in boolean) return boolean is
begin
return p;
end;
select (select count(*) from emp) = 14 and (select count(*) from dept) = 4 as is_complete,
f(1>0) is true as is_true,
cast(null as boolean) is not null as is_not_null;
/
IS_COMPLETE IS_TRUE IS_NOT_NULL
----------- ----------- -----------
TRUE TRUE FALSE
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
withClause
K_WITH:with
plsqlDeclarations
functionDeclaration
K_FUNCTION:function
plsqlCode
ID:f
LPAR:(
ID:p
K_IN:in
K_BOOLEAN:boolean
RPAR:)
K_RETURN:return
K_BOOLEAN:boolean
K_IS:is
ID:begin
K_RETURN:return
ID:p
SEMI:;
K_END:end
SEMI:;
queryBlock
K_SELECT:select
selectList
selectItem
expression:simpleComparisionCondition
expression:simpleComparisionCondition
expression:scalarSubqueryExpression
LPAR:(
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
selectList
selectItem
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_COUNT:count
LPAR:(
functionParameter
condition
expression:allColumnWildcardExpression
AST:*
RPAR:)
fromClause
K_FROM:from
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
sqlName
unquotedId
ID:emp
RPAR:)
simpleComparisionOperator:eq
EQUALS:=
expression:logicalCondition
expression:simpleExpressionNumberLiteral
NUMBER:14
K_AND:and
expression:scalarSubqueryExpression
LPAR:(
subquery:subqueryQueryBlock
queryBlock
K_SELECT:select
selectList
selectItem
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_COUNT:count
LPAR:(
functionParameter
condition
expression:allColumnWildcardExpression
AST:*
RPAR:)
fromClause
K_FROM:from
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
sqlName
unquotedId
ID:dept
RPAR:)
simpleComparisionOperator:eq
EQUALS:=
expression:simpleExpressionNumberLiteral
NUMBER:4
K_AS:as
sqlName
unquotedId
ID:is_complete
COMMA:,
selectItem
expression:isTrueCondition
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
ID:f
LPAR:(
functionParameter
condition
expression:simpleComparisionCondition
expression:simpleExpressionNumberLiteral
NUMBER:1
simpleComparisionOperator:gt
GT:>
expression:simpleExpressionNumberLiteral
NUMBER:0
RPAR:)
K_IS:is
K_TRUE:true
K_AS:as
sqlName
unquotedId
ID:is_true
COMMA:,
selectItem
expression:isNullCondition
expression:specialFunctionExpressionParent
specialFunctionExpression
cast
K_CAST:cast
LPAR:(
expression:simpleExpressionName
sqlName
unquotedId
keywordAsId
K_NULL:null
K_AS:as
dataType
oracleBuiltInDatatype
booleanDatatype
K_BOOLEAN:boolean
RPAR:)
K_IS:is
K_NOT:not
K_NULL:null
K_AS:as
sqlName
unquotedId
ID:is_not_null
sqlEnd
SEMI:;
SOL:/
<EOF>
There is an extended is_JSON_condition
that makes it possible to validate a JSON document against a JSON schema.
column j format a20
with
t (j) as (values
(json('["a", "b"]')), -- JSON array
(json('{"a": "a", "b": "b"}')), -- JSON object without id property
(json('{"id": 42}')), -- JSON object with numeric id property
(json('{"id": "42"}')) -- JSON object with string id property
)
select j,
j is json validate '
{
"type": "object",
"properties": {
"id": { "type": "number" }
}
}' as is_valid
from t;
J IS_VALID
-------------------- -----------
["a","b"] FALSE
{"a":"a","b":"b"} TRUE
{"id":42} TRUE
{"id":"42"} FALSE
file
dmlStatement
selectStatement
select
subquery:subqueryQueryBlock
withClause
K_WITH:with
factoringClause
subqueryFactoringClause
sqlName
unquotedId
ID:t
LPAR:(
sqlName
unquotedId
ID:j
RPAR:)
K_AS:as
valuesClause
LPAR:(
K_VALUES:values
valuesRow
LPAR:(
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_JSON:json
LPAR:(
functionParameter
condition
expression:simpleExpressionStringLiteral
STRING:'["a", "b"]'
RPAR:)
RPAR:)
COMMA:,
valuesRow
LPAR:(
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_JSON:json
LPAR:(
functionParameter
condition
expression:simpleExpressionStringLiteral
STRING:'{"a": "a", "b": "b"}'
RPAR:)
RPAR:)
COMMA:,
valuesRow
LPAR:(
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_JSON:json
LPAR:(
functionParameter
condition
expression:simpleExpressionStringLiteral
STRING:'{"id": 42}'
RPAR:)
RPAR:)
COMMA:,
valuesRow
LPAR:(
expression:functionExpressionParent
functionExpression
sqlName
unquotedId
keywordAsId
K_JSON:json
LPAR:(
functionParameter
condition
expression:simpleExpressionStringLiteral
STRING:'{"id": "42"}'
RPAR:)
RPAR:)
RPAR:)
queryBlock
K_SELECT:select
selectList
selectItem
expression:simpleExpressionName
sqlName
unquotedId
ID:j
COMMA:,
selectItem
expression:isJsonCondition
expression:simpleExpressionName
sqlName
unquotedId
ID:j
K_IS:is
K_JSON:json
jsonConditionOption:jsonConditionOptionValidate
K_VALIDATE:validate
expression:simpleExpressionStringLiteral
STRING:'\n {\n "type": "object",\n "properties": {\n "id": { "type": "number" }\n }\n }'
K_AS:as
sqlName
unquotedId
ID:is_valid
fromClause
K_FROM:from
fromItem:tableReferenceFromItem
tableReference
queryTableExpression
sqlName
unquotedId
ID:t
sqlEnd
SEMI:;
<EOF>
There are more new features in the Oracle Database 23c that you can use in the select
statement, such as:
fuzzy_match
and phonic_encode
plsql_declarations
boolean_and_agg
, every
, boolean_or_agg
and to_boolean
domain_check
, domain_check_type
, domain_display
, domain_name
and domain_order
JSON_passing_clause
and type
clause in json_query
and json_value
on null
and on error
clauses in json_scalar
ordered
clause in json_serialize
type
clause in json_table
and json_transform
sort
, nested_path
, case
, copy
, intersect
, merge
, minus
, prepend
and union
in json_transform
We can also assume that more features will be added with future release updates. The AI vector search, for example, should be available with 23.4 later this year.
The plan for IslandSQL is still the same as outlined in the previous episode. So we should cover the remaining DML statements (call
, delete
, explain plan
, insert
, merge
and update
) in the next episode.
The post IslandSQL Episode 5: Select in Oracle Database 23c appeared first on Philipp Salvisberg's Blog.
]]>The post Autonomous Transactions appeared first on Philipp Salvisberg's Blog.
]]>Autonomous transactions became available in the Oracle Database 8i Release 1 (8.1.5). 25 years ago. Before then the feature was used only internally, for example, when requesting a new value from a sequence. I mean, if Oracle is using autonomous transactions internally and they’ve made them public, then the usage can hardly be bad, right? – Wrong.
“The legitimate real-world use of autonomous transactions is exceedingly rare. If you find them to be a feature you are using constantly, you’ll want to take a long, hard look at why.”
— Tom Kyte
In this blog post, I’d like to discuss some of the side effects of autonomous transactions.
Here’s the screenshot of a poll result on X (Twitter).
You can run this SQL script using a common SQL client connected to an Oracle Database 12c instance or higher.
I’ve run this script against the following Oracle database versions with the same result: 23.3, 21.8, 21.3, 19.22, 19.21, 19.19, 19.17, 18.4, 12.2 and 12.1. It should therefore also work in your environment using SQL*Plus, SQLcl or SQL Developer. Simply drop the possibly existing table t
beforehand.
create table t (c1 number);
insert into t values(1);
commit;
with
function f return number deterministic is
pragma autonomous_transaction;
begin
delete from t;
commit;
return 1;
end;
select * from t where f() = 1;
/
C1
----------
1
The query result shows the row inserted on line 2. So the majority of respondents were right.
Just to be clear: This result is expected. It is not a bug. The reason why the query returns one row is the statement-level read consistency of the Oracle database. We see the data as it was at the start of the query. The autonomous transaction that deletes all rows is completed (committed) after the query is started. As a result, changes made by the autonomous transaction are not visible in the main transaction.
I like the features introduced in 23c, such as the IF [NOT] EXISTS syntax support. That’s why I’m using the 23c syntax in this blog post from now on. However, it should not be too difficult to adapt the code for older versions.
The next script looks very similar to the first one. The difference is that the function f
does not contain the pragma autonomous_transaction
anymore. The default, so to speak. And this leads to a different result.
drop table if exists t;
create table t (c1 number);
insert into t values(1);
commit;
with
function f return number deterministic is
begin
delete from t;
commit;
return 1;
end;
select * from t where f()=1;
/
Error starting at line : 6 in command -
with
function f return number deterministic is
begin
delete from t;
commit;
return 1;
end;
select * from t where f()=1
Error at Command Line : 13 Column : 15
Error report -
SQL Error: ORA-14551: cannot perform a DML operation inside a query
ORA-06512: at line 3
ORA-06512: at line 7
14551. 00000 - "cannot perform a DML operation inside a query "
*Cause: DML operation like insert, update, delete or select-for-update
cannot be performed inside a query or under a PDML slave.
*Action: Ensure that the offending DML operation is not performed or
use an autonomous transaction to perform the DML operation within
the query or PDML slave.
More Details :
https://docs.oracle.com/error-help/db/ora-14551/
https://docs.oracle.com/error-help/db/ora-06512/
The error message, the cause and the action are good. You get even the information that you can work around the problem by using an autonomous transaction.
What is missing, however, is the information on why performing a DML operation within a query is a bad thing. Maybe it’s too obvious. Who would expect a query to change data? – Probably nobody. It therefore makes sense to prohibit DML in queries by default.
Let’s add some logging information to the query to better understand what is being executed and when.
drop table if exists t;
create table t (c1 number);
insert into t values (1), (2);
commit;
drop table if exists l;
create table l (
id integer generated always as identity primary key,
text varchar2(50 char)
);
column logit format a20
with
procedure logit (in_text in varchar2) is
pragma autonomous_transaction;
begin
insert into l(text) values(in_text);
commit;
end;
function logit (in_text in varchar2) return varchar2 is
begin
logit(in_text);
return in_text;
end;
function f return number deterministic is
pragma autonomous_transaction;
begin
logit('in function f');
delete from t;
commit;
return 1;
end;
select c1, logit('in select_list') as logit
from t
where f() = 1 and logit('in where_clause') is not null;
/
select * from l order by id;
C1 LOGIT
---------- --------------------
1 in select_list
2 in select_list
ID TEXT
---------- --------------------------------------------------
1 in where_clause
2 in function f
3 in select_list
4 in select_list
We have inserted an additional row into the table t
to understand better how often a function is called. A logit
call produces a row in the new table l
.
The query on the table t
returns now two rows. That’s expected due to statement-level read consistency.
The query result on the table l
reveals the order in which the logit
calls were evaluated by the Oracle Database.
So far so good.
The Oracle Database can restart any DML statement. Automatically or intentionally via our code, for example in exception handlers. A restart also implies a rollback. The scope of a rollback is the current transaction. Changes that are made outside the current transaction, such as writing to a file, calling a REST service or executing code in an autonomous transaction, remain unaffected by a rollback. This means that the application is responsible for reversing changes made outside the current transaction.
Let’s force a DML restart. Franck Pachot provided the simplest solution in this post on X that works in a single database session. The script looks the same as before, we only added a for update
clause on line 35.
drop table if exists t purge;
create table t (c1 number);
insert into t values (1), (2);
commit;
drop table if exists l purge;
create table l (
id integer generated always as identity primary key,
text varchar2(50 char)
);
column logit format a20
with
procedure logit (in_text in varchar2) is
pragma autonomous_transaction;
begin
insert into l(text) values(in_text);
commit;
end;
function logit (in_text in varchar2) return varchar2 is
begin
logit(in_text);
return in_text;
end;
function f return number deterministic is
pragma autonomous_transaction;
begin
logit('in function f');
delete from t;
commit;
return 1;
end;
select c1, logit('in select_list') as logit
from t
where f() = 1 and logit('in where_clause') is not null
for update;
/
select * from l order by id;
no rows selected
ID TEXT
---------- --------------------------------------------------
1 in where_clause
2 in function f
3 in where_clause
The query on the table t
now returns no rows. The log contains two rows with the text “in where clause”. The second one is a clear indication of a DML restart.
Again, this is not a bug in the Oracle Database. It’s a bug in the application code.
But why is the function f
just called once? – Because the function is declared as deterministic
(a false claim, BTW, but a necessary evil to avoid an ORA-600 due to recursive restarts). The Oracle database already knows the result of the function call where no parameters are passed. As a result, there is no need to re-evaluate it.
And why was the statement restarted? – Because the Oracle Database detected that the rows to be locked have been changed. Locking outdated versions of a row is not possible and it would not make any sense. So the database is left with two options. Either throw an error or restart the statement. “Let’s try again” is a solution approach we often use when something doesn’t work on the first attempt. This also works for the Oracle Database.
Although what I wrote before sounds reasonable, I’d still like to verify it.
So, let’s change the previous script slightly and replace the delete
with a update
statement on line 28.
drop table if exists t;
create table t (c1 number);
insert into t values (1), (2);
commit;
drop table if exists l;
create table l (
id integer generated always as identity primary key,
text varchar2(50 char)
);
column logit format a20
with
procedure logit (in_text in varchar2) is
pragma autonomous_transaction;
begin
insert into l(text) values(in_text);
commit;
end;
function logit (in_text in varchar2) return varchar2 is
begin
logit(in_text);
return in_text;
end;
function f return number deterministic is
pragma autonomous_transaction;
begin
logit('in function f');
update t set c1 = c1 + 1;
commit;
return 1;
end;
select c1, logit('in select_list') as logit
from t
where f() = 1 and logit('in where_clause') is not null
for update;
/
select * from l order by id;
C1 LOGIT
---------- --------------------
2 in select_list
3 in select_list
ID TEXT
---------- --------------------------------------------------
1 in where_clause
2 in function f
3 in where_clause
4 in select_list
5 in select_list
See, the query on the table t
returns now the two updated rows. The effect of the restarted statement.
This blog post was inspired by a real-life use case. The original question was how to call a package procedure at the end of a process from a security and observability tool when this tool can run queries only.
Here’s the screenshot of an X post by my colleague Stefan Oehrli that demonstrates a possible solution approach. Just replace count(*)
with a couple of columns from the table unified_audit_trail
to make it realistic.
While this technically “solves” the original requirement it comes with a couple of issues, for example:
The problem with this approach is that this easy implementation comes at the price of possible data loss. That’s fine as long as the stakeholders know this in advance and accept the risk. However, I can imagine that this is not good enough. Especially when dealing with security-relevant data, we should strive for the best possible approach and not be satisfied with the easiest solution to implement.
Years ago I was a fan of autonomous transactions. They allow me to persist data in logging tables even if the main transaction is not committed. That’s a great feature. However, over the years I’ve seen some abuse of autonomous transactions that changed my mind. I still like to use autonomous transactions for debugging purposes. But that’s it. Using them for something else is most probably a bug in the application code.
So what are the alternatives?
Take a step back and think about who should be in charge of a certain process. The “real use case” mentioned above for example. I believe that the security and observability tool is responsible for the data it reads, processes and stores in its data store. This means that this tool should have mechanisms in place to remember the last processed log entry per source with the corresponding restart points and procedures. Moving this responsibility to the source for a part of it like “remembering the last processed log entry” is just plain wrong and leads to the issues outlined above.
Shifting process responsibility to the right place makes the use of autonomous transactions most probably superfluous.
Advanced Queues are transactional (enqueue and dequeue). It’s an excellent way to postpone certain parts of a transaction that you would like to be transactional, but which are not. For example, sending an e-mail, calling a REST service or calling a functionality that contains TCL statements. You can configure how to deal with failures such as the number of retries, delay time, etc. This leads to a more transactional-like behaviour than using autonomous transactions.
Jobs are similar to queue messages. You create them as part of the transaction. They might be the easier solution because the dequeue process is the job system itself.
Your application should contain at most one PL/SQL unit with a pragma autonomous_transaction
. The one for storing logging/debugging messages as part of your instrumentation strategy.
Autonomous transactions may seem appealing for quickly solving certain problems or requirements. However, they come with the risk of data inconsistencies.
Even when analyzing logging/debugging data generated by autonomous transactions, we need to be aware that an entry does not mean that an action has taken place, as it could have been undone via a rollback.
The post Autonomous Transactions appeared first on Philipp Salvisberg's Blog.
]]>