Formatting SQL Scripts in a Directory Tree with SQLcl

blank

Introduction

Oracle’s SQL Developer can format code in any worksheet and PL/SQL editor. The formatter is highly configurable and the default formatting results are becoming better with every version. Oracle’s SQLcl is a command-line tool. It’s a stripped-down version of SQL Developer and is known as a user-friendly alternative for SQL*Plus.

But SQLcl is more. It can execute JavaScript and access any Java class distributed with SQLcl. Through JavaScript, you can access local and remote resources easily. In this blog post, I show how you can format all your SQL scripts with a few lines of JavaScript.

Demo Setup

I re-formatted the following three SQL scripts by hand. The first two are ugly. In the end, I want to show that the formatter is an improvement, even if you do not agree with the applied style guideline. I think it is important to know how the formatter deals with syntax errors. That’s why I’ve added one to the last script.

query.sql
Select d.department_name,v.  employee_id 
,v 
. last_name frOm departments d CROSS APPLY(select*from employees e
  wHERE e.department_id=d.department_id) v WHeRE 
d.department_name in ('Marketing'
,'Operations',
'Public Relations') Order By d.
department_name,v.employee_id;
package_body.sql
create or replace package body the_api.math as function to_int_table(in_integers
in varchar2,in_pattern in varchar2 default '[0-9]+')return sys.ora_mining_number_nt deterministic accessible
by(package the_api.math,package the_api.test_math)is l_result sys
.ora_mining_number_nt:=sys.ora_mining_number_nt();l_pos integer:= 1;l_int integer;
begin<<integer_tokens>>loop l_int:=to_number(regexp_substr(in_integers,in_pattern,1,l_pos));
exit integer_tokens when l_int is null;l_result.extend;l_result(l_pos):= l_int;l_pos:=l_pos+1;
end loop integer_tokens;return l_result;end to_int_table;end math;
/
syntax_error.sql
declare
   l_var1  integer;
   l_var2  varchar2(20);
begin
   for r in /*(*/ select x.* from x join y on y.a = x.a)
   loop
      p(r.a, r.b, r.c);
   end loop;
end;
/

I committed these files to my sandbox GitHub repository. This way I can compare the formatting results with the committed version and I can easily revert the changes.

Running the Formatter with Default Settings

The following JavaScript queries all .sql files in a directory tree, applies the default formatter settings and replaces the original content with the formatted version.

JavaScript
var getFiles = function (rootPath) {
    var Collectors = Java.type("java.util.stream.Collectors");
    var Files = Java.type("java.nio.file.Files");
    var Paths = Java.type("java.nio.file.Paths");
    var files = Files.walk(Paths.get(rootPath))
        .filter(function (f) Files.isRegularFile(f) && f.toString().endsWith(".sql"))
        .collect(Collectors.toList()); 
    return files;
}

if (args[1] == null) {
    ctx.write("\nplease provide the root path to a directory with .sql files.\n\n");
} else {
    ctx.write("\n");
    var Files = Java.type("java.nio.file.Files");
    var files = getFiles(args[1]);
    var Format = Java.type("oracle.dbtools.app.Format");
    var formatter = new Format();
    for (var i in files) {
        ctx.write("Formatting file " + (i+1) + " of " + files.length + ": " + files[i].toString() + "... ");
        ctx.getOutputStream().flush();
        var original = Files.readString(files[i]);
        var result = formatter.format(original);
        Files.writeString(files[i], result);
        ctx.write("done.\n");
        ctx.getOutputStream().flush();
    }
}

SQLcl 20.2 uses the Nashorn JavaScript engine. This works also with Java 11. If you are interested in writing JavaScript scripts for SQLcl I recommend having a look at Menno Hoogendijk’s GitHub repo and the examples in Oracle’s GitHub repo.

I’d like to focus in this blog post on the formatter. The formatter is instantiated with default settings on line 18. On line 23 the original file content is passed to the formatter and the formatted result is returned. The `ctx.getOutputStream().flush();` is a trick to force SQLcl to flush output on the console. This improves the user experience when processing a lot of files (see the video at the end of this blog post).

You can store this JavaScript file along with the three example files in a directory of your choice. Then change to this directory and start SQLcl and execute the highlighted commands below (use host dir when you are using Windows):

run default_format.js
sql /nolog

SQLcl: Release 20.2 Production on Sun Aug 09 16:16:19 2020

Copyright (c) 1982, 2020, Oracle.  All rights reserved.


SQL> host ls
default_format.js	package_body.sql	query.sql		syntax_error.sql

SQL> script default_format.js

please provide the root path to a directory with .sql files.

SQL> script default_format.js .

Formatting file 1 of 3: ./query.sql... done.
Formatting file 2 of 3: ./syntax_error.sql... done.
Formatting file 3 of 3: ./package_body.sql... done.
SQL> 

Here are the original and formatted versions side-by-side:

blank
blank
blank

The first two files are certainly easier to read now. However, the syntax_error.sql looks strange. The reason is, that the formatter is designed for interactive use and the SQL Developer team decided to format with a best effort approach, even if syntax errors are found. It’s important to note that a detected syntax error does not necessarily mean that the code is incorrect. It just means that the parser does not understand the code. This may happen due to bugs or because grammar changes are not (yet) supported by the parser.

Shortcomings to Address

You’ve seen that applying the formatter is quite easy. However, there are some shortcomings:

  • Files with syntax errors are formatted
    This may lead to a bad result and is typically unwanted when processing files in batch mode.
  • Only files with the file extension .sql are processed
    What about files with the extensions .pks, .pkb, .vw, etc.? They are not processed. A better default setting would be nice, along with an option to overwrite the file extensions to be processed.
  • Default Advanced Format settings only
    SQL Developer allows you to configure 26 formatter settings for typical coding styles. It would be nice, if the default setting could be changed in a similar way as in the SQL Developer’s preferences dialog.
  • Default Custom Format only
    If Advanced Format is not enough, you can configure the formatter further by writing your own Arbori program. However, it is not that easy and it is time-consuming to write and maintain an Arbori program. But if you happen to have such an Arbori program (as I do) then you’d like to use it as input for the formatter as well to get the very same result as in the SQL Developer IDE.

You can use SQLcl’s FORMAT FILE command to address bullet points 1 and 3. However, it’s not possible to set Custom Format or to limit file extensions to be processed with FORMAT FILE in SQLcl 20.2. But you can pass a directory as INPUT  and OUTPUT parameter (instead of file names). I tried that because it’s documented for sdcli (Thanks Torsten). So, if you do not need to limit file extensions or define a custom Arbori program, then the built-in FORMAT FILE is most probably good enough. [added on 2020-08-10]

More Complete Formatter CLI

I’ve provided a format.js as part of the Trivadis PL/SQL & SQL Formatter Settings. I recommend downloading, cloning or forking this repository when you plan to use this script. It’s easier because the default Arbori program is referenced via a relative path and when you’re fine with it, you do not need to pass it as a command line argument. However, the format.js works also as a standalone script.

In my environment, I start the script as follows:

run format.js
SQL> script ../../Trivadis/plsql-formatter-settings/sqlcl/format.js

format.js for SQLcl 20.2
Copyright 2020 by Philipp Salvisberg (philipp.salvisberg@trivadis.com)

missing mandatory <rootPath> argument.

usage: script format.js <rootPath> [options]

mandatory arguments:
  <rootPath>     path to directory containing files to format (content will be replaced!)

options:
  ext=<ext>      comma separated list of file extensions to process, e.g. ext=sql,pks,pkb
  arbori=<file>  path to the file containing the Arbori program for custom format settings

SQL>

As in the simplified version an error is shown with a short help on how to use this CLI. So, I need to pass a path, e.g. . for the current directory, to make it work.

run format.js with path
SQL> script ../../Trivadis/plsql-formatter-settings/sqlcl/format.js .

format.js for SQLcl 20.2
Copyright 2020 by Philipp Salvisberg (philipp.salvisberg@trivadis.com)

Formatting file 1 of 3: ./query.sql... done.
Formatting file 2 of 3: ./syntax_error.sql... Syntax Error at line 4, column 12


   for r in /*(*/ select x.* from x join y on y.a = x.a)
            ^^^                                          

Expected: name_wo_function_call,identifier,term,factor,name,. skipped.
Formatting file 3 of 3: ./package_body.sql... done.
SQL> 

As you see in the console output, there was an error when processing the second file syntax_error.sql. The syntax error was detected, the error reported and the file was left unchanged. Behind the scenes, different formatter settings have been applied. See the source code for details. It should be quite self-explanatory.

These are the formatting results:

query.sql – custom format result
SELECT d.department_name,
       v.employee_id,
       v.last_name
  FROM departments d CROSS APPLY (
          SELECT *
            FROM employees e
           WHERE e.department_id = d.department_id
       ) v
 WHERE d.department_name IN (
          'Marketing',
          'Operations',
          'Public Relations'
       )
 ORDER BY d.department_name,
          v.employee_id;
package_body.sql – custom format result
CREATE OR REPLACE PACKAGE BODY the_api.math AS
   FUNCTION to_int_table (
      in_integers  IN  VARCHAR2,
      in_pattern   IN  VARCHAR2 DEFAULT '[0-9]+'
   ) RETURN sys.ora_mining_number_nt
      DETERMINISTIC
      ACCESSIBLE BY ( PACKAGE the_api.math, PACKAGE the_api.test_math )
   IS
      l_result  sys.ora_mining_number_nt := sys.ora_mining_number_nt();
      l_pos     INTEGER := 1;
      l_int     INTEGER;
   BEGIN
      <<integer_tokens>>
      LOOP
         l_int               := to_number(regexp_substr(in_integers, in_pattern, 1, l_pos));
         EXIT integer_tokens WHEN l_int IS NULL;
         l_result.extend;
         l_result(l_pos)     := l_int;
         l_pos               := l_pos + 1;
      END LOOP integer_tokens;
      RETURN l_result;
   END to_int_table;
END math;
/

And here’s a short audio-less video, showing how format.js is used to format utPLSQL packages and types.

Summary

Formatting SQL scripts with SQLcl is quite easy when you’re okay with the default formatter settings. It’s more work when you want to apply advanced and custom format settings with some sanity checks. Nonetheless parsing the SQL script and reporting error messages was only 14 lines of code. Formatting is possible without an active connection to the database. As long as the grammar is correct, the formatting result should be good. You can imagine what you could do when accessing the database as well (e.g. to process the source stored in the database). This clearly shows the power of JavaScript within SQLcl.

When you have questions regarding SQL Developer’s default formatting behaviour then I suggest asking them in the SQL Developer’s forum. When you find strange formatting results for the Trivadis PL/SQL & SQL formatter settings or the format.js script then please open an issue in this GitHub repo. Thank you.

Updated on 2020-08-10, added a section in the “Shortcomings to Address” chapter regarding the FORMAT FILE command in SQLcl. HT to Torsten Kleiber.

4 Comments

  1. Hi!

    Nice work – but formatting is part of the SQL Developer CLI for long time?
    Is there are reason, why do you not use this instead?


    \sqldeveloper\bin>sdcli format input=

    Kind regards
    Torsten

    • Hi Torsten,

      😲 I completely forgot about the sdcli and did not know that you can format files with it.

      A good opportunity to run it from my sandbox source directory.

      bash /Applications/SQLDeveloper20.2.0.app/Contents/Resources/sqldeveloper/sqldeveloper/bin/sdcli format input=. output=.
      
       Oracle SQL Developer
       Copyright (c) 2005, 2020, Oracle and/or its affiliates. All rights reserved.
      
      Warning: Nashorn engine is planned to be removed from a future JDK release
      Command Completed.

      Then I checked the result. All files were formatted based on my settings within SQL Developer. All .sql files and the .js file too. So I probably have to call the cli per file, which might be a bit slow.

      Now Let’s run it for the utPLSQL code (as in the video):

      bash /Applications/SQLDeveloper20.2.0.app/Contents/Resources/sqldeveloper/sqldeveloper/bin/sdcli format input=utplsql/source output=utplsql/source
      
       Oracle SQL Developer
       Copyright (c) 2005, 2020, Oracle and/or its affiliates. All rights reserved.
      
      Warning: Nashorn engine is planned to be removed from a future JDK release
      Command Completed.

      The console output is a bit sparse, but behind the scenes all 297 files in the directory tree have been formatted with my SQL Developer settings. 

      So, after trying it, I’d say you may use the sdcli to format your code, when you are sure that your directories contain only files to be formatted. If you want to have more control (file extensions to consider, handling files with syntax errors, console output) then a JavaScript for SQLcl might be a good alternative. Furthermore the SQLcl is more compact and might be better suited to run on servers.

      Thank you very much for your comment Torsten. I’ve learnt something new. 

  2. blank Micke Vrolijk says:

    Hi Philipp,

    Excellent post on formatting! Do you have any experience with the combination of triggering these kindes of directory formats from a Git hook? (eg. pre-commit?).

    I am looking to a way to enforce a standardized formatting of the (pl)sql files in the team in a way that we don’t need to rely on a specific tool with specific settings.

    • Hi Micke,

      Thanks.

      Do you have any experience with the combination of triggering these kindes of directory formats from a Git hook? (eg. pre-commit?).

      We are discussing this with customers (early stages). I have not tried it, but I do not see a problem since the pre-commit hook is just a shell script. This GitHub repo has some examples for other languages. This solution relies on other tools (for PL/SQL this would be SQLcl, the JavaScript plus settings) and everything is included in the repo (which has pros and cons).

      So, technically this is for sure doable. Of course you need SQLcl and a JavaScript. However, I guess that agreeing on a code style and living with the imperfection of a formatter is another story.

      Please let me know, when managed to integrate it. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.