Saturday, March 2, 2024

CORRESPONDING – Java, SQL and jOOQ.


I lately stumbled upon a regular SQL function that was carried out, to my shock, in HSQLDB. The key phrase is CORRESPONDING, and it may be used with all set operations, together with UNION, INTERSECT, and EXCEPT.

Let’s take a look at the sakila database. It has 3 tables with individuals in it:

CREATE TABLE actor (
    actor_id integer NOT NULL PRIMARY KEY,
    first_name varchar(45) NOT NULL,
    last_name varchar(45) NOT NULL,
    last_update timestamp
);

CREATE TABLE buyer (
    customer_id integer NOT NULL PRIMARY KEY,
    store_id smallint NOT NULL,
    first_name varchar(45) NOT NULL,
    last_name varchar(45) NOT NULL,
    e-mail varchar(50),
    address_id smallint NOT NULL,
    create_date date NOT NULL,
    last_update timestamp,
    lively boolean
);

CREATE TABLE employees (
    staff_id integer NOT NULL PRIMARY KEY,
    first_name varchar(45) NOT NULL,
    last_name varchar(45) NOT NULL,
    address_id smallint NOT NULL,
    e-mail varchar(50),
    store_id smallint NOT NULL,
    lively boolean NOT NULL,
    username varchar(16) NOT NULL,
    password varchar(40),
    last_update timestamp,
    image blob
);

Related, however not the identical. What if we needed to get all of the “individuals” from our database? A technique to try this in any strange database product is:

SELECT first_name, last_name
FROM actor
UNION ALL
SELECT first_name, last_name
FROM buyer
UNION ALL
SELECT first_name, last_name
FROM employees
ORDER BY first_name, last_name

The end result may seem like this:

|first_name|last_name|
|----------|---------|
|AARON     |SELBY    |
|ADAM      |GOOCH    |
|ADAM      |GRANT    |
|ADAM      |HOPPER   |
|ADRIAN    |CLARY    |
|AGNES     |BISHOP   |
|AL        |GARLAND  |
|ALAN      |DREYFUSS |
|...       |...      |

Utilizing CORRESPONDING

Now, in HSQLDB, and in commonplace SQL, you need to use CORRESPONDING for this sort of job. For instance:

SELECT *
FROM actor
UNION ALL CORRESPONDING
SELECT *
FROM buyer
UNION ALL CORRESPONDING
SELECT *
FROM employees
ORDER BY first_name, last_name

The result’s this:

|first_name|last_name|last_update            |
|----------|---------|-----------------------|
|AARON     |SELBY    |2006-02-15 04:57:20.000|
|ADAM      |GOOCH    |2006-02-15 04:57:20.000|
|ADAM      |GRANT    |2006-02-15 04:34:33.000|
|ADAM      |HOPPER   |2006-02-15 04:34:33.000|
|ADRIAN    |CLARY    |2006-02-15 04:57:20.000|
|AGNES     |BISHOP   |2006-02-15 04:57:20.000|
|AL        |GARLAND  |2006-02-15 04:34:33.000|
|ALAN      |DREYFUSS |2006-02-15 04:34:33.000|
|...       |...      |...                    |

So, what has occurred? The columns FIRST_NAME, LAST_NAME, and LAST_UPDATE are widespread to all three tables. In different phrases, in the event you run this question in opposition to the INFORMATION_SCHEMA in HSQLDB:

SELECT column_name
FROM information_schema.columns
WHERE table_name="ACTOR"
INTERSECT
SELECT column_name
FROM information_schema.columns
WHERE table_name="CUSTOMER"
INTERSECT
SELECT column_name
FROM information_schema.columns
WHERE table_name="STAFF"

You get precisely these 3 columns:

|COLUMN_NAME|
|-----------|
|FIRST_NAME |
|LAST_NAME  |
|LAST_UPDATE|

In different phrases, CORRESPONDING creates the intersection of columns among the many subqueries of a set operation (i.e. the “shared columns”), tasks these, and applies the set operation that projection. In a method, that is just like a NATURAL JOIN, which additionally tries to search out that intersection of columns to provide a be a part of predicate. Nonetheless, NATURAL JOIN then tasks all the columns (or the union of the columns), not simply the shared ones.

Utilizing CORRESPONDING BY

Similar to NATURAL JOIN, this can be a dangerous operation. As quickly as one subquery adjustments its projection (e.g. due to a desk column rename), the results of all such queries will change as effectively, and it may not even produce a syntax error, only a totally different end result.

In actual fact, within the above instance, we in all probability didn’t even care about that LAST_UPDATE column. It was included within the UNION ALL set operation by chance, identical to NATURAL JOIN would be a part of utilizing LAST_UPDATE by chance.

With joins, we will use JOIN .. USING (first_name, last_name) to at the least specify by which shared column names we need to be a part of the 2 tables. With CORRESPONDING, we will provide the elective BY clause for a similar goal:

SELECT *
FROM actor
UNION ALL CORRESPONDING BY (first_name, last_name)
SELECT *
FROM buyer
UNION ALL CORRESPONDING BY (first_name, last_name)
SELECT *
FROM employees
ORDER BY first_name, last_name;

This now produces solely the 2 desired columns:

|first_name|last_name|
|----------|---------|
|AARON     |SELBY    |
|ADAM      |GOOCH    |
|ADAM      |GRANT    |
|ADAM      |HOPPER   |
|ADRIAN    |CLARY    |
|AGNES     |BISHOP   |
|AL        |GARLAND  |
|ALAN      |DREYFUSS |
|...       |...      |

In actual fact, this manner, we might even use the syntax meaningfully for INTERSECT and EXCEPT, e.g. to search out clients who share their names with an actor:

SELECT *
FROM actor
INTERSECT CORRESPONDING BY (first_name, last_name)
SELECT *
FROM buyer
ORDER BY first_name, last_name;

Producing:

|first_name|last_name|
|----------|---------|
|JENNIFER  |DAVIS    |

Different dialects

I haven’t encountered this syntax many occasions in different dialects earlier than. Maybe, it should ship to PostgreSQL sooner or later. A department has been labored on by Vik Fearing:

jOOQ may quickly assist it within the API / parser / translator:

https://github.com/jOOQ/jOOQ/points/5285



Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles