Thursday, February 22, 2024

A Fast and Soiled Approach to Concatenate Two Vaguely Associated Tables in SQL – Java, SQL and jOOQ.


Every so often I run throughout a use case for the arcane NATURAL JOIN SQL operator, and I’m much more delighted once I could make {that a} NATURAL FULL JOIN. Just a few previous weblog posts on the topic embody:

Just lately, I stumbled upon a query on Reddit:

Is there such a factor that may be part of two fully completely different tables with no relation however acts like a union?

At first I although of the UNION CORRESPONDING syntax, which doesn’t actually exist in most SQL dialects, even when it’s a typical characteristic. However then, I remembered that that is once more an ideal use case for NATURAL FULL JOIN, this time barely otherwise from the above instance the place two tables are in contrast for contents. This time, we wish to be sure the 2 joined tables by no means have matching rows, with a purpose to get the UNION like behaviour.

Contemplate the Sakila database. In that database, we’ve 3 tables containing individuals, together with:

The tables are outlined as follows:

CREATE TABLE actor (
  actor_id integer NOT NULL PRIMARY KEY,
  first_name varchar(45) NOT NULL,
  last_name varchar(45) NOT NULL,
  last_update timestamp NOT NULL
);

CREATE TABLE buyer (
  customer_id integer NOT NULL PRIMARY KEY,
  store_id integer NOT NULL,
  first_name varchar(45) NOT NULL,
  last_name varchar(45) NOT NULL,
  e-mail varchar(50),
  address_id integer NOT NULL,
  lively boolean NOT NULL,
  create_date date NOT NULL,
  last_update timestamp
);

CREATE TABLE employees (
  staff_id integer NOT NULL,
  first_name varchar(45) NOT NULL,
  last_name varchar(45) NOT NULL,
  address_id integer NOT NULL,
  e-mail varchar(50),
  store_id integer NOT NULL,
  lively boolean NOT NULL,
  username varchar(16) NOT NULL,
  password varchar(40),
  last_update timestamp NOT NULL,
  image bytea
);

As will be seen, the frequent columns are actually solely (FIRST_NAME, LAST_NAME, LAST_UPDATE), all the opposite columns are table-specific. Utilizing the next question, we are able to concatenate all the info:

SELECT *
FROM (SELECT 'actor' AS supply, * FROM actor) AS a
NATURAL FULL JOIN (SELECT 'buyer' AS supply, * FROM buyer) AS c
NATURAL FULL JOIN (SELECT 'employees' AS supply, * FROM employees) AS s;

The outcome appears one thing like this:

|supply  |first_name|last_name|last_update            |actor_id|...|customer_id|...|staff_id|...|
|--------|----------|---------|-----------------------|--------|---|-----------|---|--------|---|
|actor   |PENELOPE  |GUINESS  |2006-02-15 04:34:33.000|1       |...|           |   |        |   |
|actor   |NICK      |WAHLBERG |2006-02-15 04:34:33.000|2       |...|           |   |        |   |
|actor   |ED        |CHASE    |2006-02-15 04:34:33.000|3       |...|           |   |        |   |
|buyer|MARY      |SMITH    |2006-02-15 04:57:20.000|        |   |1          |...|        |   |
|buyer|PATRICIA  |JOHNSON  |2006-02-15 04:57:20.000|        |   |2          |...|        |   |
|buyer|LINDA     |WILLIAMS |2006-02-15 04:57:20.000|        |   |3          |...|        |   |
|employees   |Mike      |Hillyer  |2006-02-15 04:57:16.000|        |   |           |   |1       |...|
|employees   |Jon       |Stephens |2006-02-15 04:57:16.000|        |   |           |   |2       |...|

Some observations:

  • The matched columns (i.e. columns by the identical title) of the NATURAL JOIN are firstly. They embody the artificial SOURCE column, which is completely different for every be part of supply, so we by no means have a match, which was desired. We wish UNION semantics (i.e. concatenate the three tables), not match them.
  • The columns which can be distinctive to every tables are listed afterwards. They comprise knowledge provided that they belong to the related SOURCE

This method is clearly not for day by day, however it may be sometimes helpful. So don’t underestimate the powers of NATURAL FULL JOIN

Caveats

I cheated just a little bit. The precise Sakila database schema has a battle between CUSTOMER and STAFF tables:

CREATE TABLE buyer (
  customer_id integer NOT NULL PRIMARY KEY,
  store_id integer NOT NULL,
  first_name varchar(45) NOT NULL,
  last_name varchar(45) NOT NULL,
  e-mail varchar(50),
  address_id integer NOT NULL,
  activebool boolean NOT NULL,
  create_date date NOT NULL,
  last_update timestamp
  lively integer NOT NULL, -- That is an integer
);

With this desk definition, I acquired this error for my question:

SQL Error [42804]: ERROR: JOIN/USING sorts integer and boolean can’t be matched

So, to repair this, I patched the CUSTOMER desk definition:

-- Patch
WITH buyer AS (
  SELECT 
    customer_id, 
    store_id, 
    first_name, 
    last_name, 
    e-mail, 
    address_id, 
    activebool as lively, 
    create_date, 
    last_update 
  FROM buyer
)

-- Unique question
SELECT *
FROM (SELECT 'actor' AS supply, * FROM actor) AS a
NATURAL FULL JOIN (SELECT 'buyer' AS supply, * FROM buyer) AS c
NATURAL FULL JOIN (SELECT 'employees' AS supply, * FROM employees) AS s;

Wishing that BigQuery’s helpful * REPLACE (...) syntax was extra extensively accessible.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles