What does OPTIONAL (NULL) Mean?
Embedded SQL and OPTIONAL Fields
I/O
statements and HOST Variables
Using OPTIONAL Fields in Language
Context
With other record types (VSAM etc) OPTIONAL/REQUIRED is mainly used to give default editing (ACCEPT) rules. With SQL OPTIONAL/REQUIRED also affects the way the data is saved, retrieved, and updated.
Property equivalence: -
SQL Property |
Jazz Property |
Allow NULL value (nullable) |
OPTIONAL |
NOT NULL (Don’t allow NULL) |
REQUIRED |
Fields are nullable (OPTIONAL) by
default. A Jazz definition may swap the
default by adding the REQUIRED property to the DEFINE:
-
DEFINE EMPLOYEE SQL PREFIX EMP REQUIRED DATA(
CREATE
TABLE EMPLOYEE
(EMPNO
CHAR(6) NOT NULL,
FIRSTNME VARCHAR(12) NOT NULL,
…
BIRTHDATE DATE,
SALARY
DECIMAL(9,2) /*MONEY*/,
..
PRIMARY KEY(EMPNO))
A Jazz definition of EMPLOYEE *1 was created by importing this table from SQL *2, and then editing it.
*1 In
this Help section, EMPLOYEE is a table in IBM’s Sample Database, and EMPLOYUP
is a copy of this table that was created for testing.
*2 To import a table from SQL, from the Workbench, click [JazzGen], choose function Data/Import from SQL, and select
EMPLOYEE from the database
It looks like this. Edits are highlighted: DECIMAL was changed to MONEY, and the highlighted properties added to make the definition more powerful: -
DEFINE EMPLOYEE SQL PREFIX EMP DATA(
EMPNO CHAR(6) REQUIRED KEY,
FIRSTNME VARCHAR(12) REQUIRED,
…
BIRTHDATE DATE DPIC 'dd mmm yyyy',
SALARY MONEY(9,2),
Thus REQUIRED is the same as
SQL’s NOT NULL, while OPTIONAL is the equivalent of fields defined without NOT
NULL or explicitly with NULL. You can
swap the default by specifying REQUIRED with
the DEFINE
statement: -
DEFINE
EMPLOYEE SQL PREFIX EMP REQUIRED DATA(
in which case all fields without OPTIONAL
are REQUIRED.
When you read an EMPLOYEE record OPTIONAL
fields like SALARY may be absent. If so
their value is undefined, i.e. NULL.
Null is neither zero nor not zero: it is simply not possible to compare
a NULL value to zero or any other number.
Thus the SQL statement
>DB2
SELECT EMPNO, FIRSTNME, LASTNAME FROM EMPLOYEE
WHERE SALARY = 0
won’t return any rows with NULL salary.
But neither will
>DB2
SELECT EMPNO, FIRSTNME, LASTNAME FROM EMPLOYEE
WHERE SALARY != 0
If Salary is NULL it is neither equal to zero, nor is it not equal to zero. If you want to retrieve records with NULL SALARY, you write
>DB2
SELECT EMPNO, FIRSTNME, LASTNAME FROM EMPLOYEE
WHERE SALARY IS NULL
Note that you write “IS NULL”, not “= NULL”. SQL says NULL is not a value, so it makes no sense to ask whether anything is equal-to or not-equal-to NULL.
In our opinion this distinction is
subtle and confusing. In Jazz we’ve tried to make it simple, ensuring that
there is always a defined value for NULL fields, and providing a build-in
function $Null
for the NULL value, which you use as a normal function. Thus in Jazz you can write “WHERE SALARY = $Null” and “WHERE SALARY <> $Null”
as you would in a normal comparison. In
PROCESS
employee WHERE
EMPLOYEE.SALARY =
JAZZ.$Null …
Jazz has qualified
the names, and sorted out the distinction between = and IS.
We also use $Null to explicitly set
a field to NULL with an assignment statement: -
EMPLOYEE.SALARY
= JAZZ.$Null;
We can also refer to EMPLOYEE.SALARY
in normal statements having nothing to do with SQL: -
IF EMPLOYEE.SALARY >
0;
But here language rules apply! Unlike SQL neither Jazz nor COBOL has a third option for “undecided”: this condition is true or false. You can test for NULL with
IF EMPLOYEE.SALARY = JAZZ.$Null;
but whether this is true or not, the test EMPLOYEE.SALARY >
0 has a meaning even with a NULL value.
Jazz, like COBOL, PL/I, Java and other programming languages that can work with SQL databases process the databases using “Embedded SQL”, i.e. SQL statements embedded into the Jazz/COBOL program. Compare this with the SQL statements above: -
002050 EXEC SQL
002060 SELECT EMPNO, LASTNAME INTO
:EUP-EMPNO, :EUP-LASTNAME
002070 FROM EMPLOYUP WHERE (EMPNO =
:EUP-EMPNO)
002080 END-EXEC.
Apart from the surrounding EXEC SQL and END-EXEC to separate it from the normal COBOL of the rest of the program, the difference is that embedded SQL bridges the world of SQL and the host language, referring both to SQL columns (EMPNO, LASTNAME) and host variables, i.e. fields defined in COBOL’s Data division (EUP-EMPNO, EUP-LASTNAME). A leading colon indicates a host variable.
Sometimes language rules apply, sometimes SQL rules apply. You need to understand the difference, and when SQL rules are applied.
Most of the time Jazz is working in
the language (Jazz/COBOL) context. To understand what this means, consider a
Jazz definition like this: -
DEFINE EMPLOYUP SQL PREFIX EMP DATA(
EMPNO CHAR(6) REQUIRED KEY,
MIDINIT CHAR(1),
SALARY MONEY(9,2));
This corresponds to a COBOL
definition like this: -
000180
01 EMPLOYUP. EMPLOYEE
000190 03 EMP-EMPNO PIC X(6) VALUE SPACES. EMPLOYEE
000200 03 EMP-MIDINIT PIC X VALUE SPACES. EMPLOYEE
000210 03 JZB-EMP-MIDINIT PIC S9(4) COMP VALUE
-1. EMPLOYEE
000220 03 EMP-SALARY PIC S9(7)V9(2) COMP-3 VALUE
ZERO. EMPLOYEE
000230 03
JZB-EMP-SALARY PIC S9(4) COMP VALUE -1. EMPLOYEE
REQUIRED
fields like EMPNO are just like any other field, but the OPTIONAL
fields are associated with a status field indicating whether the field has a
NULL value.
03
EMP-SALARY PIC S9(7)V9(2) COMP-3
holds the numeric value of EMPLOYUP.SALARY. An indicator field follows: -
03
JZB-EMP-SALARY PIC S9(4) COMP
For SQL
data this will have value -1 if the field is NULL.
Thus
every OPTIONAL
field within a SQL record has two fields, a value field and an indicator
field. There is always a value in the
value field even when the indicator field shows that the field is NULL. As shown above Jazz generates the indicator
field after the value field, but this is not a rule imposed by COBOL or SQL,
only a Jazz convention.
An aside:
the same concept is used in web service messages and also 3270 screen data,
which can also omit absent fields. The
format and NULL/NOT NULL values are different in each format. In other definition types like VSAM and WORK the
Jazz property OPTIONAL/REQUIRED can
influence editing rules but there is no indicator field and the field is always
actually present in the record.
In normal Jazz statements the
program refers to the value fields, so that
EMPLOYUP.SALARY
= EMPLOYEE.SALARY
* 1.05;
is equivalent to the COBOL statement
003830
COMPUTE EMP-SALARY OF EMPLOYUP = EMP-SALARY OF EMPLOYUP *
SQLNull
003840 1.05.
Here normal
language rules apply, where there is no concept of NULL. This is a normal multiplication of whatever
happens to be in the field EMP-SALARY OF EMPLOYUP. You
don’t have to check whether EMP-SALARY is NULL to prevent program failure.
In SQL different rules apply. As in COBOL you can use expressions to
calculate data but the rules, particularly around the treatment of NULL values,
is different. EMPLOYUP has been updated
to add employees 000015 and 000025, both of which have NULL salaries, and then
we’ve used the command window to execute the query
>DB2
SELECT EMPNO, LASTNAME, SALARY*1.05 FROM EMPLOYUP
Here are the results
EMPNO LASTNAME 3
------
--------------- --------------
200330
WONG 37138.5000
200340
ALONZO 33432.0000
000015 -
000025 -
You might
think that, having no salary information the result would be zero but it’s not. Instead DB2 prints “-” to indicate a NULL
value. LASTNAME however is not “-” but
spaces, because LASTNAME was defined as NOT NULL and so always has a value.
NULL is a difficult
concept, often introducing complexity and errors into programs, and no less a
person than C.J.Date has lamented its invention in SQL. When SQL is embedded in a program there are
many opportunities to make mistakes that you need to guard against. Jazz does its best to insulate you from the
worst problems, but you still need to understand how NULL is handled, and the
difference between Language (Jazz/COBOL) Context, and SQL Context.
Pure Jazz
statements like assignments and IF statements are Language Context. Pure SQL
statements like the query above typed into the command window are SQL Context. Embedded SQL statements are SQL, but with
host variable references.
This
section explains in detail the way in which Jazz handles NULL in various
situations. In summary, Jazz ensures
that the COBOL Value field has its initial value (zero or spaces) whenever the
field is NULL. In the absence of any VALUE property OPTIONAL
fields are initialised to NULL with zero/space values.
Here is a code fragment to add $100
to Salary: -
GET EMPLOYUP WHERE (EMPLOYUP.EMPNO = W.EMPNO) UPDATE;
EMPLOYUP.SALARY += 100;
END GET Employup UPDATE;
The GET statement generates
002020* GET EMPLOYUP WHERE (EMPLOYUP.EMPNO =
W.EMPNO) UPDATE; SQLGet2
002030 MOVE 'Y' TO EMPLOYUP-Get4Update-FLAG. SQLGet2
002040 EXEC SQL SQLGet2
002050 SELECT EMPNO, SALARY INTO
:EUP-EMPNO, SQLGet2
002060 :EUP-SALARY:JZB-EUP-SALARY FROM
EMPLOYUP WHERE (EMPNOSQLGet2
002070 = :EUP-EMPNO) SQLGet2
002080
END-EXEC.
SQLGet2
INTO names both host variables for
an optional field like SALARY: -
002050 SELECT EMPNO, SALARY INTO :EUP-EMPNO, SQLGet2
002060 :EUP-SALARY:JZB-EUP-SALARY FROM EMPLOYUP WHERE (EMPNOSQLGet2
002070
= :EUP-EMPNO) SQLGet2
The same principle is followed for
other SQL statements, when a table is read with PROCESS, or a record is updated,
or added. Whenever an OPTIONAL
field is referenced the corresponding host variable reference will name both
the value and indicator fields.
Remember that the COBOL table
definition looked like this: -
000180
01 EMPLOYUP. EMPLOYEE
000190 03 EMP-EMPNO PIC X(6) VALUE SPACES. EMPLOYEE
000200 03 EMP-MIDINIT PIC X VALUE SPACES. EMPLOYEE
000210 03 JZB-EMP-MIDINIT PIC S9(4) COMP VALUE
-1. EMPLOYEE
000220 03 EMP-SALARY PIC S9(7)V9(2) COMP-3 VALUE ZERO. EMPLOYEE
000230 03 JZB-EMP-SALARY PIC S9(4)
COMP VALUE -1. EMPLOYEE
Thus if SALARY
for this row was NULL then JZB-EMP-SALARY will have a negative value, while if
SALARY is not null JZB-EMP-SALARY will be 0 and EMP-SALARY will contain its
value (which might be 0).
Although SQL doesn’t define the value that will
end up in EMP-SALARY with NULL values, Jazz does. Jazz
will generate COBOL that ensures that EMP-SALARY is set to the default value
(zero) if a NULL value is read by
SELECT
SALARY INTO :EMP-SALARY.
Outside SQL
our programming language (Jazz, COBOL, PL/I, or Java) has no particular concept
of NULL, merely an indicator field that is a SMALLINT field like any other,
with meaning that if it is negative the related value field will be interpreted
by SQL as NULL. Jazz tries to look after
this relationship for you so you rarely have to think about it. It sets the indicator field
automatically and ensures that the value field always has a defined value (zero
or space) so that you can PRINT and assign it to normal fields whether it is
NULL or not.
You can set
an optional field to NULL with EMPLOYEE.SALARY =
JAZZ.$Null;
and test whether or not
it is NULL with =
and <>: -
IF EMPLOYEE.SALARY
= JAZZ.$Null;
Otherwise
you can use optional fields as normal.
IF EMPLOYEE.SALARY
= 0;
is true for both
NULL values, and perhaps some NOT NULL values.
If you do
need to understand what’s going on here are the rules in more detail.
You can
explicitly set an OPTIONAL field to NULL with $Null
and $Init.
EMPLOYUP.SALARY =
JAZZ.$Null;
(which I wrote as Salary =
$Null;) sets the null indicator to show that the field is NULL, and sets
its value to 0 so that if this Null value is used in a PRINT or other statement it will behave reasonably.
If you set EMPLOYUP.SALARY from a constant, whether zero or any other value, it becomes a real (NOT NULL) value.
If
you set Salary from another OPTIONAL SQL field it is
set to NULL or NOT NULL depending the source field. Thus
EMPLOYUP.SALARY =
EMPLOYUP.BONUS;
will be NULL or NOT NULL depending on Bonus. With a more complex expression, the target will
be set NULL if ANY of the fields in the expression are NULL. Thus with
EMPLOYUP.SALARY
= EMPLOYUP.BONUS
+ EMPLOYUP.COMM;
EMPLOYUP.SALARY will
be set NULL if EMPLOYUP.BONUS is NULL, or if EMPLOYUP.COMM is NULL, unless you’ve added NOTNULL to the assignment statement: -
EMPLOYUP.SALARY
= EMPLOYUP.BONUS
+ EMPLOYUP.COMM
NOTNULL;
Now the result is NOT NULL whatever the NULL status of the operands.
Note that the operators +=, -=, *= & /= are shorthand for expressions including the
target. Thus
EMPLOYUP.SALARY +=
100;
is equivalent to
EMPLOYUP.SALARY
= EMPLOYUP.SALARY
+ 100;
Thus if EMPLOYUP.SALARY is NULL, it will still be NULL even though its value will
now be 100 greater. NOTNULL might be convenient to avoid clunky code like this: -
IF EMPLOYUP.SALARY
= JAZZ.$Null;
EMPLOYUP.SALARY
= 0 [set indicator to NOT NULL;
END IF;
EMPLOYUP.SALARY
+= 100;
With NOTNULL this single
statement does the same job more efficiently: -
EMPLOYUP.SALARY +=
100 NOTNULL;
In IF, FOR, CASE etc a
SQL field is like any other: NULL fields have values which can be tested like
any other. You can also test for value
$Null, for example
IF
EMPLOYEE.SALARY =
JAZZ.$Null;
(which you’ll have written as IF Salary = $Null).
Indicator fields are not passed to the parameters of a ROUTINE or SUBPROGRAM, only the value field.