SQL* Loader Express Mode

We have used SQL*Loader and External Tables to load the data from flat files into database tables, however for SQL*Loader we need to write a control file
using which data is loaded and for External tables we need to write a script and syntax for the same is not quite straight forward. With Oracle 12c, there
is new mode of SQL*Loader called as Express Mode.

Using Express Mode is quite straight forward with a simple command line interface, and this can quite useful when you have to load data from plain CSV files.

The main benefit of SQL*Loader Express Mode is savings of time and effort that are saved from not needing to write a control file. Instead we just specify
a SQL*Loader command with few required parameters and the load starts.

Another benefit of SQL*Loader express mode is that it will try use the fastest mechanism for loading data files
1. External tables using parallel inserts with append hint
2. If External tables can not be used for a particular load operation, then it will use direct path load in SQL*Loader

In Oracle 12c we can turn on the express mode of SQL*Loader by speifying TABLE parameter in the command. The value of the TABLE parameter is name of the
table in which data will be populated. If we only specify the TABLE parameter then SQL*Loader will do the following:
1. It will look for a file with same name as TABLE value ending with “.dat” in the current directory.
2. It also assumes the order of values in data files matches as that of order of columns in table.
3. The fields are terminated by commas.

For our example, we create the following table which will store information about routes:

CDB$ROOT@ORCL> create table route(
routeid number(2)
, routename varchar2(30)
, startpt varchar2(30)
, endpt varchar2(30)
);  2    3    4    5    6  

Table created.

Now we have a CSV file “route.dat” with following contents:

10, Route1, Huda City Centre, Badshahpur
20, Route2, Bus Stand, Iffco Chowk
30, Route3, Bus Stand, Subhash Chowk

Now to load it its contents into our table we just simply issue the following command

[oracle@localhost jagdeep]$ sqlldr c##jagdeep/jagdeep#1234@orcl TABLE=route

SQL*Loader: Release 12.1.0.1.0 - Production on Mon Aug 29 02:43:02 2016

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: ROUTE
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table ROUTE:
  3 Rows successfully loaded.

Check the log files:
  route.log
  route_%p.log_xt
for more information about the load.

As from the above, we can look into route.log for information

SQL*Loader: Release 12.1.0.1.0 - Production on Mon Aug 29 02:43:02 2016

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: ROUTE
Data File:      route.dat
  Bad File:     route_%p.bad
  Discard File:  none specified

 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation:    none specified
Path used:      External Table

Table ROUTE, loaded from every logical record.
Insert option in effect for this table: APPEND

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ROUTEID                             FIRST     *   ,       CHARACTER
ROUTENAME                            NEXT     *   ,       CHARACTER
STARTPT                              NEXT     *   ,       CHARACTER
ENDPT                                NEXT     *   ,       CHARACTER

Generated control file for possible reuse:
OPTIONS(EXTERNAL_TABLE=EXECUTE, TRIM=LRTRIM)
LOAD DATA
INFILE 'route'
APPEND
INTO TABLE ROUTE
FIELDS TERMINATED BY ","
(
  ROUTEID,
  ROUTENAME,
  STARTPT,
  ENDPT
)
End of generated control file for possible reuse.

created temporary directory object SYS_SQLLDR_XT_TMPDIR_00000 for path /u01/oracle/jagdeep

enable parallel DML: ALTER SESSION ENABLE PARALLEL DML

creating external table "SYS_SQLLDR_X_EXT_ROUTE"

CREATE TABLE "SYS_SQLLDR_X_EXT_ROUTE"
(
  "ROUTEID" NUMBER(2),
  "ROUTENAME" VARCHAR2(30),
  "STARTPT" VARCHAR2(30),
  "ENDPT" VARCHAR2(30)
)
ORGANIZATION external
(
  TYPE oracle_loader
  DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000
  ACCESS PARAMETERS
  (
    RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
    BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'route_%p.bad'
    LOGFILE 'route_%p.log_xt'
    READSIZE 1048576
    FIELDS TERMINATED BY "," LRTRIM
    REJECT ROWS WITH ALL NULL FIELDS
    (
      "ROUTEID" CHAR(255),
      "ROUTENAME" CHAR(255),
      "STARTPT" CHAR(255),
      "ENDPT" CHAR(255)
    )
  )
  location
  (
    'route.dat'
  )
)REJECT LIMIT UNLIMITED

executing INSERT statement to load database table ROUTE

INSERT /*+ append parallel(auto) */ INTO ROUTE
(
  ROUTEID,
  ROUTENAME,
  STARTPT,
  ENDPT
)
SELECT
  "ROUTEID",
  "ROUTENAME",
  "STARTPT",
  "ENDPT"
FROM "SYS_SQLLDR_X_EXT_ROUTE"

dropping external table "SYS_SQLLDR_X_EXT_ROUTE"

Table ROUTE:
  3 Rows successfully loaded.

Run began on Mon Aug 29 02:43:02 2016
Run ended on Mon Aug 29 02:43:13 2016

Elapsed time was:     00:00:10.13
CPU time was:         00:00:00.03

So now our table is populated with these three records from the CSV file. We can confirm the same as following:

CDB$ROOT@ORCL> select * from route;

   ROUTEID ROUTENAME                      STARTPT                        ENDPT                        
---------- ------------------------------ ------------------------------ ------------------------------
        10 Route1                         Huda City Centre               Badshahpur                     
        20 Route2                         Bus Stand                      Iffco Chowk                    
        30 Route3                         Bus Stand                      Subhash Chowk                  

Changing Field Terminator:
We can also change the field terminator from the default “,” to any other value, for example lets change the terminator to “:” as following in the file

40: Route1: Huda City Centre: Badshahpur
50: Route2: Bus Stand: Iffco Chowk

And now run our load with the following command:

[oracle@localhost jagdeep]$ sqlldr c##jagdeep/jagdeep#1234@orcl TABLE=route terminated_by=':'

SQL*Loader: Release 12.1.0.1.0 - Production on Mon Aug 29 03:00:22 2016

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: ROUTE
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table ROUTE:
  2 Rows successfully loaded.

Check the log files:
  route.log
  route_%p.log_xt
for more information about the load.

And check from the table about these values

CDB$ROOT@ORCL> select * from route;

   ROUTEID ROUTENAME                      STARTPT                        ENDPT                        
---------- ------------------------------ ------------------------------ ------------------------------
        10 Route1                         Huda City Centre               Badshahpur                     
        20 Route2                         Bus Stand                      Iffco Chowk                    
        30 Route3                         Bus Stand                      Subhash Chowk                  
        40 Route1                         Huda City Centre               Badshahpur                     
        50 Route2                         Bus Stand                      Iffco Chowk                    

Loading from multiple files:
We can also load data from multiple files in single command. For example we having following two files “GgnCity.dat” and “DelCity.dat”:

"GgnCity.dat"
10, Route1, Huda City Centre, Badshahpur
20, Route2, Bus Stand, Iffco Chowk
30, Route3, Bus Stand, Subhash Chowk
"DelCity.dat"
40, Route1, Rajiv Chowk, ISBT
50, Route2, Dhaula Kuan, Rohini

And we specify the following command for loading data

[oracle@localhost jagdeep]$ sqlldr c##jagdeep/jagdeep#1234@orcl TABLE=route data=GgnCity.dat,DelCity.dat

SQL*Loader: Release 12.1.0.1.0 - Production on Mon Aug 29 03:10:53 2016

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: ROUTE
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table ROUTE:
  5 Rows successfully loaded.

Check the log files:
  route.log
  route_%p.log_xt
for more information about the load.

And check from the database for our current values:

CDB$ROOT@ORCL> select * from route;

   ROUTEID ROUTENAME                      STARTPT                        ENDPT                        
---------- ------------------------------ ------------------------------ ------------------------------
        10 Route1                         Huda City Centre               Badshahpur                     
        20 Route2                         Bus Stand                      Iffco Chowk                    
        30 Route3                         Bus Stand                      Subhash Chowk                  
        40 Route1                         Rajiv Chowk                    ISBT                           
        50 Route2                         Dhaula Kuan                    Rohini                         

We can also use wildcard characters while specifying the name of the files as following

[oracle@localhost jagdeep]$ sqlldr c##jagdeep/jagdeep#1234@orcl TABLE=route data=*City.dat

Specifying the fields names in the data file:

We can also specify the field names in the CSV file itself and they need not to be in the same order as of the table fields.

startpt, endpt, routeid, routename
Huda City Centre, Badshahpur, 10, Route1
Bus Stand, Iffco Chowk, 20, Route2
Bus Stand, Subhash Chowk, 30, Route3

And use the parameter FIELD_NAMES as following to load the data:

[oracle@localhost jagdeep]$ sqlldr c##jagdeep/jagdeep#1234@orcl TABLE=route FIELD_NAMES=first

SQL*Loader: Release 12.1.0.1.0 - Production on Mon Aug 29 04:02:38 2016

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: ROUTE
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table ROUTE:
  3 Rows successfully loaded.

Check the log files:
  route.log
  route_%p.log_xt
for more information about the load.

Summary
SQL*Loader Express Mode greatly simpifies the work needed to load simple CSV data files. Express mode tries to use the fastest mechanism
to load the data and provides additional parameters which can be used to customized our load.

Hope this helps:
Cheers!!!
Jagdeep Sangwan

 

to_date explicit conversion

Few days ago one of colleague was facing some issue while using months_between function while finding
months between two dates and he asked me why Oracle is giving wrong result.

So below just a set up simulating his problem:

drop table t1 purge;
create table t1(
d1	date
, d2	date
);

insert into t1
select
  to_date('01/09/2014','dd/mm/rrrr') d1
  , to_date('28/08/1948','dd/mm/rrrr') d2
 from dual
union all
select
  to_date('01/09/2014','dd/mm/rrrr') d1
  , to_date('28/08/1984','dd/mm/rrrr') d2
from dual
;

select 
	floor(
		months_between(
			to_date(d1,'dd/mm/rrrr')
			,to_date(d2,'dd/mm/rrrr')
		)/12
	) totyear 
from t1
; 

Output of the above script is:

Table dropped.


Table created.


2 rows created.


   TOTYEAR
----------
       -34
	30

Now from the above we can see that for the first row result is not correct, so what went wrong. After looking further into the query and table definition, we can see we are using to_date function on the column which is already defined as date data type, and that was the main cause of returning wrong results.
Without using this explicit conversion we can get the desired results:

select
        floor(
                months_between(d1, d2)
        )/12 totyear
from
        t1
;

Output

   TOTYEAR
----------
	66
	30

So from now on look carefully datatype while using explicit conversions.

HTH
Cheers!!!
Jagdeep Sangwan

Truncate and Delete

Today while i was browsing the web for difference between Truncate and Delete operations in SQL Server, i learnt a new thing. So here below are the basic differences between Truncate and Delete Operations:
1. Truncate operation can not be rolled back but delete can be
2. Truncate operation holds to lock at table level however delete holds the lock at row level
3. Before/After triggers are not fired for the Truncate operation however for delete they are
4. Truncate operation reset the value of Identity Column

I knew about first three differences but i have not heard about the forth point as i mainly have worked on Oracle databases till now and there was no Identity columns in Oracle prior to version 12c came. However SQL Server has been having identity column since Version 2000. So i thought to give a test for forth point in SQL Server as well as Oracle and see if it true for Oracle also.

So here below is the demo to create a table and insert rows in and check. First SQL Server

drop table t1;

create table t1(
i1 int identity(1,1)
v1 varchar(10)
)

insert into t1 values ('a');
insert into t1 values ('b');
insert into t1 values ('c');
insert into t1 values ('d');
insert into t1 values ('e');

And below is the output from SSMS:-

i1	v1
1	a
2	b
3	c
4	d
5	e

Now lets delete all the rows from this table first and then re-insert these rows and see the data in the table

delete from t1;
(5 row(s) affected)
insert into t1 values ('a');
insert into t1 values ('b');
insert into t1 values ('c');
insert into t1 values ('d');
insert into t1 values ('e');
(1 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)
select * from t1

Output

i1	v1
6	a
7	b
8	c
9	d
10	e

So here we can see that our identity column has picked the next values for insertion. Now lets truncate table

truncate table t1

Output

Command(s) completed successfully.

Insert the rows again

insert into t1 values ('a');
insert into t1 values ('b');
insert into t1 values ('c');
insert into t1 values ('d');
insert into t1 values ('e');

Output

(1 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)

(1 row(s) affected)

Now check the data in table

select * from t1

Output

i1	v1
1	a
2	b
3	c
4	d
5	e

So from this we can see that in SQL Server IDENTITY column values are reset after a truncate operation on the table.

Now lets check whether the same is true in Oracle also. I am testing this on 12c as IDENTITY columns feature has been added since 12c only. Following is my version details:

BANNER										     CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production		  0
PL/SQL Release 12.1.0.1.0 - Production							  0
CORE	12.1.0.1.0	Production								  0
TNS for Linux: Version 12.1.0.1.0 - Production						  0
NLSRTL Version 12.1.0.1.0 - Production							  0

Now below is the test in Oracle

drop table t1 purge;

create table t1(
i1 number generated always as identity
, v1 varchar2(10)
);

insert into t1(v1) values ('aa');
insert into t1(v1) values ('bb');
insert into t1(v1) values ('cc');
insert into t1(v1) values ('dd');
insert into t1(v1) values ('ee');

prompt Before Delete
select * from t1;

Output

Table dropped.


Table created.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.

Before Delete

	I1 V1
---------- ----------
	 1 aa
	 2 bb
	 3 cc
	 4 dd
	 5 ee

Now lets delete the rows from table

delete from t1;

commit;

prompt After Delete
select * from t1;

Output

5 rows deleted.


Commit complete.

After Delete

no rows selected

Now insert rows again and check

insert into t1(v1) values ('aa');
insert into t1(v1) values ('bb');
insert into t1(v1) values ('cc');
insert into t1(v1) values ('dd');
insert into t1(v1) values ('ee');

prompt Inserted after deletion
select * from t1;

Output

1 row created.


1 row created.


1 row created.


1 row created.


1 row created.

Inserted after deletion

	I1 V1
---------- ----------
	 6 aa
	 7 bb
	 8 cc
	 9 dd
	10 ee

Now lets truncate the table

truncate table t1;

prompt After truncate
select * from t1;

Output

Table truncated.

After truncate

no rows selected

Now lets insert the rows again and check

insert into t1(v1) values ('aa');
insert into t1(v1) values ('bb');
insert into t1(v1) values ('cc');
insert into t1(v1) values ('dd');
insert into t1(v1) values ('ee');

prompt Inserted after truncate
select * from t1;

Output

1 row created.


1 row created.


1 row created.


1 row created.


1 row created.

Inserted after truncate

	I1 V1
---------- ----------
	11 aa
	12 bb
	13 cc
	14 dd
	15 ee

So here we see that Oracle has not reset the IDENTITY column values.

Summary:
This is a major difference in Truncate and Delete operation in SQL Server and Oracle. So the below point is true only for SQL Server and not for Oracle.
4. Truncate operation reset the value of Identity Column

HTH
Cheers!!!
Jagdeep Sangwan

Rank dense_rank

In this article i will looking into rank and dense_rank analytic functions. I will be using the same setup as last article except that i will not be inserting any extra values having sal as null😉

drop table emp purge;

create table emp 
(empno number(4) not null, 
 ename varchar2(10), 
 job varchar2(9), 
 mgr number(4), 
 hiredate date, 
 sal number(7, 2), 
 comm number(7, 2), 
 deptno number(2) 
); 

insert into emp values (7369, 'smith', 'clerk', 7902, 
to_date('17-dec-1980', 'dd-mon-yyyy'), 800, null, 20); 
insert into emp values (7499, 'allen', 'salesman', 7698, 
to_date('20-feb-1981', 'dd-mon-yyyy'), 1600, 300, 30); 
insert into emp values (7521, 'ward', 'salesman', 7698, 
to_date('22-feb-1981', 'dd-mon-yyyy'), 1250, 500, 30); 
insert into emp values (7566, 'jones', 'manager', 7839, 
to_date('2-apr-1981', 'dd-mon-yyyy'), 2975, null, 20); 
insert into emp values (7654, 'martin', 'salesman', 7698, 
to_date('28-sep-1981', 'dd-mon-yyyy'), 1250, 1400, 30); 
insert into emp values (7698, 'blake', 'manager', 7839, 
to_date('1-may-1981', 'dd-mon-yyyy'), 2850, null, 30); 
insert into emp values (7782, 'clark', 'manager', 7839, 
to_date('9-jun-1981', 'dd-mon-yyyy'), 2450, null, 10); 
insert into emp values (7788, 'scott', 'analyst', 7566, 
to_date('09-dec-1982', 'dd-mon-yyyy'), 3000, null, 20); 
insert into emp values (7839, 'king', 'president', null, 
to_date('17-nov-1981', 'dd-mon-yyyy'), 5000, null, 10); 
insert into emp values (7844, 'turner', 'salesman', 7698, 
to_date('8-sep-1981', 'dd-mon-yyyy'), 1500, 0, 30); 
insert into emp values (7876, 'adams', 'clerk', 7788, 
to_date('12-jan-1983', 'dd-mon-yyyy'), 1100, null, 20); 
insert into emp values (7900, 'james', 'clerk', 7698, 
to_date('3-dec-1981', 'dd-mon-yyyy'), 950, null, 30); 
insert into emp values (7902, 'ford', 'analyst', 7566, 
to_date('3-dec-1981', 'dd-mon-yyyy'), 3000, null, 20); 
insert into emp values (7934, 'miller', 'clerk', 7782, 
to_date('23-jan-1982', 'dd-mon-yyyy'), 1300, null, 10);

commit;

Now that we have our set up up first lets try the rank function

select empno
	, ename
	, deptno
	, sal
	, rank() over (partition by deptno order by sal) "rank"
from   emp
where
	deptno  = 30
;

     EMPNO ENAME          DEPTNO        SAL       rank
---------- ---------- ---------- ---------- ----------
      7900 JAMES              30        950          1
      7521 WARD               30       1250          2
      7654 MARTIN             30       1250          2
      7844 TURNER             30       1500          4
      7499 ALLEN              30       1600          5
      7698 BLAKE              30       2850          6

Now here in this output we can see that WARD and MARTIN are having the same sal and hence given the rank 2 and the next employee TURNER has been given the rank 4. Now we also have the option to give the TURNER a rank as 3 as using following

select empno
	, ename
	, deptno
	, sal
	, dense_rank() over (partition by deptno order by sal) "rank"
from   emp
where
	deptno  = 30
;

     EMPNO ENAME          DEPTNO        SAL       rank
---------- ---------- ---------- ---------- ----------
      7900 JAMES              30        950          1
      7521 WARD               30       1250          2
      7654 MARTIN             30       1250          2
      7844 TURNER             30       1500          3
      7499 ALLEN              30       1600          4
      7698 BLAKE              30       2850          5

Cheers!!!
Jagdeep Sangwan

first_value last_value

In this post i will be looking at two analytic functions first_value and last_value which are useful for finding smallest value and highest value in a partitioned window in Oracle Analytic functions.

First lets create the setup, this is same emp table as in SCOTT schema found i have added three more rows having the salary null in deptno 10

drop table emp purge;

create table emp 
(empno number(4) not null, 
 ename varchar2(10), 
 job varchar2(9), 
 mgr number(4), 
 hiredate date, 
 sal number(7, 2), 
 comm number(7, 2), 
 deptno number(2) 
); 

insert into emp values (7369, 'smith', 'clerk', 7902, 
to_date('17-dec-1980', 'dd-mon-yyyy'), 800, null, 20); 
insert into emp values (7499, 'allen', 'salesman', 7698, 
to_date('20-feb-1981', 'dd-mon-yyyy'), 1600, 300, 30); 
insert into emp values (7521, 'ward', 'salesman', 7698, 
to_date('22-feb-1981', 'dd-mon-yyyy'), 1250, 500, 30); 
insert into emp values (7566, 'jones', 'manager', 7839, 
to_date('2-apr-1981', 'dd-mon-yyyy'), 2975, null, 20); 
insert into emp values (7654, 'martin', 'salesman', 7698, 
to_date('28-sep-1981', 'dd-mon-yyyy'), 1250, 1400, 30); 
insert into emp values (7698, 'blake', 'manager', 7839, 
to_date('1-may-1981', 'dd-mon-yyyy'), 2850, null, 30); 
insert into emp values (7782, 'clark', 'manager', 7839, 
to_date('9-jun-1981', 'dd-mon-yyyy'), 2450, null, 10); 
insert into emp values (7788, 'scott', 'analyst', 7566, 
to_date('09-dec-1982', 'dd-mon-yyyy'), 3000, null, 20); 
insert into emp values (7839, 'king', 'president', null, 
to_date('17-nov-1981', 'dd-mon-yyyy'), 5000, null, 10); 
insert into emp values (7844, 'turner', 'salesman', 7698, 
to_date('8-sep-1981', 'dd-mon-yyyy'), 1500, 0, 30); 
insert into emp values (7876, 'adams', 'clerk', 7788, 
to_date('12-jan-1983', 'dd-mon-yyyy'), 1100, null, 20); 
insert into emp values (7900, 'james', 'clerk', 7698, 
to_date('3-dec-1981', 'dd-mon-yyyy'), 950, null, 30); 
insert into emp values (7902, 'ford', 'analyst', 7566, 
to_date('3-dec-1981', 'dd-mon-yyyy'), 3000, null, 20); 
insert into emp values (7934, 'miller', 'clerk', 7782, 
to_date('23-jan-1982', 'dd-mon-yyyy'), 1300, null, 10);
insert into emp values (9001, 'aaaa', null, 7782, 
to_date('23-jan-1983', 'dd-mon-yyyy'), null, null, 10);
insert into emp values (9002, 'bbbb', null, 7782, 
to_date('23-jan-1984', 'dd-mon-yyyy'), null, null, 10);
insert into emp values (9003, 'bbbb', null, 7782, 
to_date('23-jan-1985', 'dd-mon-yyyy'), null, null, 10);

commit;

Now lets get the lowest salary for each department

select
  ename
  , deptno
  , first_value(sal) over (partition by deptno order by sal) first_sal
from
  emp
;

ENAME          DEPTNO  FIRST_SAL
---------- ---------- ----------
MILLER             10       1300
CLARK              10       1300
KING               10       1300
BBBB               10       1300
BBBB               10       1300
AAAA               10       1300
SMITH              20        800
ADAMS              20        800
JONES              20        800
SCOTT              20        800
FORD               20        800
JAMES              30        950
WARD               30        950
MARTIN             30        950
TURNER             30        950
ALLEN              30        950
BLAKE              30        950

Here Oracle has given us the lowest salary for each department although there are some null values for deptno 10.
Now lets say i want to retrieve the highest salary for each department then

select
  ename
  , deptno
  , last_value(sal) over (partition by deptno order by sal) first_sal
from
  emp
;

ENAME          DEPTNO  FIRST_SAL
---------- ---------- ----------
MILLER             10       1300
CLARK              10       2450
KING               10       5000
BBBB               10
BBBB               10
AAAA               10
SMITH              20        800
ADAMS              20       1100
JONES              20       2975
SCOTT              20       3000
FORD               20       3000
JAMES              30        950
WARD               30       1250
MARTIN             30       1250
TURNER             30       1500
ALLEN              30       1600
BLAKE              30       2850

So this time we didn’t get the output what we wanted as the default windowing clause is “RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW”, which means the current row will always be the last value. Lets alter the window clause to “RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING”

select
  ename
  , deptno
  , last_value(sal) over (partition by deptno order by sal
	range between unbounded preceding and unbounded following) highest_sal
from
  emp
;

ENAME          DEPTNO HIGHEST_SAL
---------- ---------- -----------
MILLER             10
CLARK              10
KING               10
BBBB               10
BBBB               10
AAAA               10
SMITH              20        3000
ADAMS              20        3000
JONES              20        3000
SCOTT              20        3000
FORD               20        3000
JAMES              30        2850
WARD               30        2850
MARTIN             30        2850
TURNER             30        2850
ALLEN              30        2850
BLAKE              30        2850

So now we have the required results for deptno where sal is not null now for the deptno having sal as null we can use

select
  ename
  , deptno
  , last_value(sal ignore nulls) over (partition by deptno order by sal
	range between unbounded preceding and unbounded following) highest_sal
from
  emp
;

ENAME          DEPTNO HIGHEST_SAL
---------- ---------- -----------
MILLER             10        5000
CLARK              10        5000
KING               10        5000
BBBB               10        5000
BBBB               10        5000
AAAA               10        5000
SMITH              20        3000
ADAMS              20        3000
JONES              20        3000
SCOTT              20        3000
FORD               20        3000
JAMES              30        2850
WARD               30        2850
MARTIN             30        2850
TURNER             30        2850
ALLEN              30        2850
BLAKE              30        2850

So now we have the required results. We can also use the first_value function instead of all these workarounds of last_value functions as following to retrieve the highest salary in each deptno

select
  ename
  , deptno
  , first_value(sal) over (partition by deptno order by sal desc nulls last) highest_sal
from
  emp
;

ENAME          DEPTNO HIGHEST_SAL
---------- ---------- -----------
KING               10        5000
CLARK              10        5000
MILLER             10        5000
BBBB               10        5000
BBBB               10        5000
AAAA               10        5000
FORD               20        3000
SCOTT              20        3000
JONES              20        3000
ADAMS              20        3000
SMITH              20        3000
BLAKE              30        2850
ALLEN              30        2850
TURNER             30        2850
MARTIN             30        2850
WARD               30        2850
JAMES              30        2850

HTH

Cheers!!!
Jagdeep Sangwan

Virtual Columns

From 11g onwards we can create a virtual column. A virtual column is not stored on disk.
Rather, oracle derives the values in a virtual column demand by computing the function
or expressions defined on the respective virtual column.

Syntax for virtual column definition is:

column_name [data_type] [generated always] as (expression) [virtual]

If the datatype is omitted, it is determined on the result of the expression. The
GENERATED ALWAYS and VIRTUAL keywords are provided for clarity only and optional

In the below example i am creating a virtual which is an expression as “n1*n2”

drop table t1 purge;
create table t1(
n1		number
, n2	number
, v1	varchar2(30)
, d1	date
, vc1	number generated always as (n1*n2) virtual
)
;

Now lets try to insert some data into this table using following:

insert into t1
select
	rownum		n1
	, mod(rownum, 100)	n2
	, lpad(rownum, '30', '*')	v1
	, sysdate + rownum /24/60 d1
	, rownum * mod(rownum, 100)	vc1
from
	dual
where
	rownum <= 1e5
;

insert into t1
            *
ERROR at line 1:
ORA-54013: INSERT operation disallowed on virtual columns

Here we get the error as we can not insert a value explicitly into a virtual column. So
for inserting data into a table having a virtual we have to give the column names explicitly
excluding the virtual columns.

insert into t1 (
	n1
	, n2
	, v1
	, d1
)
select
	rownum		n1
	, mod(rownum, 100)	n2
	, lpad(rownum, '30', '*')	v1
	, sysdate + rownum /24/60
from
	all_objects
where
	rownum <= 1e5
;

commit;

Querying the data from this table we can see the value of our virtual column:

select
	*
from
	t1
where
	rownum <= 2
;
        N1         N2 V1                             D1               VC1
---------- ---------- ------------------------------ --------- ----------
         1          1 *****************************1 02-JUL-14          1
         2          2 *****************************2 02-JUL-14          4

We can also check the expression used for generating our virtual column in DATA_DEFAULT column
of [dba|all|user]_tab_columns


col data_default format a40
select
	column_name
	, data_default
from
	user_tab_columns
where
	table_name = 'T1'
;

COLUMN_NAME                    DATA_DEFAULT
------------------------------ ----------------------------------------
N1
N2
V1
D1
VC1                            "N1"*"N2"

Now to see that actually oracle is storing the data for our virtual column, i will be
creating two tables as following:

pro *****************************************************
pro creating and populating table without virtual columns
pro *****************************************************
drop table t1 purge;
create table t1(
n1		number
, n2	number
, v1	varchar2(30)
, d1	date
)
;

insert into t1 (
	n1
	, n2
	, v1
	, d1
)
with d as(
	select
		rownum n1
	from
		all_objects
	where	
		rownum <= 1e3
)
select
	rownum		n1
	, mod(rownum, 100)	n2
	, lpad(rownum, '30', '*')	v1
	, sysdate + rownum /24/60
from
	d d1
	, d d2
where
	rownum <= 1e5
;

pro *****************************************************
pro creating and populating table with virtual columns
pro *****************************************************
drop table t2 purge;
create table t2(
n1		number
, n2	number
, v1	varchar2(30)
, d1	date
, vc1	number generated always as (n1*n2) virtual
, vc2	number as (n1+n2)
, vc3	number as (n1-n2)
)
;

insert into t2 (
	n1
	, n2
	, v1
	, d1
)
with d as(
	select
		rownum n1
	from
		all_objects
	where	
		rownum <= 1e3
)
select
	rownum		n1
	, mod(rownum, 100)	n2
	, lpad(rownum, '30', '*')	v1
	, sysdate + rownum /24/60
from
	d d1
	, d d2
where
	rownum <= 1e5
;

commit;

From below we can check which are the data defaults for our virtual columns

select
	table_name
	, column_name
	, data_default
from
	user_tab_columns
where
	table_name in ('T1', 'T2')
;

TABLE_NAME                     COLUMN_NAME                    DATA_DEFAULT
------------------------------ ------------------------------ ------------
T1                             N1
T1                             N2
T1                             V1
T1                             D1
T2                             N1
T2                             N2
T2                             V1
T2                             D1
T2                             VC1                            "N1"*"N2"
T2                             VC2                            "N1"+"N2"
T2                             VC3                            "N1"-"N2"

We can also use the view user_tab_cols to see the virtual column information as

col column_name format a10
col table_name format a11

select
	table_name
	, column_name
	, hidden_column
	, virtual_column
	, segment_column_id
	, internal_column_id
from
	user_tab_cols
where
	table_name in ('T1', 'T2')
;

TABLE_NAME  COLUMN_NAM HID VIR SEGMENT_COLUMN_ID INTERNAL_COLUMN_ID
----------- ---------- --- --- ----------------- ------------------
T1          N1         NO  NO                  1                  1
T1          N2         NO  NO                  2                  2
T1          V1         NO  NO                  3                  3
T1          D1         NO  NO                  4                  4
T2          N1         NO  NO                  1                  1
T2          N2         NO  NO                  2                  2
T2          V1         NO  NO                  3                  3
T2          D1         NO  NO                  4                  4
T2          VC1        NO  YES                                    5
T2          VC2        NO  YES                                    6
T2          VC3        NO  YES                                    7

Now lets check the size of the physical storage of these two tables

col segment_name format a12
select
	segment_name
	, bytes
	, blocks
	, extents
from
	user_segments
where
	segment_name in ('T1', 'T2')
;

SEGMENT_NAME      BYTES     BLOCKS    EXTENTS
------------ ---------- ---------- ----------
T1              6291456        768         21
T2              6291456        768         21

Here oracle has used the same physical storage for both tables although T2 has three
more columns than T1 which are defined as VIRTUAL.

We can create indexes on a virtual column which are similar to function based indexes and
they will be using the disk space to store its data.

drop index t2_vc1;
create index t2_vc1 on t2(vc1);

begin
	dbms_stats.gather_table_stats(
		user
		, 't2'
	);
end;
/

select
	segment_name
	, bytes
	, blocks
	, extents
from
	user_segments
where
	segment_name = 'T2_VC1'
;

SEGMENT_NAME      BYTES     BLOCKS    EXTENTS
------------ ---------- ---------- ----------
T2_VC1          2097152        256         17

Now lets query the data from this table and check whether the index is used or not

select
	*
from
	t2
where
	vc1 = 4
;

select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------

SQL_ID  g8kc5x3j8wgzg, child number 1
-------------------------------------
select  * from  t2 where  vc1 = 4

Plan hash value: 1966060445

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2     |     1 |    62 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_VC1 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VC1"=4)

From above we can see that oracle optimizer has used our index to retrieve the data. Now
lets try one another having the same expression as defined in the virtual and see whether it
can use the index defined on virtual column

select
	*
from
	t2
where
	n1*n2 = 4
;

select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------

SQL_ID  fxc0fdrmp7ugy, child number 0
-------------------------------------
select  * from  t2 where  n1*n2 = 4

Plan hash value: 1966060445

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2     |     1 |    62 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_VC1 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T2"."VC1"=4)

Here also oracle optimizer has choosen the index internally for retrieving our data as the
indexes on virtual columns are similar to function based indexes.

Invisible Indexes

There is this feature “Invisible indexes” from 11g onwards which is quite useful. These indexes are maintained like any normal index but these are not visible to the optimizer and using this we can for tuning a query, i.e. checking the plans and costs when optimizer can use this and when not.

So first lets create a setup

drop table t1 purge;

create table t1
as
select
	rownum	n1
	, dbms_random.string('l', 20) v1
	, dbms_random.string('l', 100) v2
	, sysdate + rownum/24/60 d1
from
	all_objects
where
	rownum <= 2e4
;

alter table t1 add constraint t1_pk primary key (n1);

begin
	dbms_stats.gather_table_stats(
		user
		, 't1'
	);
end;
/

so now just run a query against this table and see the actual execution plan

select
	/*+
		gather_plan_statistics
	*/
	n1
	, v1
from
	t1
where
	v1 = 'abc'
;

select * from table(dbms_xplan.display_cursor(null, null, 'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------

SQL_ID  bmxarkqfcbctd, child number 0
-------------------------------------
select  /*+   gather_plan_statistics  */  n1  , v1 from  t1 where  v1 =
'abc'

Plan hash value: 3617692013

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |      0 |00:00:00.01 |     354 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |      1 |      0 |00:00:00.01 |     354 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("V1"='abc')

Here optimizer is going for a full table scan.
Now lets create an invisible index on v1 column of table t1

create index t1_v1 on t1(v1) invisible;

And we run the above query again, optimizer will not be using the above index

select
	/*+
		gather_plan_statistics
	*/
	n1
	, v1
from
	t1
where
	v1 = 'abc'
;

select * from table(dbms_xplan.display_cursor(null, null, 'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------

SQL_ID  bmxarkqfcbctd, child number 0
-------------------------------------
select  /*+   gather_plan_statistics  */  n1  , v1 from  t1 where  v1 =
'abc'

Plan hash value: 3617692013

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |      0 |00:00:00.01 |     354 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |      1 |      0 |00:00:00.01 |     354 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("V1"='abc')

Here although the index is there but its invisible and optimizer is not using it for our query.
Now to use this invisible index, we can either use the hint “use_invisible_indexes” as:

select
	/*+
		gather_plan_statistics
		use_invisible_indexes
	*/
	n1
	, v1
from
	t1
where
	v1 = 'abc'
;

select * from table(dbms_xplan.display_cursor(null, null, 'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------

SQL_ID  aqu1thxzkmypf, child number 0
-------------------------------------
select  /*+   gather_plan_statistics   use_invisible_indexes  */  n1  ,
v1 from  t1 where  v1 = 'abc'

Plan hash value: 2645666169

--------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |      1 |        |      0 |00:00:00.02 |       2 |      1 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      0 |00:00:00.02 |       2 |      1 |
|*  2 |   INDEX RANGE SCAN          | T1_V1 |      1 |      1 |      0 |00:00:00.02 |       2 |      1 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"='abc')

Or we can set the parameter at session or system level as

alter session set optimizer_use_invisible_indexes=true;
select
	/*+
		gather_plan_statistics
	*/
	n1
	, v1
from
	t1
where
	v1 = 'abc'
;

select * from table(dbms_xplan.display_cursor(null, null, 'allstats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

SQL_ID  bmxarkqfcbctd, child number 1
-------------------------------------
select  /*+   gather_plan_statistics  */  n1  , v1 from  t1 where  v1 =
'abc'

Plan hash value: 2645666169

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |      1 |        |      0 |00:00:00.01 |       2 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      0 |00:00:00.01 |       2 |
|*  2 |   INDEX RANGE SCAN          | T1_V1 |      1 |      1 |      0 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"='abc')

or we can give the opt_param(optimizer_user_invisible_indexes true) as

select
	/*+
		gather_plan_statistics
		opt_param(optimizer_use_invisible_indexes true)
	*/
	n1
	, v1
from
	t1
where
	v1 = 'abc'
;

select * from table(dbms_xplan.display_cursor(null, null, 'allstats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

SQL_ID  8u01q9zy0c0u6, child number 0
-------------------------------------
select  /*+   gather_plan_statistics
opt_param(optimizer_use_invisible_indexes true)  */  n1  , v1 from  t1
where  v1 = 'abc'

Plan hash value: 2645666169

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |      1 |        |      0 |00:00:00.01 |       2 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      0 |00:00:00.01 |       2 |
|*  2 |   INDEX RANGE SCAN          | T1_V1 |      1 |      1 |      0 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"='abc')

And after we know that this index can enhance the performance of our query we can make this index as visible

alter index t1_v1 visible;

Now without any hint our query will to use the index

select
	/*+
		gather_plan_statistics
	*/
	n1
	, v1
from
	t1
where
	v1 = 'abc'
;

select * from table(dbms_xplan.display_cursor(null, null, 'allstats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

SQL_ID  bmxarkqfcbctd, child number 1
-------------------------------------
select  /*+   gather_plan_statistics  */  n1  , v1 from  t1 where  v1 =
'abc'

Plan hash value: 2645666169

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |      1 |        |      0 |00:00:00.01 |       2 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      0 |00:00:00.01 |       2 |
|*  2 |   INDEX RANGE SCAN          | T1_V1 |      1 |      1 |      0 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("V1"='abc')