CrazyEngineers
  • Hey all,

    I am going to discuss one of the most asked interview questions on Oracle SQl and a basic one which many freshers are unaware of. The question is:

    What is the difference between Union and Union All?

    The best way to answer this question is going from the basic definitions to a practical example. It impresses the interviewer a lot. To start with, Union operator is just used to perform a union operation (similar to Mathematics) on columns of two or more tables provided the the columns in each section of the union operation is same. In other words, a union operation clubs the common values as one and also returns all the unique values.

    On the other hand, Union All just clubs all the values one after another without worrying about the common values i.e. it just piles up the data one after the another. Its quite understood now that union all will be faster than union as union all has no intelligence of finding the common values and overlapping them.

    As a common example:

    select count(1) from
    (
    select job_id from employees
    union
    select job_id from jobs
    );

    count(1)
    ---------
    19

    select count(1) from
    (
    select job_id from employees
    union all
    select job_id from jobs
    );

    count(1)
    ---------
    126

    Now we can see quite a difference between the two outputs as union all involves no intelligence. Then, what is the use of union all being fast if it can't solve my requirement? Here is how we use it to replace union.

    Union All as a substitute to Union

    We can tweak union all to work like union and in a much faster way. This can be done by using a group by with union all. Modifying the same query given above as follows:

    select count(1) from
    (
    select job_id from(select job_id from employees
    union all
    select job_id from jobs)
    group by job_id
    );

    count(1)
    ---------
    19

    returns the same result as is the case with a union operation.

    How is Union All + Group By faster than Union?

    Here is a performance analysis Union when compared to Union All + Group by

    For Union operation to perform the intelligence of finding and overlapping the common values, two intermediate steps of finding the unique values among the lot and sorting the data is required which is not the case with union all as it just heaps the data and the final group by clubs the common data.

    Following example will make my statement crystal clear. Consider the following queries

    select job_id from employees
    union
    select job_id from jobs;

    The explain plan for the above query is

    1

    As seen from the explain plan, we have an intermediate SORT UNIQUE operation just as I stated above. Also have a look at the number of Bytes, cost and CPU % utilization.

    When I try to get the same functionality by using Union All,

    select job_id from(select job_id from employees
    union all
    select job_id from jobs)
    group by job_id;

    the explain plan is

    1

    When you look at this plan you can see that we have got rid of the intermediate sort and unique operation. Also, have a look at the Rows processed, Bytes processed and Cost of CPU and CPU% Utilization. It is enough to prove that Union All with group by works much better than Union operation.

    One, can argue that the Time column shows same for both plans. Well, I am running these queries on tables having very less rows so the performance improvement will be of some milliseconds but when you query on real world production systems where you store terabytes of data, this trick proves to be very very handy.

    Do let me know in case you need clarifications with any of the points stated above or you want to add someone to this.

    Hope it helps you understand the difference. This is the kind of answer that will prompt the interviewer to get up and hug you. 😛
    Replies
Howdy guest!
Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.
Replies
  • Neeraj Sharma

    MemberAug 2, 2013

    As a side note:

    You can run these queries yourself and check the explain plans in SQL developer. These tables are present in HR schema. For getting the explain plan, just type this line before executing any query

    set autotrace on

    This will enable you to look at the explain plan and you can easily sport the better explain plan.

    Happy Learning 😀
    Are you sure? This action cannot be undone.
    Cancel
  • Nikhil Lokhande

    MemberAug 4, 2013

    Great information.
    thanks😉
    Are you sure? This action cannot be undone.
    Cancel
Home Channels Search Login Register