Monday, May 31, 2010

Oracle Analytics function

- the other side of SQL summary functions

Aggregate functions are the norm in SQL worlds, where SUM, AVG, COUNT are all the rage. We are here to see another part of summary functions where it is less frequently used because of its obscure syntax that is a wee bit hard to use. Nevertheless, these function will do well in your bag of tricks. Analytics function can shorten the time you took to write long queries with these keywords rather than old-style multiple sub-queries.



TABLENAME: TRANSACT

The above data shows a transaction data for 2 products. Now lets say we need to get the second last scrap activity rows for every LOCATION and every PRODUCT (not the SCRAP rows with MAX(TIME)-1). We can do this by:

select *,
count(*) over (partition by productid, location) as count,
dense_rank() over (partition by productid, location order by time, rownum) as rank
from transact
where activity='SCRAP'

you would get:


from here it is simply filtering out the rows where COUNT= RANK +1 and you would be getting the result you wanted. It will be more difficult to pull this off using aggregate functions and sub-queries. As a comparison, here's the old style query:

select *
from (
select max(rownum) as row_num
from transact
where activity='SCRAP' and rownum not in (select max(rownum) as row_num from transact where activity='SCRAP' group by productid, location)
group by productid, location
) max_rows
inner join
(
select *, rownum
from transact
where activity='SCRAP'
) main on max_rows.row_num=main.rownum

and it will have a restriction: the rows must be inserted with the column TIME = insertion time to ensure the correctness of ROWNUM. This will not always happen in production environment where data editing always happens.

I have yet to find any Microsoft, MySQL equivalent of analytic functions, if there are any gurus out there which have seen it all, please comment!


Regards,
Popo

No comments:

Post a Comment