Oracle SQL Tuning Free Tutorial

There are often several ways to get the same answer from your Oracle Database.  But which version of the equivalent SQL query is faster? We need to learn how to tune SQL statements that produce equivalent results.

Listen and watch as Oracle Master John Watson demonstrates why not all equivalent SQL statements are created equal.
John demonstrates how SQL statements that produce equivalent result sets can have radically different response times, why this happens and common solutions.

In this free video tutorial, listen and watch as Oracle Master John Watson demonstrates why equivalent SQL statements are not all created equal when it comes to performance.

John demonstrates how SQL statements that produce equivalent result sets can have radically different response times, why this happens and common tuning solutions.

This free training is segmented into several separate lessons:

  1. Introduction (including First Demonstration) (7:11)
  2. SubQuery Factoring (14:38) (click on video below)
  3. Star Transformations (15:55)
  4. Aggregations (8:16)
  5. Summary (3:31)

Date: Sep 12, 2012

NOTE: Some corporate firewalls will not allow videos hosted by YouTube.

Transcript

SubQuery Factoring

Oracle SQL Tuning

2. SubQuery Factoring

[music]

>> John:  Subquery factoring, what this is about is rewriting your SQL to pull part of the sequel out into a separate subquery. The typical example of this, the worst case is undoubtedly the correlated subquery.

Correlated subquery are often seriously bad news. We should all know already what a correlated subquery is. It’s a query where you join two tables in such a way that for every row of the outer table you have to query the inner table. For example

[pause]

Taking a classic case again, I’ll just continue working in the emp schema. If we were to run a query say that attempts to work out every department where the total of the salaries is say more than a certain amount, what would I do?

I would run select department name (dname) from dept where – say I have a cut off for 10,000 – select dname from dept d where 10,000 is greater than my correlated subquery which will be select sum of salaries from emp e where – let me do the join – that e.deptno = d.deptno.

Pretty straightforward what’s going on. For every row in the department’s table I’m going to have to sum up the appropriate number of salaries and do the test. That does the total come to more than 10,000.

And we find that I get two rows back, sales and accounting. So two rows were greater than – sorry, that’s less actually. I’ve retrieved the two departments where the total of salaries is less than 10,000 pounds or even dollars. We see the statement is run.

To get the full detail of that, I’ll set the auto trace off and get the full information about how that thing was run. Just taking the step further, I’ll run it through the rule-based optimizer. So alter session set, optimizing mode = rule, and see how Oracle is going to run that statement.

The next setting we need to do is get some statistics. Set statistics level = all, statistics alter session off. Alter session, set statistics level = all. Then run the statement with the correlated subquery and see how it actually ran that query. So to get the execution plan out with execution statistics we select * from table and the table will be our friend dbms x plan and display the cursor just with the usual value this will do to get the last to run the statement.

[pause]

Much better. I’m going to select dname from dept. Got it right.

[pause]

Now here we see the bad news about how that statement was run. What happened is it did a full table scan of dept. Full table scan of dept, and it retrieved four rows. Then we see we did a full table scan of emp is the next stage and we did it four times. So for every row in the outer table and now to query we did a full table scan of the inner table and that’s pretty disastrous if we join to a table and the outer table has millions of rows. It’s actually a bloody disaster.

But that’s what rule-based optimizer has done. It’s done exactly what I told it to do. It’s run that inner query, that second query once for every row of the outer query.

Let us now see how the cost-based optimizer does. We would hope that it does a rather better job. The cost-based optimizer has got exactly the same result set and it’s done it much more efficiently. You can see what it’s done. It’s in effect reversed the logic. It’s gone to emp table first and has scanned it once, sorted it, done the aggregation and it’s even done the check. You see the filter coming at line 4, it’s worked out the as the sum total of the salaries over 10,000, then it’s joined to dept.

But this still isn’t perfect. It’s not bad. It has to do with the index you need to scan a dept twice and then do the look up on the dept tables twice. Why? Because there were two rows returned, sales and accounting. But that’s an awful lot better. So the cost-based optimizer has rewritten the query.

[pause]

Now take another example though, another equivalent SQL because that’s just running that same statement through different versions of the optimizer. Let’s see your program that tries to be clever. Your program has been told that correlated subqueries are bad news and indeed correlated subqueries usually are bad news.

Perhaps you decide to write it like this. We select say dname from dept. Then we can go to natural join emp, so join dept to emp and give it a group by. So group by deptno – we’ll group by dname better. Group by dname having some sal less than 10,000. So you can see what I’m doing. This isn’t all the [6:44 inaudible]. This will be an equivalent SQL and it is because exactly the same result set back. A different order but never mind that. I didn’t specify an order by clause.

So my program is being clever. He’s been told that correlated subquery’s a bad news, so it’s rewritten the query to use a join and a group by. What actually happened within the execution plan?

[pause]

Not so good, is it? What’s it have to do is go back to pkdept, go to that, go to the department’s table, do a merge join – what it’s joining to is it’s had to do four sorts of the emp table. Only one scan, it’s not too bad. But we’ve had to sort those 14 rows four times.

Quite simply, Oracle could not identify the fact that that statement is an equivalent SQL. It didn’t realize that that is an equivalent SQL to that. It did a fairly good job. It is by no means perfect. So what I’ll now do is show what your program should’ve done. What your program should’ve done is tell the optimizer what it wants it to do and that is by manually factoring out the subquery.

So with say salary total (saltot) ask – what do you want to do? We want to pass through emp totaling up the salaries per department. Select deptno sum sal ask sum sal from emp and do the group by. Group by deptno. I’m manually factoring out the subquery. It was causing the problems and now we can do a join. We can select dname from dept, natural join saltot where sum sal less than 10,000. And we get the same rows back.

That’s a third equivalent SQL. Now I’ve got three versions of the same statement. Totally different syntax that give identical results. What is this one like? Let me look at the execution plan.

[pause]

It’s perfect. We’re back to the one where the optimizer generated in the first place. We’re processing the emp table just once to construct the view so I factored out that and then we have to probe the index on dept. We have to do that twice because we got two rows back. Manually, I told the optimizer to do what it was doing previously without any danger of your programmers and that’s what your programmers should be doing. Rather than forcing the optimizer do the rewrite, they should be rewriting it themselves.

>> Dave:  I’m sorry. Can I interject a couple of things. Really good stuff. Just to interject – the recapture on what you’ve just seen is, is three equivalent SQLs. You saw the correlated subquery, then you saw natural join, and then finally the subquery factoring with the WITH clause. As John has demonstrated to us, the WITH clause is the most solid coding of it because you’re telling the optimizer what you want to do.

John, we’ve got a bunch of things in from the students here.

“One thing, your audio is a little funky so if you can speak a little up and a little slower that will be great. If you can increase your font size, that will also be great. Not an emergency on that though.”

And a couple of questions. In the last two explain plans, we had what seemed to be two nested loops and we were calling it a nested loop join. Folks, that’s what the new version of Oracle 11g single nested loop. We charged in much more detail in the SQL tuning class. We compared to 9i version or 10g version versus the 11g version, but that is actually a single nested loop between two tables, emp and dept. It’s just how Oracle is externalizing. It’s really great in a subset so it loops twice. It’s more efficient than the old way.

The other question in the queue was related to statistics level. You had altered session set statistics level = all. That’s initialization parameters, statistics level, and by default and usually it should be set to typical which would collect things like time statistics, AWR, and ADDM statistics, and all of the good things that we want. If you set it to ALL, it collects extra statistics for your SQLs. It had a little overhead so you’d probably want to set it to ALL all the time.

It can also be done with gather plans statistics hint. If time is permitting you might see an example of the gather plan statistics hint instead of altering session with statistics level.

[pause]

Finally, there is a question on dbms xplan display cursor. What that does is display the execution plan of any cursor that is filling the cache, plus any statistics if you’ve done the previous tactic of statistics level ALL or in the gather plan statistics hint. Usually dbms xplan display cursor you want to set server output off unless you get an error.

There are a couple of parameters. First, your SQL ID and child number, if you leave them null, null as John did, that will acquire the execution plan at the last query executed in that session. The third parameter is FORMAT and all stats lasts, it basically says giving all the IO and memory statistics available and display them only for the last execution of this query even if there had been multiple executions of this query.

You can go to the Oracle PL/SQL types and packages reference for dbms xplan and look for the display cursor function and you can see all of the parameters for format. There are some quite interesting ones.

[pause]

So with that little diversion, John can you show the WITH statement.

>> John:  Yes, I will do. Thank you for explaining that, by the way. I’m sorry, people. I probably was assuming a bit too much knowledge. What I’m going to do now – we’ll go back to the WITH statement. I just wanted to do one more experiment. I probably have the optimizer mode back to rule. I go back to my equivalent SQL here with the WITH clause with that.

With that are the subquery then find the appropriate department and run it. Even with the rule-based optimizer, I sincerely hope that I’m now getting the optimal plan, which I do. What this really does bring home to you is that wherever you see a join or subquery especially when it’s a correlated subquery, think about factoring this out. Back to you, Dave.

Copyright SkillBuilders.com 2017

×
Free Online Registration Required

The tutorial session you want to view requires your registering with us.

It’s fast and easy, and totally FREE.

And best of all, once you are registered, you’ll also have access to all the other 100’s of FREE Video Tutorials we offer!

 

×