Lesson 10
Join Methods
A practical introduction to the join methods available to the Oracle9i cost-based optimizer.
Choosing Join Plan
- CBO decides the following for all joins:
- Access path
- e.g. full scan, index range scan, etc.
- Join method
- e.g. nested loops, hash, sort-merge, etc.
- Join order
- What table will drive the join process
- Each one can have an impact on performance
CBO and Missing Stats
- When optimizer goal = CHOOSE
- And statistics exists for 1 or more tables in join
- And table has no statistics
- CBO uses default statistics
- Poor execution plan selection is very likely
Join Methods
- Oracle9i CBO supports the following join methods:
- Nested Loops Join
- Outer Join
- Sort-Merge Join
- Hash Join
- Anti-Join
- Semi-Join
Nested Loops Join
- Common join technique for small-ish sets
- Starts returning rows quickly
- No pre-processing of data
“New” Nested Loops
- Incorporates “batching” for performance
- Fetch “n” rows from outer table
- Sort rows
- Access inner table index
- Keep ROWID’s for matches
- Use ROWID’s to access inner table
Nested Loops Example
- CBO chose nested loops for “small” join
Optimizer Mode
- A nested loops table join was chosen
ORDERED Hint
USE_NL Hint
Continuation of Notes
- This is a full page of notes.
- The slide is hidden.
Adding Selective Filters
- Adding additional filters can change the join order
Outer Nested Loops
- Outer join forces choice of driving table
- Table being preserved must by driver
- Can hurt performance
Hash Joins…
- Good technique for large sets
- When sufficient PGA available
- Became available with Oracle 7.3
- CBO only
- RULE optimizer will never use hash join
- Pre-processing step delays return of 1st row
- Really fast once rows start returning
…Hash Joins
- Use when intend to return the entire set
- Only available for equi-joins
- Why CBO might not select hash join?
- Join condition is not equality
- Shortage of memory
- See PGA_AGGREGATE_TARGET or HASH_AREA_SIZE parameter
- Hash joins disabled
- See HASH_JOIN_ENABLED parameter
Hash Join Process…
- Use driving table to create a “hash table”
- Join key value hashed – pointer into table
- Hash value and row stored in private memory
- Uses Temp space if too big
- Read rows from second table
- Hash join key value
- Probes the hash table to get match
- Return joined row if match found
…Hash Join Process
Hash Join Example
Hash versus Nested Loops
USE_HASH Hint
Hash Join Parameters
- HASH_JOIN_ENABLED
- Database and Session level
- Dynamic
- TRUE allows hash joins
- PGA_AGGREGATE_TARGET
- Session level only
- Dynamic
- Total aggregate size for all PGA’s
- Deprecates HASH_AREA_SIZE
Sort-Merge Join…
- Usually, tables are sorted before join of rows
- Sort delays return of 1st row
- No driving table
…Sort-Merge Join
- Useful for non-equijoins
- Also useful if sort step can be skipped
- If index full scan can return data sorted by join key
Sort-Merge Example
USE_MERGE Hints
- USE_MERGE hint forces a sort-merge join:
NOT IN Subquery: Anti-Joins
- Return rows from with no match in NOT IN subquery
- 3 kinds
- Hash anti-join
- Nested loops anti-join
- Merge anti-join
- Can be very fast method of handling NOT IN
- Affected by ALWAYS_ANTI_JOIN parameter
Nested Loops Anti-Join
Anti-Join Hints
EXISTS Subquery: Semi-Joins
- Return rows that match EXISTS subquery
- Only has to find one match in inner table, then stop search
Semi-Join Hints
- HASH_SJ, MERGE_SJ and NL_SJ hints
Cartesian Join
- Often result of a missing predicate
- But, not always…
Lesson Summary
- Be familiar with the common join methods
- Nested Loops Join
- Outer Join
- Sort-Merge Join
- Hash Join
- Anti-Join
- Semi-Join
- Know why the optimizer chooses each
- Know how join order can affect performance
Join Methods Workshop