Click any image to view a larger version



8.     Optimise Queries

First Published 5 Mar 2019                               Last Updated 24 July 2022                                       Difficulty level :   Moderate

Section Links:         Introduction         Tests & Results         Query SQL         Query Execution Plans         ISAM Stats         Downloads         Access Europe


This is the eighth in a series of articles discussing various tests done to compare the efficiency of different approaches to coding.

Example databases are provided in each case so the same tests can be done on your own workstations

UPDATED 24 July 2022:
a)   added information on the ISAM Stats feature in older versions of Access.
b)   info about my forthcoming presentation on this topic to the online Access Europe user group.




Introduction                                                                                                                                             Return To Top

There is an excellent page on Allen Browne's website devoted to various methods of improving query performance: http://allenbrowne.com/QueryPerfIssue.html

I thought it would be helpful to others to illustrate his suggestions by performing a series of speed tests showing the effect of each suggested change.

To reduce file size, the queries are based on cut down versions of 3 tables from the DEMO version of my School Data Analyser application.
All data is for fictitious students in an imaginary school.

The aim of the query is to get the count of each type of pastoral incident recorded for every student in the year 2018.
The query is also filtered to those students whose date of birth was in the year 2005.

OptQ-QueryResults


Tests & Results                                                                                                                                             Return To Top

There are 11 versions of the query with varying amounts of optimisation starting with a (deliberately) very badly designed query and ending with the most optimised.

OptQ-MainForm
All queries return the same records (total = 882) but the times should get progressively faster each time (except possibly for the final stacked queries test).

Each test was run several times to reduce natural variations caused by other processes that may be running in the background.
The total time recorded is for the set number of loops. By default, the number of loops = 3 but this can be changed.

The fields used in each table to filter and sort the data are indexed to speed up searches:
The indexed fields are Surname, Forename, DateOfBirth, DateOfIncident

The average times recorded after running each set of tests 20 times was as follows:

OptQ-AvgResultsNEW
Here is the results summary as a report:

OptQ-AvgResultsChart
The times taken to loop through each query 3 times improved significantly from over 27 s originally down to about 0.49 s
This means it is running over 50 times faster after optimisation.



Query SQL                                                                                                                                                 Return To Top

The first query uses an outer join between 2 tables (PupilData / PRecords) and a DLookup value from the third table (PRCodes).
It took over 27 s to do 3 loops – VERY SLOW

This is the query SQL code used:

OptQ-TestA
Running domain functions such as DLookup in a query is VERY SLOW as the operation must be performed in turn on each record in the query.
It also wastes resources as additional connections have to be made to the data file.

The query execution plan involves a huge number of steps as each record is checked in turn

In this case, the domain function is totally unnecessary as the same result can be obtained using a second join

OptQ-TestB
In this second query, the outer join goes from the many side of the main PRecords table: PRCodes -> PRecords -> PupilData
Although the join direction is not the best choice, the time taken is dramatically reduced to about 1.55 s.

In the third query, the direction of the joins is reversed (one to many): PupilData -> PRecords -> PRCodes.

OptQ-TestC
This is a more efficient process for Access to manage and the time drops again to about 1.27 s.

This example has been deliberately designed so that using inner joins will get exactly the same records.
It always makes sense to use inner joins wherever possible as the constraints limit the searching required.

OptQ-TestD
Doing so further reduces the work required from the database engine and the time drops to just under 0.9 s.
All the remaining queries are based on inner joins.

Until now, all the aggregate totals have been based on the VBA Nz function: Count(Nz([PastoralRecordID],0)).
The Nz() function replaces Null with another value (usually a zero for numbers, or a zero-length string for text).
The new value is a Variant data type, and VBA tags it with a subtype: String, Long, Double, Date, or whatever.
This will affect the sort order and can lead to incorrect results in some situations.

The fifth query replaces the VBA Nz function with the use of the JET IIf function: IIf(Count([PastoralRecordID]) Is Null,0,Count([PastoralRecordID]))

OptQ-TestE
This has several advantages including avoiding an unnecessary VBA function call.
In addition, the correct data type is retained (in this case, integer) so the column sorts correctly.
This shaves another 0.03 s off the time which has now become about 0.87 s.
However, by using inner joins as in this example, a simple count will achieve the same results

OptQ-TestF
Although, the expression is simpler, the overall time is only slightly less than before – approximately 0.86 s.

All the above queries were sorted by a concatenated expression: [Surname] & ", " & [Forename]
Doing so, prevents the database engine making use of the indexes to perform the sort.

The next query fixes that, sorting by the two indexed fields: Surname and Forename.

OptQ-TestG
Doing so, further reduces the time required to about 0.77 s – almost 0.1 s faster
Whilst the query is now running well, further improvements can still be made.

Aggregate queries (those with a GROUP BY clause) can have both a WHERE clause and a HAVING clause.

The WHERE is executed first - before aggregation; the HAVING is executed afterwards - when the counts have been calculated.

Therefore, in many cases (though not always), it can be faster to use WHERE

The next query changes the HAVING clause to WHERE and the time drops to 0.65 s (another 0.12 s faster)

OptQ-TestH
See this separate article for detailed speed tests based on HAVING vs WHERE

However, although the WHERE clause looks simple to run, it is not using the indexing of the two date fields.
A better result is obtained using the indexes by indicating a range of values for each of the date fields:

WHERE (((PupilData.DateOfBirth) Between #1/1/2005# And #12/31/2005#)
   AND ((PRecords.DateOfIncident) Between #1/1/2018# And #12/30/2018#))



OptQ-TestI
This further reduces the time by another 0.1 s down to about 0.55 s for 3 loops. Every little helps!

All the above queries have used the default arrangement, grouping all fields from each table that are not being used for aggregation.

However the PupilID field is the unique primary key field in the PupilData table.
There is no need to group by other fields in that table. Instead optimise the query by choosing First instead of Group By in the Total row under the other fields.
Similarly for the other fields not required for the grouping in the other 2 tables.

OptQ-TestJ
This results in a further significant reduction in time to less than 0.5 s. The end result is now more than 50 times faster than the original 27.4 s!

Using First allows the database engine to return the value from the first matching record, without needing to group by the field.

In the query above I have used aliases for the fields now based on First.

Allen Browne also points out another benefit if you are grouping by Memo / Long Text fields:

If you GROUP BY a memo (Notes in the example), Access compares only the first 255 characters, and the rest are truncated!
By choosing First instead of Group By, JET is free to return the entire memo field from the first match.
So not only is it more efficient; it actually solves the problem of memo fields being chopped off.

Both stacked queries and subqueries are often useful in Access though both can be slower than using a single query where that is achievable.

As a final test, I also created a stacked query version of test J.
The first query qryStacked1 filters the records in PupilData and PRecords for the required date ranges.
The second query qryStacked2 is an aggregate query based on that

OptQ-TestK
The average time for 20 tests was 0.492 s – about 0.004 s FASTER than the single query equivalent in test J.
However, the fastest times in individual tests were recorded in test J

I repeated tests J and K on separate workstations and there was no clear winner between the 2 methods though the time difference was always very small


NOTE:
If anyone can see ways in which the above query can be further optimised, please do let me know!



Query Execution Plans                                                                                                                           Return To Top

You can use the JET ShowPlan feature to view the query execution plans for your queries.

By doing so, you can often obtain useful information to assist with the process of optimising your queries.

Using this feature creates a text file ShowPlan.out which can be viewed in any text editor such as Notepad

For further information, see my article ShowPlan – Go Faster elsewhere on this website.

I have attached three ShowPlan files for the above tests:

a)   ShowPlanA.out – this lengthy file just covers Test A which uses a DLookup.
      It should help explain why using domain functions in a query will ALWAYS be SLOW

b)   ShowPlanB2J.out – this covers all the other main tests: Tests B => Test J

c)   ShowPlanStacked.out – this just covers the stacked query version used in Test K

As an example, this is the query execution plan for test C. It is the shortest of the 11 query execution plans by a long way

Phrases such as scanning or using temporary index are a clear indication of a poorly optimised query

OptQ-ShowPlanC


ISAM Stats                                                                                                                                                 Return To Top

Older versions of Access up to Access 2010 included another little known feature: ISAM Stats

NOTE: ISAM = Indexed Sequential Access Method.
For more info, see this article at W3Schools.com: Indexed Sequential Access Method (ISAM)

ISAMStats was a hidden and undocumented member of the DAO reference library

ISAMStats
It could be used to determine the number of Disk Reads & Writes, Cache Reads, ReadAheadReads, LocksPlaced and LocksReleased
For example:

        ISAM Stats for tests A & B

ISAMStatsTestsA&B
        ISAM Stats for tests I & J

ISAMStatsTestsI&J




In simple terms, the smaller each of these figures are, the more efficient the query will be.
However, the feature does not give any useful information in ways of improving the efficiency.

Unsurprisingly, the ISAMStats feature was little used and was one of many features that were deprecated in Access 2013.

You can download the full set of ISAMStats results which are included in the text file below

NOTE:
Although still listed as a hidden member of the DAO library in current versions of Access, running the ISAMStats code in versions later than 2010 will cause this error:

ISAMStatsError

NOTE:
1.   Information on using ISAM Stats can be found in Volume 1 (of 2) of the excellent Access 2000 Developers Handbook by Getz, Litwin & Gilbert.

A2000DevHandbook
      See chapter 15 pages 1114-1121.

      The chapter contains a large number of hints and tips on optimising Access apps and is well worth detailed study

      Although over 20 years old and long out of print, copies of the book are often still available at a low price online.
      I strongly recommend purchasing this if you can find a copy. Versions were also written for Access 97 and 2002

2.   The code used to run this feature is provided in the module modISAMStats in the attached example database.

      The code was taken from the CD accompanying the Access 2000 Developers Handbook



Downloads                                                                                                                                                 Return To Top

Click to download:
    Example database:                   OptimiseQueries.accdb        Approx 3.5 MB (zipped)

    Query execution plans:            ShowPlan.out                       Approx 1.5 MB (zipped)

    ISAMStats results:                   ISAMStats.txt                       Approx 3 kB

    PDF version of this article:      OptimiseQueries.pdf             Approx 1.0 MB (zipped)



Access Europe Presentation                                                                                                                   Return To Top

I will be running a presentation called Optimising Queries and the JET ShowPlan to the online Access Europe User Group on Wed 7 Sept 2022.

For further details, times and login information, see: Access Europe – Colin Riddington (Optimising Queries and the JET ShowPlan)

The session will be recorded and the video uploaded to the Access Europe channel on YouTube



Colin Riddington           Mendip Data Systems                 Last Updated 24 July 2022



Return to Access Articles Page Return to Top Page 8 of 11 1 2 3 4 5 6 7 8 9 10 11