Skip to content

PERFORMANCE: SELECT COUNT() #30

@kfo2010

Description

@kfo2010

PROBLEM: The use of select count() in mysql is extremely inefficient and will not scale well with tables full of millions+ or even billions+ of rows.

PROOF:
sql: EXPLAIN SELECT COUNT(1) FROM transactions; # this shows that it did a full index/table scan of every row in the table. 1.5M rows. Full index/table scans are bad. They are SLOOOW and will not scale well.

sql: EXPLAIN SELECT COUNT(id) FROM transactions; # still full scan 1.5M rows

sql: EXPLAIN SELECT id FROM transactions ORDER BY id LIMIT 0,1; # still full scan 1.5M rows

sql: EXPLAIN SELECT id FROM transactions ORDER BY id LIMIT 1; # scanned ONE row.

sql: SHOW TABLE STATUS LIKE 'transactions'; # this took 80 milliseconds

RESULT: The difference is 2+ seconds vs 80 milliseconds. This is huge and the 2+ seconds will increase over time as data grows. The 80 milliseconds will not.

SOLUTION:

  1. If you need to check if at least one result is returned use ORDER BY id LIMIT 0,1; instead of SELECT COUNT. Avoid using COUNT if at all possible. This will not scale.
  2. SQL queries in code should be tested with EXPLAIN. If they are doing full table scans, they need to be removed completely from code and other methods should be used that can scale to billions of rows.
  3. NOTE: InnoDB (MyISAM is dead) do not store row counts and will count() all rows each time.
  4. if you need table counts try "SHOW TABLE STATUS LIKE ..." NOTE this is an approximation but should be fine for large tables when you need to do some sort of paging.
  5. This should also speed up PAGING on the block explorer. I am seeing that the explorer is starting to get slower. This is one reason why.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions