Justin Cooney

Web Development Tips and Examples

When you are working with an unfamiliar database, such as a third-party SQL Server database, you want to find out which tables actually store the important data. If there is no clean database diagram, then the best option is to write a T-SQL query in SQL Server Management Studio (SSMS).

If you are interested in this topic, then I’d suggest also reading my article on batch searching your SQL Server database to find out in which tables the information is being stored.

The Logic Behind the Code

To differentiate supporting tables from data storage tables, we can simply assume that tables with more than 50 rows of data are storage tables that we are interested in. To get this data and report on it, we can loop through the database table names in sys.tables, and then get a row count for each table within a sub-query, that can then be saved for a final query output to the user.

The Basic Working Example

Below is the T-SQL that will return the row count and table name for each table in a database that has more than 50 rows:

DECLARE @TableName VARCHAR(255)
DECLARE @ResultsTable TABLE(
 RowCnt INT,
 TableName VARCHAR(500)
 )
DECLARE MY_CURSOR Cursor
FOR
SELECT name FROM sys.tables
ORDER BY name
OPEN MY_CURSOR
FETCH NEXT FROM MY_CURSOR INTO @TableName
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO @ResultsTable (RowCnt,TableName)
Exec('SELECT COUNT(*),''' + @TableName + ''' FROM ' + @TableName)
FETCH NEXT FROM MY_CURSOR INTO @TableName
END
CLOSE MY_CURSOR
DEALLOCATE MY_CURSOR
SELECT RowCnt,TableName FROM @ResultsTable
WHERE RowCnt > 50

As you can see, this can be very useful in getting to know the workings of a strange database.

Expanding on the Basic Example

But let’s take things one step further.

In addition to seeing the table name and the row count, let’s say that we also want to see a snapshot of what columns and datatypes exist in each table. This, for example, would be great for differentiating mapping tables from larger data-holding tables.

If we can get the column names and their datatypes into a single cell, then we can even write queries to further restrict the results to tables that have binary, or large text-field datatypes.

Example Code for Returning a Report of Row Count, Table Name, and Column/DataType

Below is the modified query from above, which now returns a list of each table’s columns and the column’s datatype in addition to the row count and table name.

DECLARE @TableName VARCHAR(255)
DECLARE @TmpColName VARCHAR(255)
DECLARE @ColsList VARCHAR(2000)
DECLARE @ResultsTable TABLE(
 RowCnt INT,
 TableName VARCHAR(500),
 TableCols VARCHAR(2000)
 )
DECLARE MY_CURSOR Cursor
FOR
SELECT name FROM sys.tables
ORDER BY name
OPEN MY_CURSOR
FETCH NEXT FROM MY_CURSOR INTO @TableName
WHILE @@FETCH_STATUS = 0
BEGIN
SET @ColsList = ''
SET @TmpColName = ''
DECLARE MY_INNERCURSOR Cursor
FOR
SELECT column_name + ' (' + DATA_TYPE + ')' [MyColumns]
FROM information_schema.columns
WHERE table_name = @TableName
OPEN MY_INNERCURSOR
FETCH NEXT FROM MY_INNERCURSOR INTO @TmpColName
WHILE @@FETCH_STATUS = 0
BEGIN
 IF LEN(@ColsList)>1
 BEGIN
 SET @ColsList = @ColsList +', ' + @TmpColName
 END
 ELSE
 BEGIN
 SET @ColsList = @TmpColName
 END
FETCH NEXT FROM MY_INNERCURSOR INTO @TmpColName
END
CLOSE MY_INNERCURSOR
DEALLOCATE MY_INNERCURSOR
INSERT INTO @ResultsTable (RowCnt,TableName,TableCols)
Exec('SELECT COUNT(*),''' + @TableName + ''',''' + @ColsList + ''' FROM ' + @TableName)
FETCH NEXT FROM MY_CURSOR INTO @TableName
END
CLOSE MY_CURSOR
DEALLOCATE MY_CURSOR
SELECT RowCnt,TableName,TableCols FROM @ResultsTable
WHERE RowCnt > 50
Posted in , , , , , , , ,

5 responses to “T-SQL Find Highly Populated Tables in a Database”

  1. Chris Barba Avatar

    This is a great post.

    I have a similar script to table row counts that doesn’t use select count(*).
    It’s a little faster then select count(*).
    http://chrisbarba.com/2009/03/02/record-count-of-all-tables-in-a-database/

    1. Justin Cooney Avatar

      Thanks for your feedback Chris, glad you found the post interesting. I took a look at your blog; you have a lot of really useful articles.

      Agreed that select count(*) is on the slower side… as you mentioned, using SELECT rows FROM sysindexes is the more performant option.

      Cheers!

  2. sql Avatar
    sql

    thanks, great info, here’s another piece of code that i like to use in order to find a list of the most populated tables within my database : http://www.sqlpentruincepatori.ro/most-populated-tables-from-the-database-sql-server/

    1. Justin Cooney Avatar

      Hi, thanks for your feedback and for linking to your script. I was playing around with your script a bit today. Its an interesting take on the issue. Cheers!

  3. […] a previous post I talked about how to find highly populated tables in a database. I’m going to expand on that in this post to show how to catalog a database into data-holding […]

Leave a reply to Justin Cooney Cancel reply