Which is the performance killer: SELECT INTO or INSERT INTO?

Insert Into / Select Into
Insert Into / Select Into
There are many ways to kill performance in a script or stored procedure.  However, not many think about the possibility that adding columns to a temporary table can kill performance.  Here, I’m going to show you how a simple design decision – using INSERT vs ALTER TABLE – can have a huge impact on your procedure performance.

This comes from a procedure we wrote for Minion Reindex to greatly improve your index maintenance by allowing you to have some impressive dynamic routines without any extra jobs.

INSERT vs ALTER TABLE – SELECT INTO and ALTER TABLE

We recently wrote a helper procedure to allow you to limit the rows for your reindex runs with Minion Reindex. We got it working just like we wanted it, and it returned in just a couple seconds. The problem is that sometimes it would take a couple seconds, and others it would take a couple minutes. Then it started always taking two minutes, and we weren’t able to get the performance back down at all.

We added indexes, we ran with recompile, we limited our result sets, etc. We did everything you usually do to fix the performance of a stored procedure. Then I noticed that it was loading data with a SELECT INTO instead of INSERT/SELECT. What the code actually did was, it created a table with some rows and then added a couple columns and then updated those rows.

Here’s a look at the actual code:

SELECT
        F.DBID
        ,   F.TableID AS   objectID
        ,   F.IndexID
        ,   F.DBName
        ,   F.SchemaName
        ,   F.TableName
        ,   F.IndexName
INTO    #IndexTableFrag
FROM    Minion.IndexTableFrag  AS   F
INNER JOIN  ( SELECT    MAX(ExecutionDateTime) AS   ExecutionDateTime
                        ,   DBName
                FROM      Minion.IndexTableFrag
                GROUP BY   DBName
            )  AS   M
            ON F.ExecutionDateTime   = M.ExecutionDateTime
            AND F.DBName   = M.DBName
WHERE   F.DBName   LIKE @DBName;
	   
/*	Minion.IndexTableFrag has a separate ExecutionDateTime for each database. 
So, this INNER JOIN gets the most recent date for each DB.
*/

----------------------------------
-- 2. Get the most recent page count and row count information from Minion.IndexPhysicalStats.

ALTER TABLE #IndexTableFrag ADD page_count BIGINT NULL;
ALTER TABLE #IndexTableFrag ADD record_count BIGINT NULL;

CREATE NONCLUSTERED INDEX ix14587684759 ON #IndexTableFrag (DBID, ObjectID, IndexID);

UPDATE  I
SET     I.page_count =   S.page_count
        ,   I.record_count =   S.record_count
FROM    #IndexTableFrag  AS   I
INNER JOIN  Minion.IndexPhysicalStats  AS   S
            ON I.DBID   = S.database_id
            AND I.objectID  = S.object_id
            AND I.IndexID   = S.index_id
INNER JOIN  ( SELECT    MAX(ExecutionDateTime) AS   ExecutionDateTime
                        ,   database_id
                FROM      Minion.IndexPhysicalStats
                GROUP BY   database_id
            )  AS   M
            ON S.ExecutionDateTime   = M.ExecutionDateTime
            AND S.database_id   = M.database_id;

So the synopsis is: Select Into, add two columns, update those two columns. That’s it. And most of the time was spent on the alter table statements.

I’ll go ahead and say that each of these tables only has about 13,000 rows in them, so it’s not like we’re dealing with millions of rows here. This is the part of the procedure that was taking so long and what needed to be corrected.

INSERT vs ALTER TABLE – INSERT

Now that we’ve seen the original poor-performing code, let’s take a look at the fixed code.

Continue reading at MinionWare.net.

Continue reading on MinionWare.net.

54321
(1 vote. Average 5 of 5)
Leave a reply

Your email address will not be published. Required fields are marked *