SQL Code i3: November 2011

Monday, 28 November 2011

Cool CROSS APPLY Tricks, Part 1

In most SQL Server books or tutorials, the APPLY operator would get no more than a 1 or 2 page treatment, if at all. (A notable exception is Itzik Ben-Gan’s book Inside Microsoft SQL Server 2005: T-SQL Querying, which goes into some depth). But in my opinion it’s a really cool and extremely useful operator.

The example below is representative of about 97% of the examples you see in books and tutorials regarding APPLY in that it uses a table-valued function. Using the AdventureWorks database, it finds all Vista credit cards that expire in June of 2008, and gives information about the contact associated with the credit card.

select f.FirstName
      ,f.LastName
      ,f.JobTitle
      ,f.ContactType
      ,cc.CardNumberfrom Sales.CreditCard ccjoin Sales.ContactCreditCard ccc on cc.CreditCardID=ccc.CreditCardIDcross apply dbo.ufnGetContactInformation(ccc.ContactID) fwhere cc.ExpYear=2008
  and cc.ExpMonth=6
  and cc.CardType='Vista'/*
FirstName LastName JobTitle         ContactType   CardNumber
--------- -------- ---------------- ------------- --------------
Peggy     Justice  Owner            Store Contact 11119759315644
John      McClane  Purchasing Agent Store Contact 11119490672347
Laura     Cai      NULL             Consumer      11112813884091
Natalie   Gonzales NULL             Consumer      11114369564985
Jarrod    Sara     NULL             Consumer      11116045498593
Katherine Bailey   NULL             Consumer      11119100149656
Stephanie Gray     NULL             Consumer      11112324154556
Shawna    Sharma   NULL             Consumer      11116413893110
Mindy     Rai      NULL             Consumer      11115163407997
Jackson   Jai      NULL             Consumer      11112011871602
And so on... (74 rows total)
*/

The SQL Engine first must evaluate everthing on the left-hand side of the APPLY, and then, for each row, it is able to process (or apply) the right-hand side to produce some kind of a (table-valued) result.

So in the example above, it first evaluates the left-hand side (i.e. JOINs the CreditCard and ContactCreditCard tables and filters through only the Vista credit cards that expire in June2008), and then for each row, it will apply the right-hand side (i.e. call the function,passing it the ContactID) to produce a result (the function returns a 1-row table consisting of FirstName, LastName, JobTitle, and ContactType).

That’s kind of cool, and it was something that you were not able to do prior to SQL2005, when the APPLY operator was introduced.

But the right-hand side doesn’t have to be a function.

Let’s take a look at all the Stores in AdventureWorks whose Main Office is in Wisconsin (we have to trudge through a lot of tables to get that information):

select c.CustomerID
      ,s.Name
from Sales.Customer cjoin Sales.Store s on c.CustomerID=s.CustomerIDjoin Sales.CustomerAddress ca on c.CustomerID=ca.CustomerIDjoin Person.Address a on ca.AddressID=a.AddressIDjoin Person.StateProvince sp on a.StateProvinceID=sp.StateProvinceIDwhere ca.AddressTypeID=3 --MainOffice  and sp.StateProvinceCode='WI'/*
CustomerID Name
----------- ---------------------------
        418 Good Bike Shop
        453 Unique Bikes
        543 Friendly Neighborhood Bikes
        606 Little Bicycle Supply Shop
*/

And for each of those customers, let’s find out the top 3 products that they bought in terms of dollars (and while we’re at it, let’s show the dollars they spent on the product):

select c.CustomerID
      ,s.Name      ,f1.ProductID
      ,f1.PurchaseAmtfrom Sales.Customer cjoin Sales.Store s on c.CustomerID=s.CustomerIDjoin Sales.CustomerAddress ca on c.CustomerID=ca.CustomerIDjoin Person.Address a on ca.AddressID=a.AddressIDjoin Person.StateProvince sp on a.StateProvinceID=sp.StateProvinceIDcross apply (select top (3) 
                    ProductID
                   ,PurchaseAmt=sum(LineTotal)             from Sales.SalesOrderHeader soh
             join Sales.SalesOrderDetail sod
               on soh.SalesOrderID=sod.SalesOrderID
             where CustomerID=c.CustomerID
             group by ProductID
             order by sum(LineTotal) desc) f1where ca.AddressTypeID=3 --MainOffice  and sp.StateProvinceCode='WI'/*
 CustomerID Name                        ProductID PurchaseAmt
----------- --------------------------- --------- -----------
        418 Good Bike Shop                    795  30367.3500
        418 Good Bike Shop                    794  24136.8075
        418 Good Bike Shop                    792  23508.5175
        453 Unique Bikes                      773  16319.9520
        453 Unique Bikes                      771  12239.9640
        453 Unique Bikes                      772  12239.9640
        543 Friendly Neighborhood Bikes       782   9638.9580
        543 Friendly Neighborhood Bikes       868    671.9040
        543 Friendly Neighborhood Bikes       869    335.9520
        606 Little Bicycle Supply Shop        717   1717.8000
        606 Little Bicycle Supply Shop        838    858.9000
        606 Little Bicycle Supply Shop        738    809.3280
*/

Notice that the right-hand side of the CROSS APPLY is a correlated query where we refer to c.CustomerID, a column from the left-hand side. So each of the 4 customers from the left-hand side were applied to our correlated query on the right-hand side to produce 3 rows each.

But let’s not stop there… We can keep chaining additional information. For each of those products we found, let’s find out the top customer for that product (i.e. who placed the most orders on the product?):

select c.CustomerID
      ,s.Name      ,f1.ProductID
      ,f1.PurchaseAmt
      ,f2.TopCustomerID
      ,f2.NumOrdsfrom Sales.Customer cjoin Sales.Store s on c.CustomerID=s.CustomerIDjoin Sales.CustomerAddress ca on c.CustomerID=ca.CustomerIDjoin Person.Address a on ca.AddressID=a.AddressIDjoin Person.StateProvince sp on a.StateProvinceID=sp.StateProvinceIDcross apply (select top (3) 
                    ProductID
                   ,PurchaseAmt=sum(LineTotal)             from Sales.SalesOrderHeader soh
             join Sales.SalesOrderDetail sod
               on soh.SalesOrderID=sod.SalesOrderID
             where CustomerID=c.CustomerID
             group by ProductID
             order by sum(LineTotal) desc) f1cross apply (select top (1)                    TopCustomerID=CustomerID
                   ,NumOrds=count(*)             from Sales.SalesOrderHeader soh
             join Sales.SalesOrderDetail sod
               on soh.SalesOrderID=sod.SalesOrderID
             where ProductID=f1.ProductID
             group by CustomerID
             order by count(*) desc) f2where ca.AddressTypeID=3 --MainOffice  and sp.StateProvinceCode='WI'/*
 CustomerID Name                        ProductID PurchaseAmt TopCustomerID NumOrds
----------- --------------------------- --------- ----------- ------------- -------
        418 Good Bike Shop                    795  30367.3500             4       8
        418 Good Bike Shop                    794  24136.8075             4       8
        418 Good Bike Shop                    792  23508.5175            16       8
        453 Unique Bikes                      773  16319.9520            75       4
        453 Unique Bikes                      771  12239.9640            20       4
        453 Unique Bikes                      772  12239.9640           118       4
        543 Friendly Neighborhood Bikes       782   9638.9580            10       8
        543 Friendly Neighborhood Bikes       868    671.9040            23       4
        543 Friendly Neighborhood Bikes       869    335.9520            10       4
        606 Little Bicycle Supply Shop        717   1717.8000           166       8
        606 Little Bicycle Supply Shop        838    858.9000            24       6
        606 Little Bicycle Supply Shop        738    809.3280           166      12
*/

This second CROSS APPLY (f2) took the f1.ProductID that resulted from the first CROSS APPLY (f1), and it applied it to the correlated query on its right-hand side to produce a single row of information, namely the TopCustomerID and NumOrds columns.

Note that if we wanted to add product descriptions to this query, which we would get from the Production.Product table, we must perform the JOIN to that table AFTER the CROSS APPLY, because everything with APPLY is processed left-to-right.

select c.CustomerID
      ,s.Name      ,f1.ProductID
      ,ProductName=p.Name      ,f1.PurchaseAmt
      ,f2.TopCustomerID
      ,f2.NumOrdsfrom Sales.Customer cjoin Sales.Store s on c.CustomerID=s.CustomerIDjoin Sales.CustomerAddress ca on c.CustomerID=ca.CustomerIDjoin Person.Address a on ca.AddressID=a.AddressIDjoin Person.StateProvince sp on a.StateProvinceID=sp.StateProvinceIDcross apply (select top (3) 
                    ProductID
                   ,PurchaseAmt=sum(LineTotal)             from Sales.SalesOrderHeader soh
             join Sales.SalesOrderDetail sod
               on soh.SalesOrderID=sod.SalesOrderID
             where CustomerID=c.CustomerID
             group by ProductID
             order by sum(LineTotal) desc) f1cross apply (select top (1)                    TopCustomerID=CustomerID
                   ,NumOrds=count(*)             from Sales.SalesOrderHeader soh
             join Sales.SalesOrderDetail sod
               on soh.SalesOrderID=sod.SalesOrderID
             where ProductID=f1.ProductID
             group by CustomerID
             order by count(*) desc) f2join Production.Product p on f1.ProductID=p.ProductIDwhere ca.AddressTypeID=3 --MainOffice  and sp.StateProvinceCode='WI'/*
 CustomerID Name                        ProductID ProductName                PurchaseAmt TopCustomerID NumOrds
----------- --------------------------- --------- -------------------------- ----------- ------------- -------
        606 Little Bicycle Supply Shop        717 HL Road Frame - Red, 62      1717.8000           166       8
        606 Little Bicycle Supply Shop        738 LL Road Frame - Black, 52     809.3280           166      12
        453 Unique Bikes                      771 Mountain-100 Silver, 38     12239.9640            20       4
        453 Unique Bikes                      772 Mountain-100 Silver, 42     12239.9640           118       4
        453 Unique Bikes                      773 Mountain-100 Silver, 44     16319.9520            75       4
        543 Friendly Neighborhood Bikes       782 Mountain-200 Black, 38       9638.9580            10       8
        418 Good Bike Shop                    792 Road-250 Red, 58            23508.5175            16       8
        418 Good Bike Shop                    794 Road-250 Black, 48          24136.8075             4       8
        418 Good Bike Shop                    795 Road-250 Black, 52          30367.3500             4       8
        606 Little Bicycle Supply Shop        838 HL Road Frame - Black, 44     858.9000            24       6
        543 Friendly Neighborhood Bikes       868 Women's Mountain Shorts, M    671.9040            23       4
        543 Friendly Neighborhood Bikes       869 Women's Mountain Shorts, L    335.9520            10       4
*/

If we had tried to place that JOIN with all the other JOINs (i.e. BEFORE the first CROSS APPLY), we would get an error: The multi-part identifier “f1.ProductID” could not be bound.

This is all interesting, but the real magic of CROSS APPLY is yet to come in Part 2.

Thursday, 24 November 2011

It's the Natural Order of Things... NOT!

In moving from the Visual FoxPro (VFP) world to the SQL Server world, the biggest thing that I had to un-learn was the concept of a “natural order” of a table. In VFP, when you add rows to a table, they ALWAYS get appended to the end of the table. No exceptions. So when you do a SELECT FROM the table, the rows will ALWAYS appear in the same order no matter what. You can create as many indexes as you like on the table, and it still won’t make any difference, because VFP will ALWAYS use the “natural order” un-indexed rows when you do a SELECT FROM the table.

All of that gets thrown out the window when it comes to SQL Server. I see many posts on the MSDN T-SQL Forum from people who present some sample data and want some kind of solution that assumes a certain order in the table. For example, here’s a typical example of a post I’ve seen many times:

My data looks like this:

Code  Quantity
----  --------
A           23
A           15
A           11
B           14
B           18
A           43
C           12

And I want to sum up the quantities by Code and get this result:

Code  TotQuantity
----  -----------
A              49
B              32
A              43
C              12

The inevitable and immediate question from others on the forum trying to help is: “You can’t do what you want unless you order on some other column. Is there a datetime column or something?” And many times, the answer is no, there is no order. This is just what they see when they do a SELECT * and they assume that this is a “natural” order of the rows as they were chronologically INSERTed into the table.

Wrong-o.

I can certainly understand that this is a natural assumption to make. Coming from the VFP world, it’s certainly the assumption I would have made… until I learned more about how SQL Server works.

Let’s see just how wrong this assumption is.

Create a table and insert 10 rows into it and let’s take a look at what we get when we do a SELECT * from it:

create table #test (  col1 varchar(900) not null ,col2 char(1) not null
)
  insert #test values ('B'+replicate('.',899),'N')insert #test values ('R'+replicate('.',899),'A')insert #test values ('A'+replicate('.',899),'T')insert #test values ('D'+replicate('.',899),'U')insert #test values ('S'+replicate('.',899),'R')insert #test values ('C'+replicate('.',899),'A')insert #test values ('H'+replicate('.',899),'L')insert #test values ('U'+replicate('.',899),'O')insert #test values ('L'+replicate('.',899),'R')insert #test values ('Z'+replicate('.',899),'D')
 
select * from #test/*
col1      col2
--------- ----
B........ N
R........ A
A........ T
D........ U
S........ R
C........ A
H........ L
U........ O
L........ R
Z........ D
*/

Great. That makes sense. It’s the “natural” chronologically-entered order of the rows. You can see my name (Brad Schulz) spelled down vertically in the first column and “Natural Ord” spelled down vertically in the second column. That’s the order we put them in, and so it’s not surprising that’s the way they come out.

Now let’s insert two more rows and do another SELECT *:

insert #test values ('X','X')insert #test values ('Y','Y')select * from #test/*
col1      col2
--------- ----
B........ N
R........ A
A........ T
D........ U
S........ R
C........ A
H........ L
U........ O
X         X
Y         Y
L........ R
Z........ D
*/

Huh? Why didn’t it append them at the end? Well, because SQL Server just doesn’t do an append. Table data is stored as a collection of 8-KB (8192-byte) pages. If you subtract the space for page header information and other overhead, that leaves 8060 bytes for the data on each page. And here’s the key: a row of data can not span multiple pages.

The first 10 rows I created were 901 bytes each. The first 8 rows took up 8 * 901 = 7208 bytes, leaving 8060 - 7208 = 852 bytes left in the page. The ninth row of 901 bytes was too big to fit in that page, so it was placed on a whole new page, along with the tenth row (since it, too, was too big to fit on the first page).

But when I added an 11th and 12th row, I purposely made those rows only 2 bytes each. And SQL Server plopped them on the first data page of the table because there was easily enough room for them. So now when we do a SELECT *, we are getting the first page’s worth of data followed by the second page.

You can read more at Books Online about SQL’s data storage in pages.

Now let's stir things up and create a clustered index on col1:

create clustered index ix_test1 on #test (col1)select * from #test/*
col1      col2
--------- ----
A........ T
B........ N
C........ A
D........ U
H........ L
L........ R
R........ A
S........ R
U........ O
X         X
Y         Y
Z........ D
*/

The table is no longer a heap of unordered rows… the table is now ordered on the column we called col1. So a SELECT is now going to retrieve data from the table in that order. If you look at the query plan, you can see that it’s doing a Clustered Index Scan.

But wait, that’s not all…

Now create a nonclustered index on col2:

create nonclustered index ix_test2 on #test (col2)select * from #test/*
col1      col2
--------- ----
C........ A
R........ A
Z........ D
H........ L
B........ N
U........ O
L........ R
S........ R
A........ T
D........ U
X         X
Y         Y
*/

Hmmm… Why does it now come out in col2 order? After all, we did a SELECT on ALL columns of the table, and all the data is stored in the clustered index. So why does the query plan show that an Index Scan was done on the nonclustered index instead of doing a Clustered Index Scan?

Our nonclustered index on col2 (and any nonclustered index we create) automatically includes col1 in the index as well, because it needs that to do a lookup into the clustered index in case it needs to acquire any further columns. So our nonclustered index has both col1 and col2 in it… all the columns of our table are in it!

And why does SQL prefer the NonClustered Index Scan over the Clustered Index Scan, even though they both have exactly the same amount of data? Well, in normal real-life situations, a nonclustered index will have less data in its pages than a clustered one (since the clustered index contains ALL columns). The size of a nonclustered index will be either equal to or less than the size of a clustered index… it will never be bigger… so for that reason, SQL builds an execution plan that will always favor a covering nonclustered index.

Now let’s remove the two indexes… no ordering at all. What happens?

drop index #test.ix_test1drop index #test.ix_test2select * from #test/*
col1      col2
--------- ----
A........ T
B........ N
C........ A
D........ U
H........ L
L........ R
R........ A
S........ R
U........ O
X         X
Y         Y
Z........ D
*/

Wait a minute… We got rid of the clustered index. Why is the data in col1 order now? I assume that when we removed the clustered index, SQL scanned the clustered index (in col1 order) and re-wrote the data into a heap (a bunch o’ data without a clustered index).

I find it interesting, though, that it didn’t follow the rules of putting a new row into the first page it can find that has room. In other words, it wrote out the first 8 rows (the ones where col1 starts with A, B, C, D, H, L, R, and S), taking up 7208 bytes. Then row number 9’s 901 bytes was too big to fit, so it started a new page. But then row numbers 10 and 11 were our 2-byte rows, and they were apparently APPENDED AFTER row 9 in the second page and WERE NOT put into the first page where there was ample room for them. Perhaps, in order to remove the clustered index as quickly as possible, it’s just faster to spit out the rows onto successive pages rather than trying to do a “first page that fits” algorithm for every single row. (If anyone has any insight into this, please chime in).

If we add a new row, we can see that it follows the “first page that fits” rule, just as we had seen the last time we manually inserted records, putting the row into the first page:

insert #test values ('Z','Z')select * from #test/*
col1      col2
--------- ----
A........ T
B........ N
C........ A
D........ U
H........ L
L........ R
R........ A
S........ R
Z         Z
U........ O
X         X
Y         Y
Z........ D
*/

So, after all this, I think we can throw the “natural order” concept out the window, don't you? There’s just no such thing.

Wednesday, 23 November 2011

Heigh-Ho! Heigh-Ho! With NTILE We Will Go!

It's not commonly known that the Seven Dwarfs use SQL Server to keep track of the precious gems that they find in the mine. They catalog and record each stone, and, believe it or not, they have a "Finders Keepers" policy. Whoever uncovers a gem with his pickaxe gets to keep it. Not surprisingly, Sleepy has the least amount of stones to his name, and Happy has the most. (Hey, how do you think Happy got his name? The guy’s swimming in it!)

Anyway, you may not have heard, but I'm afraid I have some very sad news. Sneezy passed away last night… he had the mother of all sneezes just before bed, and I'm afraid he ka-chooed his brains out. Memorial services are next Tuesday.

Sneezy's will stipulated that all of his possessions should be randomly split as evenly as possible among the remaining 6 dwarfs. (He figured Snow White had all she needed being married to that rich prince dude). The split would occur based on quantity rather than value (since the gems they find are priceless, after all).

And so, one of the dwarfs (I'll let you guess who), wrote up a T-SQL script to evenly divide Sneezy's possessions among his surviving buddies. It was actually a pretty simple (yet clever) bit of code:

with Sneezys_Stuff as(  select OwningDwarf
        ,InheritingNTileValue=ntile(6) over (order by newid())  from DwarfPossessions
  where OwningDwarf='Sneezy')update Sneezys_Stuffset OwningDwarf=case InheritingNTileValue
                  when 1 then 'Dopey'                  when 2 then 'Doc'                  when 3 then 'Grumpy'                  when 4 then 'Bashful'                  when 5 then 'Sleepy'                  when 6 then 'Happy'                 end;

Hey, is the NTILE function way cool or what? The CTE uses NTILE to evenly assign a value of 1 to 6 to all of Sneezy’s stuff and that NTILE value is used to determine the new owner (1=Dopey, 2=Doc, and so on). And ORDERing BY NEWID() spreads Sneezy’s estate in a random manner.

Now can you guess which dwarf wrote this script? I’ll give you a hint. Let’s say that Sneezy had 27152 items attributed to him. NTILE (6) will divide that number by 6 to come up with 27152 / 6 = 4525 items, with a remainder of 2 (or 27152 % 6). Guess which NTILE value(s) get the remainder? That’s right… the remainder is distributed one-by-one to the NTILE value of 1, then 2, and so on. So in this case, Dopey and Doc will get that little something extra... they’ll get 4526 items while the other poor saps will only get 4525.

Yep, you guessed it… Dopey wrote this. I guess that whole “Dopey” guise is a sham… “Greedy” is more like it. Chances are 5 in 6 (83%) that Dopey will get something extra. Happy, on the other hand, with an NTILE value of 6 will NEVER get something extra. But I guess one could argue that he doesn’t need it anyway.

However, Happy is not going to stand idly by. He slaps Dopey upside the head (and of course he’s happily smiling as he does it) and says he has a fairer way to distribute Sneezy’s stuff. He sits down and whips out this script:

with Dwarf_Vals as (  select Dwarf
        ,DwarfValue=row_number() over (order by newid())  from (select 'Dopey'   union all        select 'Doc'     union all        select 'Grumpy'  union all        select 'Bashful' union all        select 'Sleepy'  union all        select 'Happy') Dwarfs (Dwarf)
)
,Sneezys_Stuff as(  select OwningDwarf
        ,InheritingNTileValue=ntile (6) over (order by newid())  from DwarfPossessions
  where OwningDwarf='Sneezy')update Sneezys_Stuffset OwningDwarf=Dwarf_Vals.Dwarffrom Sneezys_Stuffjoin Dwarf_Vals on Sneezys_Stuff.InheritingNTileValue=Dwarf_Vals.DwarfValue;

He randomly assigns a value of 1 to 6 to each dwarf (via the ROW_NUMBER() function) and that is joined to the values of 1 to 6 that NTILE assigns to Sneezy’s property. Another window function to the rescue.

Grumpy makes one final suggestion. He says that they shouldn’t hard-code the 6 in the NTILE function. He suggests the following instead:

ntile ((select max(DwarfValue) from Dwarf_Vals)) over (order by newid())

Grumpy explains that this is more flexible because you never know when another dwarf might suddenly bite the dust. He says this as he looks at Dopey with a malicious grin.

Monday, 21 November 2011

#usp_CreateAndInsert

People will post questions on T-SQL forums and sometimes will provide sample data, but the data is not provided in a way that others can readily use to create identical sample tables. It is admittedly kind of a pain to write out the code necessary to CREATE TABLE and to populate it via INSERT commands with meaningful data. And that’s where this utility comes in.

The code below creates a temporary procedure called #usp_CreateAndInsert. Temporary procedures can be difficult animals, and you can see my experiences in trying to make this work in my last blog entry. I created a temporary procedure because I didn’t want to force a person to create my procedure in his/her own database but instead somewhere temporary where they don’t have to worry about cleaning up after themselves.

Currently this procedure only works for SQL2005 and beyond. It makes use of the sys.columns catalog view, which apparently wasn’t available in SQL2000. Depending on demand, I may amend the procedure to work in SQL2000.

To use this utility, just follow these steps:

Step 1: Set the current database to the source of the data. For example: USE AdventureWorks

Step 2: Copy/paste the huge chunk of code at the end of this blog entry into a new query window in SSMS and execute it by choosing Query -> Execute from the menu or hitting the F5 Key. It will create the temporary procedure called #usp_CreateAndInsert.

Step 3: Clear the same query window you're already in (don't open another query window because it won't be able to "see" the temporary procedure) and call the newly-created procedure with appropriate parameters. For example:

exec #usp_CreateAndInsert 'Sales.SalesReason'

Step 4: That will produce the following result (I suggest you set your results to go to Grid rather than Text):

--Create/Populate [#SalesReason]:if object_id('tempdb..[#SalesReason]','U') is not null drop table [#SalesReason]
gocreate table [#SalesReason](   [SalesReasonID] int  ,[Name] nvarchar(50)  ,[ReasonType] nvarchar(50)  ,[ModifiedDate] datetime)goset nocount on
insert [#SalesReason] select 1, N'Price', N'Other', '1998-06-01T00:00:00'insert [#SalesReason] select 2, N'On Promotion', N'Promotion', '1998-06-01T00:00:00'insert [#SalesReason] select 3, N'Magazine Advertisement', N'Marketing', '1998-06-01T00:00:00'insert [#SalesReason] select 4, N'Television  Advertisement', N'Marketing', '1998-06-01T00:00:00'insert [#SalesReason] select 5, N'Manufacturer', N'Other', '1998-06-01T00:00:00'insert [#SalesReason] select 6, N'Review', N'Other', '1998-06-01T00:00:00'insert [#SalesReason] select 7, N'Demo Event', N'Marketing', '1998-06-01T00:00:00'insert [#SalesReason] select 8, N'Sponsorship', N'Marketing', '1998-06-01T00:00:00'insert [#SalesReason] select 9, N'Quality', N'Other', '1998-06-01T00:00:00'insert [#SalesReason] select 10, N'Other', N'Other', '1998-06-01T00:00:00'go--select * from [#SalesReason]

Step 5: Just do a Ctrl+A,Ctrl+C (Select All, Copy) in the results window to put that code on your clipboard and you’re ready to paste it (Ctrl+V) into a file or into a forum message.

Step 6: Repeat steps 3-5 above to produce CREATE/INSERT code for other data.

Below is the code to create the #usp_CreateAndInsert procedure. I hope you find it useful. It will certainly be useful to people trying to help out on the forums.

if object_id('tempdb..#usp_CreateAndInsert','P') is not null 
  drop procedure #usp_CreateAndInsert
gocreate procedure #usp_CreateAndInsert
   @Table nvarchar(500) = null      --The source table of the data  ,@Select nvarchar(4000) = '*'     --Columns to SELECT from the table  ,@Join nvarchar(4000) = ''        --Specify optional JOINs   ,@Where nvarchar(4000) = '1=1'    --Specify an optional WHERE clause  ,@MaxRows bigint = 100            --Specify maximum records allowedas
begin
  /*
=====================================================================================
  
Procedure #usp_CreateAndInsert
  
Written by Brad Schulz (be_ess@yahoo.com) Jul2009
  
  
Description/Purpose:
  
  Produces the commands necessary to create and populate a temporary table with 
    sample data.
  The primary purpose is to provide an easy way for forum users to post the code
    necessary for others to get a sampling of their data in order to help.
  The output is just a list of rows (one command line per row) in the results window 
    of SSMS. Once produced, you can do a CTRL+A CTRL+C (Select All, Copy) in the 
    results window and then Paste (CTRL+V) the code into a file or in a message at 
    a SQL forum. (For best results in formatting and copying, set your Results to 
    Grid (as opposed to Text)).
  
  
Parameters:
  
  @Table (required)
     Supply the name of your source table in this parameter.  It can be a simple 
     table name or a 2-part or 3-part or 4-part name.  The final code that will 
     be produced will be create/insert code for a temporary table with the same name.  
     For example, if 'AdventureWorks.Sales.SalesOrderDetail' is supplied for @Table,
     then the resulting create/insert code will be for a temporary table called
     #SalesOrderDetail.  If you pass the name of a temporary table already 
     (for example #Temp) then the resulting create/insert code will be for a table 
     of the same name (#Temp).
  
  @Select (optional... defaults to '*')
     Most likely you will want to limit the columns of the sample.  Enter a comma-
     delimited list of the columns to be pulled from the table specified by @Table.  
     IMPORTANT: Only columns the following datatypes will be included and all others
     will be ignored: bigint, bit, decimal, int, money, numeric, smallint, smallmoney, 
     tinyint, float, real, date, datetime, datetime, datetimeoffset, smalldatetime, 
     time, char, varchar, nchar, nvarchar, uniqueidentifier.
     Note that if a uniqueidentifier field is included, the resulting code will just
     define it as a NVARCHAR field.
  
  @Join (optional... defaults to '')
     If you want to JOIN any tables to the table specified in the @Table parameter,
     you can specify a JOIN clause (or clauses) here. See examples below for details.
     (Note that if you do JOIN any other tables, you should specify @Select columns
     and those columns should have aliases that match the table specified in the 
     @Table parameter.
  
  @Where (optional... defaults to '1=1')
     This allows you to limit the rows of the sample. Enter valid WHERE predicates.
     See examples below for details.
  
  @MaxRows (optional... defaults to 100)
     This is a precaution to make sure you don't create a final output that is 
     thousands of lines long.  If the selection of rows that you make exceeds 
     @MaxRows, then an error will occur.  If you really need the number of rows to 
     be larger than the default of 100, then pass a different number for this 
     parameter.
  
  
Examples (assuming AdventureWorks database):
  
  Number 1: 
  Produce Create/Insert statements for all fields in Sales.SalesReason table:
  
      exec #usp_CreateAndInsert 'Sales.SalesReason'
  
  Number 2: 
  Produce Create/Insert statements for a few fields in Sales.SalesPerson:
  
      exec #usp_CreateAndInsert 
             'Sales.SalesPerson'
            ,@Select='SalesPersonID,SalesQuota,SalesYTD'
  
  Number 3: 
  Produce Create/Insert statements for certain rows in Sales.Customer
  
      exec #usp_CreateAndInsert 
             'Sales.Customer'
            ,@Where='TerritoryID=7 and CustomerType=''S'''
  
    Note that you can use double quotation marks (") if you don't like having to 
    double up your single quotes, but this only works if the @Where parameter is 
    less than 128 chars:
  
      exec #usp_CreateAndInsert 
             'Sales.Customer'
            ,@Where="TerritoryID=7 and CustomerType='S'"
  
  Number 4: 
  Produce Create/Insert statements for a few fields in Sales.SalesOrderHeader,
  and, in joining that table to Sales.Customer, filter out rows based on
  fields in Sales.Customer:
  
      exec #usp_CreateAndInsert 
             'Sales.SalesOrderHeader'
            ,@Select='SalesOrderID,OrderDate,PurchaseOrderNumber,Status'
            ,@Join='join Sales.Customer 
                      on SalesOrderHeader.CustomerID=Customer.CustomerID'
            ,@Where='Customer.CustomerID<80 and 
                     Customer.TerritoryID=7 and 
                     Customer.CustomerType=''S'''
  
    Or, alternately, you can use aliases:
  
      exec #usp_CreateAndInsert 
             'Sales.SalesOrderHeader'
            ,@Select='SalesOrderID,OrderDate,PurchaseOrderNumber,Status'
            ,@Join='oh join Sales.Customer c on oh.CustomerID=c.CustomerID'
            ,@Where="c.CustomerID<80 and c.TerritoryID=7 and c.CustomerType='S'"
  
    To be honest, it may be much easier to create a temp table that contains the 
    data you want and then just run this procedure off of the temp table:
  
      select SalesOrderID,OrderDate,PurchaseOrderNumber,Status
      into #SOH
      from Sales.SalesOrderHeader oh 
      join Sales.Customer c on oh.CustomerID=c.CustomerID
      where c.CustomerID<80 and c.TerritoryID=7 and c.CustomerType='S'
  
      exec #usp_CreateAndInsert '#SOH'
  
  Number 5:
  Produce Create/Insert statements for a few fields in Sales.SalesOrderDetail,
  and, in joining that table to Sales.SalesOrderHeader and Sales.Customer, 
  filter out rows based on fields in Sales.Customer:
  
      exec #usp_CreateAndInsert 
             'Sales.SalesOrderDetail'
            ,@Select='od.SalesOrderID,od.SalesOrderDetailID,od.OrderQty,od.ProductID'
            ,@Join='od join Sales.SalesOrderHeader oh 
                         on od.SalesOrderID=oh.SalesOrderID
                       join Sales.Customer c 
                         on oh.CustomerID=c.CustomerID'
            ,@Where="c.CustomerID<80 and c.TerritoryID=7 and c.CustomerType='S'"
  
    To be honest, it may be much easier to create a temp table that contains the 
    data you want and then just run this procedure off of the temp table:
  
      select od.SalesOrderID,od.SalesOrderDetailID,od.OrderQty,od.ProductID
      into #SOD
      from Sales.SalesOrderDetail od
      join Sales.SalesOrderHeader oh on od.SalesOrderID=oh.SalesOrderID
      join Sales.Customer c on oh.CustomerID=c.CustomerID
      where c.CustomerID<80 and c.TerritoryID=7 and c.CustomerType='S'
  
      exec #usp_CreateAndInsert '#SOD'
  
=====================================================================================
*/
  set nocount on
  
declare @EmptyString nchar(1)       ,@Asterisk nchar(1)       ,@TicTacToe nchar(1)       ,@SqlStmt nvarchar(4000)       ,@NumRows bigint       ,@ErrMsg nvarchar(4000)  
       ,@DestTable nvarchar(100)       ,@ColValues nvarchar(4000)
  set @EmptyString=''set @Asterisk='*'set @TicTacToe='#'
  if isnull(@Table,@EmptyString)=@EmptyStringbegin  raiserror('You must pass a @Table parameter.',16,1)  return -1end
  
if @Select=@Asterisk and @Join<>@EmptyStringbegin  set @ErrMsg=        'You must specify columns in the @Select parameter '       +'if you pass something in the @Join parameter.'  raiserror(@ErrMsg,16,1)  return -1end
  
set @SqlStmt=      'select @NumRows=count(*) from '+@Table+' '+@Join+' where '+@Where--print @SqlStmtexec sp_executesql @SqlStmt, N'@NumRows bigint output', @NumRows output
if @@error<>0begin  set @ErrMsg=        'The @Table, @Join, @Where parameters are invalid.'+char(13)       +'The SQL Statement attempted was the following:'+char(13)       +'  select count(*)'+char(13)       +'  from '+@Table+' '+@Join+char(13)       +'  where '+@Where
  raiserror(@ErrMsg,16,1)  return -1end
if @NumRows>@MaxRowsbegin  set @ErrMsg=        'Your selection produces '+ltrim(str(@NumRows))+' rows, '       +'which is more than the '+ltrim(str(@MaxRows))+' maximum allowed.'+char(13)       +'If you want more than the maximum, then use the @MaxRows parameter'  raiserror(@ErrMsg,16,1)  return -1end
  
if object_id('tempdb..##_CAI_Data','U') is not null drop table ##_CAI_Data
  --Acquire the data and put into a (global) temporary table called ##_CAI_Data.  
--The reason I use a global temp table is because that's the only kind of temp table
--that can be created by sp_executesql that can be used again by this procedure.set @SqlStmt=      'select '+@Select+' into ##_CAI_Data from '+@Table+' '+@Join+' where '+@Where--print @SqlStmtexec sp_executesql @SqlStmt
  if @@error<>0begin  set @ErrMsg=        'The @Table, @Select, @Join, @Where parameters are invalid.'+char(13)       +'The SQL Statement attempted was the following:'+char(13)       +'  select '+@Select+char(13)       +'  from '+@Table+' '+@Join+char(13)       +'  where '+@Where
  raiserror(@ErrMsg,16,1)  return -1end
  --Now that we've gotten our sample data into ##_CAI_Data, let's pull out the
--column definitions of that table and put that information into another
--temporary table called #_CAI_Struif object_id('tempdb..#_CAI_Stru','U') is not null drop table #_CAI_Struselect Name=Name      ,DataType=type_name(System_Type_ID) 
      ,Max_Length
      ,Precision      ,Scale
      ,Collation_Name=Collation_Name 
      ,Column_IDinto #_CAI_Strufrom tempdb.sys.columns where Object_ID=object_id('tempdb..##_CAI_Data','U')  and type_name(System_Type_ID) in ('bigint','bit','decimal','int','money','numeric'                                   ,'smallint','smallmoney','tinyint'                                   ,'float','real'                                   ,'date','datetime2','datetime','datetimeoffset'                                   ,'smalldatetime','time'                                   ,'char','varchar','nchar','nvarchar'                                   ,'uniqueidentifier')
  --Create the code that will eventually pull that actual data 
--out of our #_CAI_Data tableset @ColValues=''select @ColValues=          @ColValues
         +case 
            when @ColValues=@EmptyString 
              then '' 
            else '+'', '''+char(13)+space(11)+'+' 
          end         +'coalesce('+         +case 
            when DataType in ('char','varchar')              then 'quotename('+quotename(Name)+','''''''')'            when DataType in ('nchar','nvarchar')              then '''N''+quotename('+quotename(Name)+','''''''')'            when DataType in ('date')              then '''''''''+convert(nvarchar(100),'+quotename(Name)+',112)+'''''''''            when DataType in ('time')              then '''''''''+convert(nvarchar(100),'+quotename(Name)+',108)+'''''''''            when DataType in ('datetime','datetime2','smalldatetime','datetimeoffset')              then '''''''''+convert(nvarchar(100),'+quotename(Name)+',126)+'''''''''            when DataType in ('money','smallmoney','float','real')              then 'convert(nvarchar(100),'+quotename(Name)+',2)'            when DataType in ('uniqueidentifier')              then '''''''''+convert(nvarchar(100),'+quotename(Name)+')+'''''''''            else 'convert(nvarchar(100),'+quotename(Name)+')'          end          +',''NULL'')'from #_CAI_Stru--print @ColValues
  
--Set the destination table that will come out in the resulting code
--to be a temp table name.  Note how PARSENAME() is used to get the
--actual table name of whatever was passed in the @Table parameter
--and a '#' is placed before that to get the destination table.set @DestTable=case 
                 when parsename(@Table,1) is not null 
                   then parsename(@Table,1)                 when parsename(@Table,2) is not null                   then parsename(@Table,2)                 when parsename(@Table,3) is not null                   then parsename(@Table,3)                 else @Table
               end
set @DestTable=quotename(case when left(@DestTable,1)=@TicTacToe 
                              then '' 
                              else '#' 
                         end+@DestTable)
  --Put all the statements together into one giant UNIONed query which
--we will execute via sp_executesqlset @SqlStmt='
select ''--Create/Populate '+@DestTable+':'' collate database_default
  union all
select ''if object_id(''''tempdb..'+@DestTable+''''',''''U'''') is not null ''
      +''drop table '+@DestTable+'''
  union all
select ''go''
  union all
select ''create table '+@DestTable+'''
  union all
select ''(''
  union all
select ''  ''+case when FirstLine=1 then '' '' else '','' end+ColDef
from 
(
  select ColDef=
           quotename(Name)+'' ''
          +case 
             when DataType=''uniqueidentifier'' 
               then ''nvarchar(50)'' 
             else DataType 
           end
          +case 
             when DataType in (''char'',''varchar'',''binary'',''varbinary'')
               then ''(''+case when Max_Length=-1 
                               then ''max'' 
                               else convert(varchar,Max_Length) end+'')''
             when DataType in (''nchar'',''nvarchar'')
               then ''(''+case when Max_Length=-1 
                               then ''max'' 
                               else convert(varchar,Max_Length/2) end+'')''
             when DataType in (''decimal'',''numeric'')
               then ''(''+convert(varchar,Precision)+'',''+convert(varchar,Scale)+'')''
             when DataType in (''float'')
               then ''(''+convert(varchar,Precision)+'')''
             else '''' 
           end
        ,FirstLine=case when Column_ID=(select min(Column_ID) from #_CAI_Stru) 
                        then 1 
                        else 0 end
  from #_CAI_Stru
) ColInserts
  union all
select '')''
  union all
select ''go''
  union all
select ''set nocount on''
  union all
select ''insert '+@DestTable+' select ''
           +'+@ColValues+'
from ##_CAI_Data
  union all
select ''go''
  union all
select ''--select * from '+@DestTable+'''
'--print @SqlStmtexec sp_executesql @SqlStmt
  if @@error<>0begin  raiserror('Unknown error in executing final SQL Statement.',16,1)  return -1end
  
drop table ##_CAI_Datadrop table #_CAI_Stru
  end