A day late for the twenty-ninth post of the 31 Days of SSIS, but it is still coming out. And if things go right, I’ll be caught up where I need to be by the end of the day. If you’re behind on this series, check out the introductory post.
File Export Requirements
When considering a post for today, I got to thinking about something that I’ll do for clients when on performance tuning engagements. As a consultant, I can’t always make the time to get out to the client to gather information, and provide feedback on whether I want to look at one plan or another. To get around this, I’ll often have them execute an SSIS package that can provide me this information.
To demonstrate this functionality, let’s suppose that I am investigating some potential issues with parallelism at a client. These issues typically occur in the wee hours of the night. Since I’ve read my post Can You Dig It? – The Plan Cache, I have a query to run to find plans that are using parallelism. Unfortunately, when we look at the plan cache each morning, there is nothing of interest to investigate.
We need to find out what information is in the plan cache around the time that the issues are occurring. Since this information does not persist in the plan cache, we need this information placed somewhere less volatile than the SQL Server plan cache.
File Export Solution
Hopefully, we can agree that exporting data into a file with SSIS is easy. What about when you want to export the contents of a column in each row into a file? That is, fortunately, just as easy. Accomplishing this is as easy as creating a file name and using the Export Column transformation.
Since the information that we want to get is in the plan cache, we can use a query against the plan cache to retrieve all of the plans that utilize parallelism. In the export, we’ll return a list of all of the plans that were found. At the same time, the package will use the Export Column transformation to save each of the execution plans to a file.
Package Control Flow
Our focus for this package is to look at the Export Column transformation. Due to this, our control flow will be very simple. Add a data flow task name DFT_ExportPlans and the control flow is almost done.
We will need to add a variable to the package. This way, if the server doesn’t have the same folder structure as my development environment there won’t be any issues.
The variable from above is:
- ExportFolder: Added to direct where to store the flat file with statistics and the execution plans.
Package Data Flow
We only have a single data flow. This will do the work of exporting the statistics and files from the SQL Server.
As mentioned, this data flow will export the information on parallel plans that is being sought. The data flow task will query the plan cache to identify the plans that should be exported. Once identified, they will be exported with some statistics for offline analysis.
This is done with the following transformations:
- DS_MSS_ParallelPlans: OLE DB Data Source that will query the plan cache for plans utilizing parallelism.
- DC_AddQueryPlanOutput: Derived Column transformation that adds a column to create the path and file name for the execution plans.
- EX_QueryPlan: Export Column transformation that experts each execution plans into a file for each execution plan. These will be saved with the SQLPLAN extension so that SSMS recognizes the files as execution plans.
- DD_FF_QueryPlanStats: Flat File destination that provides an index of the execution plans exported along with some basic information on the plans.
The query below is being used to identify plans that utilize parallelism. This query is derived from the post Can You Dig It? – The Plan Cache; where I discuss querying the plan cache in further detail.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ; WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') SELECT TOP 10 CAST('ParallelPlan'+'_' +CAST(ROW_NUMBER() OVER (ORDER BY UseCounts DESC) AS VARCHAR(2))+'_' +REPLACE(REPLACE(REPLACE(CONVERT(VARCHAR(20), GETDATE(), 120), ' ', ''), '-', ''), ':', '')+'_' +CONVERT(VARCHAR(50), cp.plan_handle, 1) AS VARCHAR(128)) QueryPlanName ,cp.plan_handle ,COALESCE(DB_NAME(p.dbid), p.query_plan.value('(//RelOp/OutputList/ColumnReference/@Database)', 'nvarchar(128)')) AS DatabaseName --Works in a number of cases, but not perfect. ,DB_NAME(p.dbid)+'.'+OBJECT_SCHEMA_NAME(p.objectid, p.dbid)+'.' +OBJECT_NAME(p.objectid, p.dbid) AS ObjectName ,cp.objtype ,cp.UseCounts ,p.query_plan FROM sys.dm_exec_cached_plans cp CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS q WHERE cp.cacheobjtype = 'Compiled Plan' AND p.query_plan.exist('//RelOp[@Parallel = "1"]') = 1
The key to this package is setting up the Export Column transformation. Fortunately, compared to many other tasks, setting up this transformation is relatively easy.
The first piece to configure is the Extract Column (red arrow below). This will map to a column in the data flow that has a data type that can be exported to a file. The only data types that can be exported are DT_TEXT, DT_NTEXT, or DT_IMAGE. The benefit of this limitation is that the 4,000 character limit for string data (DT_STR and DT_WSTR) doesn’t interfere with this transformation.
Next that File Path Column needs to be configured (orange arrow below). This column represents the name of the file that the data will be placed within. In this solution, I’ve configured this through the use of a Derived Column transformation. This can be create in which ever manner you wish.
With those two items configured, the next step is to configure how the export will occur. There are three options for setting this up (blue bubble above):
- Allow Append: Determines whether data is appended to the end of the file if it exists. Otherwise, the data is added to a new file.
- Force Truncate: Determines whether data overwrites any data already in the file.
- Write Byte-Order Mark: Determines whether a byte-order mark (BOM) is written to the file. For DT_NTEXT data I have found that this value should be false. Otherwise, the file will be corrupt.
There are plenty of reasons you may want to export data in this manner from your database. The reason listed above is a prime example that I use as a consultant. Or maybe you want to use Plan Explorer and need to get a good selection of plans to investigate.
Outside of execution plans, you may need to change how a database stores image and document files. In this case, you can export out each of the files and place them in their new directory. If if you need to delete data that contains image or document information you could export it to archive the data in a manner that doesn’t require restoring the SQL Server database.
This transformation can be very useful when you include it in your tool belt. Take some time to play with it, and you may find that a solution or two you’ve already built could have leveraged this transformation.