Friday, 14 December 2012

Dynamic CREATE TABLE scripts

Not taking credit for this, but I really like the code and have not come across this sort of thing before.... Basically, if you want to generate CREATE TABLE scripts for multiple tables, or just on-the-fly this will save you....
select  'create table [' + so.name + '] (' + o.list + ')' + CASE WHEN tc.Constraint_Name IS NULL THEN '' ELSE 'ALTER TABLE ' + so.Name + ' ADD CONSTRAINT ' + tc.Constraint_Name  + ' PRIMARY KEY ' + ' (' + LEFT(j.List, Len(j.List)-1) + ')' END
from    sysobjects so
cross apply
    (SELECT 
        '  ['+column_name+'] ' + 
        data_type + case data_type
            when 'sql_variant' then ''
            when 'text' then ''
            when 'decimal' then '(' + cast(numeric_precision as varchar) + ', ' + cast(numeric_scale as varchar) + ')'
            else coalesce('('+case when character_maximum_length = -1 then 'MAX' else cast(character_maximum_length as varchar) end +')','') end + ' ' +
        case when exists ( 
        select id from syscolumns
        where object_name(id)=so.name
        and name=column_name
        and columnproperty(id,name,'IsIdentity') = 1 
        ) then
        'IDENTITY(' + 
        cast(ident_seed(so.name) as varchar) + ',' + 
        cast(ident_incr(so.name) as varchar) + ')'
        else ''
        end + ' ' +
         (case when IS_NULLABLE = 'No' then 'NOT ' else '' end ) + 'NULL ' + 
          case when information_schema.columns.COLUMN_DEFAULT IS NOT NULL THEN 'DEFAULT '+ information_schema.columns.COLUMN_DEFAULT ELSE '' END + ', ' 

     from information_schema.columns where table_name = so.name
     order by ordinal_position
    FOR XML PATH('')) o (list)
left join
    information_schema.table_constraints tc
on  tc.Table_name       = so.Name
AND tc.Constraint_Type  = 'PRIMARY KEY'
cross apply
    (select '[' + Column_Name + '], '
     FROM   information_schema.key_column_usage kcu
     WHERE  kcu.Constraint_Name = tc.Constraint_Name
     ORDER BY
        ORDINAL_POSITION
     FOR XML PATH('')) j (list)
where   xtype = 'U'
AND name    NOT IN ('dtproperties')

Friday, 7 December 2012

While Loop - Cursor Alternative

I don't like cursors much. They're simple to write, easy to implement and good starting point for understanding loops. However, they are slow and memory-thirsty...(is that the right phrase?). So here is an example of using a simple WHILE loop which has the same behaviour as a cursor but it much faster and uses less memory than a cursor.

 DECLARE @RecordCount int
 DECLARE @RowCount int

 DECLARE @Field1 varchar(100)
 DECLARE @Field2 int
 DECLARE @Field3 char(2)

 select distinct IDENTITY(int,1,1) as ID, Field1, Field2, Field3into #loop from dbo.TableName order by ID DESC
 
 SET @RecordCount = (select max(ID) from #loop)
 SET @RowCount = 1

 WHILE @RowCount <= @RecordCount
 BEGIN
  
  select @Field1 = Field1, @Field2 = Field2, @Field3 = Field3 from #loop where ID = @RowCount

  EXECUTE [dbo].[usp_RandomCodeToReplace] 
    @Field1
   ,@Field2
   ,@Field3 

  SET @RowCount = @RowCount + 1

 END

 DROP TABLE #loop

Wednesday, 5 December 2012

Sequence number (Rank function) in SSIS

This example takes a table from SQL Server, orders the columns and creates a Rank (or sequence number). Works well and I've used it many times. First drag a new DataFlow into your package:
Next write your SQL statement to gather your rows. Pay attention to the ordering. WK is the field I will be ranking on and WeekID is the field I will be partitioning on.
Now we have to tell SSIS that we have ordered the way we have (this circumvents the need for a Sort component, which is slow and inefficient). This has to be done for the 'Error Output' too.
Next we tell the package what order the fields are ordered in... In this case WK is sort position 1 and WeekID is position 2.
Next drag a Script Component on to the DataFlow.
Select the fields you want to work with in the code.
Add an new column that will contain the rank numbers.
Next click on 'Design' to add your code.
Here is the code you need to add:
' Microsoft SQL Server Integration Services user script component
' This is your new script component in Microsoft Visual Basic .NET
' ScriptMain is the entrypoint class for script components

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain
    Inherits UserComponent

    Dim WeekID As Integer

    Dim Counter As Integer = 0

    Dim PrevWeekID As Integer


    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

        WeekID = Row.weekID

        If WeekID = PrevWeekID Then
            Row.Rank = Counter + 1
            Counter = Counter + 1
        Else
            Row.Rank = 1
            Counter = 1
        End If

        PrevWeekID = Row.weekID

    End Sub

End Class



Rank and pick best record SQL

I used to use loops for this kind of thing. This is far more efficient.

WITH Table AS
( 
SELECT
    Field1, Field2, 
    Rank() OVER (PARTITION BY Field2 ORDER BY Field1 desc) AS Rank 
FROM dbo.TableName 
)
SELECT * from Table WHERE Rank = 1

;



Hope this helps.

Run an SSIS package on a remote server without SQL Agent

Situation: no SQL Agent, need to run SSIS package on a server remotely via a button-click event in a winforms app (C#).

This solution will run an SSIS package on a remote server while showing the progress in a multi-line textbox.  To do this effectively we will need to use a BackgroundWorker.

I used an executable called PsExec that is freely available to download.

To set up the logging, right click in your package ControlFlow and set up the logging to a SQL Server table.  This by default will be dbo.sysdtslog90 (for 2005 anyway; for 2008 it will be dbo.sysdtslog100 etc.).

PsExec allows for running command line routines on a remote server.  In this case I am calling a package through a command line routine using DtExec.  If you are unsure on DtExec please google it, I may do another post on it but it is pretty simple stuff and easy to work out.  PsExec simply allows you to pass a computer name that you would like to run the process on.

I start the PsExec, give it a computer name and pass in the Command Line routine to run a SSIS package.

The Background Worker then loops through the log table displaying the log results as the package runs.  Once an 'End of Package' log has been recorded the Background Worker stops. Simples.

If PsExec fails, or DtExec fails, you will also get an error message returned.  I was getting error code 128 for a while - it turned out the DBAs where remote desktoping on to the server and then not logging off, so there were too many users logged in.  Once I got them to log out all worked well.

private void RunSSISPackage()
        {
            try
            {
                SqlConnection conn = new SqlConnection(ConnectionsClass.Server);
                conn.Open();

                string TruncateSQL = "truncate table dbo.sysdtslog90";

                SqlCommand TruncateCMD = new SqlCommand(TruncateSQL, conn);

                TruncateCMD.ExecuteNonQuery();

                conn.Close();

                Process process = new Process();
                process.StartInfo.FileName = @"C:\Documents and Settings\Geoff.Parsons\My Documents\PsExec.exe";
                process.StartInfo.Arguments = "\\\\ServerName -d \"C:\\dtexec.exe\" /SQL PackageName /SERVER ServerName";
                process.StartInfo.UseShellExecute = false;
                process.StartInfo.CreateNoWindow = true;
                process.StartInfo.RedirectStandardOutput = true;
                process.StartInfo.RedirectStandardError = true;

                process.Start();
                process.WaitForExit();

                string strOutput = process.StandardOutput.ReadToEnd();
                string errOutput = process.StandardError.ReadToEnd();


                if (!process.HasExited)
                {
                    process.Kill();
                }

                PackageResultTB.AppendText(strOutput);
                PackageResultTB.AppendText(errOutput);

                if (errOutput != "")
                {
                    if (!errOutput.ToString().Contains("dtexec.exe started on SERVERNAME with process ID"))
                    {
                        return;
                    }
                }

                BW = new BackgroundWorker();
                BW.WorkerReportsProgress = true;
                BW.WorkerSupportsCancellation = false;
                BW.DoWork += new DoWorkEventHandler(BW_DoWork);
                BW.ProgressChanged += new ProgressChangedEventHandler(BW_ProgressChanged);
                BW.RunWorkerCompleted += new RunWorkerCompletedEventHandler(BW_RunWorkerCompleted);

                PrepareControlsForThreading(false);

                BW.RunWorkerAsync();
            }
            catch (Exception ex)
            {
                MessageBox.Show(ex.Message);
                return;
            }
        }
This DoWork event loops through a select query to return new rows. I added a column to sysdtslog90 called 'Logged' which I populated each time a row was read to prevent repeating lines from the log table. On the front-end of the application I have a multi-line textbox called PackageResultTB.

private void BW_DoWork(object sender, DoWorkEventArgs e)
        {
            BW.ReportProgress(0, "Loading Package...");

            SqlConnection conn = new SqlConnection(ConnectionsClass.Server);
            conn.Open();

            bool Continue = true;
            while (Continue)
            {
                string SqlQuery = "select id,event,source,message,datacode, Logged from dbo.sysdtslog90 order by id";
                
                SqlCommand cmd = new SqlCommand(SqlQuery, conn);
                da = new SqlDataAdapter(cmd);
                DataTable dt = new DataTable();
                da.Fill(dt);
                dt.PrimaryKey = new DataColumn[] { dt.Columns["id"] };

                string Update = "";
                

                foreach (DataRow row in dt.Rows)
                {
                    if (row["message"].ToString().Contains("End of package"))
                    {
                        Continue = false;
                    }
                    if (row["Logged"].ToString() == "")
                    {
                        BW.ReportProgress(0, "\r\n" + row["event"].ToString() + "\r\n        " + row["source"].ToString() + "\r\n        " + row["message"].ToString() + " - " + row["datacode"].ToString() + "%\r\n");
                        Update = string.Format("Update dbo.sysdtslog90 set Logged = 'TRUE' where id = '{0}'", row["id"].ToString());
                        UpCmd = new SqlCommand(Update, conn);
                        UpCmd.ExecuteNonQuery();
                    }

                }
                Thread.Sleep(1000);
            }
            conn.Close();
        }


private void BW_ProgressChanged(object sender, ProgressChangedEventArgs e)
        {
            PackageResultTB.AppendText(e.UserState.ToString());
        }


private void BW_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
        {
            try
            {
                //Catch any errors
                if (e.Error != null)
                {
                    MessageBox.Show(e.Error.Message);
                    return;
                }
                else
                {
                    return;
                }
            }
            catch (SystemException ex)
            {
                MessageBox.Show(ex.Message);
                return;
            }
        }

Creating SSIS package from C# application

Unusual one this.  Had a need to truncate about 70 Raw Files after a large ETL process.  There is no specific task in SSIS, but you can set a DataFlow up that passes no rows to the destination and this will sort the problem out. 

Rather than making 70+ DataFlows I initially used a loop, go through each raw file and truncate it - but this changed the meta-data each time (as RawFiles don't use connections) and caused a lot of validation errors.  I could have delayed validation on the tasks, but this would have taken a very long time to go through 10+ packages looking for the use of the raw files.

Instead I chose to create a class in C# that would loop through all the RawFiles in a folder and create a bespoke DataFlow for them within a new package.

The code creates a new package, then uses a class to return a new DataFlow task in a loop.

The DataFlow consists of a RawFile Source followed by a Conditional Split and a RawFile Destination.
The conditional split creates an output with the condition "1 == 0"; as 1 will never equal 0 it will never pass a row so will truncate the RawFile without affecting the meta-data.

This is my class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using wrap = Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using System.Xml;
using System.Data.Common;
using System.Configuration;
using System.Collections;
using System.IO;

namespace SpaceApp_v0._1.Classes
{
    public class RawFileCleanUp
    {

        public TaskHost SSISPackage(Package package, Executable dataFlowTask, string DataFlowTaskName, string RawFileLocation)
        {
            //// Add DataFlow
            TaskHost taskHost = dataFlowTask as TaskHost;
            taskHost.Name = DataFlowTaskName;

            MainPipe pipeline = taskHost.InnerObject as MainPipe;

            //// Add RawFile Source
            IDTSComponentMetaData90 rawSource = pipeline.ComponentMetaDataCollection.New();
            rawSource.ComponentClassID = "DTSAdapter.RawSource.1";
            rawSource.ValidateExternalMetadata = true;
            IDTSDesigntimeComponent90 rawSourceDesignTimeComponent = rawSource.Instantiate();
            rawSourceDesignTimeComponent.ProvideComponentProperties();
            rawSource.Name = "RawFile Source";

            rawSourceDesignTimeComponent.SetComponentProperty("AccessMode", 0);
            rawSourceDesignTimeComponent.SetComponentProperty("FileName", RawFileLocation);

            rawSourceDesignTimeComponent.AcquireConnections(null);
            rawSourceDesignTimeComponent.ReinitializeMetaData();
            rawSourceDesignTimeComponent.ReleaseConnections();
            

            //// Add Conditional Split to drop ALL rows
            IDTSComponentMetaData90 conSplit = pipeline.ComponentMetaDataCollection.New();
            conSplit.ComponentClassID = "DTSTransform.ConditionalSplit";
            conSplit.Name = "Conditional Split";
            conSplit.Description = "Conditional Split Transform";

            CManagedComponentWrapper splitWrapper = conSplit.Instantiate();
            splitWrapper.ProvideComponentProperties();
            splitWrapper.AcquireConnections(null);
            splitWrapper.ReinitializeMetaData();
            splitWrapper.ReleaseConnections();
            

            //// Connect RawFile Source to Split
            IDTSPath90 path = pipeline.PathCollection.New();
            path.AttachPathAndPropagateNotifications(rawSource.OutputCollection[0], conSplit.InputCollection[0]);

            //// Configure split
            IDTSOutput90 splitOutput = splitWrapper.InsertOutput(DTSInsertPlacement.IP_BEFORE, conSplit.OutputCollection[0].ID);
            splitOutput.Name = "Split Output 1";
            splitOutput.Description = "Passes zero rows out";
            splitOutput.IsErrorOut = false;

            IDTSInput90 splitInput = conSplit.InputCollection[0];
            IDTSInputColumnCollection90 splitInputColumns = splitInput.InputColumnCollection;
            IDTSVirtualInput90 splitVirtualInput = splitInput.GetVirtualInput();
            IDTSVirtualInputColumnCollection90 splitVirtualInputColumns = splitVirtualInput.VirtualInputColumnCollection;

            string ConColumn = "";
            int columnCount = splitVirtualInputColumns.Count;
            for (int i = 0; i < columnCount; i++)
            {
                splitWrapper.SetUsageType(splitInput.ID,splitVirtualInput,splitVirtualInputColumns[i].LineageID,DTSUsageType.UT_READONLY);
                if (i == 0)
                {
                    ConColumn = splitVirtualInputColumns[i].Name;
                }
            }

            splitWrapper.SetOutputProperty(splitOutput.ID, "EvaluationOrder",0);
            splitWrapper.SetOutputProperty(splitOutput.ID, "FriendlyExpression", "1 == 0");


            //// Add RawFile Destination
            IDTSComponentMetaData90 rawDestination = pipeline.ComponentMetaDataCollection.New();
            rawDestination.ComponentClassID = "DTSAdapter.RawDestination.1";
            rawDestination.ValidateExternalMetadata = true;
            IDTSDesigntimeComponent90 rawDestinationDeignTimeComponent = rawDestination.Instantiate();
            rawDestinationDeignTimeComponent.ProvideComponentProperties();
            rawDestination.Name = "RawFile Destination";

            rawDestinationDeignTimeComponent.SetComponentProperty("AccessMode", 0);
            rawDestinationDeignTimeComponent.SetComponentProperty("FileName", RawFileLocation);

            rawDestinationDeignTimeComponent.AcquireConnections(null);
            rawDestinationDeignTimeComponent.ReinitializeMetaData();
            rawDestinationDeignTimeComponent.ReleaseConnections();


            IDTSPath90 DestPath = pipeline.PathCollection.New();
            DestPath.AttachPathAndPropagateNotifications(conSplit.OutputCollection[0], rawDestination.InputCollection[0]);

            //// Connect Split to Destination
            IDTSInput90 rawInput = rawDestination.InputCollection[0];
            IDTSVirtualInput90 rawVInput = rawInput.GetVirtualInput();

            foreach (IDTSVirtualInputColumn90 vColumn in rawVInput.VirtualInputColumnCollection)
            {
                rawDestinationDeignTimeComponent.SetUsageType(rawInput.ID, rawVInput, vColumn.LineageID, DTSUsageType.UT_READONLY);
            }

            ////Repeat connections and validate to ensure package will run first time with no warnings and proper metadata.
            rawSourceDesignTimeComponent.AcquireConnections(null);
            rawSourceDesignTimeComponent.ReinitializeMetaData();
            rawSourceDesignTimeComponent.ReleaseConnections();

            splitWrapper.AcquireConnections(null);
            splitWrapper.ReinitializeMetaData();
            splitWrapper.ReleaseConnections();
            
            rawDestinationDeignTimeComponent.AcquireConnections(null);
            rawDestinationDeignTimeComponent.ReinitializeMetaData();
            rawDestinationDeignTimeComponent.ReleaseConnections();

            rawSourceDesignTimeComponent.Validate();
            splitWrapper.Validate();
            rawDestinationDeignTimeComponent.Validate();


            return taskHost;



        }

        


    }

}

You may notice that I call the AcquireConnections, ReinitialiseMetaData and ReleaseConnections more than once.  This seem to relieve an issue I had with validation warnings after the package was saved.  With the code as it is you can run the package the instant it has been created.

This is the code to call the class:

You pick the folder that holds the RawFiles and let it run.  I would recommend changing the NewFileName or setting it to a variable like a text box, or even the original FolderBrowserDialog plus a new file name.

string Folder = "";

            FolderBrowserDialog folderBrowser = new FolderBrowserDialog();
            folderBrowser.RootFolder = Environment.SpecialFolder.Desktop;

            DialogResult result = folderBrowser.ShowDialog();
            if (result == DialogResult.OK)
            {
                Folder = folderBrowser.SelectedPath;
            }

            string[] files = Directory.GetFiles(Folder);

            Microsoft.SqlServer.Dts.Runtime.Application app = new Microsoft.SqlServer.Dts.Runtime.Application();

            Package package = new Package();

            string RawFile = "";

            foreach (string file in files)
            {
                RawFile = file.Substring(Folder.Length, file.Length - Folder.Length);
                RawFile = RawFile.Replace("\\", "");

                RawFileCleanUp rawFileClean = new RawFileCleanUp();

                Executable dataFlowTask = package.Executables.Add("STOCK:PipelineTask");

                TaskHost taskHost = rawFileClean.SSISPackage(package, dataFlowTask, RawFile, file);


            }

            package.Validate(null, null, null, null);

            string NewFileName = Folder + "\\RawFileCleanUp.dtsx";

            XmlDocument myPkgDocument = new XmlDocument();
            package.SaveToXML(ref myPkgDocument, null, null);

            app.SaveToXml(NewFileName, package, null);