Simon Worth’s SQL Server Weblog

SQL Server and its many complexities

Archive for the ‘Change Data’ Category

SSIS Configuration Tables and Tracking Changes

with 2 comments

A typical scenario when developing SSIS packages is to use configurations for initialization of packages, connections, variable values, etc.  There are few different types of configurations available through SSIS including XML files, parent package values, environment variables, and SQL Server.

In the case of SQL Server, the configurations you specify are created, updated, and queried from a table.  The table can be named what ever you want (as long as it doesn’t break the naming rules of objects) and placed in what ever schema is required.  As long as the correct column names exist in the table (with the proper data types) SSIS can use this table to hold configuration values.

Figure 1 – Package Configuration Wizard

image

If you run a trace on SQL Server while adding a SQL Server configuration in SSIS (Figure 1) returns the following query executed by BIDS to determine if a configuration table candidate exists for the specified connection.

Script 1 – Query from BIDS to find configuration tables

SELECT
       TABLE_SCHEMA
      ,TABLE_NAME
  FROM
       INFORMATION_SCHEMA.COLUMNS
 WHERE
       COLUMN_NAME = 'ConfiguredValue'
            OR COLUMN_NAME = 'PackagePath'
                  OR COLUMN_NAME = 'ConfiguredValueType'
                        OR COLUMN_NAME = 'ConfigurationFilter'
 GROUP BY
       TABLE_SCHEMA
      ,TABLE_NAME
HAVING
       COUNT(1) = 4

 

The above query looks for any table within the specified connection that has the 4 required column names (ConfiguredValue, PackagePath, ConfiguredValueType, and ConfigurationFilter).  If those 4 column names exist in a table on the specified connection then the table name will be returned to the “Configuration Table” section of the configurations wizard dropdown box.  If there is no candidate in the current connection then the dropdown will be empty – and your only option within BIDS is to click the “New” button in order to create a table.  You can also create the table using SSMS, and the next time BIDS queries the specified connection, the table will show up in the dropdown – as long as the correct column names exist in the object.

Script 2 shows the default DDL that BIDS generates for creating a configuration table.  As you can see the default code places the table in the “dbo” schema and names the table “SSIS Configurations”.  I’m not a fan of object names that have spaces in them so when I create a configuration table in SSMS I generally remove the space.  But I’m also not a fan of having the table name as SSISConfigurations or a fan of the table being in the dbo schema.

Script 2 – DDL generated by BIDS for creating a configuration table

CREATE TABLE [dbo].[SSIS Configurations]
(
    ConfigurationFilter NVARCHAR(255) NOT NULL,
    ConfiguredValue NVARCHAR(255) NULL,
    PackagePath NVARCHAR(255) NOT NULL,
    ConfiguredValueType NVARCHAR(20) NOT NULL
)

 

Script 3 is the code I usually use for a basic configuration table.  There are some enhancements I like to add as well to the table, but this script mirrors the original BIDS script for basic configurations.

Script 3 – My DDL to create a configuration table

CREATE TABLE [Config].[ETLConfigurations](
       [ConfigurationFilter] NVARCHAR(255) NOT NULL
      ,[ConfiguredValue]     NVARCHAR(255) NULL
      ,[PackagePath]         NVARCHAR(255) NOT NULL
      ,[ConfiguredValueType] NVARCHAR(20) NOT NULL)

 

So far the only differences between the SSIS generated DDL and the DDL I use in SSMS are the use of the Config schema rather than the dbo schema and I have changed to the name to “[ETLConfigurations]” rather than using the provided “[SSIS Configurations]” name.

One thing I like to see on a configuration table – or any other table for that matter that is tracking meta data or lookup information – is the use of audit columns.  These columns can be as simple (and usually are) as the name of the user who inserted the data and a date and time for when the data was inserted.  Script 4 demonstrates adding on 3 audit columns to the configurations table to track who inserted the record and when it was inserted.

Script 4 – Alter config table to add 3 new audit columns

ALTER TABLE
       Config.ETLConfigurations
  ADD
       InsertUser NVARCHAR(100) NOT NULL
            CONSTRAINT DF_ETLConfigurations_InsertUser DEFAULT SUSER_SNAME()
      ,InsertDate DATE NOT NULL
            CONSTRAINT DF_ETLConfigurations_InsertDate DEFAULT GETDATE()
      ,InsertTime TIME(7) NOT NULL
            CONSTRAINT DF_ETLConfigurations_InsertTime DEFAULT GETDATE()

 

The column InsertUser captures the full login name of the user that added the record – and InsertDate and InsertTime capture the date and time (using the new to 2008 DATE and TIME data types) of transaction.  If you’re using SQL Server 2005 then these 2 data types won’t be available to you – so you can just have 1 column instead called InsertDate with the data type DATETIME and this will capture the date and the time.  All 3 columns have defaults assigned to them so the data will be automatically added when a record is inserted and there is no need for a trigger to manage this information as new records are added.

These 3 new columns are a good beginning for auditing and data change logging – but they do not help with updates and deletes to the configuration table.

Why would we care what changes take place within the configuration table?  Does it really matter who touched what, when, and with what application?  Well sometimes it does matter.  Generally the configuration table will be feeding information to many packages – not just one – and these packages could span multiple projects and data sources.  Reusing a configuration value is a common practice in developing packages, so if a value changes it can affect many packages.  If a value in the configuration table gets changed or removed by someone or something by accident (or deliberately) this change could affect a very large set of enterprise solutions and some sort of tracking is a good idea to find out what happened and why.

Now we have 3 audit columns on the configuration table to track insert information and at this point we have the option to add a few more columns to the configuration table – but instead of InsertUser, InsertDate, and InsertTime we could add UpdateUser, UpdateDate, and UpdateTime.  These columns would be NULL when data is inserted, and a trigger could be used to populate the columns when data is changed.  However, this doesn’t help with data being deleted – only updated, and the update columns would only reflect the last change to the specific row, not all the changes for that row.

Another approach is to create a table that will capture the history of the changes that take place in the configuration table – both updates and deletes.  This can be accomplished by creating a new table and adding a trigger to the original configuration table.

Script 5 – History table for configuration changes

CREATE TABLE [Config].[ETLConfigurationsHistory] (
       [ConfigurationFilter]    NVARCHAR(255) NOT NULL
      ,[OldConfiguredValue]     NVARCHAR(255) NULL
      ,[NewConfiguredValue]     NVARCHAR(255) NULL
      ,[OldConfiguredValueType] NVARCHAR(20)  NOT NULL
      ,[NewConfiguredValueType] NVARCHAR(20)  NOT NULL 
      ,[ChangeUser]             NVARCHAR(100) NOT NULL
      ,[ChangeDate]             DATE          NOT NULL
      ,[ChangeTime]             TIME          NOT NULL
      ,[ChangeType]             NVARCHAR(10)  NOT NULL
      ,[ChangeApplicationName]  NVARCHAR(255) NOT NULL)

 

This history table will capture the previous value and new value for ConfiguredValue and ConfiguredValueType as well as the user that made the change, the date and time the change was made, the type of change (IE – Insert, Delete, Update), and the application name that made the change.  To capture all this audit information we create a trigger for UPDATE and DELETE on the original configuration table.  It is important to note that when you alter a configuration value within BIDS, the current configured value is deleted from the table and a new row is inserted with the new information.  So, on a system that is not being touched manually by users you will only see a record deleted and then a record inserted.

Script 6 – Trigger on configuration table to capture changes

CREATE TRIGGER 
       Config.tr_ETLConfigurations 
    ON 
       Config.ETLConfigurations 
 AFTER INSERT, UPDATE, DELETE AS 

-- SSIS deletes and inserts a row when a configuration value changes rather than
-- updating the row itself.  So if a row is updated - then it was outside of the
-- BIDS environment.
DECLARE @Inserted INT, @Deleted INT

SELECT @Inserted = COUNT(1) FROM Inserted
SELECT @Deleted = COUNT(1) FROM Deleted

--if inserted and deleted have record counts then this transaction is an update
IF (@Inserted > 0 AND @Deleted > 0)
      BEGIN
            INSERT INTO [Config].[ETLConfigurationsHistory]
                       ([ConfigurationFilter]
                       ,[OldConfiguredValue]
                       ,[NewConfiguredValue]
                       ,[OldConfiguredValueType]
                       ,[NewConfiguredValueType]
                       ,[PackagePath]
                       ,[ChangeUser]
                       ,[ChangeDate]
                       ,[ChangeTime]
                       ,[ChangeType]
                       ,[ChangeApplicationName])
            SELECT
                   ec.ConfigurationFilter
                  ,d.ConfiguredValue AS OldConfiguredValue
                  ,i.ConfiguredValue AS NewConfiguredValue
                  ,d.ConfiguredValueType AS OldConfiguredValueType
                  ,i.ConfiguredValueType AS NewConfiguredValueType
                  ,ec.PackagePath
                  ,SUSER_SNAME() AS ChangeUser
                  ,CAST(GETDATE() AS DATE) AS ChangeDate
                  ,CAST(GETDATE() AS TIME) AS ChangeTime
                  ,'Updated' AS ChangeType
                  ,APP_NAME() AS ChangeApplicationName
              FROM
                   [Config].[ETLConfigurations] ec
             INNER JOIN
                   Inserted i
                ON i.ConfigurationFilter = ec.ConfigurationFilter
               AND i.PackagePath = ec.PackagePath
             INNER JOIN
                   Deleted d
                ON d.ConfigurationFilter = ec.ConfigurationFilter
               AND d.PackagePath = ec.PackagePath
      RETURN
      END

IF (@Inserted > 0 AND @Deleted = 0)
      BEGIN
            INSERT INTO [Config].[ETLConfigurationsHistory]
                       ([ConfigurationFilter]
                       ,[OldConfiguredValue]
                       ,[NewConfiguredValue]
                       ,[OldConfiguredValueType]
                       ,[NewConfiguredValueType]
                       ,[PackagePath]
                       ,[ChangeUser]
                       ,[ChangeDate]
                       ,[ChangeTime]
                       ,[ChangeType]
                       ,[ChangeApplicationName])
            SELECT
                   i.ConfigurationFilter
                  ,NULL AS OldConfiguredValue
                  ,i.ConfiguredValue AS NewConfiguredValue
                  ,NULL AS OldConfiguredValueType
                  ,i.ConfiguredValueType AS NewConfiguredValueType
                  ,i.PackagePath AS PackagePath
                  ,SUSER_SNAME() AS ChangeUser
                  ,CAST(GETDATE() AS DATE) AS ChangeDate
                  ,CAST(GETDATE() AS TIME) AS ChangeTime
                  ,'Inserted' AS ChangeType
                  ,APP_NAME() AS ChangeApplicationName
              FROM
                   Inserted i
      RETURN
      END

--a delete took place
IF (@Inserted = 0 AND @Deleted > 0)
      BEGIN
            INSERT INTO [Config].[ETLConfigurationsHistory]
                       ([ConfigurationFilter]
                       ,[OldConfiguredValue]
                       ,[NewConfiguredValue]
                       ,[OldConfiguredValueType]
                       ,[NewConfiguredValueType]
                       ,[PackagePath]
                       ,[ChangeUser]
                       ,[ChangeDate]
                       ,[ChangeTime]
                       ,[ChangeType]
                       ,[ChangeApplicationName])
            SELECT
                   d.ConfigurationFilter
                  ,d.ConfiguredValue AS OldConfiguredValue
                  ,NULL AS NewConfiguredValue
                  ,d.ConfiguredValueType AS OldConfiguredValueType
                  ,NULL AS NewConfiguredValueType
                  ,d.PackagePath AS PackagePath
                  ,SUSER_SNAME() AS ChangeUser
                  ,CAST(GETDATE() AS DATE) AS ChangeDate
                  ,CAST(GETDATE() AS TIME) AS ChangeTime
                  ,'Deleted' AS ChangeType
                  ,APP_NAME() AS ChangeApplicationName
              FROM
                   Deleted d
      RETURN
      END

 

If we apply the trigger to the configuration table and start making some changes – either manually using DML in SSMS or through the configurations wizard in SSIS the changes will now be captured to the history table.  As I said earlier, BIDS deletes a row then inserts it again if modifications are made so with any luck you will only see deletes and inserts in the history table.  But if you do encounter an update then you know that the data has been manually changed by a user or some other process outside of BIDS.

Written by Simon Worth

February 20, 2009 at 7:37 pm