JSON to DataSet SQL CLR

I have been working with SQL CLR functions and procedure recently, and have come up with a crazy solution involving JSON, Analysis Services, and Visual Studio Database projects. Before we get into the crazy solution, we are going to take another step into SQL CLR Procedures which will help with understanding the pain points with using an external library. In a previous post we looked at creating JSON objects and in another post creating SQL CLR functions and procedures. In this blog post we are going to deserialize a JSON object and return the data as dataset to the results window.

The idea behind this blog post is to pass a JSON object to a SQL CLR procedure, deserialize the JSON object into a .Net object, load the .Net object into dataset, and return to dataset to the client. Now before you start thinking, “with SQL2016 we have JSON support, why do we need this?” One, an important part of this blog post is to show how to include external libraries into a CLR and how to allow SQL Server to use them. Two, we will need JSON in the crazy solution later, so showing how to work with JSON in .NET will help in the next blog post. Three, it will make for an “interesting” blog post and I could not think of anything better.

Data Setup

Here is the sample JSON object that we will pass to the SQL CLR procedure, which happens to be the grades from a student in a previous JSON blog post.

{
  "Grades": [
    {
      "CourseCode": "ICS140",
      "CourseName": "Introduction to Programming",
      "Grade": "A"
    },
    {
      "CourseCode": "ICS225",
      "CourseName": "Web Programming",
      "Grade": "B+"
    },
    {
      "CourseCode": "ICS240",
      "CourseName": "Advanced Programming",
      "Grade": "B-"
    },
    {
      "CourseCode": "ICS310",
      "CourseName": "Database Management",
      "Grade": "B"
    }
  ]
}

What we expect to return is a data set that looks like the following table.

CourseCode CourseName Grade
ICS140Introduction to ProgrammingA
ICS225Web ProgrammingB+
ICS240Advanced ProgrammingB-
ICS310Database ManagementB

CLR Object List

JsonToDataSet will accept a JSON object in the form of a NVARCHAR(MAX) column and will return a full dataset where the grade array will be converted into the dataset. The element names will become the column headers in the dataset.

Project Setup

Even though we are going to deploy this SQL CLR procedure to the same database, we are going to create a new project called JsonSQLCLR, and we will follow the same process as we did for our FirstSQLCLR project. From the Visual Studio start page, click on the “Create a new Project”, from the new project page, find and click on the “SQL Server Database Project”, and click “Next”. In the “Configure your new project” we are going to add JsonSQLCLR for the project name and select our path again.

SQL CLR

Once Visual Studio finishes configuring the SQL Server Database Project type, right click on the “JsonSQLCLR” project and select the “Add” option, then the “New Item…” option. In the Add New Item window we will select the “SQL CLR C#” option in the left options list. In the center window we will select the “SQL CLR C# Stored Procedure” option, and name it “JsonToDataSet.cs”.

SQL CLR

JsonToDataSet

In the JsonToDataSet method we will add code to send data back to the client using a SQLPipe object, and code to allow the JSON object to be passed into the procedure as a SqlString parameter.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
 
public partial class StoredProcedures
{ 
  public static void JsonToDataSet(SqlString studentGrades)
  {
    SqlPipe sqlPipe = SqlContext.Pipe; 
  }
}

Now that we have the JSON object as a string in the method, we need to deserialize and put it into a .NET object that we can access and work with. Before we can deserialize the JSON object we need to define a class that we can use to reference and populate data into. We could write our own class, however when there are free tools available to do the work, we will use the free tools. BTW, I like free tools. This Json Utils site will generate C#.NET classes for us by passing in our JSON string. The output will look like the following, which we will add below our StoredProcedures class. At a high level, the following structures state we can have a student with a list of grades, and each grade can have a Corse Code, Course Name and Grade.

public class Grades
{
    public string CourseCode { get; set; }
    public string CourseName { get; set; }
    public string Grade { get; set; }
}
 
public class Student
{
    public IList<Grades> grades { get; set; }
}

The Student class implementation is using an IList type, which will cause the build to fail, because the project does not currently understand what an IList type is. The library for an IList type is found in the System  Collections  Generic library, so we will need to import the library. Add the following line near the top of the coding file, you will see other using statements there as well.

public class Gradesusing System.Collections.Generic;

After the line is added we should be able to compile the project.

Now that we have the Student and Grades classes implemented, we will need to desearlize the JSON object, in a SQL Server Database Project we do not have libraries loaded by default for working with JSON objects, so we will need to import a JSON library. There are a few libraries that can do this, however, some are not available to SQL Server project types. The ideal library is System.Text.Json which is available through NuGet, but currently it is not an option for SQL Server database projects or integration services projects. Instead, we are going to use Json.NET from Newtonsoft, which is also a great utility for working with JSON in C#.NET. We can download the libraries from https://www.newtonsoft.com/json. Add the Newtonsoft.Json.dll to a location that is accessible to your project and the SQL Server instance.

To add the JSON libraries into our project we will right click on “References” and select “Add References…” In the Add Reference window we will select Browse and navigate to the location where you stored Newtonsoft.Json.dll.

SQL CLR SQL CLR

After we have added a reference to the library in the project we need to import the Newtonsoft.Json library in our code. Add the following line near the top of the coding file, you will see other using statements there as well.

using Newtonsoft.Json;

Now we are ready to deserialize the JSON object. Within the Newtonsoft.JSON library, we are going to use a generic method called DeserializeOjbect to convert the JSON object into a Student object

JsonConvert.DeserializeObject<ClassType>([Json Object]);

The DeserializeObject method accepts a Student class in the method call and uses the JSON object passed in as a parameter to create a student object. At a high level the Student class gives a mapping to the DeserializeObject method, and the DeserializeObject method will try to populate as much as it can based on that mapping. If the JSON object passed in does not match, the mapping no data will be mapped and you will be left with an empty Student object. Here is the full call which will instantiate the student object and allow us to start working with it.

Student student = JsonConvert.DeserializeObject<Student>(studentGrades.ToString());

Now we have our JSON object deserialized, we can define out output. In our previous post we create a metadata object to store our formatted date. This worked for a single output, however this time we need to output three columns. So we will need to define a list of metadata objects. We will accomplish this by creating a generic list of type SqlMetaData. We will also use a short cut syntax for creating the SqlMetaData objects. Instead of creating variables and separate objects like this:

SqlMetaData courseCodeMetatData = new SqlMetaData("CourseCode", SqlDbType.NVarChar, 20)
sqlMetaData.Add(sqlMetaData);

We are going to use a shortcut syntax by skipping the variable creation and creating the objects directly in the sqlMetaData list. The full syntax will look like this:

List<SqlMetaData> sqlMetaData = new List<SqlMetaData>();
 
sqlMetaData.Add(new SqlMetaData("CourseCode", SqlDbType.NVarChar, 20));
sqlMetaData.Add(new SqlMetaData("CourseName", SqlDbType.NVarChar, 250));
sqlMetaData.Add(new SqlMetaData("Grade", SqlDbType.NVarChar, 5));
 
SqlDataRecord sqlDataRecord = new SqlDataRecord(sqlMetaData.ToArray());

Now we are ready to start sending data back to the client. Before we can start sending data we need to mark the beginning of the dataset that we are going to start sending to the client. To do this we will call the SendResultsStart method, this method will mark the beginning of the dataset sending process.

sqlPipe.SendResultsStart(sqlDataRecord);

Now that the pipe has been opened to the client we send data from the Student class back to the client. We will accomplish this by traversing the student object with a foreach loop, capturing and adding to the SQL Data Record list, and finally send each record back to the client.

foreach (Grades grade in student.grades)
{
    sqlDataRecord.SetString(0, grade.CourseCode);
    sqlDataRecord.SetString(1, grade.CourseName);
    sqlDataRecord.SetString(2, grade.Grade);
    sqlPipe.SendResultsRow(sqlDataRecord);
}

All that is left is to mark the end of the dataset in the SQL Pipe.

sqlPipe.SendResultsEnd();

Full Code

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using Newtonsoft.Json;
public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void JsonToDataSet(SqlString studentGrades)
    {
        SqlPipe sqlPipe = SqlContext.Pipe;
 
        Student student = JsonConvert.DeserializeObject<Student>(studentGrades.ToString());
 
        List<SqlMetaData> sqlMetaData = new List<SqlMetaData>();
 
        sqlMetaData.Add(new SqlMetaData("CourseCode", SqlDbType.NVarChar, 20));
        sqlMetaData.Add(new SqlMetaData("CourseName", SqlDbType.NVarChar, 250));
        sqlMetaData.Add(new SqlMetaData("Grade", SqlDbType.NVarChar, 5));
 
        SqlDataRecord sqlDataRecord = new SqlDataRecord(sqlMetaData.ToArray());
 
        sqlPipe.SendResultsStart(sqlDataRecord);
        foreach (Grades grade in student.grades)
        {
            sqlDataRecord.SetString(0, grade.CourseCode);
            sqlDataRecord.SetString(1, grade.CourseName);
            sqlDataRecord.SetString(2, grade.Grade);
            sqlPipe.SendResultsRow(sqlDataRecord);
        }
        sqlPipe.SendResultsEnd();
    }
}
 
public class Grades
{
    public string CourseCode { get; set; }
    public string CourseName { get; set; }
    public string Grade { get; set; }
}
 
public class Student
{
    public IList<Grades> grades { get; set; }
}

SQL CLR Security

Before we deploy the JSON SQL CLR procedure we need to implement the same security features with an asymmetric key to allow the SQL CLR procedure to execute in the database. Here are the steps for creating an asymmetric key, signing the CLR project, and creating a login that binds the asymmetric key and allows the CLR objects to execute. More details from the previous post can be found here.

  1. Create an asymmetric key using the strong name utility command line tool installed with Visual Studio 2019.
  2. Create an ASYMMETRIC KEY in SQL Server
  3. Create a SQL Login based on the ASYMMETRIC KEY
  4. Grant Permissions to Assemblies using the SQL Login
  5. Sign the SQL CLR Project with the asymmetric key
SQL CLR

Deploy the CLR Objects

After the solution successfully compiles, the security has been configured both on the Database Project and SQL Instance, we are ready to deploy the solution to the SQL Server. To deploy the SQL CLR objects we will right click on the “JsonSQLCLR” project and select “Publish…” We are going to deploy to the same database as we did in the FirstSQLCLR blog post. We are doing this for two reasons, first to show that we can and second, we are being lazy and not creating another database.

SQL CLR

With any luck the deployment should have failed, which means that the assemblies and permissions have not been deployed and the database environment is not set to have Trustworthy=On. We should see an error like below. We will need to click on the “View Results” link to see the error message.

SQL CLR

As the error message states, we are seeing this error message because to the Newtsonsoft.Json library is not found in the database. So we will need to add the Newtsonsoft.Json library to the FirstSQLCLR database. We have a few options to add the library into the database. We can add the assembly as a trusted assembly in SQL server, we can create logins for the assemblies, or we can create certificates for the assemblies. Here are the sql scripts for each option.

  1. Using Trusted assemblies
    USE FirstSQLCLR
    GO
     
    DECLARE @clrName NVARCHAR(4000) = N'Newtonsoft.Json'
    DECLARE @clrBin VARBINARY(max) = 0x4D5A90000300000004000000FFFF0 – this is simplified for the blog post, this string is usually very large
    DECLARE @hash VARBINARY(64)
    SET @hash = hashbytes('SHA2_512', @clrBin);
     
    EXEC sys.sp__trusted_assembly @hash, @clrName;
     
    -- Now that we have a trusted the assembly we can add it into the database.
    CREATE ASSEMBLY [JsonSQLCLR]
        AUTHORIZATION [dbo]
        FROM 0x4D5A90000300000004000000FFFF0; – this is simplified for the blog post, this string is usually very large
  2. Using Asymmetric Keys
    USE master
    GO
     
    CREATE ASYMMETRIC KEY NewtonsoftJsonKey
    FROM EXECUTABLE FILE = 'C:\Development\SQLCLR\Newtonsoft.Json.dll';
     
    CREATE LOGIN NewtonsoftJsonLogin FROM ASYMMETRIC KEY NewtonsoftJsonKey;
     
    GRANT UNSAFE ASSEMBLY TO NewtonsoftJsonLogin;
     
    --Now that we have a login that has permissions to the assembly we can add it into the database.
    USE FirstSQLCLR
    GO
     
    CREATE ASSEMBLY [Newtonsoft.Json]
    FROM 'C:\Development\SQLCLR\Newtonsoft.Json.dll'
    WITH PERMISSION_SET = UNSAFE
  3. Using Certificates
    USE master
    GO
     
    CREATE CERTIFICATE NewtonsoftCert
    FROM EXECUTABLE FILE = 'C:\Development\SQLCLR\Newtonsoft.Json.dll';
    GO
     
    CREATE LOGIN NewtonsoftJsonLogin 
    FROM CERTIFICATE NewtonsoftJsonCert;
    GO
     
    GRANT UNSAFE ASSEMBLY TO NewtonsoftJsonLogin;
    GO
     
    --Now that we have a login that has permissions to the assembly we can add it into the database.
    USE FirstSQLCLR
    GO
     
    CREATE ASSEMBLY [Newtonsoft.Json]
    FROM 'C:\Development\SQLCLR\Newtonsoft.Json.dll'
    WITH PERMISSION_SET = UNSAFE
  4. In this blog post we are going to use option three the certificates. We will start out by creating the certificate in the master database using the following script. Being that we have placed the Newtonsoft.Json.dll in the proper directory these statements should execute successfully.

    USE master
    GO
     
    -- CERTIFICATE [Newtonsoft.Json]
    CREATE CERTIFICATE NewtonsoftCert
    FROM EXECUTABLE FILE = 'C:\Development\SQLCLR\Newtonsoft.Json.dll';
    GO
     
    CREATE LOGIN NewtonsoftJsonLogin 
    FROM CERTIFICATE NewtonsoftJsonCert;
    GO
     
    GRANT UNSAFE ASSEMBLY TO NewtonsoftJsonLogin;
    GO

    Now we are going to try adding the missing Newtonsoft.Json assembly into the database using the following script.

    USE FirstSQLCLR
    GO
     
    CREATE ASSEMBLY [Newtonsoft.Json]
    FROM 'C:\Development\SQLCLR\Newtonsoft.Json.dll'
    WITH PERMISSION_SET = UNSAFE

    Notice that now we got an error in SQL Server Management Studio? As the error states, we are missing another assembly. The reason here is that the Newtonsoft.Json assembly has a dependency on other assemblies. If the assembly that you are importing has multiple references you may have to import all of the other dependent references. This can be a tedious repetitive task, however, for this Json SQL CLR project here are the certificate, login, and assembly create statements. You may need to adjust the paths to the dll’s depending on the version Visual Studio and .Net installed.

    USE Master
    GO
    -- CERTIFICATE MicrosoftNetCert
    CREATE CERTIFICATE MicrosoftNetCert
    FROM EXECUTABLE FILE = 'C:\Windows\Microsoft.NET\assembly\GAC_MSIL\System.Runtime.Serialization\v4.0_4.0.0.0__b77a5c561934e089\System.Runtime.Serialization.dll'
    GO
     
    CREATE LOGIN MicrosoftNetLogin 
    FROM CERTIFICATE MicrosoftNetCert
    GO
     
    GRANT UNSAFE ASSEMBLY TO MicrosoftNetLogin
    GO
     
    -- CERTIFICATE [Newtonsoft.Json]
    CREATE CERTIFICATE NewtonsoftCert
    FROM EXECUTABLE FILE = 'C:\Development\SQLCLR\Newtonsoft.Json.dll';
    GO
     
    CREATE LOGIN NewtonsoftLogin 
    FROM CERTIFICATE NewtonsoftCert;
    GO
     
    GRANT UNSAFE ASSEMBLY TO NewtonsoftLogin;
    GO
     
     
    USE FirstSQLCLR
    GO
    -- ASSEMBLY [System.ServiceModel.Internals]
    CREATE ASSEMBLY [System.ServiceModel.Internals]
    FROM 'C:\Windows\Microsoft.NET\assembly\GAC_MSIL\System.ServiceModel.Internals\v4.0_4.0.0.0__31bf3856ad364e35\System.ServiceModel.Internals.dll'
    WITH PERMISSION_SET = UNSAFE
    GO
     
    --ASSEMBLY [SMDiagnostics]
    CREATE ASSEMBLY [SMDiagnostics]
    FROM 'C:\Windows\Microsoft.NET\assembly\GAC_MSIL\SMDiagnostics\v4.0_4.0.0.0__b77a5c561934e089\SMDiagnostics.dll'
    WITH PERMISSION_SET = UNSAFE
    GO
     
    -- ASSEMBLY [System.Runtime.Serialization]
    CREATE ASSEMBLY [System.Runtime.Serialization]
    FROM 'C:\Windows\Microsoft.NET\assembly\GAC_MSIL\System.Runtime.Serialization\v4.0_4.0.0.0__b77a5c561934e089\System.Runtime.Serialization.dll'
    WITH PERMISSION_SET = UNSAFE
    GO
     
    -- ASSEMBLY [Newtonsoft.Json]
    CREATE ASSEMBLY [Newtonsoft.Json]
    FROM 'C:\Development\SQLCLR\Newtonsoft.Json.dll'
    WITH PERMISSION_SET = UNSAFE;
    GO

    After executing scripts, providing everything was successful, you should see the assemblies in the FirstSQLCLR database under the Programmability –> Assemblies folder. Now that we have the assemblies created we can try to deploy our JSON SQL CLR procedure to the FirstSQLCLR database again. SQL CLR

    Execute the JSON CLR Procedure

    Now that we have successfully deployed the JSON SQL CLR procedure, we can try to execute it. We will need to create a variable to store the JSON object, and pass that to the procedure. We can execute the SQL CLR procedure with the following SQL Code. If everything goes well, we should see a result set with course and grade information populated based on the JSON object that was passed in.

    USE [FirstSQLCLR]
    GO
     
    DECLARE @studentGrades nvarchar(max) = '{
      "Grades": [
        {
          "CourseCode": "ICS140",
          "CourseName": "Introduction to Programming",
          "Grade": "A"
        },
        {
          "CourseCode": "ICS225",
          "CourseName": "Web Programming",
          "Grade": "B+"
        },
        {
          "CourseCode": "ICS240",
          "CourseName": "Advanced Programming",
          "Grade": "B-"
        },
        {
          "CourseCode": "ICS310",
          "CourseName": "Database Management",
          "Grade": "B"
        }
      ]
    }'
     
    EXECUTE [dbo].[JsonToDataSet]  @studentGrades
    GO
    SQL CLR

    Summary

    This was step two “partial craziness” for creating SQL CLR procedures in Visual Studio database projects. This post was focused more on importing libraries and assemblies into SQL Server than working with JSON. However, it also showed how we can build a bigger and more dynamic result set that can be returned from the CLR to the client. We have looked at a “slightly” more complex version of a CLR. Previously we looked at simple CLR objects and the intricacies of security and deployment. Now we are ready to look at the real craziness that I get myself into.

Simple SQL CLR Objects

I have been working with SQL CLR functions and procedures recently, and have come up with a crazy solution involving JSON, Analysis Services, and Visual Studio Database Projects. However, Before we get into the crazy solutions, we are going to take a look at a simple SQL CLR implementation. This simple implementation will include procedures and functions that return formatted dates. Using a simple solution will help with the intricacies of Visual Studio projects, CLR security, project deployment, and CLR execution. In this blog post we will take a look at creating SQL CLR functions in Visual Studio 2019. We will return data as a message as well as a dataset to the results window. We will implement proper SQL CLR security with asymmetric keys and signed assemblies.

The idea behind this blog post is to enhance the current CONVERT function that accepts formats for dates. There are plenty of predefined formats, however this SQL CLR function will allow you to build any date format that C#.net allows, here is a reference for building custom date formats. We will build three different SQL CLR objects, two procedure and one function.

CLR Object List

DateFormatMessage will accept a timestamp and a format, and will return the formatted date to the Message window.

DateFormatResult will accept a timestamp and a format, and will return the formatted date to the Result window.

DateFormatFunction will accept a timestamp and a format, and will return the formatted date as a result from a function call within a query call.

Project Setup

We will create our SQL CLR objects by creating a database project within Visual Studio 2019. After opening Visual Studio 2019 we will select the “Create a new Project” option

Visual Studio 2019 Landing Page

In the “Create new project” window we will select the “SQL Server Database Project” type. We can either scroll through all of the project types or you can do a key word search for the type.

Visual Studio 2019 New Project

In the “Configure your new project” window we are going to name our project “FirstSQLCLR”, specify the location where our source code will be stored on disk, and leave the solution name the same as the project.

Visual Studio 2019 Configure Project

Once Visual Studio finishes configuring the SQL Server Database Project type, we will have an empty database project. There are a couple ways to add the SQL CLR object to the project, we are going to use the Solution Explorer route. In the Solution Explorer windows we will right click on the “FirstSQLCLR” project and select the “Add” option, then the “New Item…” option.

Visual Studio 2019 Add New

In the Add New Item window we will select the “SQL CLR C#” option in the left options list. In the center window we will select the “SQL CLR C# Stored Procedure” option, and name it “DateForamtMessage.cs”.

ADD SQL CLR Procedure

In the DateFormatMessage.cs code window we will have an empty C# method body similar to the following code snippet.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
 
public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void DateFormatMessage ()
    {
        // Put your code here
    }
}

DateFormatMessage

In the DateFormatMessage method we will add code to send data back to the caller using a SQLPipe object, and code to format the passed in date with the passed in format. One thing to note here is that we are passing a string value back to the SqlPipe object, which will be returned to the message window in management studio.

Here is the completed code.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
 
public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void DateFormatMessage(SqlDateTime sqlDateTime, SqlString CSharpDateFormat)
    {
        SqlPipe sqlPipe = SqlContext.Pipe;
        DateTime dateTime = (DateTime)sqlDateTime;
 
        // Apply the date format to the ToString() function
        string formattedDate = dateTime.ToString(CSharpDateFormat.ToString());
 
        // send the results back
        sqlPipe.Send(formattedDate);
    }
}

DateFormatResult

We will repeat the previous steps for the DateFormatResult.cs SQL CLR procedure, however the code will be slightly modified from the message version. The additional code will allow us to return a record set instead of a single string. The following lines will define a record set object and populate the single value in the object.

// using a meta data object, setup the resultset configuration
SqlMetaData sqlMetaData = new SqlMetaData("formattedDate", SqlDbType.NVarChar, 25);
SqlDataRecord sqlDataRecord = new SqlDataRecord(sqlMetaData);
 
// store the data into the dataset
sqlDataRecord.SetString(0, formattedDate);

Here is the completed code.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
 
public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void DateFormatResult(SqlDateTime sqlDateTime, SqlString CSharpDateFormat)
    {
        SqlPipe sqlPipe = SqlContext.Pipe; 
        DateTime dateTime = (DateTime)sqlDateTime;
 
        // Apply the date format to the ToString() function
        string formattedDate = dateTime.ToString(CSharpDateFormat.ToString());
 
        // using a meta data object, setup the resultset configuration
        SqlMetaData sqlMetaData = new SqlMetaData("formattedDate", SqlDbType.NVarChar, 25);
        SqlDataRecord sqlDataRecord = new SqlDataRecord(sqlMetaData);
 
        // store the data into the dataset
        sqlDataRecord.SetString(0, formattedDate);
 
        // send the results back
        sqlPipe.Send(sqlDataRecord);
    }
}

DateFormatFunction

To add the DateFormatFunction we will repeat most of the same steps, however we will need to choose a “SQL CLR C# User Defined Function” instead of the “SQL CLR C# Stored Procedure” option in the Add New Item window.

SQL CLR Add UDFunction

The code for a user defined function is simplified, as we will not need a SqlPipe to return the results. We are also only returning a single string value, so we will not need the record set object either. Being we are returning a single value from the function, we only need to return the formatted string.

Here is the completed code.

using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
 
public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString DateFormatFunction(SqlDateTime sqlDateTime, SqlString CSharpDateFormat)
    {
        DateTime dateTime = (DateTime)sqlDateTime;
 
        // Apply the date format to the ToString() function
        string formattedDate = dateTime.ToString(CSharpDateFormat.ToString());
 
        // return the formatted string 
        return new SqlString(formattedDate);
    }
}

We are ready to try to compile our code. Again there are multiple ways to compile our code, we will use the Solution Explorer route. In the Solution Explorer windows we will right click on the “FirstSQLCLR” project and select the “Rebuild” option.

SQL CLR Add UDFunction

Providing everything was typed in correctly and we have no build errors we should see that the build succeeded in the output window.

SQL CLR Add UDFunction

SQL CLR Security

Before we deploy the SQL CLR objects we need to implement the security features needed to allow custom code to run within SQL Server. There are a couple ways to do this, however, we are only going to focus on a secure option implementing asymmetric keys. We will not set the database’s TRUSTWORTHY option on ON, nor will we update the CLR Strict Security option. Here are the steps for creating an asymmetric key, signing the CLR project, and creating a login that binds the asymmetric key and allows the CLR objects to execute.

  1. Create an asymmetric key using the strong name utility command line tool installed with Visual Studio 2019.
  2. Create an ASYMMETRIC KEY in SQL Server
  3. Create a SQL Login based on the ASYMMETRIC KEY
  4. Grant Permissions to Assemblies using the SQL Login
  5. Sign the SQL CLR Project with the asymmetric key

Step 1: Create the Asymmetric Key

Sn.exe is a free utility shipped with Visual Studio that will create asymmetric key. We will need to open the command window with Administrator privileges to generate the key. We will run this command with just the –k switch and a path to store the key.

Strong Name Utility

Step 2: Create an ASYMMETRIC KEY in SQL Server

In the master database we need to create the asymmetric key from the file that was generated in Step 1. If you have not already created a master key for your SQL instance you will need to do that in this step.

USE MASTER
GO
 
-- Create the Master  Key for the SQL Instance
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '[Some Really Strong Password]';
 
-- Create the asymmetric key from the strong named key used to sign the CLR project
CREATE ASYMMETRIC KEY FirstSQLCLR FROM FILE = 'C:\Development\SQLCLR\FirstSQLCLR\FirstSQLCLR.snk'

Step 3: Create a SQL Login based on the ASYMMETRIC KEY

In the master database we will need to create a SQL Login that we be generated from the asymmetric key from Step 2. This login will allow SQL Server to access the assembly and execute based on the permissions in step 4.

USE MASTER
GO
 
-- Create a login from the asymmetric key to access the CLR
CREATE LOGIN FirstSQLCLR FROM ASYMMETRIC KEY FirstSQLCLR;

Step 4: Grant Permissions to Assemblies using the SQL Login

In the master database we need to assign assembly permissions to the Login from Step 3. We are choosing UNSAFE because it give us the most flexibility, however depending on your needs you may want to choose SAFE or EXTERNAL ACCESS. See the CLR Integration Code Access Security for more details

USE MASTER
GO
 
-- Set the assembly permission level for the login
GRANT UNSAFE ASSEMBLY TO FirstSQLCLR;

Step 5: Sign the SQL CLR Project with the asymmetric key

Back in Visual Studio, we need to access the project properties. We will right click on the “FirstSQLCLR” project and select “Properties”. When the properties window opens, we will select the “SQL CLR” option. We will select “UNSAFE” as the Permission Level and then click on the “Signing” button. This will open the Signing dialog box. In this dialog box we will select the “Sign the assembly” checkbox, then using the drop down list we will browse for the asymmetric key created in Step 1.

Visual Studio Project Settings

Deploy the CLR Objects

After the solution successfully compiles, the security has been configured both on the Database Project and SQL Instance, we are ready to deploy the solution to the SQL Server. To deploy the SQL CLR objects we will right click on the “FirstSQLCLR” project and select “Publish…”

Visual Studio Deploy SQL CLR

The Target Database Setting window will open. Here we will need to select the SQL instance and Database we want to deploy the SQL CLR object to. To configure the SQL Instance, select the “Edit…” button to complete the instance settings, specify the database name, and click on the “Publish” Button. We could script this out, however, that is a discussion for another day.

Deploy Database Settings

If the deployment was successful you will see a window like below.

Deployment Status

Moving over to the SQL Instance, if we refresh the database, we should now see the SQL CLR procedures and CLR function.

Management Studio CLR View

Execute the CLR Objects

Now that we have successfully deployed the SQL CLR Objects, we can try to execute them. Call a SQL CLR procedure looks very similar to a standard procedure in SQL Server. We will create variables for the date and format parameters and pass them to the procedure.

USE [FirstSQLCLR]
GO
 
DECLARE @sqlDateTime DATETIME = '2020-05-13 21:25:02.833'
      , @CSharpDateFormat VARCHAR(50) = 'yyyyMMdd'
 
EXEC [dbo].[DateFormatMessage] @sqlDateTime, @CSharpDateFormat

After we execute the procedure, we may see a message that looks like this: Msg 6263, Level 16, State 1, Line 7 Execution of user code in the .NET Framework is disabled. Enable “clr enabled” configuration option.

Here is a sample call.

SQL CLR Error

This error is because the current instance is not configured to allow SQL CLR objects from executing. We can turn this on with the following query to reconfigure the instance with the clr enabled flag.

EXEC sp_configure 'clr enabled', 1;  
RECONFIGURE;  
GO

After we apply the configuration setting and try to execute the procedure again we should have a successful query call and see the following result. Here you can see the formatted date in the Message window.

SQL CLR Message

Let’s try running the other procedure DateFormatResults, notice that this time instead of the procedure returning a string value to the message window we see the data returned in the Results window. Here is the code to call the procedure and the results.

USE [FirstSQLCLR]
GO
 
DECLARE @sqlDateTime DATETIME = '2020-05-13 21:25:02.833'
      , @CSharpDateFormat VARCHAR(50) = 'yyyyMMdd'
 
EXEC [dbo].[DateFormatResult] @sqlDateTime, @CSharpDateFormat
SQL CLR Result

Finally let’s take a look at executing the SQL CLR Function DateFormatFunction. Again we call this the same as we would call a user defined function. This time we will call the function in a select statement and the result is returned in the Results window.

USE [FirstSQLCLR]
GO
 
DECLARE @sqlDateTime DATETIME = '2020-05-13 21:25:02.833'
      , @CSharpDateFormat VARCHAR(50) = 'yyyyMMdd'
 
SELECT [dbo].[DateFormatFunction] (@sqlDateTime, @CSharpDateFormat)
SQL CLR Function

Summary

This was a “brief” introduction into creating SQL CLR objects from Visual Studio database projects. We took a look at the permission configuration with signing assemblies and creating asymmetric keys. In a future post we will look at creating another SQL CLR procedure where we read JSON objects. This will help show how to incorporate additional binaries into SQL Server. After that, we will finally get to the craziness with SQL Server, SSAS, CLR, JSON, and Visual Studio Database Projects.

SSIS Importing and Validating Credit Card Data

POS Touch PadI spent some of my early career working with POS data. The irony of Point of Sale data is it is really was POS data. I spent a lot of time validating data all sorts of data, one of the big pain points was validating credit card data. When I decided to write a presentation that’s all about SSIS Scripting, the first demo idea that came to me was validating credit card data. In this blog post we will import credit card information from a flat text file. We will feed the credit card data into a Script Component as a transformation. Using the script component we will validate the credit card number against the LUHN algorithm, if the credit card is valid we will also determine the type of credit card. We will create separate flows for American Express, Master Card, Visa, and invalid credit cards. From the flows we will persist the results into their respective flat text files.

LUHN Algorithm

Before we get into the SSIS package definition having a basic understanding of the LUHN algorithm and factors for credit card types may be useful. The LUHN algorithm uses the last digit in the credit card number as a check digit. Then by processing the remaining numbers with a specific set of rules, shown below, we can calculate what the check digit should be.

  1. Save and removed the last digit from the CC Number. This digit is what we will use to check against the calculated check digit
  2. Starting with the last digit and working backwards, double every other value (the odd spaces) any result higher than 9 summing the digits 18 == 1 + 8 == 9
  3. Add all the numbers together and multiply by 9
  4. The check digit (that the last number of the card) should match the result modulus 10

Here is the implementation of the LUHN algorithm in C#.net.

private bool IsLUHN(string CCNumber) {
  /********************************************************************************************************************************************
  The LUHN Algorithm:
     1.  Save and removed the last digit from the CC Number. This digit is what we will use to check against the calculated check digit
     2.  Reverse the array, this is easier than going backward in an array from the last position.
     3.  Multiply the values in the odd indexes by 2 and subtract 9 from any result higher than 9.  This is the same as summing the digits 18 - 9 == 1 + 8
     4.  Add all the numbers together and multiply by 9
     5.  The check digit (the last number of the card) is the amount that you would need to add to get a multiple of 10 (Modulo 10)
  ********************************************************************************************************************************************/
  char[] charArray;
  int[] intArray;
  int lastValue;
  int sum = 0;
  int checkDigit = 0;  
 
  // Convert Credit Card Number to an integer Array
  charArray = CCNumber.ToCharArray();
  intArray = Array.ConvertAll(charArray, value => (int)Char.GetNumericValue(value));
 
  // Store the last value of the CC, this will be the check bit.
  lastValue = intArray[intArray.Length - 1];
 
  // Step 1 drop the last digit from the array
  Array.Resize(ref intArray, intArray.Length - 1);
  // Step 2 Reverse the array
  Array.Reverse(intArray);
  // Step 3 Double the odd array entries
  for (int i = 0; i < intArray.Length; i++)
  {
     if (i % 2 == 0)
     { 
        intArray[i] *= 2;
        if (intArray[i] > 9) intArray[i] -= 9;
     }
     // Step 4 Sum the array values.
     sum += intArray[i];
  }
  // Steps 4 and 5 multiple sum by 9 and get the last digit by using modulus 10.
  checkDigit = (sum * 9) % 10;
  return checkDigit == lastValue;
}

Credit Card Types

There are more credit card types, and rules, than what we are going to look at in this blog post. For simplicity we are going to use American Express, Visa, and Master card and only look the beginning numbers of the credit card to determine the card type. In the table below are the rules that we will use for this blog post

Credit Card Issuer Starts With
American Express 34, 37
Visa 4
MasterCard 51, 52, 53, 54, 55

Credit Card Validation SSIS package

The Credit Card Validation SSIS package will open a flat text file in the data flow as its source, process the flat text file through a script component transformation and depending on the results of the LUHN validation and credit card number rules, we will send to one of four outputs. Each of these outputs will be written to flat text files for persistence.

Let’s start by creating a new SSIS package and name it CCValidation. We will use five flat text file connection to either read or write data to. The following flat text file connection will be created to support the file processing.

CCValid_ConnectionManagers

Each of the flat file connections are setup the same, except for the file name. The file name will be unique to each file.

connection Name Path
CCValidationData \\localhost\Development\SSIS\Files\CCValidation\CCValidationData.txt
InvalidCCData \\localhost\Development\SSIS\Files\CCValidation\CCValidationInvalid.txt
ValidAmex \\localhost\Development\SSIS\Files\CCValidation\CCValidationAmex.txt
ValidMC \\localhost\Development\SSIS\Files\CCValidation\CCValidationMC.txt
ValidVisa \\localhost\Development\SSIS\Files\CCValidation\ CCValidationVisa.txt

Flat File Connections

In the general section of the flat file connection manager editor we will use a file name from the table above. Our files have header information, so we will need to check the “Columns names in the first data row” option.

CCValid_FileGeneral

In the columns section of the flat file connection manager editor we will select the Vertical Bar {|} value for the Column delimiter drop down list, because the flat text file is pipe delimited.

CCValid_FileColumns

In the Advanced section of the flat file connection manager editor we will need to adjust each of the columns to match the data size and type for the corresponding database column.

CCValid_FileAdvanced

Package Tasks

DFT – Process CC Data File

The Credit Card Validation SSIS package will use a single Data Flow task to validate the credit card data. This Data Flow task “DFT – Process CC Data File” implements the credit card validation process, using a Flat File Source, Script Component and flat text file destinations.

CCValid_SSISControlFlow

Data Flows

DFT – Process CC Data File

This data flow implements a total of six components a Flat File Source for the source of the credit cards, a Script Component for splitting the data flows based on credit card logic, and four flat file text file destinations for persisting the resulting output from the script component.

CCValid_SSISDataFlow

FF_SRC – Open the CC Validate Data File

This data source will define the connection manager and the output columns from the flat text file into a data stream. This data stream will provide the input to the script component. On the connection manager section select the CCValidataionData flat file connection manager.

CCValid_FlatFileSource_ConnectionManager

On the Columns section, we will output all three columns; FirstName, LastName, and CCNumber. As a best practice, if you are not going to use one of the columns in the data flow, its best to deselect it hear. It will takes up less space on your data stream, increase the performance of your package, and prevent warnings that would be written to log files.

CCValid_FlatFileSource_Columns

SCR – Test Each CC number Determine the type of card and if its valid

This Script Component will be used to validate that credit card data is valid by implementing the LUHN algorithm. We will also create data flows for each different credit card type that we find; American Express, Master Card, and Visa. When we first drop the Script Component to the Data Flow window the “Select Script Component Type” window will appear, select the Transformation.

CCValid_ScriptComponent_Type

 

 

 

 

 

 

 

 

 

 

 

 

Typically declaring the Read and Write variables would be done on the Script section of the Script Transformation Editor. However, in this Script Component we will not be using any variables.

CCValid_ScriptComponent_Script

Before we click on the Edit Script… button, we need to define out inputs and outputs. On the Input Columns section, we are going to select all three columns (FirstName, LastName, CCNumber) to be included in the data flow. Notice the Input Name (Input 0), this will become an object type that we can use in the script.

CCValid_ScriptComponent_Inputs

In the Inputs and Outputs section, will define the input and the four outputs MC, Invalid, Visa and Amex. We could change the name of the input in Common Properties for the Input 0, however for this package will leave the default name of Input 0. We will select the Add Output button for each of the four outputs and name them. These outputs will become buffer objects in the data flow.

CCValid_ScriptComponent_Outputs

On each of the outputs we will add three columns; FirstName, LastName, and CCNumber. Each of the columns that we add, will become properties in the buffer object in the data flow. Using the Common Properties we will define the types for each of the columns. For this Script Component all columns will be 50 character strings.

CCValid_ScriptComponent_Outputs2

We will now define the code that will implement out validation logic as well as separate the different types of credit cards. Here is the Input0_ProcessInputRow method. This method will be called one time for each row that is found in the flat text file. It will take the contents of the row and put them in the Row variable. Using the Row variable we can access the FirstName, LastName, and CCNumber properties. Each of our outputs can be accessed through buffer objects that are created when we defined out outputs. Amex == AmexBuffer, Visa == VisaBuffer, MC ==MCBuffer and Invalid = InvalidBuffer. Using these buffer objects we can add rows and give values to either FirstName, LastName, or CCNumber.

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
  if (IsLUHN(Row.CCNumber))
  {
    /*
       Credit Card Issuer	Starts With
       American Express	34, 37
       Write to the Amex Buffer and write to the Valid Amex File.
    */
    if ((Row.CCNumber.StartsWith("34")) || (Row.CCNumber.StartsWith("37")))
    {
      AmexBuffer.AddRow();
      AmexBuffer.CCNumber = Row.CCNumber;
      AmexBuffer.FirstName = Row.FirstName;
      AmexBuffer.LastName = Row.LastName;
    }
    /*
      Credit Card Issuer	Starts With
      Visa	4
      Write to the Visa Buffer and write to the Valid Visa File.
    */
    else if (Row.CCNumber.StartsWith("4"))
    {
      VisaBuffer.AddRow();
      VisaBuffer.CCNumber = Row.CCNumber;
      VisaBuffer.FirstName = Row.FirstName;
      VisaBuffer.LastName = Row.LastName;
    }
    /*
      Credit Card Issuer	Starts With
      MasterCard	51, 52, 53, 54, 55 
      Write to the MC Buffer and write to the Valid MC File.
    */
    else if ((Row.CCNumber.StartsWith("51")) || (Row.CCNumber.StartsWith("52")) || (Row.CCNumber.StartsWith("53")) ||
     (Row.CCNumber.StartsWith("54")) || (Row.CCNumber.StartsWith("54")))
    {
      MCBuffer.AddRow();
      MCBuffer.CCNumber = Row.CCNumber;
      MCBuffer.FirstName = Row.FirstName;
      MCBuffer.LastName = Row.LastName;
    }
    // CC Number is not Amex, MC, or Visa so move to Invalid buffer and write to Invalid File.
    else
    {
      InvalidBuffer.AddRow();
      InvalidBuffer.CCNumber = Row.CCNumber;
      InvalidBuffer.FirstName = Row.FirstName;
      InvalidBuffer.LastName = Row.LastName;
    }
  } 
  // CC Number did not pass the LUHN Algorithm add to the Invalid Flow and write to the Invalid File.
  else
  {
    InvalidBuffer.AddRow();
    InvalidBuffer.CCNumber = Row.CCNumber;
    InvalidBuffer.FirstName = Row.FirstName;
    InvalidBuffer.LastName = Row.LastName;
  }
}

Looking at the American Express section of code the IF statement looks to see of the incoming credit card number, accessed through the Row variable, starts with either 34 or 37. If the credit card starts with either 34 or 37, we will call the AddRow() method on our buffer object. This will create a new row on our data flow that we can add values to. All we have to do now is set the FirstName, LastName and CCNumber properties in our buffer object, in this case AmexBuffer, to the respective values in the Row Variable. The other three inputs all follow this same logic.

/* 
   Credit Card Issuer	Starts With
   American Express	34, 37
   Write to the Amex Buffer and write to the Valid Amex File.
*/
if ((Row.CCNumber.StartsWith("34")) || (Row.CCNumber.StartsWith("37")))
{
  AmexBuffer.AddRow();
  AmexBuffer.CCNumber = Row.CCNumber;
  AmexBuffer.FirstName = Row.FirstName;
  AmexBuffer.LastName = Row.LastName;
}

FF_DST – Output valid Visa data to file
FF_DST – Output valid Master Card data to file
FF_DST – Output valid American Express data to file
FF_DST – Output invalid data to file

Each of these Flat File Destinations will define the details of the destination of our data into the flat text files and which data stream columns will be mapped to which flat text file columns. When we map each output from the script component it should correlate to the corresponding file type. Being that all of the Flat File Destinations are equivalent we will only look at a single one. In the Connection Manager section, we will select the appropriate Flat File Connection Manager.

CCValid_FlatFileDesination_ConnectionManager

On the Mappings section, if the Flat File Destination can match the names, lines will be drawn connecting the two columns. If any columns are not automatically mapped, we will either need to drag and drop the column on the corresponding columns that should be mapped, or select the corresponding column value from the Input Column drop down list.

CCValid_FlatFileDesination_Mapping

Execution

After the package is executed the credit card data will be loaded into one of four flat text files. Depending on if the credit card data is valid, invalid, or a specific type the file details will be into the respective file type. There you have it, a credit card validation SSIS package.

Simple SSIS: Date Formats

One feature that I would like to see added to SSIS is the ability to format a date string with a specified format. I have been tasked many times with outputting file names or folders with a specific date format, such as YYYYMMDD, and found myself writing custom functionality to accomplish this task. This is not difficult in T-SQL, but in SSIS it’s not all that easy. In this blog post I am going to show a few ways that we can format a date string into a specific format. I will show how to use an Expression task, SQL task, and a Script task.

DateFormat_SSISPackage

Expression Task

EXP – Format Date as YYYMMDD
The expression task is a new features with 2012 which allows us to store the result of an expression in a variable. Basically we create an expression using the Expression Builder window and store the result into a SSIS variable. In the windows below you can see an example of the expression being set to the SSISDateFormat variable.

@[User::SSISDateFormat] =  
               (DT_WSTR,4)YEAR(GETDATE())  
+ RIGHT ( "0" + (DT_WSTR,2)MONTH(GETDATE()),2) 
+ RIGHT ( "0" + (DT_WSTR,2)DAY(GETDATE()),2)

DateFormat_ExpressionBuilder

 

 

 

 

 

 

 

 

 

 

 

 

 

SQL Task

SQL – Format Date as YYYYMMDD
The SQL Task is a commonly used task in SSIS. This task allows us to execute queries against a data source and return a full result set, a single row, or nothing at all. We also have the opportunity to pass variables into the query as well. However, for this example we will not pass anything in, and only return back a single row. Using T-SQL’s CONVERT function it’s very easy to format a date string. We will use this CONVERT function and return the single row and store the results into a SSIS Variable. The SQL syntax that we will use is:

SELECT SQLDateFormat = CONVERT(NVARCHAR(8),GETDATE(),112);

Here is a link to additional formats for the CONVERT function. https://technet.microsoft.com/en-us/library/ms187928.aspx

DateFormat_SQLTask

 

 

 

 

 

 

 

 

 

 

 

 

After the SQL Task executes the result will be stored in the SQLDateFormat SSIS variable, which is configured in the window below.

DateFormat_SQLTaskVariables

 

 

 

 

 

 

 

 

 

 

 

 

Script Task

SCR – Format Date YYYYMMDD
The script task is one of my favorite tasks, mainly because all of the additional functionality that I have access too, but it also gives me an opportunity to play with C#.net code. By passing the SSIS variable ScriptDateFormat into the ReadWriteVariables box, we can access that variable through code. In programming languages like C#.net it’s very easy to format a date string. There is a ToString() method that provides many date formats by using specific characters. When we click on the Edit Script button the Visual Studio for Apps design time environment appears. We can access the ScriptDateFormat variable through the Dts.Variables array and save string values to it. The result of C#.net DateTime.Now.ToString(“yyyyMMdd”) method call will give us a formatted date, this formatted date we can save to our Dts.Variables array. To get a date in T-SQL we use the GETDATE and in C#.net we use DateTime.Now. So the entire line of code would look like this.

Dts.Variables["ScriptDateFormat"].Value = DateTime.Now.ToString("yyyyMMdd");

Here is a link to MSDN page that has the format specifiers.
https://msdn.microsoft.com/en-us/library/8kb3ddd4.aspx

DateFormat_ScriptTask

 

 

 

 

 

 

 

 

 

 

 

 

Summary

So which is the best way? I personally prefer the Script task, mainly because it’s clean and simple. The Expression Task is a complicated expression, and the SQL Task involves an unnecessary round trip to the server to get a formatted date string. With that said, all three options work and I am sure there are other ways as well.

SSIS Find Non-Imported Files

Traversing a folder looking for files to import into a database table is a common practice using integration services. What if you have little control over the import folder or files because of security restrictions? What if there are many files stored in the folder and number of files grows daily? One way to deal with this is using a Foreach Loop Container, call to a database table or log and determine if the file had previously been imported. This may be a sufficient solution if the number of files is small, however if the number of files is larger this could have a big impact on database performance and/or package performance. A better approach may be to use a Script Task. In this blog post I will show how we can use a Script Task with some simple C# code to capture a list of files, remove processed files, and use a Foreach Loop Container to traverse the new files.

File Import Table

Before we get into creating the Integration Services package, we will need a FileImport table to keep a list of all files that have already been imported.

CREATE TABLE dbo.FileImport(
     FileImportID INT IDENTITY(1, 1) NOT NULL
   , FileName NVARCHAR(1000) NOT NULL
   , ImportDate DATETIME NULL
   , RecordCount INT NULL
   , ErrorMessage NVARCHAR(MAX) NULL   
) ON  [PRIMARY];

New File Import SSIS package

The New File Import SSIS package will query the FileImport table to capture a list of previously imported files as well as capturing a list of available files in the import folder. Using some code we will eliminate files that have already been imported and then loop over this new set of files that have not been imported.

Let’s start by creating a new SSIS Package and named it NewFileImport. We will use an OLE DB connection to query the FileImport table and a few package variables to store processing data. The following package variables will be created for the associated purposes.

Trav_SSIS Variables

Variable Definitions

  • The FileName variable will be used store each of the new files as we traverse the new files using a Foreach Loop Container.
  • The ImportedFiles variable will be used to store the previously imported files that is received from FileImport Table
  • The ImportedPath variable is the path where the files to be imported are stored.
  • The NewFiles variable is a list of files that have not been imported into the system yet.

Package Tasks

The package will use a SQL Task, Script Task and Foreach Loop Container.Trav_SSIS Tasks

SQL – Get a list of all processed files

This Task will capture a list of all the files that have been previously imported files and will store them into the ImportedFiles variable. The ResultSet should be set to “Full result set” and a simple query to the FileImport table “SELECT FileName FROM dbo.FileImport;” should be saved in the SQLStatement.

Trav_SQLTaskGeneral

The Result Set should be saved to the User::ImportedFiles variable.

Trav_SQLTaskResultSet

SCR – Get a list of files to Import

This task will create a data set and array list object that will store all of the file names in the FileImport Table and all the file names in the Import Path. A loop will be used to traverse the files names in FileImport object. On each pass through the loop the corresponding file name in the Import Path object will be removed. When the loop is complete a list of only new file names will be remaining in the Import Path object. This Import Path object will be stored in the NewFiles variable and used in the Foreach Loop Container. The following C# script implements this logic.

public void Main(){			
   OleDbDataAdapter da = new OleDbDataAdapter();
   DataTable dt = new DataTable();
   object ImportedFiles = Dts.Variables["ImportedFiles"].Value;
   string ImportedPath = Dts.Variables["ImportedPath"].Value.ToString();
 
   // Load the processed files from the record set into a DataTable variable.              
   da.Fill(dt, ImportedFiles);
 
   // Load the files to be imported from the directory to be imported into an 
   // ArrayList variable.                        
   ArrayList ImportDataFiles = new ArrayList(Directory.GetFiles(ImportedPath));
 
   // Traverse the DataTable and remove any matching files from the ArrayList.
   foreach (DataRow dr in dt.Rows) {
      ImportDataFiles.Remove(dr[0].ToString());
   }
 
   // Return the ArrayList which will only have new files.
   Dts.Variables["NewFiles"].Value = ImportDataFiles;
 
   Dts.TaskResult = (int)ScriptResults.Success;
}

The ImportedFiels, ImportPath and NewFiles are used in the script, so we will need to declare their use in the script.

Trav_ScriptTaskScript

FELC – Loop through all files

This Foreach Loop Container will loop through all of the files that are stored in the NewFiles variable. The enumerator type that will be used is the “Foreach From Variable Enumerator”. The selected Variable should be User::NewFiles.

Trav_FELCCollection

The Variable Mapping should store the result of each loop into the User::FileName variable.

Trav_FELCVariableMapping

On each iteration of the Foreach Loop Container, the full path to the file, from the NewFiles variable will be stored into the FileName variable. Having the path to the file will help in the file import process inside the Data Flow Task. The details of importing file data can be found in this Simple SSIS: Importing Data from Flat Text Files blog post.