Iterate Over Databases Using SSIS – Part 2

In Part 1 of 2 I showed how to iterate over the databases in any given SQL Server Instance using SSIS. In Part 2 I’ll show how to extend the SSIS Package created in Part 1 to iterate over all databases across any number of  Instances in your environment.


Overview

The verb iterate means “to perform repeatedly.” Within the context of this article it will specifically mean “running the same code for each database in any number of SQL Server Instances.” We’ll be extending the sample SSIS 2012 Package developed in Part 1 so this walk-through section assumes you already have the package that resulted from the Part 1 walk-through. Let’s get started.

Design

In Part 1 we retrieved the list of databases in an Instance from the sys.databases System View and iterated over the results using a Foreach Loop Container (FeLC) sending each database name and id to the output window using a Script Task. We will extend this technique so we can iterate over any number of Database Instances and step 1 will be to create a place to store the list of Instances we want to iterate over. From there, constructing a Package with a loop that contains a nested loop will serve us well. In addition to extending the loop mechanism we’ll also explore how to map values made available by a FeLC to the properties of Connection Manager, another common need when developing SSIS Packages. Here is what we’re aiming for in terms of a workflow:

  1. Use an Execute SQL Task to issue a SELECT statement that retrieves a list of database Instances in which we want to do something for each database and store the resultset in a SSIS Variable of type Object.
  2. Use a Foreach Loop Container to iterate over the DataSet stored in a SSIS Variable and do something for each row in the resultset, i.e. do something for each Instance.
  3. Inside the Instance-loop, or outer-loop, use an Execute SQL Task to issue a SELECT statement that retrieves information about the databases on that Instance and store the resultset in a SSIS Variable of type Object.
  4. Use another Foreach Loop Container to iterate over the DataSet stored in a SSIS Variable and do something for each row in the resultset, i.e. do something for each database.
  5. The options in terms of what you can do inside the inner Foreach Loop Container are up to your imagination but for purposes of this demo we’ll simply be writing the instance and database names to the output window to show one way you might access each row in the resultset separately.

At this stage we have broken down the problem and defined a workflow so let’s begin diving into the development of the sample SSIS Package. If you want to follow along developing your own version of the sample Package open your preferred copy of Visual Studio or BIDS and create a new SSIS Package, else you can just follow along referring to the screenshots provided in each step. I’ll be using Visual Studio 2010 and SSIS 2012 but this sample is generic such that you can produce the sample Package using any version of Visual Studio or BIDS that supports SSIS development.

Development Walk-through

1. The first step is determining where we will get our list of Instances. Unless you have a Central Management Server or a utility database in your environment then you may not have a structured location where a list of the database Instances in your environment can be stored. My utility databases are named DBA and they contain tables with environment-metadata and administration information as well as code objects that help me maintain my environment. In my utility database there is a table named dbo.EnvironmentInstance. This table typically has more than one column and more than one entry but for this demo I am keeping it simple. If you do not have a utility database go ahead and create one now on the same instance you pointed the Part 1 demo-Package, and then create this table in it:

USE DBA;
CREATE TABLE dbo.EnvironmentInstance ( Name SYSNAME CONSTRAINT [PK_dbo.EnvironmentInstance] PRIMARY KEY );

You’ll need at least one entry to support the demo, but add as many as you like. My table contains only one entry named for my local instance of SQL Server 2012, ORLANDO\SQL2012 :

USE DBA;
INSERT INTO dbo.EnvironmentInstance (Name) VALUES ('ORLANDO\SQL2012');


2. From the Package we developed in Part 1 here is what you should have so far:

1.PackageAfterPart1


3. We’ll incorporate the list of Instances we want to iterate over in much the same way we incorporated the list of Databases in Part 1. Again we will need a Variable to temporarily store our list of Instances, i.e. a SQL Server resultset in the form of a .NET DataSet. Create a new Variable at the Package scope* named Instances and configure it as an Object data type.

5.InstancesVariable

* SSIS Variables, as with variables in most other programming languages, have scope. Visual Studio will create new Variables in the scope that has the focus when the variable is created. We want our Variables created in the Package scope. To give the Package focus click any open space on the Control Flow Design Surface, i.e. we do want the Execute SQL Task to have focus when the Variable is created.



4. The next thing we’ll do is add what will become the outer-loop, the Foreach Loop Container that will iterate over our list of Instances. Drag a new Foreach Loop Container from the SSIS Toolbox and drop it onto the Control Flow. 3.AddOuterFeLC


5. Holding the Ctrl key, click the Execute SQL Task and then the original Foreach Loop Container to select both of them. You’ll know they are both selected because they’ll have a thick black border around them. Then, release the Ctrl key and drag them (you can click anywhere on either of the selected objects to grab them) and drop them onto the new Foreach Loop Container.

4.DragAndDrop

When you drop them onto the new FeLC you’ll notice they will then exist inside the container. This means they will execute for every iteration of the outer-loop so now we have a loop within a loop, also known as a nested loop.

4.InsideLoop


6. At this stage we have our looping structure configured but we still do not have our Package setup to retrieve the list of Instances from our utility database. Drag a new Execute SQL Task from the SSIS Toolbox and drop it onto the Control Flow. You’ll notice Visual Studio has kindly marked the Task with a red-X denoting a validation error exists in the Task. You can safely ignore the validation error for now. The validation error will be resolved once we configure the Task to connect to our utility database in the next step.

03.ExecuteSQLTask


7. Double-click the Execute SQL Task to open the Editor and configure the General Page as follows:

8.ExecuteSQLProperties

SQL Statement:  SELECT * FROM DBA.dbo.EnvironmentInstance;


8. Configure the Result Set page as follows:

9.ExecuteSQLResultSet

Click OK and notice how Visual Studio lifted the Red-X from our Task letting us know it appears to be configured in a valid way:

6.ExecuteSQLTask


9. Connect the new Execute SQL to the FeLC. Here is the Package we have developed so far:

9.PackageSoFar

This is actually how the finished Control Flow will look visually however we have a bit more programming to do before it will do what we want it to do.


10. We will need a Variable to store the Instance Name as we iterate over our DataSet. Create a new Variable at the Package scope * named InstanceName and configure the data type as String and sets its value to the name of database instance, e.g. ORLANDO\SQL2012. Your Variables window should now look like this:

10.Variables

* SSIS Variables, as with variables in most other programming languages, have scope. Visual Studio will create new Variables in the scope that has the focus when the variable is created. We want our Variables created in the Package scope. To give the Package focus click any open space on the Control Flow Design Surface, i.e. we do want the Execute SQL Task to have focus when the Variable is created.


11. Double-click the outer Foreach Loop Container, i.e. Foreach Loop Container 1, to open the Editor and configure the General, Collection and Variable Mappings Pages as follows:

11.FeLC.General

11.FeLC.Collection

Notice we chose the Foreach ADO Enumerator. The name of this iterator fits with what we’re doing because 1) we’ll be iterating over an ADO.NET DataTable contained within a DataSet, and 2) our Instances Variable is of type Object. The Execute SQL Task stored the resultset it retrieved from the database into the Variable of type Object. When the FeLC is using the Foreach ADO Enumerator it treats the variable of type Object it is set to iterate over as an ADO.NET DataSet, which suits our needs quite well.

11.FeLC.VariableMappings


Status Check:

Our outer-loop is configured to iterate over the list of Instances we retrieved from our utility database, DBA, and the loop from our previous effort that iterates over the databases in an Instance, now our inner-loop,  is tucked inside. Now comes the fun part, namely configuring the Tasks contained within our outer-loop to take their cues from the information provided by that loop so we can iterate over the databases in any number of Instances per what we have stored in our Instance-table.

In the next steps we’ll be leveraging SSIS Expressions to make some of the Tasks within our Package work dynamically.


12. The Execute SQL Task inside the inner-loop, i.e. Execute SQL Task – Get List of Databases, must be changed to execute against the Instance made available by the outer-loop as it iterates over the list of Instances retrieved from our EnvironmentInstances table. To achieve this we will need a new Connection Manager configured with a dynamic Connection String which can be used to execute code against the Instances in our environment.

Create a new OLE DB Connection Manager and point it to the same database where the EnvironmentInstances table resides, but name the Connection Manager InstanceConnection. We are pointing it to our utility database initially but will configure it shortly such that it will change where it is pointing at runtime per the InstanceName variable as the outer-loop iterates over the Instances variable.

13. In the properties of InstanceConnection (highlight the Connection Manager and press F4) click into the Expressions field and click the ellipsis. Then map the InstanceName variable into the ServerName property. Also, we’ll change the DelayValidation property to True because the connection information won’t be available until runtime and we want to avoid any validation errors.

12.ConnectionProperties

14. Now that we have a dynamic Connection Manager ready to connect to the Instance provided by our outer-loop we need to configure the Execute SQL Task that selects the names of the databases on the Instance. Remember in Part 1 we were only looking at one Instance, and now we are looking at any number of Instances per what is stored in the EnvironmentInstance table.

Double-click the Execute SQL Task in the inner-loop, Execute SQL Task – Get List of Databases, to open the Task Editor. We only need to change the Connection property to point to the new InstanceName Connection Manager.

14.ExecuteSQLTask

15. The last piece for us to complete our demo Package is to modify the Script Task that writes information to the Output Window to show the Instance Name, in addition to the Database ID and Database Name.

Double-click the Script Task to open the Properties Editor and add User::InstanceName to the list of read-only variables passed to the script.

15.ScriptTask2

15.ScriptTask

Then click the Edit Script button to modify the VB code* and change the Main method to this:

Public Sub Main()
     Dim tempBool As Boolean = False
     Dts.Events.FireInformation(1, "Instance.Database (Id)",
        Dts.Variables("User::InstanceName").Value.ToString() + "." +
        Dts.Variables("User::DatabaseName").Value.ToString() + " (" +
        Dts.Variables("User::DatabaseId").Value.ToString() + ")", "", 0, tempBool)
     Dts.TaskResult = ScriptResults.Success
End Sub

Save and close the script, then Click OK on the Script Task Editor to save the changes.

This is where your imagination can takeover. What is it that you want to do for each database in each instance? Do you need to add columns to EnvironmentInstance and incorporate them in your process, or do you need to retrieve additional columns from sys.databases or add a WHERE-clause to the query?

* For the record I overwhelmingly prefer C# over Visual Basic. The fact is that SSIS 2005 Script Tasks only supported Visual Basic so the choice to use Visual Basic here is purely to allow people still using SSIS 2005 to continue to follow along without having to port C# code.


Results

Development is complete and we are ready to execute the Package:

Results

Success! You should see the Instance name, the database name and the database id of each database in each of your Instances shown in the Output Window, like this:

SSIS package “C:\SkyDrive\SSIS.IterateOverDatabases-Part2\SSIS.IterateOverDatabases-Part2\Package.dtsx” starting.
Information: 0x1 at Script Task, Instance.Database (Id): ORLANDO\SQL2012.master (1)
Information: 0x1 at Script Task, Instance.Database (Id): ORLANDO\SQL2012.tempdb (2)
Information: 0x1 at Script Task, Instance.Database (Id): ORLANDO\SQL2012.model (3)
Information: 0x1 at Script Task, Instance.Database (Id): ORLANDO\SQL2012.msdb (4)
Information: 0x1 at Script Task, Instance.Database (Id): ORLANDO\SQL2012.DBA (5)
SSIS package “C:\SkyDrive\SSIS.IterateOverDatabases-Part2\SSIS.IterateOverDatabases-Part2\Package.dtsx” finished: Success.

Tagged with: ,
Posted in SQL Server, SSIS
2 comments on “Iterate Over Databases Using SSIS – Part 2
  1. Shodhan says:

    Hi,

    I have done similar development using two for loops to fetch the list of companies and then extract data from webservice. The inner for loop is used to extract the data and retry the call to webservice if there is any failure.
    In my case if the inner for loop fails then the outer for loop still executes successfully. It doesn’t give any error. Can you guide me how to fail the outer loop on failure of inner for loop.

Leave a comment