Big picture thoughts on software and other topics

March 6, 2009

Why I'll Never Use SSIS (Unless My Family is Threatened)

by Brian Donahue

My friend Kyle BaleyBroken Link: http://codebetter.com/blogs/kyle.baley/ wrote a postBroken Link: http://codebetter.com/blogs/kyle.baley/archive/2009/03/05/ninject-contextual-binding-or-how-to-defy-mick-jagger-and-always-get-what-you-want.aspx about using Ninject for Dependency Injection in a console app that is basically performing an export ETL process to get data from a database into an XML format and ship it off to a third party.  The app is for my client, on a project that Kyle has been helping me out with for a while.  In that post, he joked about using SSIS to do so, and one of his commenters askedBroken Link: http://codebetter.com/blogs/kyle.baley/archive/2009/03/05/ninject-contextual-binding-or-how-to-defy-mick-jagger-and-always-get-what-you-want.aspx#188998 the earnest question,  "why not SSIS?"

Kyle’s short answer to this question is probably "because Brian asked me not to."  I showed him an example of an import console app I had done, and suggested we take a similar approach.  He asked me if I could elaborate in reply to the comment.  Never wanting to miss an opportunity to vent my SSIS frustration, I jumped at chance!  :)  If you want to commiserate with other SSIS users, I recommend reading my original SSIS Tries So Hard To Make You Hate It post, and the comments.  It is one of my most popular - and most commented on - posts.

The most compelling reason to avoid SSIS in this situation, as I have already covered in my brilliant essay "Reason #94,532 to Hate SSIS," is that SSIS has no native support for exporting data as XML.  If you want to read that sentence again, feel free.  In the year 2009, SQL Server Integration Services can not export your data (easily) to XML even if you use a "FOR XML" query which spits out xml right from SQL Server.  FOR XML queries are horrendous in practice, though, for anything beyond very simple XML schemas.

I’ve covered a lot of reasons why SSIS is painful to work with, and I quoted Ayende's very comprehensive list of pain points.  I strongly suggest reading that list if you are considering using SSIS for any business critical operations.  Rather than repeating that list, I’ll summarize some of the advantages I got from using a console application.

  1. I can write tests.  I actually did the last console app in a mostly TDD fashion, though I confess when it came to integration tests and testing with real sample data, I didn’t spend the extra effort.  But I ended up with a fairly extensible framework for performing a data import - fetching the data, parsing/mapping, and saving to my datastore.
  2. Logging - I could control when/where I logged, and what the messages told me.  They actually were human readable, and didn’t require a super-human DBA who speaks in SQL Server Error Codes to decipher.
  3. Error Handling
    1. Leveraging the logging (log4net) I can also send error emails out on failure that actually mean something to not only myself, but my clients.
    2. I can catch some errors and retry several times, such as when an FTP connection fails - which happens fairly regularly.
  4. Inversion of Control/Dependency Injection - with my framework above, I can just plug in different "DataRetrievers" (FTP, HTTP, File System, etc), different parsers, mappers, and Data Access components, and the logging, and general workflow does not have to be recreated.

There are other ways I could have done this, including checking out Ayende’s Rhino ETL package, which I am interested in, but wasn’t sure I was ready to add another tool into the mix.  In the end, I decided to stick with TDD and C# and see what I came up with.  I was very happy with the results, and even happier when I got my first production error email, and it was clear (the vendor’s FTP had been down), and easy for my client to understand without me having to go and parse the log table in SQL Server.  It simply read:

Subject: My Import Error

2009-02-17 12:30 [1] ERROR MyImport.Core.FtpClientCSCommands Files could not be downloaded: The remote name could not be resolved: 'ftp.datasource.com'

In conclusion, it’s not to say you can’t do any/all of the above with SSIS.  But in my experience, if you want to try, you really mustn’t value your time and mental health very much.