Friday, March 30, 2012

How to remove unwanted characters during ETL?

Can unwanted characters (e.g. control codes) be replaced or removed in varchar fields during extraction inside DTS package?

SQL Server/SSIS 2005.

Thanks, Andrei.

Inside of an SSIS package, yes. DTS is old news now! You can use the REPLACE() function along with a CHAR() function in a derived column transformation.

To remove tabs: REPLACE([column],char(9),"")|||

Thanks, Phil!

This is the way to go.

Unfortunatelly there are couple problems.
1) Function CHAR isn't supported in SSIS (DTSx) package.

2) I'd like to have a filter against any invalid character, e.g. ASCII range 0-31. I'm not sure if SSIS's REPLACE function can do that.

I wonder if a custom function, e.g. from a CLR assembly, could be used in SSIS.

Regards, Andrei.

|||

Andrei Kuzmenkov wrote:

Thanks, Phil!

This is the way to go.

Unfortunatelly there are couple problems.
1) Function CHAR isn't supported in SSIS (DTSx) package.

2) I'd like to have a filter against any invalid character, e.g. ASCII range 0-31. I'm not sure if SSIS's REPLACE function can do that.

I wonder if a custom function, e.g. from a CLR assembly, could be used in SSIS.

Regards, Andrei.

CHAR() is too supported, it's just not in the drop down list. You can use it inside the replace function, just as I noted. You just have to type it in there. You could always write 31|||

Andrei Kuzmenkov wrote:

I wonder if a custom function, e.g. from a CLR assembly, could be used in SSIS.

Yes, you can create Script Transform and easily call custom function from there. Or if it is one-time code, you may simply place it in the Script Transform code as well.

Note that the custom assembly needs to be placed to GAC and because of to limitation of VSA we use for scripting it also should be in Windows\Microsoft.NET\Framework\v2.0.50727 (this is only needed for design time).

|||

In VS2005 I'm getting "The Function CHAR was not recognized".

No comments:

Post a Comment