Wednesday, March 28, 2012

how to remove duplicate records from incoming textfiles

Is there a way to check if duplicates exists in the incoming textfiles?

Run the text file through a sort transformation and select the "remove duplicates" check box.|||using sort exceution seem to be too slow....is there any other way?|||

sureshv wrote:

using sort exceution seem to be too slow....is there any other way?

Yep,

Some related links for you that will help:

A distinct component please

(http://blogs.conchango.com/jamiethomson/archive/2006/12/08/SSIS_3A00_-A-distinct-component-please.aspx)

How to get Distinct Count in SSIS

(http://sqlblog.com/blogs/marco_russo/archive/2007/03/09/how-to-get-distinct-count-in-ssis.aspx)

NSort (High performance sort component that contains functionality to remove duplicates)

(http://www.ordinal.com/NsortSSIS.pdf)

"Distinct" related posts from the SSIS Search Macro

(http://search.live.com/results.aspx?q=distinct&form=QBRE&q1=macro%3Ajamiet.ssis)

-Jamie

|||You could also load that flat file as is to a staging table in SQL server then use a select distinct from that staging table.

No comments:

Post a Comment