Friday, March 8, 2013

Staging paradigms universe in MS SQL Server


How do you pass bulk data from the application to a staging table in the database layer in a typical 3-tier java application?
Options:
  1. Use multiple INSERT clauses (with iBATIS/spring-jdbc/jdbc)
  2. Use INSERT clause with multiple values clauses [SQL Server 2008 onwards] (withiBATIS/spring-jdbc/jdbc)
  3. Use batching of INSERT clause with varying batch sizes (with iBATIS/spring-jdbc/jdbc)
  4. Create a bcp file and use the bcp executable
  5. Create a bcp file and use the BULK INSERT T-SQL command (with iBATIS/spring-jdbc/jdbc)
  6. Create a bcp file and use the OPENROWSET BULK T-SQL command (withiBATIS/spring-jdbc/jdbc)
  7. XML shredding (with iBATIS/spring-jdbc/jdbc)
We will analyze each of these options and try to establish benchmarks in trying to stage data of two data sets sized 1k and 100k records having columns belonging to a motley of data types and try to establish best practices as to where what should be used.
The staging candidate chosen looks something like this:
CREATE TABLE motley
  (
     date           DATETIME,
     name           VARCHAR(50),
     id             INT,
     price          NUMERIC(18, 4),
     amount         NUMERIC(18, 4),
     fx_rate        NUMERIC(18, 7),
     is_valid       TINYINT,
     knowledge_time DATETIME
  )
The stats were as follows:



As you may imagine, there are pros and cons with each approach which I have tried to outline here.

Feel free to suggest ways to further fine-tune each of these approaches or suggest brand new ones or plainly comment on the comparison metrics or factors.
Hope this helps!
References:

No comments:

Post a Comment