Musings of the lazy developer: benchmarking

How do you pass bulk data from the application to a staging table in the database layer in a typical 3-tier java application?

Options:

Use multiple INSERT clauses (with iBATIS/spring-jdbc/jdbc)
Use INSERT clause with multiple values clauses [SQL Server 2008 onwards] (withiBATIS/spring-jdbc/jdbc)
Use batching of INSERT clause with varying batch sizes (with iBATIS/spring-jdbc/jdbc)
Create a bcp file and use the bcp executable
Create a bcp file and use the BULK INSERT T-SQL command (with iBATIS/spring-jdbc/jdbc)
Create a bcp file and use the OPENROWSET BULK T-SQL command (withiBATIS/spring-jdbc/jdbc)
XML shredding (with iBATIS/spring-jdbc/jdbc)

We will analyze each of these options and try to establish benchmarks in trying to stage data of two data sets sized 1k and 100k records having columns belonging to a motley of data types and try to establish best practices as to where what should be used.

The staging candidate chosen looks something like this:

CREATE TABLE motley
  (
     date           DATETIME,
     name           VARCHAR(50),
     id             INT,
     price          NUMERIC(18, 4),
     amount         NUMERIC(18, 4),
     fx_rate        NUMERIC(18, 7),
     is_valid       TINYINT,
     knowledge_time DATETIME
  )

The stats were as follows:

As you may imagine, there are pros and cons with each approach which I have tried to outline here.

Feel free to suggest ways to further fine-tune each of these approaches or suggest brand new ones or plainly comment on the comparison metrics or factors.

Hope this helps!

References:

Musings of the lazy developer

Friday, March 8, 2013

Staging paradigms universe in MS SQL Server

About Me

Blog Archive