Importing XML into SQL Server 2000
All the
examples in this article are based on a sample table having following schema:
|
USE
Northwind
GO
CREATE
TABLE tblBooks(
Row_ID INTEGER IDENTITY(1,1) PRIMARY
KEY,
BookTitle VARCHAR(100) NOT NULL,
Publisher VARCHAR(100) NULL,
DateOfPurchase DateTime NULL
)
GO
|
And the
input source XML document looks like:
|
<?xml
version="1.0"?>
<Books>
<Book>
<Title>XML
Application Development with MSXML 4.0</Title>
<Publisher>Wrox
Press</Publisher>
<DateOfPurchase>2/1/2002</DateOfPurchase>
</Book>
<Book>
<Title>Professional
SQL Server 2000 XML</Title>
<Publisher>Wrox
Press</Publisher>
<DateOfPurchase>9/10/2001</DateOfPurchase>
</Book>
<Book>
<Title>Professional
XML for .NET Developers</Title>
<Publisher>Wrox
Press</Publisher>
<DateOfPurchase>12/20/2001</DateOfPurchase>
</Book>
</Books>
|
Make sure you have
the above mentioned sample table (tblBooks) created and the source XML file saved somewhere on the
disk (c:\Books.xml).
This
article assumes that you have some familiarity with MSXML 4.0, ADO 2.7, .NET
Framework and SQLXML 3.0.
Using MSXML 4.0 DOM and ADO 2.7
In this method we
write the code to load and parse the source XML document using MSXML 4.0 DOM
and then use ADO 2.7 to insert the rows into the table. This method gives us
the maximum flexibility and can be used with any database, not just SQL Server
2000.
Start Microsoft
Visual Basic 6.0, create a standard EXE Project, add reference (Project |
Reference) to Microsoft XML,
v4.0 (MSXML4.dll) and Microsoft ActiveX Data Objects Library 2.7 (ADO 2.7),
double click on the form and write the following code in the Form_Load() method:
First, declare MSXML
and ADO objects:
|
Dim objXMLDOM As New MSXML2.DOMDocument40
Dim objNodes As IXMLDOMNodeList
Dim objBookNode As IXMLDOMNode
Dim objADORS As New ADODB.Recordset
Dim objADOCnn As
New ADODB.Connection
|
Next, load the source
XML file into an MSXML DOMDocument object:
|
objXMLDOM.async
= False
If Not
objXMLDOM.Load("C:\Books.xml") Then
MsgBox
"Error: " & objXMLDOM.parseError.reason
Exit Sub
End If
|
Let's now try to see
if there are any records to be inserted:
|
Set objNodes =
objXMLDOM.selectNodes("/Books/Book")
If
objNodes.length <= 0 Then
MsgBox
"No records to be inserted!"
Exit Sub
End If
|
So far, we have
loaded the XML document into memory using DOM and we know that there are some
records to be inserted into the database table. Let's now connect to the
database, create a blank tblBooks recordset and start adding rows into the table:
|
objADOCnn.Open _
"PROVIDER=SQLOLEDB;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
objADORS.Open _
"SELECT
* FROM tblBooks WHERE 1 = 2", objADOCnn,
adOpenKeyset,
adLockOptimistic
If Not
objADORS.Supports(adAddNew) Then
MsgBox
"RecordSet does not support adding new records!"
GoTo
DoCleanup
End If
|
We create a blank
recordset based on the tblBooks table by using the Where
1=2 condition.
Next, we make sure that the recordset supports adding new records.
Following lines of
code actually add new records to the recordset and save them into the database
table:
|
On Error Resume
Next
For Each
objBookNode In objNodes
With
objADORS
.AddNew
.fields("BookTitle") = _
objBookNode.selectSingleNode("Title").nodeTypedValue
.fields("Publisher") = _
objBookNode.selectSingleNode("Publisher").nodeTypedValue
.fields("DateOfPurchase") = _
objBookNode.selectSingleNode("DateOfPurchase").nodeTypedValue
.Update
End With
Next objBookNode
|
The above lines use
the ADO Recordset's AddNew method to add a blank record to the recordset.
Next we set the field's values by selecting the node values from the XML DOM Nodes;
the call to Update method saves the newly inserted row. Finally, we close the
recordset and database connection.
After executing the
above code, if you select the rows from the tblBooks database table in Query Analyzer, you'll see the newly
inserted records:

The above approach is
very easy to implement and gives the great amount of control as to how and what
to read and insert. The only problem with the above scheme is with the inherent
design of DOM: XML documents needs to be loaded (and then unloaded) entirely
into memory to work on them – this might not be the best solution when the
source XML document is large. This method will require significant amount of
memory and system resources when the source XML documents are huge. Also, as we
are inserting one record at a time, it is not required to load the entire XML
document into memory. Enter SAX – an excellent lightweight alternative to DOM
when working with large XML documents.
Using MSXML 4.0 SAX and ADO 2.7
Once again, we'll
insert rows into the sample table (tblBooks) from the source XML file (c:\Books.xml). You might
want to delete all the existing records before running this sample. To delete
all records and reset the identity field's value, run following T-SQL command
in the SQL Query Analyzer:
Start Visual Basic
6.0, create a new standard EXE Project, add reference to Microsoft XML 4.0 and
Microsoft ActiveX Data Objects 2.7, add a new class (Project
| Add Class Module), change the class
name from Class1 to ContentHandler and write the following line inside this class module:
|
Option Explicit
Implements IVBSAXContentHandler
|
After you write the
above two lines, select IVBSAXContentHandler from the first (General) combo box followed by selecting each item from the second
(Declarations) combo box. Basically, we are implementing the ContentHandler SAX interface. To learn more about SAX, visit the
PerfectXML MSXML focus area at http://www.perfectxml.com/msxml.asp.
Let's declare some
variables to maintain the state; write following lines below the Implements IVBSAXContentHandler line:
|
Dim objADORS As New ADODB.Recordset
Dim objADOCnn As New ADODB.Connection
Dim strTitle As String
Dim strPublisher As String
Dim strDate As String
Dim CurField As Integer
|
When the startDocument event is fired, we connect to the database and create an
empty Recordset; when the endDocument is called, we close the recordset and the database
connection:
|
Private Sub IVBSAXContentHandler_startDocument()
objADOCnn.Open _
"PROVIDER=SQLOLEDB;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
objADORS.Open "SELECT * FROM tblBooks
WHERE 1 = 2", objADOCnn, _
adOpenKeyset,
adLockOptimistic
End Sub
Private Sub IVBSAXContentHandler_endDocument()
objADORS.Close
objADOCnn.Close
Set objADORS =
Nothing
Set objADOCnn =
Nothing
End Sub
|
The startElement event handler initializes the variables based on the strLocalName:
|
Private Sub
IVBSAXContentHandler_startElement(strNamespaceURI As String, strLocalName As
String, strQName As String, ByVal oAttributes As MSXML2.IVBSAXAttributes)
If strLocalName
= "Book" Then
strTitle =
" "
strPublisher
= " "
strDate =
" "
Else
Select Case
(strLocalName)
Case
"Title"
CurField = 1
Case
"Publisher"
CurField = 2
Case
"DateOfPurchase"
CurField = 3
End Select
End If
End Sub
|
The characters event handler and endElement event handler are the two main functions that store the XML
data into local variables first and then insert a new record into the ADO Recordset,
respectively:
|
Private Sub IVBSAXContentHandler_characters(strChars As
String)
Select Case
(CurField)
Case 1
strTitle
= strTitle & strChars
Case 2
strPublisher
= strPublisher & strChars
Case 3
strDate
= strDate & strChars
End Select
End Sub
Private Sub
IVBSAXContentHandler_endElement(strNamespaceURI As String, strLocalName As
String, strQName As String)
If strLocalName
= "Book" Then
With
objADORS
.AddNew
.Fields("BookTitle") = strTitle
.Fields("Publisher") = strPublisher
.Fields("DateOfPurchase") = strDate
.Update
End With
Else
CurField = 0
End If
End Sub
|
The characters event handler stores the XML node value into a local
variable based on the current node (set in startElement handler); the endElement when called for the Book node, inserts a new record into the recordset, this code is
exactly same as in the earlier example.
The rest of the
ContentHandler methods have the blank implementation:
|
Private Property Set
IVBSAXContentHandler_documentLocator(ByVal RHS As MSXML2.IVBSAXLocator)
End Property
Private Sub
IVBSAXContentHandler_endPrefixMapping(strPrefix As String)
End Sub
Private Sub
IVBSAXContentHandler_ignorableWhitespace(strChars As String)
End Sub
Private Sub IVBSAXContentHandler_processingInstruction(strTarget
As String, strData As String)
End Sub
Private Sub IVBSAXContentHandler_skippedEntity(strName As
String)
End Sub
Private Sub
IVBSAXContentHandler_startPrefixMapping(strPrefix As String, strURI As
String)
End Sub
|
Now that we have the ContentHandler implementation ready, let's write the code to start parsing
the source XML document using SAX and tell it to use the above class as the ContentHandler implementation. Write the following code in the Form_Load
method:
|
Dim objReader As
New MSXML2.SAXXMLReader40
Dim CHandler As
New ContentHandler
Set
objReader.ContentHandler = CHandler
objReader.parseURL "c:\Books.xml"
|
Run the application
and you'll see that this example produces the same results as the first
example; inserts records into the tblBooks table from the c:\Books.xml file. Like the first example, this approach also provides
the maximum control over how and what data is imported, in addition this method
is more efficient when working with large XML documents. The only downside is
that SAX programming requires extra (little complicated) code and requires
application to maintain the state.
Using SQLXML 3.0 XML Bulk Load
Microsoft recently
(Feb 13, 2002) announced the SQLXML 3.0 final release. This release further enhances
the XML Bulk Load functionality, which was introduced in Web Release 1.0. SQLXML
3.0 provides full support for XSD schemas while using the Bulk Load functionality.
Let's look at how to
use SQLXML 3.0 XML Bulk Load functionality to import C:\Books.xml into tblBooks database table. Once again, you might want to truncate the
table. To learn more about SQLXML, visit PerfectXML SQLXML Focus section at http://www.perfectxml.com/sqlxml.asp.
SQLXML 3.0 Bulk load
functionality requires an XSD schema file, in addition to the source XML
document. Save the following text as Books.xsd:
|
<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
<xsd:element
name="Book" sql:relation="tblBooks" >
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Title" type="xsd:string"
sql:field="BookTitle" />
<xsd:element name="Publisher" type="xsd:string"
/>
<xsd:element name="DateOfPurchase"
type="xsd:date" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
|
The above XSD schema
file defines the structure of Books.xml file, maps the Book node to the tblBooks table and also defines the table fields mapping.
Start Visual Basic
6.0, create a new standard EXE project, and add reference to Microsoft SQLXML
BulkLoad 3.0 Type Library (xblkld3.dll) and write the following code in the Form_Load
method:
|
Dim objXBL As
New SQLXMLBulkLoad3
objXBL.ConnectionString = _
"PROVIDER=SQLOLEDB.1;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
objXBL.ErrorLogFile = "C:\SQLXML3Books.errlog"
objXBL.KeepIdentity = False
objXBL.Execute
"C:\Books.xsd", "C:\Books.xml"
Set objXBL =
Nothing
|
The above code first
creates an instance of SQLXML Bulk Load object and initializes the ConnectionString, ErrorLogFile, and KeepIdentity
properties. Setting the KeepIdentity to False tells the bulk-load operation to ignore the identity column
and let SQL Server assigns a value to the identity column.
Note that the connection
string now refers to the SQLOLEDB.1 provider in this example.
If there are any
errors during the bulk load operation, they are saved into the error log file (C:\SQLXML3Books.errlog).
You may also use the
XML Bulk Load object model (SQLXMLBulkLoad.SQLXMLBulkLoad.3.0) from with a DTS
Package script, and schedule the package to run periodically to import XML data
files into SQL Server 2000 database tables.
Using OPENXML
SQL Server 2000
introduced a new Transact SQL function called OPENXML, which allows using XML data as a relational rowset. To
learn more about OPENXML, see http://www.perfectxml.com/articles/xml/openxml.asp.
We'll now see an
example of OPENXML in conjunction with some ADO code to import the C:\books.xml file into the tblBooks database table.
Let's start by
creating a stored procedure that actually uses the OPENXML
function to insert the data; we'll then call this stored procedure from the ADO
code.
|
CREATE PROCEDURE sp_Insert_Books_Openxml
@strXML ntext
AS
DECLARE @iDoc
int
EXECUTE
sp_xml_preparedocument @iDoc OUTPUT, @strXML
INSERT INTO
tblBooks (BookTitle, Publisher, DateOfPurchase)
(SELECT * FROM
OpenXML(@iDoc, '/Books/Book', 2)
WITH
(Title VARCHAR(100),
Publisher
VARCHAR(100),
DateOfPurchase
DateTime)
)
EXECUTE
sp_xml_removedocument @iDoc
GO
|
The above stored
procedure code accepts an ntext
datatype parameter, which contains the source XML document text. The INSERT..SELECT statement uses the OPENXML function to select all the /Books/Book nodes and insert rows into the tblBooks
table with the values of Title,
Publisher, and DateOfPurchase elements. The
value of 2 as the second parameter to OPENXML indicates to use the element-centric mapping.
Let's now write the
Visual Basic 6.0 code to call this stored procedure and pass the contents of C:\books.xml as the parameter:
Start Visual Basic
6.0, create a new standard EXE project, and add reference to MSXML 4.0 and ADO.
Note that once again, we are using MSXML 4.0 to get the contents of the XML
document; if you wish you may use the standard file I/O API and get the XML
document contents as we don't necessarily need the DOM processing in this
example. For simplicity, we'll use MSXML 4.0 DOM to get the source XML text:
|
Dim objXMLDOM As New MSXML2.DOMDocument40
Dim objADOCnn As New ADODB.Connection
Dim objADOComm
As New ADODB.Command
objXMLDOM.async
= False
If Not
objXMLDOM.Load("C:\Books.xml") Then
MsgBox
"Error: " & objXMLDOM.parseError.reason
Exit Sub
End If
|
Once the document is
loaded, let's use ADO to call the sp_Insert_Books_Openxml stored procedure described above:
|
objADOCnn.Open _
"PROVIDER=SQLOLEDB;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
Set
objADOComm.ActiveConnection = objADOCnn
objADOComm.CommandText
= "sp_Insert_Books_Openxml"
objADOComm.CommandType = adCmdStoredProc
objADOComm.Parameters.Refresh
objADOComm.Parameters.Item(1).Value = objXMLDOM.xml
objADOComm.Execute
objADOCnn.Close
Set objADOCnn =
Nothing
Set objADOComm =
Nothing
Set objXMLDOM =
Nothing
|
The above code calls
the sp_Insert_Books_Openxml stored procedure and passes the XML text (objXMLDOM.xml) to it.
The .NET Way
Microsoft .NET
Framework provides many ways to work with XML and relational data. The tight
integration of XML in .NET facilitates many ways in which XML data can be
imported or exported in/from the relational database. Let's look at one of the
ways in which XML file can be imported into SQL Server table using the DataSet class.
Start Visual Studio
.NET, create a Visual Basic .NET Console Application. Begin with adding the
following Imports statements at the top:
|
Imports
System
Imports
System.Data
Imports
System.Data.SqlClient
Imports System.Data.SqlTypes
|
Next, write the
following code in the Main function:
|
Dim objSQLConn As New _
SqlConnection("SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;")
Dim
objAdapter As SqlDataAdapter
Dim
objDataRow, objDBRow As DataRow
Dim objDSXML As New
DataSet()
Dim objDSDBTable As New
DataSet("tblBooks")
Dim ObjCmdBuilder As
SqlCommandBuilder
objDSXML.ReadXml("C:\Books.xml")
objSQLConn.Open()
objAdapter = New
SqlDataAdapter("SELECT BookTitle, Publisher, DateOfPurchase FROM
tblBooks WHERE 1 = 2 ", objSQLConn)
objAdapter.Fill(objDSDBTable,
"tblBooks")
For Each objDataRow In
objDSXML.Tables(0).Rows
With objDSDBTable.Tables(0)
objDBRow = .NewRow()
objDBRow(0) = objDataRow(0)
objDBRow(1) = objDataRow(1)
objDBRow(2) = objDataRow(2)
.Rows.Add(objDBRow)
End With
ObjCmdBuilder = New
SqlCommandBuilder(objAdapter)
objAdapter.Update(objDSDBTable,
"tblBooks")
Next
objSQLConn.Close()
|
The above code loads
the DataSet (objDSXML) object with the content of the source XML document. Next,
we connect to the SQL Server 2000 database and create a blank DataSet (objDSDBTable)
based on the tblBooks table.
Now, for each row in
the source XML DataSet (objDSXML), we create a new row in the table DataSet
(objDSDBTable), set the field values and add the new row into the table.
At the end, we close the database connection.
DTS ActiveX Script & SQLXML 3.0 Bulk Load
Earlier in the article, we discussed using SQLXML 3.0 XML Bulk Load object model
to import the XML file into SQL Server by writing a Visual Basic Application. Let's do the same thing from inside
a DTS package by writing the ActiveX Script. Before we create the DTS package, make sure you have c:\Books.xml and
c:\Books.xsd files (discussed above) saved on your hard disk, and tblBooks table created within NorthWind sample database.
Start Enterprise Manager, expand Data Transformation Services node, right clik on Local Packages and select New Package
to enter the DTS designer. Add a new ActiveX Script Task from the Task toolbar (or right click in the
designer and select Add Task | ActiveX Script Task.... As an optional step,
name the script task as ImportBooks.
Write the following Visual Basic ActiveX Script under the Main() Function
Dim objXBulkLoad
Set objXBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad.3.0")
objXBulkLoad.ConnectionString = "PROVIDER=SQLOLEDB.1;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
objXBulkLoad.KeepIdentity = False
'Optional Settings
objXBulkLoad.ErrorLogFile = "c:\NWError.LOG"
objXBulkLoad.TempFilePath = "c:\temp"
'Executing the bulk-load
objXBulkLoad.Execute "c:\books.xsd", "c:\books.xml"
Main = DTSTaskExecResult_Success
|
Optionally, click on the Parse button just to make sure that the above code is error-free, and then click on OK button.
You can right click on the ActiveX Script Task and select Execute Step to actually run the above script code
and import the XML file data into SQL Server table. Execute the step and then verify that the data is insterted
using the Query Analyzer or Enterprise Manager. Save the package and try to execute it, if you wish.
The above VBScript code is very similar to the earlier Visual Basic Bulk Load example. The Bulk Load process
uses the schema file (books.xsd) to determine which table to import the data into. Change something in the XSD file
(such as from sql:relation="tblBooks" to sql:relation="ABCDtblBooks", save the XSD and run the
package (or ActiveX Script task) and it will generate the error and you can also see the log generated in the
c:\NWError.LOG file. Rollback the change in the XSD file and run the script again, and it should properly import
the data in the sample tblBooks table.
DTS ActiveX Script & MSXML 4.0 DOM
Earlier in the article, we discussed using MSXML 4.0 DOM object model
to import the XML file into SQL Server by writing a Visual Basic Application. Let's do the same thing from inside
a DTS package by writing the ActiveX Script. Before we create the DTS package, make sure you have c:\Books.xml
(discussed above) saved on your hard disk, and tblBooks table created within NorthWind sample database.
Start Enterprise Manager, expand Data Transformation Services node, right clik on Local Packages and select New Package
to enter the DTS designer. Add a new ActiveX Script Task from the Task toolbar (or right click in the
designer and select Add Task | ActiveX Script Task....
As an optional step, name the script task as ImportBooksDOM.
Write the following Visual Basic ActiveX Script under the Main() Function
Dim objXMLDOM
Dim objNodes
Dim objBookNode
Dim objADORS
Dim objADOCnn
Const adOpenKeyset = 1
Const adLockOptimistic = 3
Set objXMLDOM = CreateObject("MSXML2.DOMDocument.4.0")
objXMLDOM.async = False
objXMLDOM.validateOnParse = False
'No error handling done
objXMLDOM.load "c:\books.xml"
Set objNodes = objXMLDOM.selectNodes("/Books/Book")
Set objADOCnn = CreateObject("ADODB.Connection")
Set objADORS = CreateObject("ADODB.Recordset")
objADOCnn.Open "PROVIDER=SQLOLEDB;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
objADORS.Open "SELECT * FROM tblBooks WHERE 1 = 2", objADOCnn, adOpenKeyset, adLockOptimistic
For Each objBookNode In objNodes
With objADORS
.AddNew
.fields("BookTitle") = objBookNode.selectSingleNode("Title").nodeTypedValue
.fields("Publisher") = objBookNode.selectSingleNode("Publisher").nodeTypedValue
.fields("DateOfPurchase") = objBookNode.selectSingleNode("DateOfPurchase").nodeTypedValue
.Update
End With
Next
objADORS.Close
objADOCnn.Close
Main = DTSTaskExecResult_Success
|
Optionally, click on the Parse button just to make sure that the above code is error-free, and then click on OK button.
You can right click on the ActiveX Script Task and select Execute Step to actually run the above script code
and import the XML file data into SQL Server table. Execute the step and then verify that the data is insterted
using the Query Analyzer or Enterprise Manager. Save the package and try to execute it, if you wish.
The above VBScript code is exactly similar to the earlier ADO + MSXML 4.0 DOM example. We use MSXML DOM to
load the XML document, and then ADO to add rows into the database table.
Using DataDirect XML ADO Provider
As per the DataDirect Technologies Web site,
"DataDirect Technologies is the industry leader in data connectivity.
Our standards-based technology ensures consistent behavior and performance across
diverse environments such as J2EE, .NET, Web Services, and client/server.
With the most comprehensive support for ODBC, JDBC, and
dynamic XML/relational data transformation, we provide the easiest experience connecting software
to data. This helps customers bring applications to market faster by speeding development,
integration and deployment—no matter what the standard, platform, or
data source."
The DataDirect Connect for ADO, release 2.7
is the premier suite of native ADO providers, enabling reliable, high-performance data connectivity for
client/server and web-based applications on the Microsoft platform.
Connect for ADO supports the latest OLE DB specification,
providing data access across a broad range of client platforms and data sources, including XML.
One of the providers in this suite is the
Connect ADO for XML (the "XML data provider") that supports tabular- and hierarchical-formatted
XML documents that can be accessed from either a local file system or a web server, using the ADO API.
For instance, this provider can be used from within the traditional ADO code to load the XML documents as recordsets.
Let's use this OLE Provider and see how it can be used to import the XML data into SQL Server. We'll write
a Visual Basic application, but this can be easily converted into a VBScript that can be used
inside a DTS package as a ActiveX Script task, as we saw above.
- Download the Evaluation version of DataDirect Connect for ADO, release 2.7 from the DataDirect Web site.
- Install it. Make sure to install the Connect ADO for XML (the "XML data provider").
- Click on Start | Programs | DataDirect Connect ADO 2.7 | DataDirect Configuration Manager.
- Click on the New Data Source toolbar button OR right click on Data Sources in the tree and select New Data Source.
- From the New Data Source dialog, select DataDirect XML ADO Provider from the
OLE DB Provider select box, and type the Name of the data source as NWBooks and then
click on the Set Up Data Source button
-
In the Data Source Setup dialog, select the XML Document from the Location Type frame,
type the Location as c:\books.xml and click on Test Connect. If the connection
succeeded, click on OK button.
After you click OK, you should see the NWBooks data source in the Configuration Manager window under the Data Sources node in the tree.
Once the Data Source is configures, the code to access this data source (the books XML file) is as below:
Start Visual Basic 6.0, create a new Standard EXE project, add reference to Microsoft ADO 2.7, double click on
the form and write the following code in the Form_Load method:
Dim objADOConnXML As New ADODB.Connection
Dim objRSXML As New ADODB.Recordset
Dim objField As Field
Dim objADOConnDB As New ADODB.Connection
Dim objRSDB As New ADODB.Recordset
Dim iIndex As Integer
'Open the XML Data Source
objADOConnXML.Provider = "DataDirect XML ADO Provider"
objADOConnXML.ConnectionString = "Data Source=NWBooks"
objADOConnXML.Open
Set objRSXML = objADOConnXML.Execute("Books")
'Open the database recordset
objADOConnDB.Open "PROVIDER=SQLOLEDB.1;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
objRSDB.Open "SELECT * FROM tblBooks where 1 = 2 ", objADOConnDB, adOpenStatic, adLockOptimistic
'Insert data from XML Recordset to Database Recordset
While Not objRSXML.EOF
'Add a new Record
objRSDB.AddNew
'Populate Fields
'Starting the index from 1 to ignore the identity column
iIndex = 1
For Each objField In objRSXML.Fields
objRSDB.Fields(iIndex) = objField.Value
iIndex = iIndex + 1
Next
'Save the new database record
objRSDB.Update
objRSXML.MoveNext
Wend
objRSXML.Close
objADOConnXML.Close
objRSDB.Close
objADOConnDB.Close
Set objRSXML = Nothing
Set objADOConnXML = Nothing
Set objRSDB = Nothing
Set objADOConnDB = Nothing
|
The above code opens two ADO Connections and two recordsets and then copies the data from one (source)
recordset into the destination recordset. The source recordset is opened using the DataDirect XML ADO Provider and
it points to the c:\books.xml XML file. The DataDirect XML ADO Provider allows us to use XML file
as a ADO recordset from which we read the data and insert into the tblBooks database table in the Northwind database.
Using Microsoft XML OLE-DB Simple Provider
The Microsoft OLE-DB Simple Provider allows
ADO to access any data for which a provider has been written using the OLE DB Simple Provider Toolkit. Simple providers are intended to access data sources that require only fundamental OLE DB support, such as in-memory arrays or XML documents.
MSXML DSOControl is such simple provider that can be used to read hierarchical XML data as a ADO recordset.
See KB article Q271772: Access Hierarchical XML Data with the XML OLE DB Simple Provider for more details on this.
In the following example, we use ADO 2.7 and MSXML 4.0 DSOControl as a simple OLE DB Provider, to import XML file (books.xml) into
SQL Server table (tblBooks in the Northwind database).
Start Visual Basic 6.0, create a new standard EXE project, add reference to ADO 2.7, double click on the form and write following
code in the Form_Load method:
Dim objADOConnSrc As New ADODB.Connection
Dim objADORsSrc As New ADODB.Recordset
Dim objADOConnDest As New ADODB.Connection
Dim objADORsDest As New ADODB.Recordset
Dim srcFields As ADODB.Fields
'Load XML document as a ADO Recordset
objADOConnSrc.Open "Provider=MSDAOSP; Data Source=MSXML2.DSOControl.4.0;"
objADORsSrc.Open "c:\books.xml", objADOConnSrc
'Open the destination recordset
objADOConnDest.Open "PROVIDER=SQLOLEDB.1;SERVER=.;UID=sa;PWD=;DATABASE=Northwind;"
objADORsDest.Open "SELECT * FROM [tblBooks] WHERE 1 = 2", objADOConnDest, adOpenKeyset, adLockOptimistic
'For each record in the source XML recordset
While Not objADORsSrc.EOF
With objADORsDest
.AddNew
Set srcFields = objADORsSrc.Fields
.Fields("BookTitle").Value = srcFields("Title").Value
.Fields("Publisher").Value = srcFields("Publisher").Value
.Fields("DateOfPurchase").Value = srcFields("DateOfPurchase").Value
.Update
End With
objADORsSrc.MoveNext
Wend
objADORsSrc.Close
Set objADORsSrc = Nothing
objADORsDest.Close
Set objADORsDest = Nothing
objADOConnDest.Close
Set objADOConnDest = Nothing
Unload Me
|
 |
Using TextCopy.exe
The SQL Server 2000 "Binn" folder (Program Files\Microsoft SQL Server\MSSQL\Binn) contains a command-line
utility named TextCopy.exe. Written using DB-Library, this utility is not really documented (and possibly is not supported).
Passing the /? parameter to this executable results in following help text (alongwith help on each parameter):
Copies a single text or image value into or out of SQL Server. The value
is a specified text or image 'column' of a single row (specified by the
"where clause") of the specified 'table'.
In other words, in our case, where we want to import XML into SQL Server, for TextCopy to work,
- you need to already have a row in the table
- the entire XML text will be imported into a single column of type text or image
For this example, consider the sample table:
USE Northwind
GO
CREATE TABLE tblBooksEx(
Row_ID INTEGER IDENTITY(1,1) PRIMARY KEY,
BooksData text
)
GO
Let’s insert one row into this table
INSERT INTO [tblBooksEx] SELECT ''
Now, we have a single row in the sample table, where is the
Row_ID is 1 and BooksData is an empty string.
You can now use the following
TextCopy.exe /S . /U sa /P sa /D Northwind /T tblBooksEx /C BooksData /F c:\books.xml /W "WHERE Row_ID=1" /I
You can use xp_cmdshell to execute the TextCopy.exe from within your T-SQL code.
Note that xp_cmdshell operates synchronously and by Execute permissions for xp_cmdshell default to members of the sysadmin fixed server role.
Summary
XML is the popular
mechanism for the data on the move. But once the XML data is received at some
destination, at many instances it is required to import the XML text into the
database. In this article, we explored some of the possible ways in which XML
data can be imported into SQL Server. We saw how MSXML and ADO together can be
used to import XML data not into SQL Server but any database; We saw an example
of using DOM and SAX to read the source XML; We then saw one example each of
using OPENXML, SQLXML 3.0 Bulk Load and .NET Framework classes.
We also learned about writing the ActiveX Script Tasks inside a DTS package to import the XML data either using
SQLXML BulkLoad or MSXML DOM + ADO object models. And finally, we saw an example of importing XML data using the
DataDirect XML ADO Provider.
Q&A
Q:
I have a Client-Server database application. On the client side I am saving the local access database data to
a disk file using ADO XML Persistence (adPersistXML), as outlined in your
Exporting SQL Data as XML article.
Now I get this file from the client, and on the server-side I have to write a DTS package to import this data
into a SQL Server 2000 database.
In simple words, how do I import a ADO persisted XML file into SQL Server?
A:
Let's say you have a ADO Persisted XML file for Shippers table in the Northwind Access database.
Here is the DTS ActiveX Script Task VBScript code to import this file into SQL Server Northwind sample
database. The file is assumed to be saved on C:\.
Start SQL Server Enterprise Manager, right click on Data Transformation Services, select New Package,
create a new ActiveX Script Task and write the following VBScript code:
'**********************************************************************
' Visual Basic ActiveX Script
'************************************************************************
Function Main()
Dim objSrcADORS
Dim objDestADORS
Dim ObjSrcFields
Dim ObjSrcField
Const adOpenForwardOnly = 0
Const adLockBatchOptimistic = 4
Const adCmdFile = 256
Const adAddNew = 16778240
Const adOpenKeyset = 1
Const adLockOptimistic = 3
Set objSrcADORS = CreateObject("ADODB.Recordset")
Set objDestADORS = CreateObject("ADODB.Recordset")
'The XML File Recordset (Source)
objSrcADORS.Open "C:\ACC_Shippers.xml", "Provider=MSPersist;", _
adOpenForwardOnly, adLockBatchOptimistic, adCmdFile
'The Blank SQL Server Northwind database :: Shippers Recordset (Dest.)
objDestADORS.Open "SELECT * FROM Shippers WHERE 1 = 2", _
"PROVIDER=SQLOLEDB.1; SERVER=.;UID=sa;PWD=;DATABASE=Northwind", _
adOpenKeyset, adLockOptimistic
If objDestADORS.Supports(adAddNew) Then
While Not objSrcADORS.EOF
'Add new destination record
objDestADORS.AddNew
Set ObjSrcFields = objSrcADORS.Fields
'Update Fields
For each ObjSrcField in ObjSrcFields
'Skip the Identity Field
If ObjSrcField.Name <> "ShipperID" Then
objDestADORS.Fields(ObjSrcField.Name).Value = ObjSrcField.Value
End If
Next
'Save the destination record
objDestADORS.Update
objSrcADORS.MoveNext
Wend
Else
MsgBox "The destination recordset does not support AddNew"
End If
objSrcADORS.Close
Set objSrcADORS = Nothing
objDestADORS.Close
Set objDestADORS = Nothing
Main = DTSTaskExecResult_Success
End Function
|
The above code loads the ADO Persisted XML file using the "Provider=MSPersist;" as the
connection string, and it also opens the SQL Server destination Recordset containing 0 records.
For each record in the source Recordset (XML file), we add a new record in the destination Recordset,
update fields (except identity column), and saves the record.
As an exercise, try the alternative solution to this problem, and use MSXML to load the ADO-persisted
XML document, get the values from the XML document, and then use ADO to insert
records into the destination database table.
Q:
I have an ADO persisted XML file, similar to
ACC_Shippers.xml.
How do I read from this file and import into my SQL server table using Visual Basic .NET?
A:
Let's say you have an ADO-persisted XML file available as d:\temp\ACC_Shippers.xml,
and that you would like to import this XML file into Shippers table in the Northwind sample SQL Server database.
Here is a sample Visual Basic .NET Console Application code that uses
System.Xml and System.Data classes
to load and read XML document, and then run Insert statements.
Imports System
Imports System.IO
Imports System.Xml
Imports System.Data
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Module Module1
Sub Main()
Try
'Connect to the database
Dim dbConn As New SqlConnection("SERVER=.;UID=sa;PWD=;DATABASE=Northwind;")
dbConn.Open()
'The command object to execute the inserts
Dim sqlComm As New SqlCommand()
sqlComm.Connection = dbConn
'Load the XML document, and get the root element
Dim xmlDoc As New XmlDocument()
Dim xmlRoot As XmlElement
xmlDoc.Load("d:\temp\ACC_Shippers.xml")
xmlRoot = xmlDoc.DocumentElement
'Create an XmlNamespaceManager for resolving namespaces.
Dim nsManager As XmlNamespaceManager
nsManager = New XmlNamespaceManager(xmlDoc.NameTable)
nsManager.AddNamespace("z", "#RowsetSchema")
nsManager.AddNamespace("rs", "urn:schemas-microsoft-com:rowset")
'Select the rs:data/z:row nodes
Dim dataNodes As XmlNodeList
Dim aNode As XmlNode
'See KB Q316913 or http://samples.gotdotnet.com/quickstart/howto/doc/Xml/XmlNameSpace.aspx for details
dataNodes = xmlRoot.SelectNodes("rs:data/z:row", nsManager)
Dim strInsertStmt As String
For Each aNode In dataNodes
Try
'Build the insert statement
strInsertStmt = "INSERT INTO Shippers (CompanyName, Phone) VALUES ('"
strInsertStmt += Replace(aNode.Attributes("CompanyName").Value, "'", "''")
strInsertStmt += "','"
strInsertStmt += Replace(aNode.Attributes("Phone").Value, "'", "''")
strInsertStmt += "')"
'For Debugging
Console.WriteLine(strInsertStmt)
'Execute the insert statement
sqlComm.CommandText = strInsertStmt
sqlComm.ExecuteNonQuery()
Catch e As Exception
Console.WriteLine("Error: " & e.Message)
End Try
Next
'Close the database connection
dbConn.Close()
Catch e1 As Exception
Console.WriteLine("Error: " & e1.Message)
End Try
Console.WriteLine("Press Enter to continue...")
Console.ReadLine()
End Sub
End Module
|
The above code first connects to the SQL database using ADO.NET classes,
then uses XmlDocument to load the (ADO-persisted) XML document.
As the XML document makes use of namespaces, the code creates a
XmlNamespaceManager objects and associates it with the
XmlDocument instance. This XmlNamespaceManager
is then passed when selectNodes method is called to
get all the z:row nodes.
For each z:row node, we format an INSERT SQL statement, and
then run it using SqlCommand object.
In this example, we are directly running the INSERT commands;
ADO.NET and/or SQLXML offers many other ways to insert/update data, for instance you may use the
DataSet class and SqlDataAdapter to insert; or you may
transform ADO persisted XML into an Updategram and execute that updategram using SQLXML .NET classes,
and so on. In summary, there are many other ways to import the XML data into SQL Server using .NET, this example
presented just one approach. Also the above code sample uses += string concatenation
operator to generate the INSERT SQL statement, the better approach would be to use the
System.Text.StringBuilder class.
Q:
Is there any way to use an external XML document file with OPENXML? Essentially, what I would like to do is
import (insert) data into the database table using OPENXML; but instead of specifying the XML string
in the T-SQL code, I would like to pass an XML file to OPENXML?
A:
OPENXML T-SQL function uses the XML document handle created using the
sp_xml_preparedocument - that accepts (char, nchar, varchar, nvarchar, text, or ntext) -
and can only work on string. It does not accept stream or physical file name.
The maximum characters that you can pass while working in query analyzer is 8000 characters (varchar limit).
In summary, OPENXML (yet) does not accept XML input from a file/stream.
However, there are couple of (good) workarounds. One of them is to use an extended stored procedure to read
XML file as a text document into a temporary table, concatenate the rows and build a XML string;
and then use this XML string to create a handle to be passed to OPENXML.
Let's say the following XML document is saved as c:\books.xml:
<?xml version="1.0"?>
<Books>
<Book>
<Title>XML Application Development with MSXML 4.0</Title>
<Publisher>Wrox Press</Publisher>
<DateOfPurchase>2/1/2002</DateOfPurchase>
</Book>
<Book>
<Title>Professional SQL Server 2000 XML</Title>
<Publisher>Wrox Press</Publisher>
<DateOfPurchase>9/10/2001</DateOfPurchase>
</Book>
</Books>
|
And, let's say we want to import the above XML file into tblBooks table created in NorthWind sample database:
USE NorthWind
GO
IF OBJECT_ID('tblBooks', 'U') IS NOT NULL
DROP TABLE tblBooks
GO
CREATE TABLE tblBooks
(rowID int IDENTITY,
Title nvarchar(255),
Publisher nvarchar(255),
DateOfPurchase datetime)
|
Here is the T-SQL code that first uses xp_cmdshell to read the XML file as a text document and
add the lines into temporary table. The reason we are putting into a temporary table is because
xp_cmdshell returns NULL for blank lines; to get rid of these NULL rows we use ISNULL function on the table rows.
We then select from this temporary table and concatenate the rows to form a XML string. This string is then
used to create the handle used with INSERT...OPENXML statement.
SET NOCOUNT ON
-- Let's now first read the XML file into a temporary table
-- Create temporary table first
CREATE TABLE #tmpFileLines (rowID int IDENTITY, lineData nvarchar(255))
-- Insert lines from files into temp table (using xp_cmdshell)
INSERT #tmpFileLines EXEC master.dbo.xp_cmdshell 'TYPE c:\books.xml'
DECLARE @strXMLText nvarchar(4000)
-- Reading the XML data from the table into a string variable
-- This string variable is used with OPENXML
SELECT @strXMLText =
CASE rowID WHEN 1 THEN
ISNULL(RTRIM(lineData), '')
ELSE
@strXMLText + ISNULL(RTRIM(lineData), '')
END
FROM #tmpFileLines ORDER BY rowID ASC
PRINT '-------------------------------'
PRINT 'Bytes read from the file:'
PRINT DATALENGTH(@strXMLText)
PRINT '-------------------------------'
DROP TABLE #tmpFileLines
-- Preparing for calling OPENXML
DECLARE @hDoc int
EXEC sp_xml_preparedocument @hDoc OUTPUT, @strXMLText
-- Inserting using OPENXML
INSERT INTO tblBooks
SELECT *
FROM OPENXML(@hDOC, '/Books/Book', 2)
WITH
(
Title nvarchar(255),
Publisher nvarchar(255),
DateOfPurchase datetime
)
EXEC sp_xml_removedocument @hdoc
GO
SELECT * FROM tblBooks
GO
SET NOCOUNT OFF
|
To learn more about OPENXML, see my article at http://www.perfectxml.com/articles/XML/OPENXML.asp.
Note that the above code will work best for small documents (< 8000 bytes).
The workaround to deal with more than 8000 characters is discussed in the book
The Guru's Guide to SQL Server Stored Procedures, XML, and HTML (chapter 15, page 449).
We also make the assumption that each line in the XML file is not more than 255 characters (the column in #tmpFileLines is declared as varchar(255); however you can change this).
Also, to run the extended stored procedure (xp_cmdshell), special permissions are required.
By default, only members of the sysadmin fixed server role can execute this extended stored procedure. You may, however, grant other users permission to execute this stored procedure.
See SQL Server Books Online for more details on this.
Finally, instead of calling xp_cmdshell, we can also use xp_readerrorlog (as below) (or better, sp_readerrorlog) to read any text file:
EXEC master.dbo.xp_readerrorlog -1, 'c:\books.xml'
|
Q: We have to import all the XML files present in a
specified folder into SQL Server, using DTS. Is it possible?
A:
You can use FileSystemObject, MSXML, and ADO inside a DTS ActiveX Script to do this. Let's see an
example of this.
First we'll use FileSystemObject to get a list of all the files in the specified directory. Next, we'll
use MSXML 4.0 SP1 to load each XML file, and finally use ADO to add a record into the table. MSXML and ADO inside
a DTS package is discussed above. Instead of using MSXML and ADO, you can also
use SQLXML Bulk Load (provided you have the XSD schema for all the XML files) as discussed above.
Let's say we have three files (Ship1.xml, Ship2.xml, and Ship3.xml) within the folder D:\TEMP\XMLFiles\; and
we would like to insert the data into the Shippers table in the NorthWind sample database.
Ship1.xml
<?xml version="1.0"?>
<Shipper>
<CompanyName>Universal Shipping</CompanyName>
<Phone>1-877-000-UNIV</Phone>
</Shipper>
Ship2.xml
<?xml version="1.0"?>
<Shipper>
<CompanyName>Global Shipping</CompanyName>
<Phone>1-800-544-SHIP</Phone>
</Shipper>
Ship3.xml
<?xml version="1.0"?>
<Shipper>
<CompanyName>Express Shippers</CompanyName>
<Phone>1-800-EXPRESS</Phone>
</Shipper>
|
Start SQL Server Enterprise Manager, expand Data Transformation Services node in the tree, right click on Local Packages
and select "New Package". In the Package Designer, drag and Drop ActiveX Script Task (or right click in the
designer and select Add Task | ActiveX Script Task...). Write the following code in the Script Task:
'**********************************************************************
' Visual Basic ActiveX Script
'************************************************************************
CONST strFilesPath = "D:\TEMP\XMLFiles"
CONST adOpenKeyset = 1
CONST adLockOptimistic = 3
Function Main()
Dim objFSO
Dim objFolder
Dim objFilesColl
Dim iFilesCount
Dim objFile
Dim objXMLDOM
Dim objNodes
Dim objNodeItem
Dim objADORS
Dim objADOCnn
Dim strCurFileName
'Create and initialize (Open) ADO Connection
Set objADOCnn = CreateObject("ADODB.Connection")
objADOCnn.Open "PROVIDER=SQLOLEDB;SERVER=.;UID=sa;PWD=;DATABASE=NorthWind;"
'Create MSXML 4.0 DOM Object and initialize it
Set objXMLDOM = CreateObject("MSXML2.DOMDocument.4.0")
objXMLDOM.async = False
objXMLDOM.validateOnParse = False
'Get a list of files in the specified directory
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder(strFilesPath)
Set objFilesColl = objFolder.Files
'Load each file in MSXML DOM and use ADO to insert data into the table
For Each objFile in objFilesColl
strCurFileName = strFilesPath & "\" & objFile.Name
'Load the XML file
'No error handling done
objXMLDOM.load strCurFileName
Set objNodes = objXMLDOM.selectNodes("/Shipper")
'Create and Open the recordset
Set objADORS = CreateObject("ADODB.Recordset")
objADORS.Open "SELECT * FROM Shippers WHERE 1 = 2", objADOCnn, adOpenKeyset, adLockOptimistic
'Add records
For Each objNodeItem In objNodes
With objADORS
.AddNew
.fields("CompanyName") = objNodeItem.selectSingleNode("CompanyName").nodeTypedValue
.fields("Phone") = objNodeItem.selectSingleNode("Phone").nodeTypedValue
.Update
End With
Next
objADORS.Close
'Message box for debugging purposes
MsgBox "Copied data from " & strCurFileName & " into the database"
Next
objADOCnn.Close
Set objADORS = Nothing
Set objADOCnn = Nothing
Set objXMLDOM = Nothing
Set objFSO = Nothing
Main = DTSTaskExecResult_Success
End Function
|
The above code creates ADO Connection, FileSystemObject, and MSXML DOMDocument objects.
It then uses FileSystemObject to get a list of all files in the specified folder; and for each file
we load it in DOMDocument and use MSXML XPath and ADO to insert data into the Shippers table.
Q: I have a SQL Server 2000 database table which contains image
data type field; And an XML file that contains Base64 encoded binary data. I would like
to write a stored procedure and pass it the binary data stored in XML document, and the
stored procedure should insert a row into the table. The stored procedure will be called from
ADO code. Can you please suggest as to how to implement this?
A: As you said, you have an XML file that contains Base64 encoded binary data;
Let's take NWEmp.XML sample
XML file for this discussion. This file contains just one element whose value is a Base64 encoded binary data.
Let's assume you have a table and stored procedure in the NorthWind database as follows:
USE NorthWind
GO
CREATE TABLE NWEmp (Photo image)
GO
SELECT * FROM NWEmp
GO
CREATE PROCEDURE AddAnEmp
@EmpPhoto as image
as
INSERT INTO NWEmp Values(@EmpPhoto)
GO
|
The above T-SQL statements create a table named NWEmp, which has just one field named Photo of datatype image; and a
stored procedure AddAnEmp that accepts an image type input parameter.
Here is some Visual Basic code that uses ADO and MSXML 4.0 to
- Load the XML document
- Read the binary element value
- Convert it into byte array
- Call the stored procedure and pass the byte array to it.
'Declaring Variables
Dim oXMLDOMDoc As New MSXML2.DOMDocument40
Dim oNode As MSXML2.IXMLDOMNode
Dim connDB As New ADODB.Connection
Dim comm As New ADODB.Command
'Open the Connection
connDB.Open "PROVIDER=SQLOLEDB;UID=SA;PWD=;SERVER=.;DATABASE=NORTHWIND"
'Loading the XML document
oXMLDOMDoc.async = False
'Load the XML document that contains binary data (as base64 encoded)
If oXMLDOMDoc.Load("http://www.perfectxml.com/NWEMP.XML") Then
Set oNode = oXMLDOMDoc.selectSingleNode("EMPPHOTOS/PHOTO")
Dim btArr() As Byte
btArr = oNode.nodeTypedValue
'Now we have the binary data in a byte array
'Get ready to call the stored procedure
Set comm.ActiveConnection = connDB
comm.CommandText = "AddAnEmp"
comm.CommandType = adCmdStoredProc
comm.Parameters.Refresh
comm.Parameters(1).Type = adBinary
comm.Parameters(1).Value = btArr
'Call the stored procedure
comm.Execute
Set oXMLDOMDoc = Nothing
Set oNode = Nothing
Else
'XML document load error!
MsgBox "Error: " & oXMLDOMDoc.parseError.reason
End If
connDB.Close
|
The above code illustrates one way in which you can use MSXML 4.0 and ADO to insert binary data into SQL Server.