Wednesday, October 28, 2009

Importing XML into SQL Server 2008 using asp.net

Last week I needed to Import an XML file into a sql server 2008 database. Something I thought would be done often and should be fairly easy. Not being something I was familiar with or not in any of the books I had here at home I did some fruitless searches, and then posted on 2 forums that I thought would be able to help point me in the right direction. Neither of those sites generated even a single response.

http://www.daniweb.com/forums/thread232967.html

http://forums.asp.net/p/1485040/3475429.aspx

So with no responses after 3 days and my question now past page 3 with over 100 views I figured no one could help and I was just going to have to spend a day figuring it out.

Here was my post:

 

Hi,

I need a little help importing an XML file into sqlserver 2008.  I followed this example

http://www.sql-server-performance.com/articles/dba/Loading_XML_data_into_SQL_Server_2008_p1.aspx

(make sure you goto the rest of the pages in the article!) , generated the schema etc., but I can't get the import to do what I need. I was able to add the DataConversion on my own but I can't seem to figure out how to get both ID fields in the middle table.
I made a simplified version.  For this example I have 3 tables.  Books, BookAuthors, Authors.

Books Table (bookID, bookTitle, bookNumPages)
BookAuthors Table (bookID, authorID)
Authors (authorID, authorName)

After the Import the tables should look like this
Books
79, Some Good Book, 300
112, Another Book, 350
BookAuthors
79, 200
79, 205
79, 215
112, 200
Authors
200, First Author
205, Second Author
215, Third Author

The Problem I'm having is getting the BookID and AuthorID into the BookAuthors Table.  I'm just wondering if there is an easy way to set this up to Import or if I'm going to have to code it line by line.

<?xml version="1.0" encoding="UTF-8"?>
<result>
    <books>
        <book>
            <bookID>79</bookID>
            <bookTitle>Some Good Book</bookTitle>
            <bookNumPages>300</bookNumPages>
            <authors>
                <author>
                    <authorID>200</authorID>
                </author>
                <author>
                    <authorID>205</authorID>
                </author>
                <author>
                    <authorID>215</authorID>
                </author>
            </authors>
        </book>
        <book>
            <bookID>112</bookID>
            <bookTitle>Another Book</bookTitle>
            <bookNumPages>350</bookNumPages>
            <authors>
                <author>
                    <authorID>200</authorID>
                </author>
            </authors>
        </book>
    </books>
    <authors>
        <author>
            <authorID>200</authorID>
            <authorName>First Author</authorName>
        </author>
        <author>
            <authorID>205</authorID>
            <authorName>Second Author</authorName>
        </author>
        <author>
            <authorID>215</authorID>
            <authorName>Third Author</authorName>
        </author>
    </authors>
</result>


After some digging SQLXMLBULK load seemed like a good way to go.



The 1st problem (of course) is that for some reason this was not included in sql server 2008.  You need to download and install Microsoft SQLXML 4.0 SP1 from this location it’s WAAAAY down near the bottom of the page http://www.microsoft.com/downloads/details.aspx?FamilyId=228DE03F-3B5A-428A-923F-58A033D316E1&displaylang=en



The next thing to do was to code the .xsd file or schema using sql:relation sql:is-constant and setting up the Parentkey ChildKey relationships with sql:relationship



Since this was a test environment I was able to make the column names the same, CASE is Sensitive.  If you are not in control of both fields you’re going to have to map each element in the xml feed to it’s column in the sql table like this



        <xsd:element name="FName"  
                     sql:field="FirstName" 
                     type="xsd:string" /> 


 



Here is my finished schema:



 



<?xml version="1.0" encoding="utf-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
    <xsd:annotation>
        <xsd:appinfo>
            <sql:relationship name="BooksBookAuthors"
                parent="Books"
                parent-key="bookID"
                child="BookAuthors"
                child-key="bookID" />
        </xsd:appinfo>
    </xsd:annotation>
    
    <xsd:element name="result" sql:is-constant="1" >
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="books" sql:is-constant="1" >
                    <xsd:complexType>
                        <xsd:sequence>
                            <xsd:element maxOccurs="unbounded" name="book" sql:relation="Books">
                                <xsd:complexType>
                                    <xsd:sequence>
                                        <xsd:element name="bookID" type="xsd:integer" />
                                        <xsd:element name="bookTitle" type="xsd:string" />
                                        <xsd:element name="bookNumPages" type="xsd:integer" />
                                        <xsd:element name="authors" sql:is-constant="1" >
                                            <xsd:complexType>
                                                <xsd:sequence>
                                                    <xsd:element maxOccurs="unbounded" name="author"  
                                                                    sql:relation="BookAuthors"
                                                                    sql:relationship="BooksBookAuthors">
                                                        <xsd:complexType>
                                                            <xsd:sequence>
                                                                <xsd:element name="authorID" type="xsd:integer" />
                                                            </xsd:sequence>
                                                        </xsd:complexType>
                                                    </xsd:element>
                                                </xsd:sequence>
                                            </xsd:complexType>
                                        </xsd:element>
                                    </xsd:sequence>
                                </xsd:complexType>
                            </xsd:element>
                        </xsd:sequence>
                    </xsd:complexType>
                </xsd:element>
                <xsd:element name="authors" sql:is-constant="1" >
                    <xsd:complexType>
                        <xsd:sequence>
                            <xsd:element maxOccurs="unbounded" name="author" sql:relation="Authors">
                                <xsd:complexType>
                                    <xsd:sequence>
                                        <xsd:element name="authorID" type="xsd:integer" />
                                        <xsd:element name="authorName" type="xsd:string" />
                                    </xsd:sequence>
                                </xsd:complexType>
                            </xsd:element>
                        </xsd:sequence>
                    </xsd:complexType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>


 


Remember that XMLBulkload thing? Create a reference to it, when done it will add Interop.SQLXMLBULKLOADLib.dll to your Bin Folder



 



After that I had to write the code to do the import:



            Dim objbl As Object
            objbl = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
            'objbl.connectionstring = "provider=SQLOLEDB;data source=SQLSERVERINSTANCENAME;database=TESTDATABASE;integrated security=SSPI"
            objbl.connectionstring = "provider=SQLOLEDB;data source=SQLSERVERINSTANCENAME;database=TESTDATABASE;user id=sa;password=SUPERSECRETPASSWORD"
            objbl.ErrorLogFile = "c:\error.log"
            objbl.Execute("c:\temp\BookExample.xsd", "c:\temp\BookExample.xml")
            objbl = Nothing


I left the line commented out incase you want to use the Windows Integrated Security, it’s easy to switch back and forth. For me I was using the SQLServer authentication and it was near impossible to find the connection string formatted  correctly as this is not how it appears in my web.config file



 



That should be everything.  Hope this helps someone.

3 comments:

  1. Quick note: the code pastes nicely into wordpad but not notepad

    ReplyDelete
  2. Well, after 3 years, you have a comment! Great job! This is exactly what I was looking for.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete