Programming

Parsing an RDF Format RSS Feed in VB.NET

Some RSS feeds are published in a format called RDF which has a schema that’s very different from your standard RSS feed. It’s actually different enough that your standard XPath expression may not work to filter your results when you apply SelectNodes(“/rss/abc”) to the XML document

Instead of relying on XPath to filter the nodes you need, we’ll need to iteratively loop though the nodes and their child nodes. This is pretty much the only way I’ve found to be able to parse RDF format RSS feeds, but once you use it, it becomes fairly straightforward to do.

In the example code below, I’m going to parse an exchange rates RDF feed hosted by the Bank of Canada at: https://www.bankofcanada.ca/valet/fx_rss/ I’m using VB.Net, but you can easily change to C# with a code converter.

Having a look at the RDF file format

You can see the RDF file by browsing to the URL I gave above. Basically, though, we are interested in the information in the nodes:

rdf:RDF / Item

from there, we want the information found in title, description, and dc:date

For reference, here’s what an Item node looks like:

<item rdf:about="https://www.bankofcanada.ca/valet/fx_rss/FXAUDCAD">
    <title>CA: 0.9511 CAD = 1 AUD 2019-01-16</title>
    <link>https://www.bankofcanada.ca/?p=39898</link>
    <description>1 AUD = 0.9511 CAD (Australian dollar to Canadian dollar daily exchange rate)</description>
    <dc:date>2019-01-16T21:30:00Z</dc:date>
    <dc:language>en</dc:language>
    <cb:statistics>
        <cb:country>CA</cb:country>
        <cb:exchangeRate>
            <cb:value decimals="4">0.9511</cb:value>
            <cb:baseCurrency>CAD</cb:baseCurrency>
            <cb:targetCurrency>AUD</cb:targetCurrency>
            <cb:rateType>Bank of Canada exchange rate</cb:rateType>
            <cb:observationPeriod frequency="daily">2019-01-16T21:30:00Z</cb:observationPeriod>
        </cb:exchangeRate>
    </cb:statistics>
</item>

The VB.NET code to parse the RDF XML

Here is the code that I used to get the RDF feed from the site, and then loop through the interesting nodes. Notice that the looping is three node loops deep in order to access the title, description, and dc:date information.

First import the necessary libraries to your code. In this example we need the IO library to handle files and the Data library to be able to use the DataTable control.

<%@ Import Namespace="System.IO"  %>
<%@ Import Namespace="System.Data" %>

Then inside a VB sub, we can add the code to create a DataTable , add some columns to it, and then populate the DataTable with the exchange rate information from the RDF file using a series of loops through each level of nodes. Finally, we bind the DataTable to a DataList control to display the data on a Web page.

Note that if you want to change the order of the data, the best place to do this is by converting the final DataTable to a DataView, which you can then sort or filter as you need.

 Dim dt As DataTable = New DataTable("table")
 dt.Columns.Add("title", Type.GetType("System.String"))
 dt.Columns.Add("link", Type.GetType("System.String"))
 dt.Columns.Add("pubdate", Type.GetType("System.String"))
 dt.Columns.Add("realdate", Type.GetType("System.DateTime"))
 Dim rssFeed As HttpWebRequest = DirectCast(WebRequest.Create("https://www.bankofcanada.ca/valet/fx_rss/"), HttpWebRequest)
 Dim response As WebResponse = rssFeed.GetResponse()
 Dim rssStream As Stream = response.GetResponseStream()
 Dim rssDoc As New XmlDocument()
 rssDoc.Load(rssStream)
 Dim NList As XmlNodeList = rssDoc.GetElementsByTagName("rdf:RDF")
 For Each Node As XmlNode In NList
     For Each Child As XmlNode In Node.ChildNodes
         Dim strInsertRowText As String = ""
         For Each itemChild As XmlNode In Child.ChildNodes
             Dim itemChildName As String = itemChild.Name
             If ("description").Equals(itemChildName) Then
                 strInsertRowText = "" & itemChild.InnerText
             ElseIf ("dc:date").Equals(itemChildName) Then
                  Dim tmpDate As DateTime
                  If DateTime.TryParse(itemChild.InnerText, tmpDate) Then
                         Dim strStartTime As String = tmpDate.ToString("u", CultureInfo.CreateSpecificCulture("en-US"))
                         strStartTime = strStartTime.Replace("Z", "")
                         strInsertRowText &= "<br>" & strStartTime
                  End If
              End If
          Next
          If Not String.IsNullOrWhiteSpace(strInsertRowText) Then
               Dim icdr As DataRow = dt.NewRow()
               icdr("pubdate") = strInsertRowText
               dt.Rows.Add(icdr)
          End If
     Next
 Next
 Dim dv As New DataView(dt)
 dlMyNewsFeed.DataSource = dv
 dlMyNewsFeed.DataBind()
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s