This information was difficult to find, so I figured I would post it in case anyone else needed it in the future, including myself.
Giving an xml file like the following, I needed to write a parser(in C#) that would verify the existence of tags, and store them in a class.
<feed xmlns="http://www.w3.org/2005/Atom" lang="en-US">
<entry>
<updated>
2008-09-05T17:05:01-07:00
</updated>
<link rel="self" href="http://www.google.com">
<link rel="alternate" href="http://www.netflix.com">
<link rel="assetlinklogo" href="http://www.google.com%5Cimagefile.gif">
<link rel="assetlinkdemo" href="http://www.netflix.com%5Cdemo.aspx">
<link rel="assetlinkscreenshot1" href="http://www.google.com%5Cscreenshot1.gif">
<link rel="assetlinkscreenshot2" href="http://www.netflix.com%5Cscreenshot2.gif">
<averagerating xmlns="http://testserver:8080/syndicate.xsd">
One
</averagerating>
</entry>
</feed>
There are 2 major problems.
The xml file resides in a "http://www.w3.org/2005/Atom" namespace, however specific tags reside in a different namespace: "http://testserver:8080/syndicate.xsd". Dealing with one namespace was annoying enough, how do we juggle two?
Also the assetlink tags are optional, there may be one hundred or none. I wanted to be able to put them in a list, but to exclude the self and alternate links which are required and will always show up.
To store the data I used a struct like this:
public struct BasicData
{
public XmlNode feed;
public XmlNode updated;
public XmlNode selfLink;
public XmlNode altLink;
public XmlNodeList assetLinks;
public XmlNode AverageRating;
};
I was used to reading xml files with XmlDocument (in the using System.Xml namespace):
BasicData basicData = new BasicData();
XmlDocument xmlFile = new XmlDocument();
XmlNode root;
xmlFile.Load(filePath);
root = this.xmlFile.DocumentElement;
"root" is now pointing to the base node in the xml document, in the case of this example it is the
basicData.updated = xmlFile.SelectSingleNode(@"/feed/entry/updated");
This function call is null because all the tags reside in a namespace. To deal with this we need to create a namespace manager and give our namespace a custom name(sometimes xml namespaces will be declared with names, but not in this case). This namespace manager is done as follows:
XmlNamespaceManager nsmgr;
nsmgr = new XmlNamespaceManager(this.xmlFile.NameTable);
nsmgr.AddNamespace("atom", @"http://www.w3.org/2005/Atom");
nsmgr.AddNamespace("ppt", @"http://testserver:8080/syndicate.xsd");
Remember I pointed out the file has 2 namespaces? We need to add both of them to the namespace manager since we will need both. Now that we have a namespace manager and the namespaces added, we can access the updated node and the self link node as follows:
basicData.updated = xmlFile.SelectSingleNode(@"/atom:feed/atom:entry/atom:updated", nsmgr);
basicData.selfLink = xmlFile.SelectSingleNode(@"/atom:feed/atom:entry/atom:link[@rel='self']", nsmgr);
And to access the node with the second namespace it is as follows:
basicData.updated = xmlFile.SelectSingleNode(@"/atom:feed/atom:entry/ppt:AverageRating", nsmgr);
Using the second namespace is done by using both in a single xPath.
Now to get the asset links but not the other links there is actually a "starts-with" function in xPath. In this case you would use it as follows:
basicData.assetLinks = xmlFile.SelectNodes(@"atom:feed/atom:entry/atom:link[starts-with(@rel, 'assetlink')]", nsmgr);
And there you go! The full code for this will look like this:
public struct BasicData
{
public XmlNode feed;
public XmlNode updated;
public XmlNode selfLink;
public XmlNode altLink;
public XmlNodeList assetLinks;
public XmlNode AverageRating;
};
BasicData basicData = new BasicData();
XmlDocument xmlFile = new XmlDocument();
XmlNode root;
xmlFile.Load(filePath);
XmlNamespaceManager nsmgr;
nsmgr = new XmlNamespaceManager(this.xmlFile.NameTable);
nsmgr.AddNamespace("atom", @"http://www.w3.org/2005/Atom");
nsmgr.AddNamespace("ppt", @"http://testserver:8080/syndicate.xsd");
basicData.updated = xmlFile.SelectSingleNode(@"/atom:feed/atom:entry/atom:updated", nsmgr);
basicData.selfLink = xmlFile.SelectSingleNode(@"/atom:feed/atom:entry/atom:link[@rel='self']";
basicData.assetLinks = xmlFile.SelectNodes(@"atom:feed/atom:entry/atom:link[starts-with(@rel, 'assetlink')]", nsmgr);
...(and so on)...
1 comment:
Goodness, on facebook and now here too? You must be really trying to get the word out.
Post a Comment