RSS Feed Parser in 20 Lines with .NET LINQ

RSS is becoming more and more the face of web. I rarely visit the sites rather visit a bunch of them using an RSS reader. I was working with RSS data and realized how LINQ to XML made easier, elegant and terse to manipulate XML data. Here is a very naive RSS parser from RSS 2.0 specification

The Subscription and Channel classes are shown below. Normally I prototyped the parser with anonymous types but when using with methods I needed to get rid of the anonymous types and make them typed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
 public class Channel
        {
            public string Title { get; set; }
            public string Link { get; set; }
            public string Description { get; set; }
            public IEnumerable<item> Items { get; set; }
        }
 public class Item
        {
            public string Title { get; set; }
            public string Link { get; set; }
            public string Description { get; set; }
            public string Guid { get; set; }
        }

Notice the usage of automatic properties as well to optimise readability and simplicity of the code.

Actually, here is where the F# power comes from, if I had implemented this in F#, I wouldn’t need to create those mock objects because of the type inference. C# inferenced types still needs to be more manually inferenced, like the subscription and channel class case.

In order to use we need to create an XDocument object that is preferably an RSS feed (that’s the purpose actually :)) XDocument is an object that take the best features of XMLDocument and XMLReader. It doesn’t load the complete stream, it is possible to read forward and backward. It is very flexible, we can do transformation, parsing, writing, reading, querying etc.

The LINQ parser query is below :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
static IEnumerable<channel> getChannelQuery(XDocument xdoc)
        {
            return from channels in xdoc.Descendants("channel")
                        select new Channel
                        {
                            Title = channels.Element("title") != null ? channels.Element("title").Value : "",
                            Link = channels.Element("link") != null ? channels.Element("link").Value : "",
                            Description = channels.Element("description") != null ? channels.Element("description").Value : "",
                            Items = from items in channels.Descendants("item")
                                    select new Item
                                    {
                                        Title = items.Element("title") != null ? items.Element("title").Value : "",
                                        Link = items.Element("link") != null ? items.Element("link").Value : "",
                                        Description = items.Element("description") != null ? items.Element("description").Value : "",
                                        Guid = (items.Element("guid") != null ? items.Element("guid").Value : "")
                                    }};
        }

At the end there is no magic, it is like opening an XML reader and choosing the element and attribute we like. However LINQ to XML does this so nicely that I don’t wanna use XMLReader anymore.

It is needed to create the XDocument object and pass it to the function. Bare in mind the parsing is not executed until we loop for each element of the list (laziness).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
 static void Main(string[] args)
        {
            
            string feedUri = "http://feeds.feedburner.com/canerten";             
            var myFeed = getChannelQuery(XDocument.Load(new StreamReader(HttpWebRequest.Create(feedUri).GetResponse().GetResponseStream())));

            foreach (var item in myFeed)
            {
                Console.WriteLine("{0} - {1}", item.Title, item.Description);

                foreach (var i in item.Items)
                {
                    Console.WriteLine("{0}", i.Title);
                }
            }
        }

Coding Day - Adventures in Computing - Can Erten’s Blog Book Review-Expert F# Distributed Functional Programming with F# MPI Tools Power of Functional Programming, its Features and its Future

The code output is shown below. You can also get program.cs file below.

http://www.codingday.com;//downloads/rsslinq.cs;LINQ to RSS;

What I like about LINQ is its laziness and really gives a unified data model for all different sources. At the end everything is an IEnumerable no matter whether it is a database query or a memory object.