I decided to start up and try and complete a project I had put on the backburner about a month ago for myself, which pretty much involved downloading the content of a webpage for offline viewing and storing it in a database. I also took this time to get into WPF and further my explorations into LINQ.

Those aspects were easy (WPF is damn nice, and LINQ makes database access soo intuitive and easy) however when it came to parsing the HTML, another problem arose. I headed back to my old faithful tool for .Net2 called HtmlAgilityPack. Now this tool I have used before, and it has been great, taking even malformed HTML and turning it into XML for use in applications.

However this is where the issue arose, the nature of the website and content required me to parse a selection box on the page to grab every chapter of the content, so that I could download the rest in the background, and also so that the program knows what to download. Now since the site is not XHTML compliant, it could get away with not placing an ending tag to correspond with each <option> tag. Seems the HtmlAgiltyPack also doesn't do this right at all, and now I am stuck with finding something to do it, at least fix it, or I might have to revert to writing up a custom rewriter with that for Regex. (euurgh - tedious)

 

On the XNA front, I intend to get that sample done really soon, just thinking of something to make it interesting, and then I might have some more things I can get cranked out over these holidays.