noizZze

How to Download Whole Podcasts Archive

I was planning to write something different today, but this just poped up in front of me and I can’t help sharing it with you. So please enjoy and hope you appreciate it.

It’s always interesting and, what’s more important, pleasant to learn some hack you could continue using; something really useful and not obvious. Yeah, it’s nice. I have something of a sort for you today. A feather in my hat actually; but at first, here’s some background.

I found couple of neat podcasts on English and one of them sounds incredibly. That’s the reason why I decided to stick to it and follow. In this specific moment I mean The Bob and Rob Show which is terrific and I really mean that. I will share my feelings on this matter later, but today I wish to turn your attention to something different. They do have a wonderful podcast feed with detailed show notes and all bells and whistles; admittedly, it’s universally great. But there’s a downside – it’s limited only to the latest eight episodes. What a trouble indeed!

I need to get off track here to say that I have become and avid iTunes user since my buying of iPod and have recently discovered the beauty of the tight integration of podcasts in it. So, as you have already realized, I’m using iTunes for podcasts fetching and monitoring.

As it supposed, there’s no surprise that iTunes picks only most recent 8 episodes from that feed. What about the rest? Am I supposed to wave my hand in goodbye to them? No way, and I decided to fight bravely against it. And here’s what I came up with.

Their feed link looks like this:

http://www.englishcaster.com/bobrob/?feed=rss2

The site runs on top of the Wordpress platform, so I took the sources for deeper analysis and found I can add a parameter to control which page is displayed, like ’paged=2’ to display the second page. I also found the protected parameter to control the number of posts on the page, but its nature doesn’t let manipulate the numbers. That’s why I took this page number thing and started experimenting.

The idea was to dump all pages into XML files, merge them into consolidated XML and feed it to iTunes somehow. Plus preferably, it shouldn’t be a different podcast as I still wished to continue following the original once I finished with archives. The goal was clear and I jumped right into action.

It was an easy yet a little mundane task to dump all pages (7) into separate files and copy-paste item tags from them into one big consolidated XML (200 Kb).

Here comes the tricky part. I needed iTunes to pick up this file when it looks for the podcast updates next time. To do so, I needed to put the handmade XML on some server and route the query to it, but how? The magic trick here is to route all the calls to the original server (www.englishcaster.com) to some other server with my file. I decided to use my local Apache server for this purpose: created the ‘bobrob’ directory and ‘index.php’ in it. You know that when there’s no file mentioned in the URL (like it is in our case; see the link above), the server will look for ’index.php’ and ’index.html’ before throwing out the white flag.

I named the file ’index.php’ and added some simple code to set the content type header for iTunes to think it’s valid XML. Here’s how the code looks like:

<?php
header(“Content-type: text/xml”);
echo ‘<?xml version=”1.0” encoding=”UTF-8”?>’;
?>
<!– generator=”wordpress/2.0.2” –>
<rss version=”2.0” > … </rss>

The last, the most interesting part was to tell Windows to route all requests to my local site. There’s a file ’hosts’ in ’c:\Windows\system32\drivers\etc’. The file holds aliases for site names and maps them to IP addresses. The address ’127.0.0.1’ stands for local PC (as you might know). All I had to do was to put the desired host name ’www.englishcaster.com’ to the end of the localhost mapping line, and upon saving changes, the system started to recognize my PC under the new name.

The last move was to open iTunes and ask it to update the feed. It showed the complete list of 51 episodes (at the moment of writing) ready for fetching. Delicious…

I removed the fake mapping and continued with downloading. That’s all tricks for today. I hope it helps you some day.

Let me know what you think of this. Was it clear? Do you have any enhancements or alternative tricks? How would *you* cope with this situation? These are all interesting questions and I would love to hear from you.

Have a nice day and happy listening!