Pages

Monday, 30 July 2012

Download, slice and dice podcasts on Linux

I’m trying to replace my Windows applications with Linux applications. On Windows, I use I use Juice to download podcasts as MP3s. Recently I decided to switch over to Linux for receiving podcasts. After looking around at various podcast catchers (especially ones that ran on the command-line, so that I could automate them with a cron job), I ran across Podracer. I decided to combine Podracer with a script to split long MP3s into shorter MP3s so that I could play them more easily in my car. Here’s what I did on my Ubuntu Linux machine:
Step 1: Install and configure podracer
I used these commands:
sudo apt-get install podracer
mkdir ~/.podracer
vim ~/.podracer/subscriptions
and add the url of a podcast, e.g. http://feeds.webmasterradio.fm/tdsc for The Daily SearchCast.
cp /etc/podracer.conf ~/.podracer/podracer.conf
Edit ~/.podracer/podracer.conf so that you can pick the download directory you want. I changed
#poddir=$HOME/podcasts/$(date +%Y-%m-%d)
to
poddir=$HOME/rawpodcasts
because I want all my podcasts in one directory where I can do a batch process over them afterwards. Go ahead and run “mkdir ~/rawpodcasts” to create the directory that podcasts will be stored in.
sudo vim /usr/bin/podracer
(it’s okay, Podracer is a shell script). Find the line that says
m3u=$(date +%Y-%m-%d)-podcasts.m3u
and comment it out so that podracer won’t automatically create an .m3u playlist as it downloads podcasts.
Run podracer in “catchup” mode to avoid downloading all the old podcasts from your subscriptions with “podracer -c”. podracer will create a file ~/.podracer/podcast.log to keep a record of all the podcasts that have been downloaded (the “-c” catchup mode creates this text file without actually downloading the MP3s). If you want to re-download a file (e.g. while you’re testing your configuration), you can edit the file ~/.podracer/podcast.log and just delete the line for any MP3 you want to re-download.
Step 2: Install and configure mp3splt (optional)
At a terminal window, type “sudo apt-get install mp3splt”. In Step 1, we configured Podracer to download podcasts as MP3s into a “rawpodcasts” directory. In this step, we’re going to take those long MP3s and split them into individual segments into a new “finishedpodcasts” directory. Make the “finishedpodcasts” directory with the command “mkdir ~/finishedpodcasts”.
Make a file /home/username/download-mp3s-and-process.sh that looks like this.
#!/bin/bash
# Run podracer to download any new podcasts
/usr/bin/podracer
# Now split the podcasts into segments
for i in /home/username/rawpodcasts/*.mp3
do
nicename=`basename $i .mp3`
# Send both stderr and stdout to /dev/null so that this is a quiet cron job
mp3splt -eqd /home/username/finishedpodcasts -o $nicename-@n $i &> /dev/null
done
This script will run podracer to download any new podcasts. Then we list all the MP3 files in the rawpodcasts directory and run mp3splt on each podcast. If you had a file test.mp3, you would be running the command
“mp3splt -eqd /home/matt/finishedpodcasts -o test-@n test.mp3 &> /dev/null”
for example. What do the options to mp3splt mean?
-e means “split on sync errors.” If someone created an mp3 by concatenating multiple mp3s (e.g. with a program such as mp3wrap), that could cause sync errors. mp3splt looks at those sync errors to split the concatenated mp3 back into multiple mp3 files.
-q stands for “quiet.” Don’t ask user to respond to any questions. Normally “-e” says something like
Mp3Splt 2.1 (2004/Sep/28) by Matteo Trotta
THIS SOFTWARE COMES WITH ABSOLUTELY NO WARRANTY! USE AT YOUR OWN RISK!
MPEG 1 Layer 3 – 44100 Hz – Joint Stereo – 256 Kb/s – Total time: 35m.04s
Processing file to detect possible split points, please wait…
Total tracks found: 6
Is this a reasonable number of tracks for this file? (y/n)
Quiet mode suppresses this interactive question on the last two lines above.
-d is the directory to place the split mp3s.
-o lets you specific an output file. “@n” stands for the track number after splitting. So if test.mp3 were made out of two mp3 files, the output of the command above would be two files (in the finishedpodcasts directory) named test.mp3-001.mp3 and test.mp3-002.mp3 . It doesn’t hurt to run mp3splt on existing mp3s because it will just overwrite any old files that had been created.
Step 3: Periodically download and process podcasts
To download podcast files periodically and process them, make a crontab entry for podracer or your script. This will make the cron daemon run your script every few hours to download new mp3s.
I typed “crontab -e” and made the file look like this:
# At 3:03 am, 8:03 am, 10:03 am, 12:03 pm, and 4:03 pm, run this script
3 3,8,10,12,16 * * * /home/username/download-mp3s-and-process.sh
Whenever you’re ready to put the podcasts on some type of media (SD Card, iPod, iPhone, whatever), just copy over anything from the finishedpodcasts directory (if you used mp3splt in step 2) or the rawpodcasts directory if you skipped step 2. Then delete anything left over in either directory.

from http://www.mattcutts.com/blog/download-podcasts-on-linux/