Downloading content from Rapidshare.com using wget and bash

I was trying to find a program to automate the process of downloading content from rapidshare.com, I used to use RapidGet however this only works on windows and while it’s possible to run it on other platforms like linux using wine (I haven’t tried darwine on osx but I’m sure it would work) it’s not the ideal solution. I also found some issues with using RapidGet the most irritating being that sometimes it wouldn’t download files correctly and I would have to download the file manually in a browser. One of the other features I wanted was to remotely download files from a different machine, while this is also possible using linux and wine and forward the X11 calls over ssh. This solution is still not ideal because you need to be logged into the machine and leave the ssh session open until RapidGet is finished, pretty messy. So I decided to bite the bullet once rapidshare.de started to forward all their content to rapidshare.com and write my own bash script.

The script requires you to have a rapidshare premium account and the following programs and script to run correctly.

sed, wget, list_urls.sed

sed and wget are available on most if not all platforms Cygwin on Windows, fink or darwin ports on Mac OSX, and under linux or unix repositories. I use ubuntu so in the unlikely situation that wget or sed are not installed you can install them by typing sudo apt-get install sed wget list_urls.sed is a sed script available at the sed site on sed.sourceforge.net. list_utls.sed extracts all the hyper-link urls from a file.

So how did I create the script? Well with a little bit of detective work and a beer while I looked at how rapidshare.com fitted together. The first thing I did was to download livehttpheaders for firefox, this allows you to examine the headers of http requests including get, post and redirect data. Using this it became fairly easy to see what was going on. There are three steps to the script

  1. Login and save the cookie
  2. Using the saved cookie retrieve the actual url of the file
  3. Download the url with wget

To use the script you must provide it with a user password and a url e.g. downloadFromCom.sh -u  username -p password -l link

The first part of the script handles user input

#!/bin/bash
TEMP=`getopt -o u:p:l: --long user:,pass:,url: \
-n 'downloadFromCom.sh' -- "$@"`

if [ $? != 0 ] ; then echo "Error. Correct useage options are \" -u  -p  -l \" Terminating..." >&2 ; exit 1 ; fi
eval set -- "$TEMP"

while true ; do
case "$1" in
-u|--user) echo "Username: $2"; user=$2; shift 2;;
-p|--pass) echo "Password: $2"; pass=$2; shift 2;;
-l|--url) echo "URL: $2"; url=$2; shift 2;;
--) shift ; break ;;
*) echo "Internal error" ; exit 1 ;;
esac
done

RED='\e[1;31m'
CYAN='\e[1;36m'
NC='\e[0m' # No Color

echo REALURL: $url
fileName=`basename $url`
echo FILENAME: $fileName
cookie=cookie

Next it logs into Rapidshare and saves the user cookie

## LOGIN and save cookie ##
wget --save-cookies=$cookie -q --post-data="login=$user&password=$pass" https://ssl.rapidshare.com/cgi-bin/premiumzone.cgi
rm premiumzone.cgi
wget --load-cookies=$cookie -q $url -O $fileName.temp

server=`grep post $fileName.temp | tr " " "\n" | grep action | sed 's/[^"]*"\([^"]*\).*/\1/'`
uri=`grep post $fileName.temp | tr " " "\n" | grep value | sed 's/[^"]*"\([^"]*\).*/\1/'`
#tr " " "\n" replaces all spaces with a new line
#sed 's/[^"]*"\([^"]*\).*/\1/' searches a string and retrieves the content from in between double quotes

echo SERVER: $server
echo URI: $uri
newURL="$server$uri"
echo NEWURL: $newURL

It now tries to find the actual URL

## RETRIEVE ACTUAL URL ##
wget --load-cookies=cookie -q --post-data=dl.start=PREMIUM $newURL -O $fileName.temp2
actualURL=`list_urls.sed $fileName.temp2 | grep /files | tail -1`
echo -e "${RED}ACTUALURL: ${CYAN}$actualURL${NC}"
fileName2=`basename $actualURL`

Lastly it downloads the file to “downloads” directory and also has the “-b” flag which enables background mode in wget allowing multiple files to download at once.

## DOWNLOAD ACTUAL URL ##
wget -b $actualURL -O ./downloads/$fileName2 --load-cookies=cookie

Finally does some clean up

## CLEAN UP ##
rm $cookie
rm $fileName.temp
rm $fileName.temp2

This script will only download one file and then exit. What if we have a lot of urls that we want download? Simple just put all your links into a text file separated by a new line character and pass that file as an option to this script. e.g. ./download.sh urls.txt

#!/bin/bash
for url in `cat $1`
do
./downloadFromCom.sh -u username -p password -l $url
done

All you have to do is make sure that you have edited this script and entered your username and password.Now that we have run the script and the files are downloading how do we know when they are finished? With this little script of course.

!/bin/bash
j=0
while true
do
clear
echo "===       Iteration $j    ==="
for i in `ls ./wget-log*`
do
head -1 $i
saved=`grep saved $i`
if [ -z "$saved" ]; then
tail -3 $i | head -1 #tail -1 $i
else
tail -2 $i | head -1 #tail -3 $i | head -1
fi
done
let j++
sleep 3
done

All this does is poll the logs that wget produces and continually gives a summary of what logs it finds in the current directory. You will have to kill this manually by ctrl+c. I could have made sure that it quits itself once all the files have been successfully but its not really necessary.

So there we go a few scripts to download content from rapidshare you could of course use the same process write some scripts for other sites.

Here’s a zip file containing all the scripts. RapidShare Download Scripts

22 Responses to “Downloading content from Rapidshare.com using wget and bash”


  1. 1 dan Dec 30th, 2006 at 8:17 am

    Have you used http://rapidbolt.com ? It’s a service which masks the download page and encrypts the URLs, thus nobody can report your files.

  2. 2 sullivanmark Dec 30th, 2006 at 8:46 am

    Haven’t come across that particular site but I have seen http://rapidsafe.de and .org seems to do the same type of thing.

  3. 3 Karl Dec 30th, 2006 at 4:41 pm

    What’s the download url’s of the exact files?

    Rapidshare gives urls, but it redirects somewhere, do you know where about?

    Thanks,
    Karl

  4. 4 sullivanmark Dec 30th, 2006 at 4:55 pm

    Hey Karl,
    I don’t know what the exact url of each file is but thats what script does. You give it a rapidshare url and it will follow the redirect so it will find the actual url of the file. Which it then proceeds to download.
    Hope that makes sense.
    Mark

  5. 5 Karl Dec 31st, 2006 at 7:26 am

    Hello Mark,

    Thanks for the reply. Yeah it makes sense.

    Suppose I just need to find the URL where all the files are saved.

    Thanks,
    Karl

  6. 6 Trevor Jan 6th, 2007 at 8:41 pm

    Hi Mark,

    First off to Dan above, rapidsafe, I find, is more powerful then rapidbolt.

    Secondly to Mark. I have been reading some interesting articles on converting Linux to Windows the past few weeks. My first thougtht was “Why bother!”

    The more I did think about it however, the more interested I got and I was just wondering if you had any thought on how you would go about it?

    Trevor

  7. 7 sullivanmark Jan 7th, 2007 at 1:42 pm

    Hey Trevor,

    Could you clarify what you mean by “converting Linux to Windows”. ?

    Do you mean, if you had some script/program which worked in linux and you wanted to use it in windows, you’d like some procedure to follow in-order to accomplish this, or do you mean you’d like to change the platform you are currently using altogether e.g. linux to windows or vice-versa.

    Mark

  8. 8 Nikola Jan 9th, 2007 at 10:08 am

    Hello,
    When I found your script yesterday I was thrilled because I was searching for an easy way to download from my gentoo server via ssh, bash & wget. Unfortunately it doesn`t work for me. When i set direct-download in my account options it had worked for some time, and then it stops and writes this for all files in the list:
    URL: http://rapidshare.com/files/10856645/The.Simpsons.S18E10.PROPER.PDTV.XviD-2HD.part1.rar
    REALURL: http://rapidshare.com/files/10856645/The.Simpsons.S18E10.PROPER.PDTV.XviD-2HD.part1.rar
    FILENAME: The.Simpsons.S18E10.PROPER.PDTV.XviD-2HD.part1.rar
    SERVER: http://ul66.rapidshare.com/files/10856645/The.Simpsons.S18E10.PROPER.PDTV.XviD-2HD.part1.rar
    URI:
    NEWURL: http://ul66.rapidshare.com/files/10856645/The.Simpsons.S18E10.PROPER.PDTV.XviD-2HD.part1.rar
    ./downloadFromCom.sh: line 47: list_urls.sed: command not found
    ACTUALURL:
    basename: missing operand
    Try `basename –help’ for more information.
    wget: missing URL
    Usage: wget [OPTION]… [URL]…

    Try `wget –help’ for more options.

    It works like this for some time. I saw that the file downloaded was ~60MB. Later, it just stopped.
    When I switched off the direct-download it won`t start the dl at all.
    Help?

  9. 9 Szati Jan 17th, 2007 at 3:11 pm

    That’s great!
    Thanks for it!

    Szati

  10. 10 Dan Jan 23rd, 2007 at 6:16 am

    “First off to Dan above, rapidsafe, I find, is more powerful then rapidbolt.” - Trevor

    What do you mean by that? I was just saying rapidbolt is a good service to help stop people reporting your files and potentially getting them deleted by RS.

  11. 11 kmbasu Jan 29th, 2007 at 4:05 pm

    Hi,

    can you suggest what is going wrong, because I cannot make it work! I typed
    ~>./downloadFromCom.sh -u ***** -p *****

    this is the first part of the error message:

    REALURL:
    basename: missing operand
    Try `basename –help’ for more information.
    FILENAME:
    rm: cannot remove `premiumzone.cgi’: No such file or directory
    wget: missing URL
    Usage: wget [OPTION]… [URL]…

    Try `wget –help’ for more options.
    grep: .temp: No such file or directory
    grep: .temp: No such file or directory
    SERVER:
    URI:
    NEWURL:
    wget: missing URL
    Usage: wget [OPTION]… [URL]…

    Try `wget –help’ for more options.
    /bin/sed: can’t read .temp2: No such file or directory

    Thanks for your help!

  12. 12 kmbasu Jan 29th, 2007 at 4:18 pm

    P.S. just in case there is confusion over the previous post, I didn’t forget the link URL in my command, it just disappeared when converting to HTML because I wrote it in angular brackets! :-)

  13. 13 kmb Jan 30th, 2007 at 11:55 am

    Unfortunately this didn’t work. However, I’m using RapGet with Wine, and apart from some minor annoyances, that works fine!

  14. 14 Pat Feb 8th, 2007 at 4:32 am

    Nice work Mark. Wanted to write such script on my own but I see it would take some time to write it. I like the list_url.sed script :) ). Thank you for making this publicly available.

  15. 15 Pat Feb 9th, 2007 at 4:10 am

    You have small mistake in your script. The line actualURL=`list_urls.sed $fileName.temp2 | grep /files | tail -1` should have …`./list_urls.sed $fileName.temp2 …

  16. 16 Pat Feb 9th, 2007 at 4:11 am

    and tail -1 is deprecated … tail -n 1 is the new format.

  17. 17 Pat Feb 9th, 2007 at 4:32 am

    Eh, the change with the ./ is wrong due to the `. Should have bigger fonts… :)

  18. 18 Name Feb 16th, 2007 at 1:53 am

    http://www.google.co.uk/search?hl=en&q=wget+rapidshare.com&meta=

    http://en.wikipedia.org/wiki/RapidShare
    - see wget

    #
    # rapidbash
    # - use standard unix cmds and wget to rapidshare download
    #

    user='’
    pass='’
    link='’

    file='’

    WGOPT=’-c -q –no-check-certificate’

    getpara(){
    user=$1
    pass=$2
    link=$3
    file=$(basename $3)
    return 0
    }

    help(){
    cat

  19. 19 Nanyo Feb 17th, 2007 at 5:01 pm

    I am having an error when trying to download look:

    ACTUALURL:
    wget: missing URL
    Usage: wget [OPTION]… [URL]…

    Try `wget –help’ for more options.

    Any help?

  20. 20 marc Feb 25th, 2007 at 6:56 am

    awesome job! you’ve saved me a lot of work :)

  21. 21 somebody Feb 27th, 2007 at 5:06 am

    does this script still work, i’m running it on cygwin, and it doesn’t seem to do anything, actualurl always is empty

  22. 22 attrox Mar 3rd, 2007 at 5:15 pm

    That works Great! Thank You!

Leave a Reply