Wget

From Augix' Wiki

Jump to: navigation, search

Contents

Usage

sh this_script.sh <URL>

Example

sh this_script.sh http://www.affymetrix.com/Auth/support/downloads/library_files/HuEx-1_0-st-v2_libraryfile.zip

Download a file in a website which requires login

wget

#need wget version > 1.10
 
#1. get cookie
 
~/bin/bin/wget -v -c --no-check-certificate --post-data="logon=augix.com%40gmail.com&password=PASSWORD"  \
--save-cookies=cookies.txt --keep-session-cookies http://www.affymetrix.com/site/processLogin.jsp
 
#2. download file $1
 
~/bin/bin/wget --no-check-certificate  --cookies=on --load-cookies=cookies.txt \
--keep-session-cookies $1


curl

curl -D headers_and_cookies -d "logon=augix.com%40gmail.com&password=PASSWORD&x=0&y=0" http://www.affymetrix.com/site/processLogin.jsp
 
curl -b headers_and_cookies http://www.affymetrix.com/Auth/support/downloads/demo_data/HuEx-1_0-st-v2.tissue-mixture-data-set.gcos-files.zip


Download all the files form FTP site by Wget

echo "wget -r -l2 --no-parent -A.bz2 ftp://ftp.hgsc.bcm.tmc.edu/pub/data/Rmacaque/fasta/" | at now

You want to download all the GIFs from a directory on an HTTP server. You tried

wget http://www.server.com/dir/*.gif

but that didn't work because HTTP retrieval does not support globbing. In that case, use:

wget -r -l1 --no-parent -A.gif http://www.server.com/dir/

More verbose, but the effect is the same. `-r -l1' means to retrieve recursively (see section Recursive Retrieval), with maximum depth of 1. `--no-parent' means that references to the parent directory are ignored (see section Directory-Based Limits), and `-A.gif' means to download only the GIF files. `-A "*.gif"' would have worked too.

wget -r -l1 --no-parent -A.gz ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/vsPanTro2/axtNet/ -o wget.log

Download the whole Website

wget -r -p -np -k url


Reference

Mastering Wget - Lifehacker.com

[1]

Personal tools