>>10537<3
>>10538It took at least forty-five minutes of reading the wget manual, trying out variant commands, and searching the internet for existing solutions to similar problems to write that one line. I started by writing a lengthy wget command, which lasted until I found out that it was downloading the (large, slow) files I'd told it that I didn't want. It wasn't a bug, and there was no way to turn it off.
All you really need past a healthy knowledge of shell script syntax and a few core tools (like grep and the man pages) is the belief that *most* things can probably be automated in one way or another... and practice doing weird stuff in the command shell.
Experience will teach you how to identify a problem and break it into easier sub-problems that have likely been solved before, either by you or by someone on the internet -- I had to search through a few Stack Overflow threads before I found out that lynx is a tool that can dump links in a page. Many solutions to these problems tend to hide in the manual for tools you may be using. Wget's --no-clobber flag, for example, prevents downloading files you've already downloaded before.
You could view my code above as the combined solution to the problems of "Extract all the links from the page, find those that point to a PDF file, ignore those that I already have, and download the rest". Unix tools are often extremely effective at doing one job well, so it's usually better to glue many commands together than to use just one that claims to do it all.
As for specifics, I learned a lot from The Grymoire (
http://www.grymoire.com/Unix/). Other tips I've picked up from Googling for things like "bash tricks".