Page 1 of 1
Solved: How to check urls in linux
Posted: Tue Jan 31, 2012 3:23 pm
by mister_v
Hi,
I want to check my sites for broken or missing urls.
Does anyone know an easy and fast way to check the links of a website.
I want to use it in linux (Ubuntu) and create a script with it,
so I can run it automaticly on a regular basis.
Thanks,
Re: How to check urls in linux
Posted: Wed Feb 01, 2012 9:18 am
by chris
You can use wget.
It is not the best tool but it is probably already installed on your system.
Code: Select all
wget -r -nd --spider -o links.txt -np -p http://www.sitetocheck.com
It shows the results in
links.txt.
just search for the 404 errors.
A better tool is linkchecker (
http://linkchecker.sourceforge.net)
You can install it on ubuntu/kubuntu with:
You can check for broken links.
But it can also validate your HTML and CSS,
It can even scan for viruses on you site with clamAV.
There is also GUI client for linkchecker.
Re: How to check urls in linux
Posted: Wed Feb 01, 2012 7:57 pm
by mister_v
Thanks,
I used linkchecker it works really great.
But it lists everything, I only want the errors.
It also checks the amazon urls and for some reason they also give errors,
I don't want them.
Re: How to check urls in linux
Posted: Thu Feb 02, 2012 2:17 pm
by chris
Jus use grep to get the 404 error out:
Code: Select all
less links.txt | grep -B 4 '404 Not Found'
-B 4 tells grep to also return the 4 lines before each match.
You don't want linkchecker to test the amazon URLs,
you can exclude them:
Code: Select all
linkchecker --ignore-url="amazon" http://www.sitetotest.com > links.txt
Re: How to check urls in linux
Posted: Mon Feb 06, 2012 7:18 pm
by mister_v
Many Thanks
I got what I needed.