Thursday, October 23, 2008

Strip HTML

In some situations you have a installation instruction file that is in HTML. You want to read it in the console but you don't have links|elinks|lynx installed. What do you do ?

You pass the file through:


sed -e :a -e 's/<[^>]*>//g;/</N;//ba'


I used something like this before but lost it. Hence this blog, I want to keep little useful things like this to re-use.

A very useful way of using this is to bind it to a alias in your .profile as something like striphtml:


alias striphtml="sed -e :a -e 's/<[^>]*>//g;/</N;//ba'"


so that you have it handy. Which you can then use like:

striphtml readme.html | less


The original found here.

No comments: