HTML -> ASCII

Jaroslav Klaus J.Klaus at sh.cvut.cz
Thu Oct 7 13:36:53 CEST 1999


On Thu, Oct 07, 1999 at 01:12:44PM +0200, Tomas TPS Ulej wrote:
> Potrebujem prevod HTML suboru do tvaru bez HTML prikazov, existuje nieco
> podobneho?

a) viz.priloha
b) lynx -dump

Zalezi co presne potrebujete.

Jarda Klaus

-------------- next part --------------
#!/usr/local/bin/perl

# converts html files into ascii by just stripping anything between
#  < and >
# written 4/21/96 by Michael Smith for WebGlimpse

$carry=0;

while(<STDIN>){
	$line = $_;
	
	if($carry==1){
		# remove all until the first >
		unless($line=~s/(.*?)\?>//) {
			$cutout.= " ".$line;
			next;
		}
		# if we didn't do next, it succeeded -- reset carry
		$cutout .= " ".$1;
		$cutout="";
		$carry=0;
	}

	while($line=~s/<\?(.*?)\?>//g){
		$cutout = $1;
		$cutout = "";
	};
	if($line=~s/<\?(.*)$//){
		$cutout=$1;
		$carry=1;
	}
	print $line;
}


More information about the Users-l mailing list