=head1 NAME C - Reformat HTML, indented according to structure =head1 SYNOPSIS urhtml_fmt [uri|file] =head1 EXAMPLE urhtml_fmt http://perl.org =head1 DESCRIPTION Given the URI or the name of a file, writes it to C reformatted and indented according to the HTML structure. Missing start and end tags are supplied and comments added to indicate this. Text inside C<<
 >> elements 
is not altered.

L tries to parse everything that is actually out there on the Web.
In fact,
L will assume any file fed to it was intended as HTML,
and will produce its best guess of the author's intent.

L supplies missing start and end tags.
L's parser is extremely liberal in what it accepts.
When its liberalization of the standards is not sufficient to make
a document into valid HTML,
L
will pick characters to treat as noise or "cruft".
The parser ignores cruft in determining
the structure of the document.

When
L adds
a missing start tag,
it precedes the new start tag with a comment.
When
L adds
a missing end tag,
it follows the new end tag with a comment.
When L classifies characters
as "cruft",
it adds a comment to that effect before the "cruft".

C
 elements receive special treatment.
The contents of 
C
 elements are not reformatted.
When missing tags or cruft occur inside a C
 element,
the comments to that effect are placed 
before the C<< 
 >> start tag.

The argument to L can be either as a URI or a file
name.  If it starts with alphanumerics followed by a colon, it is treated
as a URI.  Otherwise it is treated as file name.

=head1 SAMPLE OUTPUT

Given this input:

    Test page<tr>x<head attr="I am cruft"><p>Final graf

L<urhtml_fmt> returns

    <!-- Following start tag is replacement for a missing one -->
    <html>
      <!-- Following start tag is replacement for a missing one -->
      <head>
        <title>
          Test page
        
        
      
      
      
      
        
        
x

Final graf

=head1 PURPOSE This program is a demo of a demo. It purpose is to show how easy it is to write applications which look at the structure of web pages using L. And the purpose of L is to demonstrate the power of its parse engine, L. L was written in a few days, and its logic is a straightforward, natural expression of the structure of HTML. =head1 AUTHOR Jeffrey Kegler =head1 BUGS Please report any bugs or feature requests to C, or through the web interface at L. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. =head1 SUPPORT You can find documentation for this module with the perldoc command. perldoc Marpa You can also look for information at: =over 4 =item * AnnoCPAN: Annotated CPAN documentation L =item * CPAN Ratings L =item * RT: CPAN's request tracker L =item * Search CPAN L =back =head1 ACKNOWLEDGMENTS The starting template for this code was HTML::TokeParser, by Gisle Aas. =head1 LICENSE AND COPYRIGHT Copyright 2007-2009 Jeffrey Kegler, all rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0. =cut