Package translate :: Package storage :: Module html :: Class htmlfile
[hide private]
[frames] | no frames]

Class htmlfile

source code

markupbase.ParserBase --+    
                        |    
    HTMLParser.HTMLParser --+
                            |
               object --+   |
                        |   |
    base.TranslationStore --+
                            |
                           htmlfile
Known Subclasses:

Nested Classes [hide private]
  UnitClass
A unit of translatable/localisable HTML content
Instance Methods [hide private]
 
__init__(self, includeuntaggeddata=None, inputfile=None)
Initialize and reset this instance.
source code
 
guess_encoding(self, htmlsrc)
Returns the encoding of the html text.
source code
 
do_encoding(self, htmlsrc)
Return the html text properly encoded based on a charset.
source code
 
parse(self, htmlsrc)
parser to process the given source string
source code
 
addhtmlblock(self, text) source code
 
strip_html(self, text)
Strip unnecessary html from the text.
source code
 
has_translatable_content(self, text)
Check if the supplied HTML snippet has any content that needs to be translated.
source code
 
startblock(self, tag) source code
 
endblock(self) source code
 
handle_starttag(self, tag, attrs) source code
 
handle_startendtag(self, tag, attrs) source code
 
handle_endtag(self, tag) source code
 
handle_data(self, data) source code
 
handle_charref(self, name) source code
 
handle_entityref(self, name) source code
 
handle_comment(self, data) source code

Inherited from HTMLParser.HTMLParser: check_for_whole_start_tag, clear_cdata_mode, close, error, feed, get_starttag_text, goahead, handle_decl, handle_pi, parse_endtag, parse_pi, parse_starttag, reset, set_cdata_mode, unescape, unknown_decl

Inherited from markupbase.ParserBase: getpos, parse_comment, parse_declaration, parse_marked_section, updatepos

Inherited from markupbase.ParserBase (private): _parse_doctype_attlist, _parse_doctype_element, _parse_doctype_entity, _parse_doctype_notation, _parse_doctype_subset, _scan_name

Inherited from base.TranslationStore: __str__, addsourceunit, addunit, findunit, getunits, isempty, makeindex, save, savefile, translate, unit_iter

Inherited from base.TranslationStore (private): _assignname

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__

Class Methods [hide private]

Inherited from base.TranslationStore: parsefile, parsestring

Class Variables [hide private]
  markingtags = ['p', 'title', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6...
  markingattrs = []
  includeattrs = ['alt', 'summary', 'standby', 'abbr', 'content']

Inherited from HTMLParser.HTMLParser: CDATA_CONTENT_ELEMENTS

Inherited from markupbase.ParserBase (private): _decl_otherchars

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, includeuntaggeddata=None, inputfile=None)
(Constructor)

source code 

Initialize and reset this instance.

Overrides: object.__init__
(inherited documentation)

guess_encoding(self, htmlsrc)

source code 

Returns the encoding of the html text.

We look for 'charset=' within a meta tag to do this.

parse(self, htmlsrc)

source code 

parser to process the given source string

Overrides: base.TranslationStore.parse
(inherited documentation)

strip_html(self, text)

source code 

Strip unnecessary html from the text.

HTML tags are deemed unnecessary if it fully encloses the translatable text, eg. '<a href="index.html">Home Page</a>'.

HTML tags that occurs within the normal flow of text will not be removed, eg. 'This is a link to the <a href="index.html">Home Page</a>.'

handle_starttag(self, tag, attrs)

source code 
Overrides: HTMLParser.HTMLParser.handle_starttag

handle_startendtag(self, tag, attrs)

source code 
Overrides: HTMLParser.HTMLParser.handle_startendtag

handle_endtag(self, tag)

source code 
Overrides: HTMLParser.HTMLParser.handle_endtag

handle_data(self, data)

source code 
Overrides: HTMLParser.HTMLParser.handle_data

handle_charref(self, name)

source code 
Overrides: HTMLParser.HTMLParser.handle_charref

handle_entityref(self, name)

source code 
Overrides: HTMLParser.HTMLParser.handle_entityref

handle_comment(self, data)

source code 
Overrides: HTMLParser.HTMLParser.handle_comment

Class Variable Details [hide private]

markingtags

Value:
['p',
 'title',
 'h1',
 'h2',
 'h3',
 'h4',
 'h5',
 'h6',
...