lexor.core.elements module¶

This module defines the elements of the document object model (DOM). This implementation follows most of the recommendations of w3.

Inheritance Tree¶

lexor.core.node.Node (__builtin__.object)

CharacterData

Text
ProcessingInstruction
Comment
CData
Entity
DocumentType

Element

RawText (Element, CharacterData)

Void

Document

DocumentFragment

class lexor.core.elements.CharacterData(text='')[source]¶

Bases: lexor.core.node.Node

A simple interface to deal with strings.

__init__(text='')[source]¶: Set the data property to the value of text and set its name to '#character-data'.

node_value[source]¶: Return or set the value of the node. This property is a wrapper for the data attribute.

class lexor.core.elements.Text(text='')[source]¶

Bases: lexor.core.elements.CharacterData

A node to represent a string object.

__init__(text='')[source]¶: Call its base constructor and set its name to '#text'.

clone_node(_=True)[source]¶: Return a new Text node with the same data content.

class lexor.core.elements.ProcessingInstruction(target, data='')[source]¶

Bases: lexor.core.elements.CharacterData

Represents a “processing instruction”, used to keep processor-specific information in the text of the document.

__init__(target, data='')[source]¶: Create a Text node with its data set to data.

target[source]¶: The target of this processing instruction.

clone_node(_=True)[source]¶: Returns a new PI with the same data content.

class lexor.core.elements.Comment(data='')[source]¶

Bases: lexor.core.elements.CharacterData

A node to store comments.

__init__(data='')[source]¶: Create a comment node.

comment_type[source]¶: Type of comment. This property is meant to help with documents that support different styles of comments.

clone_node(_=True)[source]¶: Returns a new comment with the same data content.

class lexor.core.elements.CData(data='')[source]¶

Bases: lexor.core.elements.CharacterData

Although this node has been deprecated from the DOM, it seems that xml still uses it.

__init__(data='')[source]¶: Create a CDATA node and set the node name to '#cdata-section'.

clone_node(_=True)[source]¶: Returns a new CData node with the same data content.

class lexor.core.elements.Entity(text='')[source]¶

Bases: lexor.core.elements.CharacterData

From merriam-webster definition:

Something that exists by itself.
Something that is separate from other things.

This node acts in the same way as a Text node but it has one main difference. The data it contains should contain no white spaces. This node should be reserved for special characters or words that have different meanings across different languages. For instance in HTML you have the & to represent &. In LaTeX you have to type \$ to represent $. Using this node will help you handle these Entities hopefully more efficiently than simply finding and replacing them in a Text node.

__init__(text='')[source]¶: Create an Entity node and set the node name to #entity.

clone_node(_=True)[source]¶: Returns a new Entity with the same data content.

class lexor.core.elements.DocumentType(data='')[source]¶

Bases: lexor.core.elements.CharacterData

A node to store the doctype declaration. This node will not follow the specifications at this point (May 30, 2013). It will simply recieve the string in between <!doctype and >.

Specs: http://www.w3.org/TR/2012/WD-dom-20121206/#documenttype

__init__(data='')[source]¶: Create a DocumentType node and set its name to #doctype.

clone_node(_=True)[source]¶: Returns a new doctype with the same data content.

class lexor.core.elements.Element(name, data=None)[source]¶

Bases: lexor.core.node.Node

Node object configured to have child Nodes and attributes.

__init__(name, data=None)[source]¶

The parameter data should be a dict object. The element will use the keys and values to populate its attributes. You may modify the elements internal dictionary. However, this may unintentially overwrite the attributes defined by the __setitem__ method. If you wish to add another attribute to the Element object use the convention of adding an underscore at the end of the attribute. i.e

>>> strong = Element('strong')
>>> strong.message_ = 'An internal message'
>>> strong['message'] = 'Attribute message'

__call__(selector)[source]¶: Return a lexor.core.selector.Selector object.

update_attributes(node)[source]¶: Copies the attributes of the input node into the calling node.

__getitem__(k)[source]¶

Return the k-th child of this node if k is an integer. Otherwise return the attribute of name with value of k.

>>> x.__getitem__(k) is x[k]
True

get(k, val='')[source]¶: Return the attribute of name with value of k.

__setitem__(k, val)[source]¶

Overloaded array operator. Appends or modifies an attribute. See its base method lexor.core.node.Node.__setitem__() for documentation on when val is not string.

>>> x.__setitem__(attname) = 'att' <==> x[attname] = 'att'

__delitem__(k)[source]¶

Remove a child or attribute.

>>> x.__delitem__(k) <==> del x[k]

__contains__(obj)[source]¶

Return True if obj is a node and it is a child of this element or if obj is an attribute of this element. Return False otherwise.

>>> x.__contains__(obj) == obj in x
True

contains(obj)[source]¶: Unlike __contains__, this method returns True if obj is any of the desendents of the node.

__iter__()[source]¶

Iterate over the element attributes names.

>>> for attribute_name in node: ...

attlen[source]¶: The number of attributes.

attributes[source]¶: Return a list of the attribute names in the element.

values[source]¶: Return a list of the attribute values in the Element.

attribute(index)[source]¶: Return the name of the attribute at the specified index.

attr(index)[source]¶: Return the value of the attribute at the specified index.

items()[source]¶: return all the items.

update(dict_)[source]¶: update with the values of dict_. useful when the element is empty and you created an Attr object. then just update the values.

rename(old_name, new_name)[source]¶

Renames an attribute.

>>> from lexor.core.elements import Element
>>> node = Element('div')
>>> node['att1'] = 'val1'
>>> node
div[0x10a090750 att1="val1"]:
>>> node.rename('att1', 'new-att-name')
>>> node
div[0x10a090750 new-att-name="val1"]:

clone_node(deep=False, normalize=True)[source]¶: Returns a new node. When deep is True, it will clone also clone all the child nodes.

get_elements_by_class_name(classname)[source]¶: Return a list of all child elements which have all of the given class names.

children(children=None, **keywords)[source]¶

Set the elements children by providing a list of nodes or a string. If using a string then you may provide any of the following keywords to dictate how to parse and convert:

parser_style: '_'
parser_lang: 'html
parser_defaults: None,
convert_style: '_',
convert_from: None,
convert_to: 'html',
convert_defaults: None,
convert: 'false'

If no children are provided then it returns a string of the children written in plain html. To change this behavior provide the following keywords:

writer_style: 'plain'
writer_lang: 'html

Important

This requires the installation of lexor styles.

__weakref__¶: list of weak references to the object (if defined)

class lexor.core.elements.RawText(name, data='', att=None)[source]¶

Bases: lexor.core.elements.Element, lexor.core.elements.CharacterData

A few elements do not have children, instead they have data. Such elements exist in HTML: script, title among others.

__init__(name, data='', att=None)[source]¶: You may provide att as a dict object.

clone_node(deep=True, normalize=True)[source]¶: Returns a new RawText element

__weakref__¶: list of weak references to the object (if defined)

class lexor.core.elements.Void(name, att=None)[source]¶

Bases: lexor.core.elements.Element

An element with no children.

__init__(name, att=None)[source]¶: You may provide att as a dict object.

clone_node(_=True, normalize=True)[source]¶: Returns a new Void element.

class lexor.core.elements.Document(lang='xml', style='default')[source]¶

Bases: lexor.core.elements.Element

Contains information about the document that it holds.

__init__(lang='xml', style='default')[source]¶: Creates a new document object and sets its name to #document.

clone_node(deep=False, normalize=True)[source]¶: Returns a new Document. Note: it does not copy the default values.

language[source]¶

The current document’s language. This property is used by the writer to determine how to write the document.

This property is a wrapper for the lang attribute.

writing_style[source]¶

The current document’s style. This property is used by the writer to determine how to write the document.

This property is a wrapper for the style attribute.

uri[source]¶: The Uniform Resource Identifier. This property may become useful if the document represents a file. This property should be set by the a Parser object telling us the location of the file that it parsed into the Document object.

static create_element(tagname, data=None)[source]¶: Utility function to avoid having to import lexor.core.elements module. Returns an element object.

get_element_by_id(element_id)[source]¶: Return the first element, in tree order, within the document whose ID is element_id, or None if there is none.

class lexor.core.elements.DocumentFragment(lang='xml', style='default')[source]¶

Bases: lexor.core.elements.Document

Takes in an element and “steals” its children. This element should only be used as a temporary container. Note that the __str__ method may not yield the expected results since all the function will do is use the __str__ method in each of its children. First assign this object to an actual Document.

append_child(new_child)[source]¶: Adds the node new_child to the end of the list of children of this node. The children contained in a DocumentFragment only have a parent (the DocumentFragment). As opposed as lexor.core.node.Node.append_child() which also takes care of the prev and next attributes.

__repr__()[source]¶

>>> x.__repr__() <==> repr(x)

__str__()[source]¶

>>> x.__str__() <==> str(x)