It should be able to create, modify and read X/HTML in a highly object oriented way that still feels DOM like but is not obese, and is really Pythonic. Preferably it would deal with malformed HTML too, but we can skip this for templates.
For example, I'd like to do this:
>> from someAmazingTemplate import *
>> html = Template('<html><head><title>Hi</title></head><body></body></html>')
>> html.head.append('<link type="text/css" href="main.css" rel="stylesheet" />')
>> html.head.title
Hi
>> html['head']['title']
Hi
I should be able to use/define short functions and use them like this:
>> html.head.append(stylesheet(href="main.css"))
>> html.body.append(h1('BIG TITLE!12',Class="roflol"))
>> html.body.SOURCE
<body>
<h1 class="roflol">
BIG TITLE!12
</h1>
</body>
Note: If it doesn't exist, I'm going to make it under BSD/MIT/Python license. Help is most welcome. Anything that works towards more Pythonic web app development will be great. Very much appreciate it!
-Luke Stanley
HTML parsing involves tokenization and tree construction. HTML tokens include start and end tags, as well as attribute names and values. If the document is well-formed, parsing it is straightforward and faster. The parser parses tokenized input into the document, building up the document tree.
html5lib: A pure-python library for parsing HTML. It is designed to conform to the WHATWG HTML specification, as is implemented by all major web browsers.
The first part can for the most part be done by ElementTree, but it takes a few more steps:
>>> import xml.etree.ElementTree as ET
>>> html = ET.XML('<html><head><title>Hi</title></head><body></body></html>')
>>> html.head = html.find('head')
>>> html.head.append(ET.XML('<link type="text/css" href="main.css" rel="stylesheet" />'))
>>> html.head.title = html.head.find('title')
>>> html.head.title.text
'Hi'
The second part can be completed by creating Element objects, but you'd need to do some of your own work to make it happen the way you really want:
>>> html.body = html.find('body')
>>> my_h1 = ET.Element('h1', {'class': 'roflol'})
>>> my_h1.text = 'BIG TITLE!12'
>>> html.body.append(my_h1)
>>> html.body.SOURCE = ET.tostring(html.body)
>>> html.body.SOURCE
'<body><h1 class="roflol">BIG TITLE!12</h1></body>'
You could create a stylesheet
function of your own:
>>> def stylesheet(href='', type='text/css', rel='stylesheet', **kwargs):
... elem = ET.Element('link', href=href, type=type, rel=rel)
... return elem
...
>>> html.head.append(stylesheet(href="main.css"))
And the whole document:
>>> ET.tostring(html)
<html><head><title>Hi</title><link href="main.css" rel="stylesheet" type="text/css" /></head><body><h1 class="roflol">BIG TITLE!12</h1></body></html>
But, I think if you're going to end up writing your own thing, this is a good place to start. ElementTree is very powerful.
Edit: I realize that this is probably not exactly what you're looking for. I just wanted to provide something as an available alternative and to also prove that it could actually be done without too much work.
Amara Bindery provides the most Pythonic XML API I've seen. See the quick reference, manual and faq
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With