通过 headless Mozilla 取得 DOM 的 CSS 属性 
Mozilla July 10th, 2009
项目地址在:http://git.lazytech.info/?p=dom-traversal.git
简单来说,这个程序通过 headless Mozilla back-end,可以不依赖 X 环境运行,输出指定网页的 DOM 节点及其 CSS 属性。
关于 headless Mozilla 的介绍见:http://chrislord.net/files/fosdem-09-slides.odp
如何编译见 Wiki:http://center.lazytech.info/wiki/OffscreenMozilla,http://center.lazytech.info/wiki/DOMTraversal
程序关于 www.google.com 的输出片段:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | <html> Style: html, div, map, dt, isindex, form { display: block; } <head> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <meta> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <title> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <script> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <style> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <script> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <style> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <style> Style: area, base, basefont, head, meta, script, style, title, noembed, param { display: none; } <body> Style: body { display: block; margin: 8px; } body, td, a, p, .h { font-family: arial,sans-serif; } <textarea> Style: display: none; textarea { margin: 1px 0pt; border: 2px inset threedface; background-color: -moz-field; color: -moz-fieldtext; font: medium -moz-fixed; text-rendering: optimizelegibility; text-align: start; text-transform: none; word-spacing: normal; letter-spacing: normal; vertical-align: text-bottom; cursor: text; -moz-binding: url("chrome://global/content/platformHTMLBindings.xml#textAreas"); -moz-appearance: textfield-multiline; text-indent: 0pt; -moz-user-select: text; text-shadow: none; word-wrap: break-word; } input:-moz-read-write, textarea:-moz-read-write { -moz-user-modify: read-write ! important; } <iframe> Style: display: none; iframe { border: 2px inset; } <div> Style: html, div, map, dt, isindex, form { display: block; } #gbar { height: 22px; } #gbar, #guser { font-size: 13px; padding-top: 1px ! important; } #gbar { float: left; } <nobr> Style: nobr { white-space: nowrap; } <b> Style: b, strong { font-weight: bolder; } .gb1, .gb3 { height: 22px; margin-right: 0.5em; vertical-align: top; } ...... |
跑起来的内存占用大概是 20M 左右,要想降低内存使用量的话,估计只能单独把 Mozilla 的 DOM 和 Layout 部分抽出来重新实现才行了…
Related posts: