On my list of things to do is read the document Handling character encodings in HTML and CSS from the W3C. For some reason I can’t quite bring myself to concentrate on it right now.
Tag Archives: HTML
HTML5 and CSS positioning
I stumbled across a weird bug today that I didn’t know about and wasn’t expecting. I’d done a little bit of a CSS file to go with a little bit of HTML that did some simple positioning of content. Then I validated my document on the W3C Markup Validation Service and it complained about a missing doctype. So I added a doctype for HTML5. After I did that my page looked all screwy, the CSS positioning was applying correctly. Anyway it turned out that the reason the CSS wasn’t applying was because I had property specifications like this:
#content { margin: 170 50 50 50; padding: 0; }
Whereas I needed to specify the units, like this:
#content { margin: 170px 50px 50px 50px; padding: 0px; }
HTML to text in PHP
On my list of things to do (at a rather low priority) is learning more about how to convert html to text in php.
Compressing HTML in PHP (no comments or whitespace)
Note: if you’re a web developer you might be interested in registering to become a ProgClub member. ProgClub is a free international club for computer programmers and we run some mailing lists you might like to hang out on to chat about software development, life, etc.
In addition to compressing CSS in PHP I’ve been compressing HTML. My HTML compressor is a bit of a hack. It doesn’t handle CDATA sections for instance. But it should generally work OK. Here it is:
function slib_compress_html( $buffer ) { $replace = array( "#<!--.*?-->#s" => "", // strip comments "#>\s+<#" => ">\n<", // strip excess whitespace "#\n\s+<#" => "\n<" // strip excess whitespace ); $search = array_keys( $replace ); $html = preg_replace( $search, $replace, $buffer ); return trim( $html ); }
I use this function to compress HTML generated by my PHP scripts by putting ob_start( ‘slib_compress_html’ ) at the beginning of my script (after the ob_gzhandler) and ob_end_flush() at the end. My HTML compression code looks like this:
if ( extension_loaded( 'zlib' ) ) { ob_start( 'ob_gzhandler' ); } ob_start( 'slib_compress_html' ); run_app( $app_factory ); ob_end_flush(); if ( extension_loaded( 'zlib' ) ) { ob_end_flush(); }
Web page HTML/CSS/JavaScript file size
I found this article (Some Guidelines for Determining Web Page and File Size) today which talks about the average size of HTML and other files on the web. According the article (and I’m not clear how they got their data) the average HTML file is 25k, JPEG 11.9k, GIF 2.9k, PNG 14.5k, SWF 32k, external scripts 11.2k and external CSS 17k with the average total size of a web page being 130k. Interesting stuff. Particularly that scripts are typically 11.2k given that jQuery is 90k.
I’m really struggling with a design decision at the moment, being that I’m not sure whether it’s better to embed CSS/JavaScript content or to link it. The thing is that if you link it then the client has to send extra HTTP requests (at least two) to get the content, which is overhead and takes time. The thing is, if your users are returning customers then they might already have the linked files in their cache, meaning they don’t need to send extra HTTP requests, or if they do maybe those requests won’t need to return content. But then maybe a browser will cache a file when it shouldn’t (this can be avoided with good design), or maybe the user’s connection will fail while loading the linked files and they’ll see an unstyled page in their browser.
So many pros and cons, and it’s all hypothetical… what I really need is data. Anyway, I don’t have data, nor do I really have the tools to get it. So given that I have to fly in the dark, here’s my plan:
When I’m processing a request for a user who doesn’t have a browser cookie set I will embed CSS and JavaScript in the HTML. This is because if their browser cookie isn’t set then this is their first request to my web-site, maybe ever, or maybe just in a while. Either way, it’s probably safe to assume they’re a first-time visitor so they won’t have any content in their cache and they’d need to send additional requests for linked files. So I can save those additional requests and hopefully make my web pages load faster for users who are probably one-off visitors.
But for regular users having to download the same content over and over in every request gets tired fast. The linked files can be about half the size of the page, so embedding doubles the size of each transfer. When I’m processing a request if the user’s browser cookie is already set then I’ll assume they’re a regular visitor and link my JavaScript files rather than embedding them. I’ll still embed CSS content though, because my CSS content is relatively small and I want to avoid errors where the page loads but the styles don’t.
Then I’ll make the system configurable so users can change their link/embed settings for CSS and JavaScript if they’re not happy with the defaults. Regular power users can use this feature to turn on linking for all content so pages load as fast as possible for them.
AutoComplete in HTML
Ran across this article, Using AutoComplete in HTML today while looking up how to disable autocompletion.
HTMLPurifier
I found out about the HTMLPurifier today. A HTML sanitisation library for PHP. Handy!
Cross-site scripting and HTML injection
Been reading about Cross-site scripting today on Wikipedia just to see if there was anything I didn’t already know. I’m in the process of code reviewing the entire Pcphpjs code base to remove all the XSS vulnerabilities that I left latent while hacking it together and learning the CodeIgniter and Doctrine frameworks. Now things are relatively stable so I’m going to go over the whole thing and refactor it with a view to code reviewing data handling for HTML injection while I’m at it.
Forcing MediaWiki to display math as PNG
I had a problem with MediaWiki math sections not always displaying as a PNG. For simple expressions HTML was used instead. This lead to a very non-uniform look and feel where some images had a green background and large fonts (for PNG expressions) compared to a black background and different fonts (for HTML expressions). I wanted a uniform look and feel so I went looking for a configuration setting.
I haven’t been able to figure out how to force mediawiki to always display a PNG as a global setting, but in your user settings on the math preferences section you can change from the default “HTML if very simple or else PNG” to “Always render PNG” which fixes the problem on a per user basis, which is good enough for me.