Friday, November 30, 2007

Stop writing garbage HTML!

In a world full of broadband connections, it can be tempting to let your page weight to creep up a bit. While a bit more heft in a page is probably acceptable, it still is silly to write wasteful HTML. Wasteful HTML needlessly increasing bandwidth bills, increases load times, and can (depending on how it is done) increase the browser render times.

One of the things I noticed on a project that I am currently working on, is that the HTML generated dynamically is about three times as large (measured in KB) as it needs to be, primarily due to whitespace. It is not the kind of whitespace that HTML writers put in to make things easy like indentation. It is just wasteful whitespace. For example, in one generated table, for every table cell, there are about 40 lines with 30 – 50 spaces per line between the tags. On this particular page, just removing the uselessly generated whitespace would reduce the page weight by 50%. On the same page, there are tables with alternating row colors. Instead of defining two cell classes in an external style sheet which gets downloaded once as cached, and using the right class in the tag, the HTML writers used an inline style attribute on every row tag. Wasteful!

All said and done, this particular page is about three times as heavy as it needs to be, simply due to poor coding. And that does not even touch the JavaScript which could and should be stored in an external file. Another killer is that many dynamic Web pages are set to not be cacheable, which means that instead of being wasteful one, they are wasteful on every page view. I can understand that many dynamic pages should not be cached. But on these types of pages, it is especially crucial that the HTML be kept to an acceptable minimum.

These kinds of coding practices are what separate the “shake ‘n bake‿ programmers from even the moderately decent ones. There is absolutely no acceptable excuse for this kind of coding. Not only is your HTML or templates more difficult to maintain, but all of the inline JavaScript and CSS styling discourages reuse, standardization, and other good habits. It also makes a site significantly harder to maintain. It makes for a miserable user experience. And it sends the bandwidth bills through the roof. Not only should you be writing better HTML, but if your Web server has the CPU power to spare, you should turn on HTTP compression. HTML compresses quite well. In addition, you may want to investigate a post-processing engine (again, CPU needed for this) that strips all unneeded whitespace, comments, and other cruft from your HTML output. Even if none of that is an option, consider using such a weight reducer at the time of deployment on your static pages and templates, to reduce the page weight. Such a system would probably be under ten lines of Perl to write; for this particular project, this ten line Perl script can save my employer probably 50% on their bandwidth bills for this particular project. Who can object to that?

courtesy @TechRepublic

No comments: