A while back I was complaining about the evils of CSS and its mis-use to render the Web ugly, unreadable, or both. I even threatened to resort to writing code to clean pages up as they were downloaded. Over the course of April and May I spent quite a few spare minutes putting together a preliminary piece of JavaScript code to carry out that threat. Tip of the hat to the folks that do GreaseMonkey, a Firefox extension that makes it straightforward to have a piece of code executed whenever a page finishes loading, the product that I use to launch my code. I have to admit that I have been surprised by how much content can be made to have consistent appearance by working in a bottom-up fashion independent of the Web site -- I thought the code would have to be considerably more complicated.
I installed the newest version of Firefox for desktop machines today, and noticed that "reader mode" is now available by default. If the software thinks that it can extract a simplified version of the page's principal content, an extra icon appears in the address bar. Click on the icon and a new pane opens, with the text that the software thinks is the primary page content rendered in a simple layout. Unsurprisingly, it doesn't get everything right; pictures might be excluded, and graphics that combine both images and text are likely to be garbled. On some pages -- like the individual post pages Blogger generates for this site -- the software doesn't recognize that there's a main piece of content that could be extracted, so reader-mode isn't offered. Still, I am clearly not the only one that thinks the Web is being rendered unreadable by the designers. I'm not concerned with stripping away as much of the clutter as reader-mode; I simply want text presented consistently in terms of fonts and sizes.
Content extraction and formatting tools turn out not to be new (my bad, you can't keep up to date about everything). Readability, for example, is available for a variety of platforms, and is attempting to build commercial products and services around the idea of simplifying content. But having the software embedded in a popular desktop browser and enabled by default is probably a bigger thing for user acceptance. Mozilla offers guidelines for structuring page content that makes it easier for the reader-mode software to recognize and extract content. Will page designers be encouraged by the folks paying the bills to conform to those guidelines? Will the advertising companies figure out ways to present ads so that they are still included in the simplified material? How long before there's a user preference setting that invokes reader-mode automatically if the page content is recognized as conforming to the guidelines? And of course, Mozilla is a much bigger target, and it might be tempting for an
ad-selling firm to go to court on the legal theory that tools that
rewrite possibly copyrighted material should be illegal or that damages should be paid.
Speaking broadly, this falls into the type of war that I've claimed for a long time the content providers can't win. Content providers have to conform to standards so that their content can be rendered. In this case, they have to stick with HTML and JavaScript's DOM. Content consumers have a steadily increasing amount of processing power at their disposal for tearing the HTML apart and extracting a subset of the content. Browsers give consumers the ability to write and/or install software on their own. Nor is the necessary software all that complex, as I've demonstrated. So the content providers can't win the war on legal grounds, because they can't put enough people in jail to matter. The only way to "win" is to make content that is compelling and pleasant to use.
For at least one personal situation, the whole thing is an amusing development. I was invited to be part of a group discussing site appearance and functionality for a multi-author blog (as a reader who regularly comments, not as a member of the editorial staff that makes the actual decisions). I've been thinking that perhaps I should resign that "position" since I now run my own rewriting software as the default, and it does things that affects the size and proportions of various widgets on the blog's pages. More interesting for the long term, though, is the whole question of how much effort should go into appearance issues since it seems likely that contemporary browsers will be making more and more decisions about what content to show and how to format that content.