Thursday, January 18, 2007

how to verify ownership to Yahoo on wordpress.org blog server


The mischief caused by 404.php template is not a problem for authentication, when I stumbled into Yahoo!'s Site Explorer. To claim your site, you were instructed to place a special file with special content under / of the site. This way, one GET will do and have no problem with customized 404 pages. The latter is pretty common in use for sites managed by a CMS (Content Management System) or blogging servers. I wonder how come the smart engineers at Google decide to do two GET instead...

Once I added the required file on my wordpress.org blog server's / and clicked to continue, the next page asked me to keep the file there for 24 hours, till Yahoo's bots take their sweet time to crawl, literally! To a sharp contrast, Google's webmaster tool authenticates site ownership real-time, and sucks in sitemaps real-time too!

Placing a special file under / is the only way to authenticate your ownership on Yahoo Site Explorer. For millions of hosted sites (blogs or otherwise) whereas content owners don't have access to the /, they'd be out of luck. For now, at least. Hopefully when Yahoo! Site Explorer comes out of beta, they'd come up with a way to authenticate sites whose content owner have content-level access (META tokens, maybe?) instead of file-level access.

In comparison with Google's web master tools, Yahoo's site explorer is so spartan right now. Its own blog hasn't been updated for a few months now. I guess it is real beta then.

Sunday, January 14, 2007

trial & errors :: SEO Dave's adSense-spiked themes

It's a God-send when I found SEO Dave's themes for wordpress.org blogs. He spiked some default themes with Google's adSense and optimized the placements. I gladly took his words for it, since he is a SEO consultant by trade.  This way, I can get new sites up quickly w/o laboring on SEO first :)

On a new wordpress.org server I set up recently, I attempted to unzip the five theme zip files obtained from his blog. I was surprised to be prompted to overwrite this file and that file. I said 'None'.  It turned out that only "connections", one of five zip files holds everything under a directory, as expected for a compliant theme package, which is to be extracted under /wordpress/wp-content/themes. No biggie, a little command line bash magic, I got it.

/wordpress/themes.php, the admin page to preview and activate a theme,  now showed all of them. Many, however, didn't show preview screenshots. Some of them even reported errors. Long listing of these files on the server revealed that the permission was too restrictive: 0700 for directory and 0600 for files. The root user's umask is 0022, so I am sure the restrictive permissions came from the theme zip, instead of from my sometimes overly-secure setup on the server. A few find and chmod later, all is well again.

There may be a bug in the original or spiked Blix theme, as it treats any new page as top-level, even if it is specified as child page for the ubiquitous About page. Same parent-child page relationship was handled properly by other themes such as Kubrick.

I posted a comment on SEO Dave's site in hope he may check it out, along with a wish of using Google's search box instead of the default search box.

how to verify ownership to Google on wordpress.org blog server

I was checking out Google's web master tools site the other night.  In order to verify the ownership, I opted to create a static HTML file on my site. Google failed to verify the ownership, stating it received a 404 error inside a 220-status page. I saw the file on the server and could browse to it properly using a browser too.

Puzzled, I looked at the server's access log. It turned out the Google attempted to retrieve two files. One was the long-winding name it stipulated. The other was the former file with its name prefixed with 'noexist_'.  The logic is clear: Google wants be sure the 220 code returned for the "magic" file is real, by verifying a different code (404 in this case) would be returned if the "magic" file doesn't exist.

66.249.74.2 - - [11/Jan/2007:20:12:20 -0500] "GET /google0467d40068c96de7.html HTTP/1.1" 200 59
66.249.74.2 - - [11/Jan/2007:20:12:20 -0500] "GET /noexist_0467d40068c96de7.html HTTP/1.1" 200 5306

The help page claims Goggle does HEAD only. This obviously isn't true, or isn't true any more, per Apache's access log entries above.

<META name="verify-v1" content="DGxlTrIdDwI9xwBYeYOMddr34POYb934o45vCpf3t+nvcI=" />
I ended up use the META tag instead. I copy+pasted it into /wp-content/themes/myTheme/header.php right before </HEAD><BODY>. This time it worked just fine.

The mischief was caused by a beautified 404 page generated by the /wp-content/themes/myTheme/404.php. Vaguely recalling Apache's manual pages do state that ErrorDocument directive and some other types of redirect tend to lose the original response status code, be it 404, 501, or 403.

Wednesday, January 10, 2007

AJP proxy enabled by default for Apache 2.2.3 on Fedora Core 6

As part of hardening an Apache instance on a new Fedora Core 6 Linux server, I commented out all _proxy_ modules in the main httpd.conf. When checking for syntax, however, I got
# /etc/init.d/httpd configtest
httpd: Syntax error on line 209 of /etc/httpd/conf/httpd.conf: Syntax error on line 2 of /etc/httpd/conf.d/proxy_ajp.conf: Cannot load /etc/httpd/modules/mod_proxy_ajp.so into server: /etc/httpd/modules/mod_proxy_ajp.so: undefined symbol: proxy_module

Surprised by the something actually required proxy_module, I took a look at the proxy_ajp.conf.
#cat proxy_ajp.conf

LoadModule proxy_ajp_module modules/mod_proxy_ajp.so

#
# When loaded, the mod_proxy_ajp module adds support for
# proxying to an AJP/1.3 backend server (such as Tomcat).
# To proxy to an AJP backend, use the "ajp://" URI scheme;
# Tomcat is configured to listen on port 8009 for AJP requests

This surely is nice. The building of mod_jk for Apache 2.0 on CentOS 4 has got old pretty fast. From a security standpoint, I'd  think this comes off some httpd-tomcat package. Not exactly!
# rpm -qf proxy_ajp.conf
httpd-2.2.3-5

For my purpose, I just commented out the LoadModule directive inside proxy_ajp.conf. However, it'd make more sense, if
  • it comes off an optional module package, something like httpd-tomcat or httpd-ajp. Or,
  • disabled by default.

Tuesday, January 09, 2007

weird wormholes :: set reply-to header to the list, or not to

Recently I asked on a mailing list how come I had to remember to copy the list address to the CC whenever I reply to a post. To my surprise, it turned to a big flame war of sorts. Unwilling to make any changes, the list administrator essentially asked both parties to go away, politely.

In its current form, you need to 'reply-all' to mailing list and the list administrator will make sure the poster's email address is in a special list such that the poster not receive two copies. Occasionally, someone may request his/her email address be added to the special list, yet again. What a hassle! Why the administrator wants all this hassle for nothing?!

It puzzled me in the past. It still puzzles me now. To my simple mind, the benefits of setting reply-to header to the list is so obvious:
  • less keystrokes for the list members. Many modern MUA (Mail User Agent) has default to 'reply' instead of 'reply-to-all', esp. web-mail UI.
  • less unnecessary mental note to reply-to-all instead of reply, or to copy list's address to CC if you already hit 'reply'
  • no missing discussion
  • consistently threaded discussion in archive and in live discussion. A post sent to the poster alone was often seen forwarded to the list, as an after-thought and after-fact good intention efforts. The thread is then broken, making it extreme difficult to follow a discussion you found an interesting excerpts by googling.
Instead,
  • Some said it is hard to be done.
  • Some insisted that it is philosophically wrong to reply-to the list. <= Hello, the purpose of subscribing to a mailing list is to publish to and read from the list, not to find sensible partners to conduct private conversations!
  • Some insists reply-to-all is great enough and is the only sensible way, so they went ahead to hack their mail clients (MUA) to detect whether a message is post or private message and automate to save the unnecessary keystrokes.
  • Well, maybe some of these people just accept the dysfunctional setting as the inevitable fact of life, and just find a get-around and moved on. If so, it is pretty sad.
Reflecting a bit more today on the observed need for some people to invent convoluted ways to satisfy themselves, I thought of what the agent in Men in Black says in a StarBucks coffee shop. There are people who can't control their own destiny or fate or fortune, and they are well aware of it. Instead, they opt to pay premium to select from a bloated feel-rich selection of lattes, as if they were the master of the universe.

Monday, January 08, 2007

downloads.wordpress.org needs a face lift

Over this past weekend, I built a new wordpress.org blog server on a new CentOS 4.4 server.  It seems the 'official'  plugin/theme repository page at http://downloads.wordpress.org needs quite a face lift. In its current form, the plug-in and theme repository page is
  • spartan: It gives an ordered list of plugins and themes by name (and/or versions).
  • poor in function:
    • No descriptions to tell whether you need or like a theme or plugin. I had to just download it and install it, then read the descriptions from /wp-admin/plugins.php. Quite some wasted bandwidth on the server and time & efforts on the users.
    • no option to download it all to try it out. Instead, you have to click on each one. The lack of description certainly exacerbates the problem.
  • not up-to-date. For example, the link to tiger-admin plugin doesn't work. Upstream now has tiger-admin-v3.0.zip.
  • malformed HTML. plugin page has an empty link text for tiger-admin plugin, causing the page to display funny in FireFox 1.5/Linux and FireFox 2.0/windows.
  • Many of the themes don't show preview screens under /wp-admin/themes.php.  Yet to check whether the themes themselves are at fault, or the tiger-admin v3.0 theme is the culprit.
  • no security or integrity assurance: no checksum or digital signature is provided to verify the authenticity and/or integrity of the file.
I guess I'll generate a more functional version and contribute to the site. Or alternatively, host a beautified version here myself.