Interesting read: Backing up Geocities: Lessons so far.

A side-effect of the whole process is I now know way, way, way too much about Geocities than I ever expected to. We’ve had to dissect every aspect of how the site functions to understand how to mirror things, from its history through how it does crazy javascript ads. Some of it is stupid and some is hilarious, but this contextual bit is important to understanding the data we have.

Golden Age Flash Archives Vol 2The Comics Archives has launched its 2008 DC Archives Survey [edit: it’s since been taken offline]. Readers are asked which DC Archive books they own, and which series they would be likely to buy if new volumes were released next year. Results will be collated and sent to DC Editorial.

DC’s Archive line is their line of hardcover reprints on nice, glossy paper, usually following a character or team starting at the beginning of the series. DC has two sets of Flash archives right now:

  • The Flash Archives: 4 volumes following Barry Allen from his first few appearances in Showcase through the start of his solo title, covering 1956–1962.
  • The Golden Age Flash Archives: 2 volumes following Jay Garrick through the first 2 years of Flash Comics and All-Flash, covering 1940–1941.

The survey also asks about other reprint formats, including the paperback Chronicles series, the Omnibus series (hardcover, but lower-quality paper), and more thematic reprint sets (one suggestion is Flash: The Death of Iris Allen

So if, like me, you’re still hoping for that next volume of Golden Age Flash Archives—or any other classic DC book that hasn’t been reprinted in decades, if ever—stop on over and fill out the survey.

(via Comic Bloc Forums)

Golden Age Flash Archives Vol 2Newsarama reports that during the Q&A part of the DC Nation panel at this weekend’s Baltimore Comic-Con, a fan asked:

Are there more Legion, Flash or Justice League Archives coming? [VP of Sales Bob] Wayne said that when you get up to the issues that can be affordably bought by collectors the demand for the Archive Editions goes down.

Okay, this might apply to the Silver-Age material. The four Flash Archives books so far are up to Flash #132 (1962). When I was tracking down back-issues in the #133–140 range (the likely contents of a hypothetical book 5) about 6 or 7 years ago, I seem to remember finding reasonably good copies in the $5-15 range. (Better copies, of course, run into triple digits.)

But there’s still 8 years of Golden-Age material to cover, from 1942–1949: more than 75% of Jay Garrick’s solo run. And those books are much harder to find, with battered readers’ copies often selling for $40–150.

Moreover, those 8 years include the first appearances of every major Golden-Age Flash villain. Continue reading

I recently discovered exactly how the Wayback Machine deals with changes to robots.txt.

First, some background. I have a weblog I’ve been running since 2002, switching from B2 to WordPress and changing the permalink structure twice (with appropriate HTTP redirects each time) as nicer structures became available. Unfortunately, some spiders kept hitting the old URLs over and over again, despite the fact that they forwarded with a 301 permanent redirect to the new locations. So, foolishly, I added the old links to robots.txt to get the spiders to stop.

Flash forward to earlier this week. I’ve made a post on Slashdot, which reminds me of a review I did of Might and Magic IX nearly four years ago. I head to my blog, pull up the post… and to my horror, discover that it’s missing half a sentence at the beginning of a paragraph and I don’t remember the sense of what I originally wrote!

My backups are too recent (ironic, that), so I hit the Wayback Machine. They only have the post going back to 2004, which is still missing the chunk of text. Then I remember that the link structure was different, so I try hitting the oldest archived copies of the main page, and I’m able to pull up the summary with a link to the original location. I click on it… and I see:

Excluded by robots.txt (or words to that effect).

Now this is a page that was not blocked at the time that ia_archiver spidered it, but that was later blocked. The Wayback machine retroactively blocked access to the page based on the robots.txt content. I searched through the documentation and couldn’t determine whether the data had actually been removed or just blocked, so I decided to alter my site’s robots.txt file, fire off a request for clarification, and see what happened.

As it turns out, several days later, they unblocked the file, and I was able to restore the missing text.

In summary, the Wayback Machine will block end-users from accessing anything that is in your current robots.txt file. If you remove the restriction from your robots.txt, it will re-enable access, but only if it had archived the page in the first place.

(Originally posted as a Slashdot comment. I reposted it here several years later, and have since backdated it to the original time.)

Golden Age Flash Archives Vol 2As I’ve gone through what little Golden-Age Flash material I have access to, I’ve once again lamented that DC has not yet published a Golden Age Flash Archives Volume 2. (Volume 1 was released in way back in 1999.) But in looking up info on the restoration process, I discovered a page that lists two volumes… and Amazon has a listing for volume 2, to be released on January 4, 2006! Apparently it will feature stories from Flash Comics #18-24 and All-Flash Quarterly #1-2!

Edit Oct. 10: Confirmed! Today’s DC Comics Direct Channel Special lists the archive and its contents among the books scheduled for January-February.