I was trying to access a particular website in the wayback machine, but it looks like it's completely broken because it's React-based and some of the key JavaScript wasn't archived. As a result, the site doesn't load at all - it's an entirely blank page.

Starting to worry about how many pages are going to end up like this. JS-free fallbacks aren't just about people who turn JS off in their browsers!

@misty I don't have noscript on this browser, it seems, but I wonder how sites like this search.lib.umich.edu/articles? which are built primarily with Javascript will work in these kinds of cases too... It was a trend I saw at the Big Ten Academic Alliance library conference.

@platypus @misty In the meantime it looks like most bots *can* do JavaScript :(

But accidents like this do and will happen, alas.


@saper @platypus @misty javascript perhaps, but are they also archiving whatever data the JS pulls down from the backend service? then I guess the archive bot would need to wait until all that's loaded and *then* scrape the HTML?

it's a bit of a weasel phrase to say "accidents like this do and will happen" when this is the technology (fat client webapps) under criticism functioning properly.

@wrl @platypus @misty I think the problem is to replicate browser behaviour exactly. It can be complicated to figure out which JS should still be loaded. Probably impossible to do properly until a fully-blown rendering engine like Gecko is used.

I am fully with you guys that this is wrong. I believe that webpages should be simple and downloadable.

In the meantime though tools like GoogleBot learned JavaScript to deal with this... sad, but that is adjustment to reality, alas.

Sign in to participate in the conversation

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!