Follow TV Tropes


Website / Internet Archive

Go To
The Internet Archive is by far the largest digital library and web archiving organization ever made. Their stated mission is to provide universal access to all knowledge, and in order to do that, they have created both a digital library, which is public file hosting system with content submitted by both the community and the site's staff, and, more notably, the Wayback Machine.

The Wayback Machine, which is by far their most famous feature, is essentially a tool that allows people to see past, archived versions of web pages — in other words, it's the browser version of a Time Machine.

Here's how it works: after going to the Internet Archive's front page, users can paste a URL address on the input-box below the Wayback Machine logo and pressing "Enter". After that, the user is shown a calendar-like list of archived pages (provided there are any). Dates written on blue dots are links to versions of that particular page archived on that particular date. If a dot is orange, however, then it means that the URL was not found at the time of the snapshot, usually indicating the site was already gone by that time. Green dots, on the other hand, indicate that the URL led to a redirect.

It's also possible to use the WM to instantly archive any given webpage, too, simply go to the address[url_of_the_webpage] to save the newest version in the archive.

That being said, though, the Wayback Machine is not 100% reliable. Sometimes, the particular page or image you remember most fondly will turn out to be missing from the archives either due to not many sites linking to it or, more commonly, due to it having a structure that's difficult to archive. Websites like DeviantArt as well as several Web Comic sites are notorious for being nigh-impossible to archive, meaning that once they're gone, they're, well, gone.

The Wayback Machine also used to follow the robots exclusion standard, so if your favorite website (say, like FanFiction.Net, which actually did block its contents from being archived) blocked the Wayback Machine from saving it in its robots.txt file, then it and its content became inaccessible to the public. Infuriatingly enough, if the domain was taken over by a cybersquatter who then implemented a robots file, it also blocked you from seeing the earlier, legitimate versions of the website. However, the abuse of robots.txt by these cybersquatters led to so many defunct websites losing their previous archives that the Internet Archive themselves decided in April 2017 to stop broadly honoring the standard, now requiring explicit requests for exclusion. They also stopped honoring the standard for crawling and displaying U.S. government and military websites from December 2016 onwards (a month before Donald Trump took office as President of the United States).

The IA also takes no chances with the law, and so all requests by the copyright owners to remove data from the Wayback Machine are immediately obeyed.

As for the Internet Archive's digital library, although it is by comparison a lesser-known feature, it is still by no means lacking in content, as it is not only where the Internet Archive hosts the films, photos, and books they've digitized (they are a library, after all), but also it's widely used by archival initiatives such as the Archive Team as well as various users eager to either submit cool videos and such they found throughout the web, or upload their content somewhere secure. In fact, even us at TV Tropes are beginning to enjoy using this feature, as it allows for us to post important videos and webcomics and such somewhere where they are guaranteed to stay (unlike, say, Google Drive or MediaFire). It also helps that image files hosted on the IA can be viewed as part of their "preview" feature, this makes the Internet Archive even more useful for archiving things like webcomics, as they can already be read there on a surprisingly readable format without actually having to download them.

Oh, and yes, the Internet Archive is also behind archive-it (not related to Archive Team), which is a paid subscription service that lets you run crawler projects of your own, which comes in handy if the site you want preserved isn't archived by the Wayback Machine for one reason or the other.

The Internet Archive provides examples of:

  • April Fools: This video, which was uploaded into the Internet Archive's digital library (presumably) by its authors through the Ourmedia project, shows two guys enacting some common April Fools pranks.
  • Bad Future: As part of their 25th anniversary in 2021, the website created the "Wayforward Machine" as a satirical depiction of what the internet might look like in another 25 years if efforts by powerful interests to restrict access to information aren't stopped. Depicting websites full of censorship, spying, and paywalls.
  • Great Big Library of Everything: As of October 2016, the Internet Archive's collection of books, videos, images, and websites has topped 15 Petabytes and continues to grow. For some context, the human brain can hold roughly 25 Petabytes worth of data (that's ten times the findings of previous studies). This means that if you were to memorize their entire database, it would take up well over half of your brain.
  • Information Wants to Be Free: As explained by this speech by none other than the site's founder himself, this is essentially the entire ideology behind the Internet Archive — and it shows. For example, once you upload something in the their digital library, it technically doesn't belong to you anymore; this means that not only it will stay up even if you delete your account, but also that you can only delete it through a formal request, which can be declined. This is Averted, however, whenever Copyright is brought up into the conversation, as the IA does not take any chances with the law and will remove anything from their archives upon a DMCA request.
    • This trope is also the underpinning of several of their community-maintained collections' works. The aforementioned (see April Fools above) Ourmedia project, for example, strives to give its members a way to share and preserve their amateur, possibly humorous, and otherwise endangered online works and such.
  • Mind Screw: This comic, which, according to its description, was uploaded into the IA's library by its authors because they thought it was So Bad, It's Good, can cause this effect if read through the preview feature, as its image files, although numbered, are out of order just enough for it mess up your understanding of the comic's already confusing plot, but still orderly enough for you to see (or make up) at least some connection between the pages.
  • Rules Lawyer: There is a small, little known provision on the US Copyright law (section 108(h)) that dictates that libraries can make copies of books whose Copyright has originally expired (but has been repeatedly extended) available within their archives, this law was made as counteraction to the Mickey Mouse Protection Act (which allowed companies to extend Copyright terms indefinitely) and was thought up years before The Internet was even conceived. Several decades later and, in October 2017, the Internet Archive used this long forgotten rule in order to make it so essentially all books published between 1923 and 1942 would become available within a collection in their digital library, and suggested that other libraries should do the same. To add insult to injury, they also named said collection after Sonny Bono, the man behind the law that made the provision behind all of this a necessity in the first place.
  • Screw This, I'm Outta Here: Due to fears of Internet censorship in the US, a Canadian mirror of the Internet Archive was made and is being kept up to date so the site can invoke this trope if necessary.
  • Shout-Out: Yes, their famous Wayback Machine function does get its name after the WABAC time machine from Peabody's Improbable History. And now you know.
  • Sudden Game Interface: This humorous, user-submitted GIF gets the infamous footage of George W. Bush narrowly dodging a shoe and adds to it an RPG-like interface.

Fun fact time! 

Alternative Title(s): Wayback Machine