Follow TV Tropes

Following

Context Website / InternetArchive

Go To

1[[quoteright:300:https://static.tvtropes.org/pmwiki/pub/images/internetarchive_1.png]]
2[[https://archive.org The Internet Archive]] is by far ''the'' largest digital library and web archiving organization ever made. Their stated mission is to provide [[InformationWantsToBeFree universal access to all knowledge]], and in order to do that, they have created both a digital library, which is public file hosting system with content submitted by both the community and the site's staff, and, more notably, the Wayback Machine.
3
4The Wayback Machine, which is by far their most famous feature, is essentially a tool that allows people to see past, archived versions of web pages -- in other words, it's the browser version of a TimeMachine.
5
6Here's how it works: after going to the Internet Archive's front page, users can paste a URL address on the input-box below the Wayback Machine logo and pressing "Enter". After that, the user is shown a calendar-like list of archived pages (provided there are any). Dates written on blue dots are links to versions of that particular page archived on that particular date. If a dot is orange, however, then it means that the URL was not found at the time of the snapshot, usually indicating the site was already gone by that time. Green dots, on the other hand, indicate that the URL led to a redirect.
7
8It's also possible to use the WM to instantly archive any given webpage, too, simply go to the address ''[=https://web.archive.org/save/[url_of_the_webpage]=]'' to save the newest version in the archive.
9
10That being said, though, the Wayback Machine is not 100% reliable. Sometimes, the particular page or image you remember most fondly will turn out to be missing from the archives either due to not many sites linking to it or, more commonly, due to it having a structure that's difficult to archive. Websites like Website/DeviantArt as well as several WebComic sites are notorious for being nigh-impossible to archive, meaning that once they're gone, they're, well, ''gone''.
11
12The Wayback Machine also ''used'' to follow the robots exclusion standard, so if your favorite website (say, like Website/FanFictionDotNet, which actually ''did'' block its contents from being archived) blocked the Wayback Machine from saving it in its robots.txt file, then it and its content became inaccessible to the public. Infuriatingly enough, if the domain was taken over by a cybersquatter who then implemented a robots file, it also blocked you from seeing the earlier, legitimate versions of the website. However, the abuse of robots.txt by these cybersquatters led to so many defunct websites losing their previous archives that the Internet Archive themselves decided in April 2017 to stop broadly honoring the standard, now requiring explicit requests for exclusion. They also stopped honoring the standard for crawling and displaying U.S. government and military websites from December 2016 onwards.
13
14The IA also takes no chances with the law, and so all requests by the copyright owners to remove data from the Wayback Machine are immediately obeyed.
15
16As for the Internet Archive's digital library, although it is by comparison a lesser-known feature, it is still by no means lacking in content, as it is not only where the Internet Archive hosts the films, photos, and books they've digitized (they ''are'' a library, after all), but also it's widely used by archival initiatives such as the Archive Team as well as various users eager to either submit cool videos and such they found throughout the web, or upload their content somewhere secure. In fact, even ''us'' at TV Tropes are beginning to enjoy using this feature, as it allows for us to post important videos and webcomics and such somewhere where they are guaranteed to stay (unlike, say, Google Drive or [=MediaFire=]). It also helps that image files hosted on the IA can be viewed as part of their "preview" feature, this makes the Internet Archive even ''more'' useful for archiving things like webcomics, as they can already be read there on a surprisingly readable format without actually having to download them.
17
18Oh, and yes, the Internet Archive is also behind [[https://archive-it.org/ archive-it]] ('''not''' related to Archive Team), which is a paid subscription service that lets you run crawler projects of your own, which comes in handy if the site you want preserved isn't archived by the Wayback Machine for one reason or the other.
19----
20!!The Internet Archive provides examples of:
21* AprilFools: [[https://archive.org/details/April_Fools This]] video, which was uploaded into the Internet Archive's digital library (presumably) by its authors through the [[https://archive.org/details/ourmedia Ourmedia]] project, shows two guys enacting some common April Fools pranks.
22* BadFuture: As part of their 25th anniversary in 2021, the website created the "[[https://wayforward.archive.org/ Wayforward Machine]]" as a satirical depiction of what the internet might look like in another 25 years if efforts by powerful interests to restrict access to information aren't stopped. Depicting websites full of censorship, spying, and paywalls.
23%%* FilmNoir: [[https://archive.org/details/Film_Noir&tab=about This]] small, user-maintained collection is dedicated to making copies of such films from the 40s and 50s available through the Internet Archive. As of 2017, they currently have collected about 97 films, among them being ''Film/TheStranger'', ''Film/ScarletStreet'', ''Film/DeadOnArrival'', and ''Film/{{Suddenly}}''
24* GreatBigLibraryOfEverything: As of October 2016, the Internet Archive's collection of books, videos, images, and websites has [[https://blog.archive.org/2016/10/23/defining-web-pages-web-sites-and-web-captures/ topped 15 Petabytes]] and continues to grow. For some context, the human brain can hold roughly [[https://www.scientificamerican.com/article/new-estimate-boosts-the-human-brain-s-memory-capacity-10-fold/ 25 Petabytes worth of data]] (that's ten times the findings of [[https://www.scientificamerican.com/article/what-is-the-memory-capacity/ previous studies]]). This means that if you were to memorize their entire database, it would take up well over half of your brain.
25** Oh, and, if you're wondering, yes, the Internet Archive ''is'' an actual, official library -- [[https://archive.org/iathreads/post-view.php?id=121377 it actually was classified as such by the state of California in 2007]].
26* InformationWantsToBeFree: As explained by [[https://archive.org/details/SDForumBK this]] speech by none other than [[http://en.wikipedia.org/wiki/Brewster%20Kahle the site's founder]] himself, this is essentially the entire ideology behind the Internet Archive -- and it shows. For example, once you upload something into their digital library, it technically doesn't belong to you anymore; this means that not only it will stay up even if you delete your account, but also that you can only delete it through a formal request, which can be declined. This is {{Averted}}, however, whenever MediaNotes/{{copyright}} is brought up into the conversation, as the IA does ''not'' take any chances with the law and will remove anything from their archives upon a DMCA request.
27** This trope is also the underpinning of several of their community-maintained collections' works. The aforementioned (see AprilFools above) [[https://archive.org/details/ourmedia Ourmedia]] project, for example, strives to give its members a way to share and preserve their amateur, possibly humorous, and otherwise endangered online works and such.
28* MindScrew: [[https://archive.org/details/thediamondclub This]] comic, which, according to its description, was uploaded into the IA's library by its authors because they thought it was SoBadItsGood, can cause this effect if read through the preview feature, as its image files, although numbered, are out of order just enough for it mess up your understanding of the comic's already confusing plot, but still orderly enough for you to see (or [[DeathOfTheAuthor make up]]) at least ''some'' connection between the pages.
29* RulesLawyer: There is a small, little known provision on the US Copyright law (section [[https://www.law.cornell.edu/uscode/text/17/108 108(h)]]) that dictates that libraries can make copies of books whose Copyright has originally expired (but has been repeatedly extended) available within their archives, this law was made as counteraction to the [[http://heinonline.org/HOL/LandingPage?handle=hein.journals/uclalr48&div=35&id=&page= Mickey Mouse Protection Act]] (which allowed companies to extend Copyright terms indefinitely) and was thought up ''years'' before TheInternet was even conceived. Several decades later and, in October 2017, [[https://blog.archive.org/2017/10/10/books-from-1923-to-1941-now-liberated/ the Internet Archive used this long forgotten rule]] in order to make it so essentially ''all'' books published between 1923 and 1942 would become available within [[https://archive.org/details/last20&tab=about a collection]] in their digital library, and suggested that other libraries should do the same. To add insult to injury, they also named said collection after [[https://en.m.wikipedia.org/wiki/Sonny_Bono Sonny Bono]], the man behind the law that [[{{Irony}} made the provision behind all of this a necessity in the first place]].
30* ScrewThisImOuttaHere: Due to fears of Internet censorship in the US, a [[http://www.motherjones.com/politics/2016/12/internet-freedom-wayback-machine-moving-copy-to-canada-donald-trump/ Canadian mirror of the Internet Archive was made]] and is being kept up to date so the site can invoke this trope if necessary.
31* ShoutOut: Yes, their famous Wayback Machine function ''does'' get its name after the WABAC time machine from ''[[WesternAnimation/RockyAndBullwinkle Peabody's Improbable History]]''. [[AndKnowingIsHalfTheBattle And now you know]].
32* SuddenGameInterface: [[https://archive.org/details/Shoe-tosserGuyGif1 This]] humorous, user-submitted GIF gets the infamous footage of UsefulNotes/GeorgeWBush [[https://en.m.wikipedia.org/wiki/Bush_shoeing_incident narrowly dodging a shoe]] and adds to it an RPG-like interface.
33%%* VlogSeries: Similarly to the SpeedRun collection noted above, the Internet Archive also has an entirely user submitted [[https://archive.org/details/vlogs&tab=about collection]] of videos from several notable vloggers.
34----
35[[labelnote:Fun fact time!]]If you're wondering why this page was created so recently, don't fret! We actually used to have a page on the Wayback Machine itself back in the day, but we then decided to turn it into a redirect to this new and improved page as per the [[https://tvtropes.org/pmwiki/posts.php?discussion=14116047640A25319600&page=1 Websites cleanup]] project. Though you can still see what the page used to look like by -- you guessed it -- [[{{Irony}} checking it on the Wayback Machine]]. We recommend you go to [[https://web.archive.org/web/20171015235345/https://tvtropes.org/pmwiki/pmwiki.php/Website/WaybackMachine this]] capture that was submitted by us tropers just before we deleted it.[[/labelnote]]

Top