Help:Using the Internet Archive

From LIMSWiki
Revision as of 04:18, 26 November 2014 by Shawndouglas (talk | contribs) (Added content.)
Jump to navigationJump to search
Internet Archive logo and wordmark.png

My name is Shawn Douglas, and I'm the curator and a senior editor here at LIMSwiki. One of the major tasks we undertake with the wiki is academic/historic research and writing of articles related to laboratory informatics, with the goal of providing useful information to those interested in the field. Traditionally, academic and historic research used to entail digging through physical stacks and archives in libraries and store rooms. Books, magazines, newspapers, journals, brochures, and other grey literature have long played an important role of not only learning more about specific topics but reconstructing fragments of history into a coherent whole.

The advent of the Internet and improved computing and storage technology, however, has brought with it new ways to create, publish, and archive. Books, film, music, news, and more have become staples of the Internet and other computer networks around the world, their popularity spurred by cheaper, more readily available digital publishing tools. Yet just as readily as new material is being uploaded and published to the Internet in large quantities, an alarming amount of digitally published material is either replaced or removed forever. This rapid and voluminous creation and destruction of mundane and creative cultural material is forcing researchers, historians, and data preservationists of all sorts to further examine what should be archived and how it should be done. One of many important tools to evolve from this examination is the Internet Archive.

What is the Internet Archive and why is it important?

The Internet Archive is a non-profit entity with the goal of building an Internet-based library. The non-profit describes why it's doing this as such[1]:

Libraries exist to preserve society's cultural artifacts and to provide access to them. If libraries are to continue to foster education and scholarship in this era of digital technology, it's essential for them to extend those functions into the digital world.

Many early movies were recycled to recover the silver in the film. The Library of Alexandria — an ancient center of learning containing a copy of every book in the world — was eventually burned to the ground. Even now, at the turn of the 21st century, no comprehensive archives of television or radio programs exist.

But without cultural artifacts, civilization has no memory and no mechanism to learn from its successes and failures. And paradoxically, with the explosion of the Internet, we live in what [Applied Minds' Chief Technology Officer] Danny Hillis has referred to as our "digital dark age."

The Internet Archive is working to prevent the Internet — a new medium with major historical significance — and other "born-digital" materials from disappearing into the past. Collaborating with institutions including the Library of Congress and the Smithsonian, we are working to preserve a record for generations to come.

External audio
Louis Armstrong & His Orchestra - "Ain't Misbehavin'"
"Ain't Misbehavin'", as performed by Louis Armstrong & His Orchestra on July 19, 1929. Retrieved 25 Nov. 2014.

The Internet Archive's digital library includes more than seven million texts, two million audio recordings, and nearly two million videos that have fallen or intentionally been published to the public domain. The non-profit also has another important tool: the Wayback Machine. This tool functions as "a three-dimensional index that allows browsing of web documents over multiple time periods,"[2] and contains nearly two petabytes of archived web data. The Wayback Machine gives researchers the ability to see past iterations of a website as long as the web address is known. For example, the now defunct X-Files website can still be viewed in its various forms dating back to 1996. (It also includes a bit of history for X-Files buffs: the original owner of the domain allegedly received legal threats from Fox in September 1997 concerning the domain.[3]) And that previous fact in the parentheses? I was able to add a citation to that statement all thanks to the Internet Archive!

So why is the Internet Archive important? Well, hopefully the previous example of The X-Files mildly illustrates the importance of the service. From someone who's researching the history of the television series to an editor working on The X-Files Wikipedia entry, having access to content that was originally published at the associated web domain — via the Wayback Machine — is particularly useful, not only for finding facts but also citing them. LIMSwiki editors also make good use of the Wayback Machine in their research of laboratory informatics vendors and open-source software projects: many companies and projects either alter their web content or disappear from the Internet completely, with only the Internet Archive to provide clues. Finally, cultural anthropologists, data preservationists, and historians aren't the only ones discovering cultural records on the site; people from all walks of life are tapping into both "born-digital" and digitized physical materials, some of which date back multiple centuries. From a 1929 recording of Louis Armstrong & His Orchestra's "Ain't Misbehavin'" to an English dictionary published in 1720, the Internet Archive gives people from all walks of life a chance to revisit a cultural past old and recent.

Activities

Let's learn a bit more about the Internet Archive and its offerings, performing a few activities in the process.

Internet Archive as an Internet library

Citing files from the Internet Archive

The Wayback Machine

1. Learning about the Wayback Machine

Open the following YouTube video in a new browser tab and watch it: Internet Archive's Wayback Machine (4:44)

User Willie D. explained how to use the Wayback Machine and stated one reason for using it as it being "fun."

1a. Did you find the Wayback Machine interface easy-to-use or difficult? Explain.
1b. What other uses can you imagine for using the Wayback Machine other than for fun? State two real-world problems or activities that the Machine could be applied to and explain how it would resolve each problem or benefit each activity.
2. Using the Wayback Machine

Imagine you're researching the history of Keane International, an IT services and solutions company. In your research you discover Keane was officially acquired by NTT DATA Corporation on January 3, 2011, and you find its former web domain was [1].

2a. Using the Wayback Machine and the Keane domain name, find the following pieces of information:
* the first available archive date for the domain
* the first-quarter 1996 revenues as reported to investors
* the name of the founder of the company
* the year the company acquired GE Consulting Services

Citing web pages from the Wayback Machine

On Wikipedia, this wiki, or other wikis with citation tools

On this and other wikis, citations are required. Citations are placed via citation templates. Here are a few unpopulated citation templates, for example:

  • <ref name="">{{cite web |url= |format= |title= |work= |author= |publisher= |date= |accessdate=}}</ref>
  • <ref name="">{{cite book |url= |chapter= |title= |author= |pages= |publisher= |year= |edition= |volume= |isbn= |accessdate=}}</ref>
  • <ref name="">{{cite journal |url= |format= |journal= |chapter= |title= |author= |year= |volume= |issue= |pages= |pmid= |doi= |accessdate=}}</ref>
  • <ref name="">{{cite news |url= |format= |title= |author= |agency= |publisher= |newspaper= |pages= |location= |date= |accessdate=}}</ref>

This guide isn't dedicated to showing you how to create citations; consult the advanced training section of the MediaWiki training guide to learn more. That said, citing an archived webpage is straightforward. Using any of these templates, you'll be required to add a few additional parameters: "archiveurl" and "archivedate". Let's use the November 25, 2005 version of the Keane website as an example:

<ref name="KeaneArch05">{{cite web |url=http://www.keane.com/ |archiveurl=https://web.archive.org/web/20051125010336/http://www.keane.com/ |title=Welcome to Keane |work=Keane.com |author= |publisher=Keane International |date= |archivedate=25 November 2005 |accessdate=25 November 2014}}</ref>

In this case we added the "archiveurl", which is the URL for that archived page found in the browser address window, and the "archivedate", the archive date we selected from the interface. Note that if you forget the "archivedate" parameter, the system will show an error message in the citations stating "Error: If you specify |archiveurl=, you must also specify |archivedate=".

3. Making a citation in a wiki

You've found the 1994 shareholder equity number for IBM using the Wayback Machine: $23,413,000,000. The source is this page: https://web.archive.org/web/19961026210557/http://www.ibm.com/IBM/ibmar94/finance4.2.html

Create a full wiki citation for this archived webpage using the guidance above.

In a research paper

MLA: This commentary on MLA formatting comes directly from the Internet Archive website[4]:

This question is a newer one. We asked MLA to help us with how to cite an archived URL in correct format. They did say that there is no established format for resources like the Wayback Machine, but it's best to err on the side of more information. You should cite the webpage as you would normally, and then give the Wayback Machine information. They provided the following example: McDonald, R. C. "Basic Canary Care." _Robirda Online_. 12 Sept. 2004. 18 Dec. 2006 [http://www.robirda.com/cancare.html]. _Internet Archive_. [http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html]. They added that if the date that the information was updated is missing, one can use the closest date in the Wayback Machine. Then comes the date when the page is retrieved and the original URL. Neither URL should be underlined in the bibliography itself. Thanks MLA!

This information may be a bit outdated since as of 2012 the MLA Handbook states that web addresses are not required. They do have a particular format, however, if your professor requires a URL: use < and >. That said, I'd probably format the above quoted material as:

McDonald, R. C. "Basic Canary Care." Robirda Online. 12 Sept. 2004. 25 Nov. 2014. <http://www.robirda.com/cancare.html>. Internet Archive. <http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html>.

Note the formatting: author, last name first; title of the website; update date; access date; original URL; source of archived URL; archived URL. For more on MLA formatting, see the Purdue OWL MLA Style Guide.

APA: As for making the same citation in APA style, how to do it is less clear and authoritative. I've found at least one professor who states "just cite the web page where you found your information."[5] I would tend to agree, simply using the archive URL rather than the original:

McDonald, R. C. (2004, September 12). Basic Canary Care. Retrieved from http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html

Note the formatting: author, last name first; update date; title of the website; "Retrieved from" archived URL. For more on APA formatting, see the Purdue OWL APA Style Guide.

4. Creating MLA and APA citations

Use one of the Wayback webpages you encountered for Keane.com from activity two. State the fact you are citing (one of the bullet point items) and include both an MLA and APA citation of that webpage.

Associated help pages

External links

References

  1. "About the Internet Archive". Internet Archive. https://archive.org/about/. Retrieved 24 November 2014. 
  2. "Frequently Asked Questions - The Wayback Machine". Internet Archive. https://archive.org/about/faqs.php#The_Wayback_Machine. Retrieved 24 November 2014. 
  3. Mitbo, Dale (September 1997). "Where's WWW.XFILES.COM?". Archived from the original on 10 February 1998. https://web.archive.org/web/19980210080348/http://www.xfiles.com/xfiles.htm. Retrieved 25 November 2014. 
  4. "How do I cite Wayback Machine urls in MLA format?". The Wayback Machine FAQ. The Internet Archive. https://archive.org/about/faqs.php#265. Retrieved 25 November 2014. 
  5. "How should I cite an archived version of a web page in APA style?". Quoara.com. Tilleman, Doron; Pettigrew, Tonya. http://www.quora.com/How-should-I-cite-an-archived-version-of-a-web-page-in-APA-style. Retrieved 25 November 2014.