Monday, 20 September 2021

The problem of link rot

I was watching a library science webinar from 2016 on YouTube recently which introduced me to a brand new term - to me - that of "link rot" (Choice Media Channel, 2016). 

Link rot is where, in a journal article, we click on a link and get a 404 message, or find a "lack of specific content (resources) at the indicated internet address" (Krol & Zdonek, 2020, p. 21). This is because the resource, the website, or the journal has been moved, merged, or changed. Link rot includes two elements: the link itself being broken (link rot), or the content having being changed (content drift) (Klein et al., 2014; Krol & Zdonek, 2020; Zhou et al., 2015). 

The grey block in the image accompanying this article shows the URLs cited in scholarly work; the yellow block shows the number of links which could not be found a year later (Choice Media Channel, 2016, citing Klein et al., 2014, p. 27). 

in 2013, Hennessey and Ge analysed the Thomson Reuters Web of Science citation index which included 15,000 websites. The researchers found the average website "lifespan [..] was 9.3 years", with only 62% having been properly archived (Krol & Zdonek, 2020). Work by Zhou et al. (2015) indicated that the archive rate is actually 72%. The actual archival rate is probably somewhere in between. But that still means that we are only archiving somewhere between 2/3 and 3/4 of our academic sources. 

That means we are at risk of losing a significant portion of academic data. In fact, some academics have dubbed our times as the 'digital dark ages' (Krol & Zdonek, 2020). We have the potential to lose our digital resources at an alarming rate. I personally download items and keep them: but that only works as long as my PC still keeps reading the materials I have. What happens to my 15,000+ pdfs when the pdf format changes? Eeek....!

Link rot is one of the reasons that the Digital Object Identifier (DOI) system was set up: "the (DOI) was introduced to persistently identify journal articles. In addition, the DOI resolver for the URI version of DOIs was introduced to ensure that web links pointing at these articles remain actionable, even when the articles change web location" (2014, p. 3).  A dedicated link can be redirected fairly easily to a new version of the file, or to a new location (Klein et al., 2014). I have written on the DOI system before (here), but had not realised that link rot was one of the drivers for the founding of the DOI system. 

We learn something new every day :-)


Sam

References:

  • Choice Media Channel (22 April 2016). Reimagining the Academic Library. https://youtu.be/8-SYUslsVfg
  • Klein, M., Van de Sompel, H., Sanderson, R., Shankar, H., Balakireva, L., Zhou, K., & Tobin, R. (2014). Scholarly context not found: one in five articles suffers from reference rot. PloS one, 9(12), e115253. https://doi.org/10.1371/journal.pone.0115253
  • Król, K., & Zdonek, D. (2020). Peculiarity of the bit rot and link rot phenomena. Global Knowledge, Memory and Communication, 69(1/2), 20-37. https://doi.org/10.1108/GKMC-06-2019-0067
  • Zhou, K., Grover, C., Klein, M., & Tobin, R. (2015). No more 404s: predicting referenced link rot in scholarly articles for pro-active archiving. Paper presented at JCDL '15: Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, 21-25 June 2015, Knoxville, Tennessee, USA. https://doi.org/10.1145/2756406.2756940

No comments :

Post a Comment

Thanks for your feedback. The elves will post it shortly.