0
Could this paragraph be out of date regarding HTTP URI persistence?
Viewed 71 times

It seems to me that this paragraph overlooks significant progress regarding HTTP URI persistence that has been achieved in the past 5 years. Web archives combined with the Memento protocol and associated infrastructure provide a rather impressive level of persistence for HTTP URIs:

  1. Web archives around the world contain archival snapshots of HTTP-URI-identified resources.

  2. The Memento protocol and associated infrastructure allows following HTTP URIs into these web archives in an interoperable manner. This can be done for HTTP URIs that point at resources that are still on the live web (as a means of getting access to one of their prior states) and for resources that have vanished from the live web. Check out this 1.15 minute video.

Admittedly:

  1. Memento is not natively supported by web infrastructure such as browsers. But chances are that one day it will be. It is however, supported by all major public web archives in the world. And there are plenty of tools to experience/implement Memento.

  2. There are no archived snapshots in web archives for all resources or for all versions of resources, so the persistence is not general. But, if one needs a snapshot of a resource it suffices to ask a web archive to create one. Several web archives meanwhile support on-demand archiving including Internet Archive, archive.today, perma.cc, webcitation, both via manual interaction and via an API. Hence, referring to Table 2 of the paper, it is not only/necessarily the owner of the object's responsibility to take action, any user of the object can. In the Hiberlink project, we have started to call this pro-active archiving.

Back to data citation:

  1. Web archives are not the only systems that can support Memento; they do provide a nice example of the protocol's power, though. Resource versioning systems such as wikis, software control systems, and ... data archives can support Memento too in order to facilitate interoperable access to temporal resource versions. The versions do not need to reside in a web archive, they can reside in any system that supports versioning. Actually, the Memento protocol is even more performant (read: requires way less infrastructure) in cases where a system holds on to its own versions instead of offloading that responsibility to a web archive. And there are tools to make a version control system support Memento, see e.g. the TimeGate server.

  2. References based on HTTP URIs can be made extra robust by decorating them, i.e. by conveying original URI, datetime of linking, URI of a snapshot/version in a machine-actionable manner. With this regard, see Robust Links, the Motivation for Robust Links, and this little demo.

It would be interesting to hear why the authors decided to overlook this state of affairs.

waiting for moderation