A Modest Proposal: Modern Legal Citations
A better way to find legal resources.

January 15, 2014


When Tim Berners-Lee created the World Wide Web in 1991, it was a paradigm shift because it had previously been nearly impossible to link together data stored on different computers in a manner that the average person could understand. That's not to say that it had previously been completely impossible; the File Transfer Protocol (FTP) was proposed in 1971 as RFC 114. Yet it still wasn't until the advent of HTTP and Mosaic that the internet exploded in growth.

Fast forward twenty-three years from the time that Berners-Lee uploaded the first web site, and lawyers still use (relatively) ancient legal citations rooted in physical page numbers and expensive bound volumes. It's widely understood that there should be a better way to encode references to legal texts. Committees have been and are being formed; academic debate abounds. The following is a simple proposal for legal citation formatting that aims to make maximum use of the ample tools that the past several decades have already brought us.

Start with the way in which you found this web page; most likely, you clicked on a link, either in another web page (encapsulated in the HREF, or Hypertext Reference, parameter of the A, or Anchor, tag) or in an e-mail somewhere. That link has a few parts: the protocol, HTTP, the server, www.plainsite.org, the file being requested, /articles/article.html, and parameters to send to the server about that file, in this case involving the unique identifier of the article. In addition to this being a functional system, it's also simple. There's nothing extraneous in the request at all.

As robust and time-honored as the current West Publishing-based system of citations is—and there are many other articles that go into its pros and cons—legal citations could and should work in the same way as the hyperlink just described. We propose a legal citation format that works in a similar fashion, but instead of being contained in the venerable HREF parameter, we propose a second and optional parameter for the A tag, called LREF for Legal Reference. LREF links would look like this:

For the docket corresponding to California Northern District Court's Case No. 5:13-cv-02054-EJD:

For document 5 in the above docket:

For document 68, attachment 1 in the above docket:

For the docket corresponding to Court of Appeals for the First Circuit Case No. 12-1594:

For the docket corresponding to Supreme Court* Case No. 12-965:

For document 3 in the docket corresponding to San Mateo County (California) Case No. CLJ480021:

* Is it annoying, at least for the purposes of making a consistent standard, that the Supreme Court has its own separate domain name as though it is not part of the rest of the United States Courts? Yes, it is.

At this point a bit of explanation may be necessary. The link format is derived from the format used by the RECAP project, which uses an ARPA-style backwards domain name to go from the broadest scope to the most specific as one reads left to right. However, there is one major difference: RECAP, which was designed to store data from PACER, uses identification numbers unique to PACER in the links for its district court records. This nomenclature was broken when RECAP expanded to include appellate-level data, and thanks to PACER's myriad quirks, appellate cases do not appear to have unique identifiers that are exposed beyond the actual case number. Here, we simply use the case number itself, stripping out judge initials (which are frequently changed when cases are re-assigned, something that happens regularly), and converting all punctuation marks to hyphens. At the appellate and Supreme Court levels, no changes to the case number are necessary at all.

The dependence upon the domain name system is convenient because most courts already have a domain name and domain names uniquely identify their owners in a manner that the vast majority of people already accept and understand. For those that have more than one for whatever reason (such as the County of San Francisco, which owns both sfcourts.org and sfsuperiorcourt.org), a primary domain name would have to be chosen.

An additional advantage of the proposed naming system is that it can be useful beyond courts. For example, the following link might identify a patent application:

For the docket corresponding to U.S. Patent Application 13/849,537:

Here, I'm cheating a bit, and more input from the community may be necessary, because the USPTO firstly covers both patents and trademarks, meaning that a sub-domain of "patents" is necessary to specify the proper scope. The USPTO also does not number the documents in its dockets—of which there are two per patent application (the "Transaction History" and the "Image File Wrapper"). I have also made an exception to the convert-punctuation-to-hyphens rule for the slash and comma that sometimes appears in the serial numbers of applications, and they have simply been stripped away. The USPTO's numbering scheme for patent applciations basically makes no logical sense; certain number ranges are reserved for specific purposes, but those can most easily be identified by the first three digits, even though the slash appears after two. Moving on...

For the docket corresponding to USPTO Trademark Trial and Appeal Board Opposition No. 91125553:

Fortunately, TTAB docket items are numbered. On the patent side, PTAB proceedings could be addressed similarly.

One more use case could be for statutes, regulations, bills, and standards, as opposed to just dockets and linked documents.

For 18 U.S.C. § 1960(a):

For 18 C.F.R. § 1010.103:

For H.R. 2279 of the 113th Congress, 1st Session:

There are several reasons why using the LREF parameter would be advantageous.

  1. Simplicity. It would be relatively trivial to implement LREF as an extension to the HTML A tag in major web browsers if browser vendors chose to do so. This is not the case for standards such as LegalXML, which are incredibly complex and costly to implement. Implemenation by a few browser vendors is also greatly preferred to implementation by a larger number of small vendors or users themselves; most notably, California tried and failed to use LegalXML or similar standards with CCMS, though LegalXML was hardly the only reason. (CCMS aside, LegalXML has also failed to make dockets actually transportable for average users on any court system web site we've ever seen.)
  2. Neutrality. The scheme described above does not grant any advantage to any particular software vendor, institution, or group. The identifiers are not borne of some hundred-year-old monopoly; they are simply the identifiers bestowed by government. Nor are they proprietary to any particular software, programming language, platform, or service.
  3. Flexibility. By referencing the most generic identifier possible, web browsers could be configured with a user setting to launch a specific web site or reader application in association with LREF tags. Services such as Cornell's LII, public.resource.org, Justia, Casetext, Google Scholar, and of course, PlainSite—not to mention others yet to be created—could establish a router script at a pre-defined handler URL to direct traffic from incoming LREF links. This has already worked in practice for search engine selection. Services could decide themselves whether to show raw materials or their own improved and/or annotated versions.

  4. Specificity. As described, it's possible with the above scheme to link to an entire docket, or a particular document, which covers the widely-used purpose of linking to opinions. It would further be possible to use # notation to jump to a particular paragraph or page number.
  5. International Appeal. Since the domain name system already works around the world, it's conceivable that one could access dockets, laws, regulations, and standards from any country using this same framework.
  6. Readability. Aside from the fact that the links above are pretty easy for a person to read and make sense of, they are extremely machine-readable. This means that even without web browser support, they could make the exchange of legal data between computers much more efficient than it is currently. In a perfect world, it might even be possible to teach law clerks how to use them for opinion drafting! (Just kidding, John Roberts doesn't use e-mail.)

Feedback is welcome. Send e-mail to aaron dot greenspan at plainsite.org.

No comments have been added yet. Sign in to post a comment.

Issues Laws Cases Pro Articles Firms Entities
Issues Laws Cases Pro Articles Firms Entities
Sign Up
Need Password Help?