escape coffins and patent classification

Blogging at the Death Reference Desk has been interesting, entertaining, befuddling and more. We get the occasional reference question, but it’s mostly pulling in news articles and other content through RSS feeds (the deathwire, as I calls it) and selecting, summarizing and commenting on items of interest.

I do, however, look for opportunities to dig deeper–to be a librarian, not a blogger, and add research value, not regurgitate the web. My recent post, Premature Burial Device Patents, was one such opportunity. As keen to explain the search process as share the information, I fear I may have gotten a tad too library science enthusiastic for the audience. So I figured I’d elaborate more here. In short, gasp! massive, wondrous patent classification system! And Google Patents is a bit broken yet still manages to be reasonably awesome.

Inspiration struck for this post when one of those skim-friendly web lists came down the deathwire10 Horrifying Premature Burials. This is not typical DeathRef fodder. It’s ad-laden, the photos are cheesy and the references, scattershot vague. But it did get me thinkingpremature burial was a genuine fear, rational or not, around the turn of the twentieth century, and inventors of the time were up to the task. Be that task cheating death and saving lives or exploiting the fear of paranoid Victorians, who knows. But the patents for such devices poured inplans and designs for spring-loaded escape coffins and electrical systems that detected corpse movement then triggered alarm systems above ground, to name a couple.

As government documents, US patents are in the public domain, and I wondered if they are online. I started with the United States Patent and Trademark Office (USPTO), which, sure enough, provides patents onlinefull text (and full text searching) starting in 1976 and image-only patents since 1790. I couldn’t get the image plug-in to work, however (arrrrrghgh!) and search is impenetrable. All this data was at my fingertips but I couldn’t quite grasp it.

Wikipedia’s safety coffin article directed me to this marvelous page at USPTO. This was iteverything I wanted, as far as I could tell, in barely human-readable format. The 27/31 intrigued me the most is that what I think it is? sure enoughclassification numbers.

Like most classification systems, the United States Patent Classification System is at first glance amazing. I wanted to swan dive into classes, wallow in all its sprawling facets. But I’m sure upon deeper inspection, it’s driven many a patent librarian or poor legal assistant insane. For my domain of interest:

Class 27, Undertaking:
This class includes coffins or caskets and portable coffin-cases for receiving and transporting dead bodies for burial; processes and apparatus for embalming and preserving the bodies of persons after death; and various attachments, accessories, and devices used in connection with the preparation of the bodies or employed at the time of interment at the grave, such as head-rests, corpse-carriers, lowering devices, life-signals, and the like. Subclass 31, Life Signals:
Alarms or signals used in connection with coffins for indicating life in persons supposed to be dead.

Bingo. Keywords got nothing on a calculated brain putting things in their places. But what to do with this cumbersome interface?

Enter Google Patents (GP). With a search and view structure much like Google Books, GP has mined all of USPTO’s content and delivers it much more digestibly. All those image-only patents I couldn’t get to work are now slick PDFs I can preview in-browser, see as copy-pastable HTML or download as PDFs. Everything is also now full-text searchable (unlike USPTO’s pre-1976 black hole).

Unfortunately, however, that doesn’t make searching for the patents any easier. In the About GP page, it states:

As with Google Web Search, we rank patent results according to their relevance to a given search query. We use a number of signals to evaluate how relevant each patent is to a user’s query, and we determine our results algorithmically.

I’m assuming word frequency and fields play a part. For instance, “coffin” mentioned a lot in a patent, especially in important fields, will increase its relevancy ranking. Great. But there’s so much that happens with web search rankinga critical mass of users, search optimization, incoming and outgoing links, even domain extensionsthat simply aren’t a part of a pile of patents, many of which have faulty information (whether an omission on Google’s part or from the start when extracted from USPTO). Fields are transposed, the inventors becoming their inventions. Other fields are left blank. Words are misspelled and other typos abound, likely from bad OCR.

In other words, Google Patents is familiar, clean and comforting, but keyword searching is still crap.

If you know exactly what you’re looking for, you may have better luck but not necessarily. Advanced search allows you to search by patent number, inventor, date and so forth. You can also search by classification, US and international, which initially thrilled me, but my magic numbers 27/31 for life signal devices rounded up only a handful of results, none of them relevant (like the martial arts uniform top or “duck on the rock” kids’ game). Out of curiosity, I tried searching for other classification numbers: some results appeared relevant while others, again, were way off.

I’m stumped. USPTO can easily retrieve patents based on classificationif they’re using the same data, why can’t Google? Searching by patent number also retrieves a lot of irrelevant results in GP. Despite specifying a field search, it still seems to be doing a keyword search. Many patents refer to other similar patents (including their numbers) to explain how this new one compares or deviates, which can be helpful if researching the evolution of an invention or process. But extraneous, completely different items end up in the mix, too, which frustrates and impedes.

Because I couldn’t generate a list of what I wanted in Google Patents, I used the USPTO 27/31 list to grab the patent numbers which I then searched for in GP to compile a list of life signal coffin devices for the DeathRef post. These are linked to the easy-to-view and use (once you find them) GP patents.

As the titles of these patents are often similar or vague, I annotated a few of them with quotes from the patents. This is where the plain text view came in handyfor easy copy and pasting. But what really blew my mind is the clipping feature found in the upper right:

Google Patent clipping feature.

With Clip you can select with a bounding box any part of a PDF then immediately grab the embed code for the image and presumably do whatever you want with it. I threw a handful into the DeathRef post. These patents have marvelous line drawingsI had planned to download PDFs or take manual screenshots, resize as needed, upload them to the blog then link back to the PDFs. The clipping feature did everything automatically and instantly. Wowza!

I don’t know whether Google takes a snapshot of the image and stores it somewhere, or if the code is a script that generates the image on the fly based on the bounding box parametersI think it’s the latter. While it’s always good practice to have local copies of images in case something happens to ones stored elsewhere (beyond your control), this is a slick feature I haven’t seen before, from Google or anyone else. I suspect it’s the absence of copyright that makes this possible more so than newly discovered technical ingenuity, but stillso handy, so cool.

In conclusion, I love what Google Patents is doing but arughg! it could be so much better. I have a hunch making improvements on providing access to something in theory already available is of pretty low priority, howeverand it does say it’s beta, so *deep breath* I can settle down. And in the meantime, be excited. For all the endless ventures and questionable agendas of the Google Empire, this one seems pretty innocuousand neat.

4 comments on “escape coffins and patent classification”

  1. Hi Meg,

    Great post. I work in patent information and I’m also in library school. I’m the senior editor for an online resource called Intellogist, where we post in-depth reviews of patent search systems (and a whole lot more). I can completely relate to your frustration with Google patent – here’s the deal. The USPTO only offers electronic full text of US patents back to 1976. Of course, a lot of people would like to have it go back further. In the mid-nineties a company called Corporate Intelligence (which then became MicroPatent, which then was purchased by Thomson Reuters) digitized the collection back to 1836 via OCR. Unfortunately reconstructing metadata for these documents was not easy (including classification codes). In 2006, Google tried again, re-scanning everything to create the collection they offer now. As you found, their metadata isn’t great either. But there are now at least two independently created collections of scanned US patent documents back to 1836. Of course, LexisNexis offers a collection as well, and they also purchased a company called Univentio, which was a big producer of scanned patent data back in the day. I don’t happen to know whether Univentio produced a third independently scanned collection, but since that has got to be a major undertaking, maybe not (as you know there are over 7 million US patent documents right about now).

    So forgive this informal history off the top of my head – but that’s what’s going on with the bad metadata in the US collection. We’re quite lucky to have the Google collection for free, as badly scanned as some of it is (as you might guess, you have to pay for Thomson Reuters’ or LexisNexis’ collections). My company’s website, Intellogist, offers a review of Google Patents with some additional inforamation on this topic: http://www.intellogist.com/wiki/Report:Google_Patent_Search/Data_Coverage/Patent_Coverage/Full_Text_Coverage

    I hope you’ll stop by Intellogist if you ever need to do another patent search – we have a lot of information up there, including info on other classification systems (like the International Patent Classification or IPC, for example: http://www.intellogist.com/wiki/IPC_Classification_System) and we have a discussion board for any difficulties you might encounter!

    Awesome post, I loved reading the step-by-step re-creation of your strategy!

  2. Thanks for the short history and clearing some things up, Kristin! The Google Patent report at Intellogist is excellent. I’ve emailed Google in the past on an unrelated matter — not asking for or expecting secrets, just hoping for clarification (dealing with open access content in Google Scholar). I got a polite but evasive (non)response. So I didn’t think I’d get far drilling them about what’s going on with the patent metadata. But it’s terribly interesting stuff — nice to see some of my suspicions confirmed (bad OCR). Duplicated work, especially when it’s not done very well, drives me crazy — also when it’s public domain content that turns into toll-based. Argh!

    Glad you liked my post, and thanks again for filling me in and directing me to Intellogist — it’s a great resource. Keep up the good work!

  3. Hello Ms. Holle,

    I am intrigued by your website. I linked to it through the Death Reference Desk which was mentioned on the Morbid Anatomy email list.

    This is probably a stupid question on my part but have you read Poe’s the “Premature Burial ?” It refers to a number of the devices you describe. I have been fascinated with this morbid topic since I read this work when I was a kid !

  4. Hello, David,

    Actually I’m not sure if I’ve read Premature Burial. Other Poe stories come to mind that I do recall — Telltale Heart particularly, and the Cask of Amontillado — but I don’t remember Premature Burial specifically. I should check it out, especially if it mentions some of escape coffin devices.

    Last fall I posted about Poe’s mysterious death at the Death Reference Desk. Interesting stuff, with a great podcast from the Memory Palace:

    http://deathreferencedesk.org/2009/11/03/vote-or-die-or-and/

Say Something

Your email address will not be published. Required fields are marked *

17 − 14 =