December 11, 2004

Gift ideas for hir and hir

John Lawler emailed a pointer to US patent application 20040249626, "Method for modifying English language compositions to remove and replace objectionable sexist word forms", filed 6/3/2003 and published 12/9/2004. The abstract:

A method for removing objectionable sexist word forms from English language text and substituting new non-sexist word forms for the objectionable sexist word forms provides ten new blended word forms. Each new blended word form provides a non-sexist substitute for a word from ten associated sexist word pairs. If the gender of the being under consideration by the word from the sexist word pair is unknown, an appropriate new non-sexist word form is substituted for the objectionable sexist word form.

After reading the text of this document, I had to check the URL to be sure that I was really looking at the site of the US Patent and Trademark Office, rather than a spoof on our out-of-control system for evaluating algorithm patents. A check on the USPTO site suggests that this is an application is still pending review. As I'll explain below, I hope that it turns out to be in the 30% of patent applications that the USPTO does not allow.

The inventor -- Richard S. Neal of Edmond, OK -- cites 68 claims, all of which are of the same form as claim 1:

1. A method for removing objectionable sexist word forms from a portion of English language text and substituting new non-sexist word forms for the removed sexist word forms, said method comprising the steps of: a. Providing a non-sexist word HIR; b. Searching the portion of English language text to locate each instance wherein the sexist word HIM is used in third person objective case; c. Determining in each instance from the context whether the gender of the being referred to by the sexist word HIM is known or unknown; d. Substituting said non-sexist word HIR for the sexist word HIM in each instance wherein the sexist word HIM is used in the third person objective case and, further, wherein the gender of the being referred to by the sexist word HIM is unknown; e. Searching the English language text to locate each instance wherein the sexist word HER is used in third person objective case; f. Determining in each instance from the context whether the gender of the being referred to by the sexist word HER is known or unknown; and g. Substituting said non-sexist word HIR for the sexist word HER in each instance wherein the sexist word HER is used in third person objective case, and, further, wherein the gender of the being referred to by the sexist word HER is unknown.

The essential idea in this claim is to substitute the word hir for the standard English words him or her, just in case these are objective-case pronouns referring to a "being" whose "gender" is unknown. This requires four steps:

  1. Finding instances of him and her as independent words in text. This is trivial.
  2. Determining whether each instance is an objective-case pronouns. This is hard to do with 100% accuracy (consider "...gave her dog food"), and no methods are specified.
  3. Determining what each objective-case pronoun refers to. This is very hard to do and no methods are specified.
  4. Determining whether the "gender" of the reference is "known or unknown". This is a completely undefined step, which would probably be hard to do if it were defined precisely enough to make it possible to do it. "Gender" is not defined in the text of the patent, as far as I can find -- does Mr. Neal mean grammatical gender? does he mean gender as a euphemism for biological sex, or a term for culturally-defined sex-related roles? Nor does he ever define to whom the "gender" is "known or unknown" -- the author of the text? a typical reader of the text? a person or machine implementing the algorithm (that would make it easy, since the machine could simply plead ignorance in every case)?

The other 67 claims are all exactly of the same form, except that they deal with nine other proposed lexical substitutions, multiplied by a variety of grammatical and morphological distinctions (oddly analyzed in some cases, but never mind that): hirs as a substitute for his and hers, hir as a substitute for his and her, hesh as a substitute for he and she, fother in place of mother or father, mir in place of sir or madam, hirself for himself or herself, birl in place of boy or girl, wan as a substitute for man or woman, and wen as a substitute for men or women.

The patent description (following the patent claims) suggests that the author is really thinking of a technique of editing for use by humans, rather than an algorithm for use by machines, and sees his innnovation as the provision of a set of words to use in defined circumstances.

Objectionable sexist word forms (especially pronouns, nouns, and the possessive adjectives his and her) have plagued the English language for generations [...] writers have used such phrases as "him or her," "her or his," or the awkward "their" in largely isolated attempts to avoid the problem of objectionable sexist language. [... ]

Because of a general lack of a suitable substitute, writers are both reluctant to employ the objectionable sexist language and also reluctant to fashion a remedy. [..]

Entry rules for a contest may use "contestant" or "participant" repeatedly in an effort to avoid using "him or her" or "he or she", etc. As a manager was heard to say, "If somebody decides not to participate then tell that somebody that that somebody doesn't have to." [...]

All occurrences of "him" or "her" in English language composition are sexist, of course, but not all sexist occurrences are objectionable. If the being under consideration is clearly male (for example, George Washington), it would be completely appropriate (and non-sexist) to refer to him in a subsequent sentence. Likewise, it would be appropriate to refer to his horse or to his presidency. Similarly, a second reference to Emily Dickinson might refer to her poetry.

When gender is unknown, however, the use of multi-compounded expressions, the adoption of word forms from other contexts, and the interposition and repetition of needlessly larger word forms deny the English language (and writers of the English language) what is required--word forms which are simple, accurate, and easily expressed. After a century and a half, no set of words adequately solves the problem of sexist language. [...]

I'm not going to comment on the improbability of the idea that these innovations might become widely used. The patent office has always been open to silly inventions, and as far as I'm concerned it should be. And no significant harm would be done by this particular patent, since it seems unlikely to have any real impact. But this case strikes me as symptomatic of wider flaws in the patent system, which lead to a proliferation of inappropriate patents that sometimes do have a big impact.

First, there's no reference to the considerable "prior art". Many of the specific "non-sexist" words in this application have been around for a while. A few minutes of googling turns up a reference to hesh and hir in this document dated 1995, and the Wikipedia article on sie and hir mentions the first recorded use of hir on usenet in 1981, and possible roots as far back as the 1930s. Birl is common and/or obvious enough that birls is the name of the livejournal community for "boyish girls" -- though I don't know how far back the term can be documented, I suspect that it precedes 6/2/2003. I haven't checked the other new (?) words in the patent. There's good reason to believe that consideration of "prior art" is generally inadequate in the case of software patents. I can only imagine what would happen if the patent office had to investigate prior art in the (hypothetical) case of lexical patents.

Second, (one interpretation of) the scope of the application seems entirely inappropriate. I'm no kind of expert in patent law, but I don't believe that you should be able to patent a word. What makes this document look like a patent are the claims of (pseudo-) methods for substituting words in texts. But if these are interpreted as instructions to writers or editors, you could use this approach to patent any proposed new word or word usage -- just list a bunch of claims of the form "...find all words or phrases referring to the concept X and replace them with the (patented) word Y...", suitably fleshed out with restrictions on syntactic and semantic categories and structures.

Third, if these methods are interpreted as an algorithm to be implemented by a machine, they pose problems in automatic text understanding that can't now be solved, like accurately determining the intended referents of pronouns, and evaluating the epistemological status of the sex or gender of those referents. I happen to think that progress is to be expected, at least in the long term, but there are some contrary views, and if anyone wants to place a bet about the performance of relevant reference-resolution algorithms over the next 17 years, I might be willing to take the "under". Patents that involve the solution of genuinely impossible problems cause no problems, other than a waste of patent-office resources. However, if you let people patent any process they can imagine, even if they have no glimmer of an idea how to implement it, the mass of resulting fantasies is sure to include a fraction of processes that others else might later be able to create -- if they weren't forestalled by a pre-existing fantasy patent. Thus this kind of patent would actually discourage innovation instead of encouraging it.

Finally, the algorithms as specified in the application don't do what the author wants to them to. They're buggy. If we fill in all the vagueness in the most sympathic way, and supply skilled human oracles to solve all the unsolved problems, the result is still junk.

Let's take a specific case. Remember that the patent is aiming to solve problems like this one:

[0004] Objectionable sexist word forms (especially pronouns, nouns, and the possessive adjectives his and her) have plagued the English language for generations. In countless papers and documents written over the last 150 writers have used such phrases as "him or her," "her or his," or the awkward "their" in largely isolated attempts to avoid the problem of objectionable sexist language.

What's the proposed method for fixing up "him or her", say in the phrase "gift ideas for him or her"?

Well, the first step is to "locate each instance wherein the sexist word ... is used". Check -- we just found two.

Are these uses "in third person objective case"? Check.

Now we need to be "[d]etermining from the context whether the gender of the being referred to ... is known or unknown". Well, when I look at the context, at http://www.giftsforhimorher.com, I'd have to say "unknown". The page talks about "boys and girls", "bride and groom", "Mom and Dad", "boyfriend and girlfriend". You can't get much more gender-role inclusive than that.

OK, the pattern has matched, and so we take the recommended action, which is:

Substituting said non-sexist word HIR for the sexist word HIM in each instance wherein the sexist word HIM is used in the third person objective case and, further, wherein the gender of the being referred to by the sexist word HIM is unknown

and also

Substituting said non-sexist word HIR for the sexist word HER in each instance wherein the sexist word HER is used in third person objective case, and, further, wherein the gender of the being referred to by the sexist word HER is unknown.

This part I can even write a computer program to do. For the input "gift ideas for him or her", the output is "gift ideas for hir or hir".

I guess there's another alternative here, which is that the "gender of the being referred to" is actually "known". But in that case, no subsitution will take place, and the output is the same as the input: "gift ideas for him or her". The problem is still not solved. Probably what the patent author wants to recommend in this case is the phrase "gift ideas for hir" -- but although he features such disjunctions of gendered words as the problem, his specified methods fail to produce the appropriate solution when applied to them.

Although I may seem to be beating up on the patent applicant here, I'm really concerned with the patent examiners. If this application as written were allowed, it would mean that some patent examiners are so careless in evaluating algorithmic applications that that they don't notice that the specified methods produce wrong results (or no results) when applied to the sample problems that are listed in the application. More seriously, it would extend the patent system's notion of "invention" in a major way, by allowing what looks like a patent on a software method -- albeit a buggy one -- but is actually a patent on the use of some particular words.

The 1972 Gottschalk v. Benson decision "held that a patent cannot cover all possible uses of a mathematical procedure or equation", and for a while had the effect of preventing software patents. Then Diamond v. Diehr in 1981, though dealing with software control of a physical manufacturing process, opened the floodgates to software patents, and the 1994 Federal Circuit Court decision In re Alappat seems to have systematically validated the idea of software patents by ruling that software turns a general-purpose machine into a special-purpose one that can constitute a patentable invention. Since then, this line of development has been extended further to encompass business method patents, although (as far as I know) this took place purely as a matter of USPTO practice, without any new legal foundation. Allowing a patent like the one under discussion here would open the door to patents on modes of linguistic expression, by making them seem as if they are algorithms for transforming text, just as Diamond v. Diehl opened the door to software patents by treating them as methods for controlling machinery.

Looking on the bright side, this patent application does suggest some exam problems for a semantics course. The question of whether the "gender of the being referred to ... is unknown", in a phrase like "gifts for him or her", raises all sorts of interesting issues. Do the pronouns in "gift ideas for him or her" refer at all? If so, do they refer as individual words or only as a disjunction? If "the sexist word HIM" and "the sexist word HER" don't refer to any particular being at all in this context, is it true or false that "the gender of the being referred to ... is unknown"?

[If you like reading about patents that never should have been granted, Jason Schultz's LawGeek site has a good collection. ]

[Today's IEEE Spectrum Careers has an article by Adam B. Jaffe & Josh Lerner on "A Radical Cure for the Ailing U.S. Patent System." ]

 

Posted by Mark Liberman at December 11, 2004 10:41 AM