Are APIs Copyrightable?

Oracle v. Google resulted in a really bad opinion for everybody in software, including Oracle. The Federal Court of Appeals applying 9th Circuit law held that the structure, sequence, and organization of APIs are copyrightable. This is a radical change in the law as to date, everybody and everybody's lawyers have taken the position that APIs are not copyrightable, but the implementation of the APIs is.

First the mitigation.

Are long as the APIs are reverse engineered, it's okay to clone them

A big part of the decision seems to be that Google did not have clean hands and the court is punishing Google for being a bully to Sun in the Java licensing negotiation and then verbatim copying 7,000 lines of API declarations. I get this.

For most of us, the key to the decision is on page 48:

As the former Register of Copyrights of the United States pointed out in his brief amicus curiae, “[h]ad Google reverse engineered the programming pack- ages to figure out the ideas and functionality of the original, and then created its own structure and its own literal code, Oracle would have no remedy under copyright whatsoever.” Br. for Amicus Curiae Ralph Oman 29. Instead, Google chose to copy both the declaring code and the overall SSO of the 37 Java API packages at issue.

I read this (talk to your lawyer, don't follow my advice) as meaning the following retains copyright:

/*
 * Copyright (c) 1994, 2010, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 only, as
 * published by the Free Software Foundation.  Oracle designates this
 * particular file as subject to the "Classpath" exception as provided
 * by Oracle in the LICENSE file that accompanied this code.
 *
 * This code is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 * version 2 for more details (a copy is included in the LICENSE file that
 * accompanied this code).
 *
 * You should have received a copy of the GNU General Public License version
 * 2 along with this work; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
 * or visit www.oracle.com if you need additional information or have any
 * questions.
 */
package java.lang;

import java.io.ObjectStreamField;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Comparator;
import java.util.Formatter;
import java.util.Locale;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;

/**
 * The <code>String</code> class represents character strings. All
 * string literals in Java programs, such as <code>"abc"</code>, are
 * implemented as instances of this class.
 * <p>
 * Strings are constant; their values cannot be changed after they
 * are created. String buffers support mutable strings.
 * Because String objects are immutable they can be shared. For example:
 * <p><blockquote><pre>
 *     String str = "abc";
 * </pre></blockquote><p>
 * is equivalent to:
 * <p><blockquote><pre>
 *     char data[] = {'a', 'b', 'c'};
 *     String str = new String(data);
 * </pre></blockquote><p>
 * Here are some more examples of how strings can be used:
 * <p><blockquote><pre>
 *     System.out.println("abc");
 *     String cde = "cde";
 *     System.out.println("abc" + cde);
 *     String c = "abc".substring(2,3);
 *     String d = cde.substring(1, 2);
 * </pre></blockquote>
 * <p>
 * The class <code>String</code> includes methods for examining
 * individual characters of the sequence, for comparing strings, for
 * searching strings, for extracting substrings, and for creating a
 * copy of a string with all characters translated to uppercase or to
 * lowercase. Case mapping is based on the Unicode Standard version
 * specified by the {@link java.lang.Character Character} class.
 * <p>
 * The Java language provides special support for the string
 * concatenation operator (&nbsp;+&nbsp;), and for conversion of
 * other objects to strings. String concatenation is implemented
 * through the <code>StringBuilder</code>(or <code>StringBuffer</code>)
 * class and its <code>append</code> method.
 * String conversions are implemented through the method
 * <code>toString</code>, defined by <code>Object</code> and
 * inherited by all classes in Java. For additional information on
 * string concatenation and conversion, see Gosling, Joy, and Steele,
 * <i>The Java Language Specification</i>.
 *
 * <p> Unless otherwise noted, passing a <tt>null</tt> argument to a constructor
 * or method in this class will cause a {@link NullPointerException} to be
 * thrown.
 *
 * <p>A <code>String</code> represents a string in the UTF-16 format
 * in which <em>supplementary characters</em> are represented by <em>surrogate
 * pairs</em> (see the section <a href="Character.html#unicode">Unicode
 * Character Representations</a> in the <code>Character</code> class for
 * more information).
 * Index values refer to <code>char</code> code units, so a supplementary
 * character uses two positions in a <code>String</code>.
 * <p>The <code>String</code> class provides methods for dealing with
 * Unicode code points (i.e., characters), in addition to those for
 * dealing with Unicode code units (i.e., <code>char</code> values).
 *
 * @author  Lee Boynton
 * @author  Arthur van Hoff
 * @author  Martin Buchholz
 * @author  Ulf Zibis
 * @see     java.lang.Object#toString()
 * @see     java.lang.StringBuffer
 * @see     java.lang.StringBuilder
 * @see     java.nio.charset.Charset
 * @since   JDK1.0
 */

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * Class String is special cased within the Serialization Stream Protocol.
     *
     * A String instance is written initially into an ObjectOutputStream in the
     * following format:
     * <pre>
     *      <code>TC_STRING</code> (utf String)
     * </pre>
     * The String is written by method <code>DataOutput.writeUTF</code>.
     * A new handle is generated to  refer to all future references to the
     * string instance within the stream.
     */
    private static final ObjectStreamField[] serialPersistentFields =
            new ObjectStreamField[0];

Because it was copied verbatim from the source code, but:

public final class java.lang.String implements java.io.Serializable, java.lang.Comparable<java.lang.String>, java.lang.CharSequence {
  public static final java.util.Comparator<java.lang.String> CASE_INSENSITIVE_ORDER;
  public java.lang.String();
  public java.lang.String(java.lang.String);
  public java.lang.String(char[]);
  public java.lang.String(char[], int, int);
  public java.lang.String(int[], int, int);
  public java.lang.String(byte[], int, int, int);
  public java.lang.String(byte[], int);
  public java.lang.String(byte[], int, int, java.lang.String) throws java.io.UnsupportedEncodingException;
  public java.lang.String(byte[], int, int, java.nio.charset.Charset);
  public java.lang.String(byte[], java.lang.String) throws java.io.UnsupportedEncodingException;
  public java.lang.String(byte[], java.nio.charset.Charset);
  public java.lang.String(byte[], int, int);
  public java.lang.String(byte[]);
  public java.lang.String(java.lang.StringBuffer);
  public java.lang.String(java.lang.StringBuilder);
  java.lang.String(char[], boolean);
  java.lang.String(int, int, char[]);

Does not retain copyright because it was reverse engineered using javap.

This really weird dichotomy is necessary for the court to square this decision with Sony v. Connectix.

Basically, Sony held that it's fair use to make copies of code while reverse engineering the APIs in the code in order to make compatible systems. So somehow, the act of reverse engineering something (the Oracle opinion is very light on guidance, so this shouldn't be a surprise), not matter how little the effort, is enough to remove copyright from the structure, sequence, and organization of the code.

Google's lawyers and the EFF are not the sharpest tools...

At this point, the post is going to get a bit personal. Basically, Oracle hired the best IP litigator in the world and she mopped up the floor with Google's lawyers. Much of the court's opinion was based on the framing of things like "what's a computer language?" "what's compatibility?" and "what's originality?" The opinion shifts the meaning of each of these things to suit the outcome and that's the hallmark of a very excellent lawyer.

And the EFF is just weak in their amicus brief which should have been one line: "Open good. Not open, bad." Because there's very little substance beyond that in the brief.

The Takedown that should have happened

Google's lawyers should have done a much, much better job both at trial and on the appeal with the facts.

The Java API SSO is not an original work

The first and most important fact, the one that the court rests it's whole opinion on is that (page 21):

At this stage, it is undisputed that the declaring code and the structure and organization of the Java API packages are original. The testimony at trial revealed that designing the Java API packages was a creative process...

The structure, sequence, and organization of the 37 packages in question is an evolution of many other libraries and systems. The Java APIs derive from Eiffel and Smalltalk and C++ and many other programming languages. The java.net package is shaped like the BSD socket library that it acts as a facade to.

Sadly, this is one of the framing issues that Google's lawyers didn't get across. Yes, there were decisions about how to name and organize the packages, classes, and methods involved some creativity, but it was the organization and normalization of many existing APIs and systems. And then it becomes a naming issue and once it's a naming issue, we arrive back at the District Court's analysis that copyright doesn't apply to short things like names.

All the lawyers would have had to do was to look through the JCP discussions about some of the APIs to get a ton of evidence that the APIs are in fact mostly derived from other systems.

The Ideal/Expression Dichotomy

The worst thing about the opinion is that in computing, we used to have a very simple way of expressing the Ideal/Expression Dichotomy: an API is the ideal and the implementation is the expression. Simple.

Yes, ideas can be very complex things. Sadly, Google's lawyers didn't come up with about 50 examples of very complex ideas such as the Theory of Relativity.

Nor did Google's lawyers help the court understand that the API/Interface vs. Implementation line is a very, nice bright line that actually avoids a lot of the messiness with multi-factor tests. Further, having a bright line, rather than saying, "everything that's written in source code and all machine translations of that source code has copyright attached, but there may be fair use which is a darned squishy concept" means that developers and companies know what acceptable behavior is.

Put simply, the API vs. implementation used to be a bright line. It's not any more. The bright line worked.

What's a Language

This is another place where the court was all over the map... defining the Java language (something that apparently doesn't have copyright protection).

In some places, that language is just the syntax of the grammar.

In some places, it's bytecode that can run on the JVM.

In some places, it's the syntax of the language and the libraries required to make a minimal program operate (e.g., java.lang).

So, the court gives very little guidance as to what copyright attaches to. Is it all the Java APIs? Is it the APIs that are not referenced in the Java compiler?

Yet another place where the Oracle lawyers convinced the court to define "language" to suit their goals rather than anything that a developer can use to know where "language to which copyright does not attach" ends.

Cleanliness of Hands

Early in the opinion, the court discusses how Google and Sun did not reach a licensing agreement because (page 9):

The parties negotiated for months but were unable to reach an agreement. The point of contention between the parties was Google’s refusal to make the implementation of its programs compatible with the Java virtual machine or interoperable with other Java programs. Because Sun/Oracle found that position to be anathema to the “write once, run anywhere” philosophy, it did not grant Google a license to use the Java API packages.

The above had to do with trademark. Sun was preserving it's "write once, run anywhere" mark. The court seems to put this as a negative on Google, but it seems to me that Sun could have licensed the code but not the Java mark to Google and Google would have paid for the license to the code. Much of the opinion rests on Google not having clean hands, but the above indicates to me that it was Sun that didn't have clean hands.

Sure, it was a really bad thing for Google to copy 7,000 lines of source code. But at the end of the day, they could have hired a clean-room team to reverse engineer the APIs with javap and it would have cost all of $500K. But Google's lawyers did a weak job of pointing to Sun's lack of clean hands.

Done for the night

I have some more notes that I want to get out on this case, but I'm done for the night. Maybe more tomorrow or maybe not.

Net-net, this is generally a bad decision. It's badly reasoned. It gives little guidance to how we are supposed to act. The silver lining is that it's upholding Sony and it's clear that reverse engineering rather than verbatim source copying removes any copyright. So for those folks that are using packet sniffers to figure out network protocols and for those folks that are disassembling object code, I think you're safe.

For the folks that are cloning published APIs, I think you're going to have to do a clean-room approach where the API documentation is sent to a clean-room team that recreates the API docs and then you code to the clean-roomed docs.