While this is not a brand new software-related decision, it appears to be highly relevant because it is one out of two decisions cited in the artificial intelligence-related section G-II, 3.3.1 of the revised Guidelines for Examination of the EPO, which are in force since November 2018. In this decision, the European Patent Office refused to grant a software patent for a classifying text documents based on their content. Here are the practical takeaways from the decision T 1358/09 (Classification/BDGB ENTERPRISE SOFTWARE) of Technical Board of Appeal 3.5.06:
The European patent application relates to a highly efficient and flexible method and an apparatus for building a classification scheme which can be used to classify text documents in an efficient and flexible manner. This is done by first building a “classification model” and then classifying documents using this classification model.
Claim 1A method for the computerized classification of an unclassified text document into one of a plurality of predefined classes based on a classification model obtained from the classification of a plurality of preclassified text documents which respectively have been classified as belonging to one of said plurality of classes, said document and said documents respectively comprising a plurality of terms which respectively comprise one or more symbols of a finite set of symbols;
a) wherein said method involves the computerized building of said classification model, comprising the following method steps:
al) representing each of said plurality of text documents, which are digitally represented in a computer, by a vector of n dimensions, said n dimensions forming a vector space, whereas the value of each dimension of said vector corresponds to the frequency of occurrence of a certain term in the document corresponding to said vector, so that said n dimensions span up a vector space;
a2) representing the classification of said already classified documents into classes by separating said vector space into a plurality of subspaces by calculating one or more hyperplanes, such that each subspace comprises one or more documents as represented by their corresponding vectors in said vector space, so that said each subspace corresponds to a respective class;
a3) calculating a maximum margin surrounding said hyperplanes in said vector space such that said margin contains none of the vectors contained in the subspaces corresponding to said classification classes;
b) wherein said method further involves, on basis of said classification model, the computerized classification of said unclassified text document as belonging to one of said plurality of classes,
comprising the following method steps:
b1) representing said text document, which is digitally represented in a computer, by a vector of n dimensions, said n dimensions spanning up said vector space, whereas the value of each dimension of said vector corresponds to the frequency of occurrence of a certain term in the document corresponding to said vector;
b2) classifying said document into one of said plurality of classes by determining into which of said plurality of subspaces of said vector space said vector falls and identifying said document as belonging to a certain class which corresponds to the subspace into which said vector falls;
b3) calculating a confidence level for the classification of said document as belonging to said certain class based on the distances between the vector representing said document and all hyperplanes surrounding said subspace which corresponds to said certain class normalized by the corresponding margins such that a document which lies outside said margins is assigned
a confidence level of ‘1’ and a document which falls into said margins is assigned a value between ‘0’ and ‘1’.
Is it patentable?
According to the Board, determining whether two text documents belong to the same class does not provide a technical contribution:
5.2 A mathematical algorithm contributes to the technical character of a computer-implemented method only in so far as it serves a technical purpose (see decision T 1784/06 of 21 September 2012, reasons 3.1.1). In the present case, the algorithm serves the general purpose of classifying text documents.
Classification of text documents is certainly useful, as it may help to locate text documents with a relevant cognitive content, but in the Board’s view it does not qualify as a technical purpose. Whether two text documents in respect of their textual content belong to the same “class” of documents is not a technical issue. […]
To convince the Board of the patentability of the claimed subject-matter, the appellant argued that a human being would manually read through the document and assign a particular class to it on the basis of his understanding of the document. In contrast, the claimed automatic classification method involved precise computation steps which no human being would ever perform when classifying documents.
However, the Board takes the position that a comparison with what a human being would do is not a suitable basis for distinguishing between technical and non-technical steps:
5.4 The Board agrees that a human being would not apply the claimed classification method to perform the task of classifying text documents. The Board further accepts that the proposed computerised method may be faster than classification methods known from the prior art. However, the determination of the claim features which contribute to the technical character of the invention is made, at least in principle, without reference to the prior art (cf. T 154/04, OJ EPO 2008, 46, reasons 5, under (E) and (F)). It follows that a comparison with what a human being would do or with what is known from the prior art is not a suitable basis for distinguishing between technical and non-technical steps (see also decision T 1954/08 of 6 March 2013, reasons 6.2).
By the way, if you are interested in a deeper look into how the European Patent Office examines software-related inventions, this article provides some more details thereon.
The appellant further argued that the claimed method provided more reliable and objective results than manual classification which was not contested by the Board. However, the Board stated that the mere fact that an algorithm leads to reproducible results does not imply that it makes a technical contribution:
5.6 The Board does not contest that the claimed classification method may provide reliable and objective results, but this is an inherent property of deterministic algorithms. The mere fact that an algorithm leads to reproducible results does not imply that it makes a technical contribution.
As a result, the Board ruled that a the claimed mathematical algorithm does not contribute to the technial character of the claimed method. The only implementation features specified in the claim are references to the method being “computerized” and the text documents being “digitally represented in a computer”. The skilled person, when given the task of implementing the algorithm, would certainly have chosen to represent text documents “digitally in a computer”. The Board further considers that the skilled person, using only his common general knowledge, would have had
no difficulty in implementing on a computer the various steps of claim 1 and thus rejected the present application due to lack of inventive step.
You can read the whole decision here: T 1358/09 (Classification/BDGB ENTERPRISE SOFTWARE) of November 21, 2014