The application underlying this decision relates to a sparse neural network architecture. However, the European Patent Office refused to grant a patent since claim 1 specifies an abstract computer-implemented mathematical operation on unspecified data, i.e., approximating the weight values of the network. Here are the practical takeaways of the decision T 0702/20 (Sparsely connected neural network/MITSUBISHI) of November 7, 2022 of Technical Board of Appeal 3.5.06:
The application underlying the present decision mainly concerns an adaption of the connections between layers of a neural network. In detail, the application discloses a concept called “loose couplings” which reduces the number of connections between the nodes of the several layers of the neural network compared to conventional fully-connected architectures as illustrated by the dotted lines in Fig. 2 as shown below (cf. paras.  and  of the application). The “loose couplings” are defined by a sparse parity-check matrix of an error correcting code such as a LDPC code, spatially-coupled code or pseudo-cyclic code (cf. para.  of the application).
Fig. 2 of EP 3089081 A1
Here is how the invention is defined in claim 1 of the main request:
Claim 1 (Main Request)
A hierarchical neural network apparatus (1) implemented on a computer comprising
a weight learning unit (20) to learn weights between a plurality of nodes in a hierarchical neural network, the hierarchical neural network being formed by loose couplings between the nodes in accordance with a sparse parity-check matrix of an error correcting code, wherein the error correcting code is a LDPC code, spatially-coupled code or pseudo-cyclic code, and comprising an input layer, intermediate layer and output layer, each of the layers comprising nodes; and
a discriminating processor (21) to solve a classification problem or a regression problem using the hierarchical neural network whose weights between the nodes coupled are updated by weight values learned by the weight learning unit (20) or comprising
a weight pre-learning unit (22) to learn weights between a plurality of nodes in a deep neural network, the deep neural network being formed by loose couplings between the nodes in accordance with a sparse parity-check matrix of an error correcting code, wherein the error correcting code is a LDPC code, spatially-coupled code or pseudo-cyclic code, and comprising an input layer, a plurality of intermediate layers and an output layer, each of the layers comprising nodes; and
a discriminating processor (21) to solve a classification problem or a regression problem using the deep neural network whose weights between the nodes coupled are updated by weight values learned by the weight pre-learning unit (22) and
a weight adjuster (23) to perform supervised learning to adjust the weights learned by the weight pre-learning unit (22) by supervised learning; and wherein the weights are learned by the weight pre-learning unit (22) by performing unsupervised learning; and
the weights between the nodes coupled are updated by weight values adjusted by the weight adjuster (23).
Is it technical?
Both the Board in charge and the Appellant agreed that claim 1 distinguishes itself from the closest prior art by means of the “loose couplings” as defined by the feature:
neural network being formed by loose couplings between the nodes in accordance with a sparse parity-check matrix of an error correcting code, wherein the error correcting code is a LDPC code, spatially-coupled code or pseudo-cyclic code
The Appellant’s arguments may be structured as follows:
A first argument of the Appellant addressed the achieved effects “within a computer” due to the provided architecture and stated that:
Machine learning serves a technical purpose by solving a well defined technical problem by mathematical means.
In order to support this statement, the Appellant referred to decision T 1326/06 from which the Appellant followed:
methods relating to data encoding and/or decoding can serve a technical purpose even though they are almost entirely based on mathematical algorithms and used for encrypting and decoding abstract data.
Accordingly, the Appellant argued that:
the possibility that the neural network apparatus may process, unknown, possibly abstract data in- and outputs should not necessarily take away the technical character of the distinguishing feature,
and supported this statement by referring to T 0697/17 according to which describing a technical feature at a high level of abstraction does not necessarily take away the feature’s technical character.
Therefore, the Appellant argued that the technical problem of improving the learning capability and efficiency of a machine is solved by reducing the required computational resources and preventing overfitting.
A second argument of the Appellant addressed the neural network as an automation tool:
The Appellant argued that:
an artificial neural network was a mathematical algorithm meant to mimic the human brain, by replicating biological optimization. It was implemented and trained in hardware, on a computer … It allowed the automation of complex tasks, so that the computer could perform them instead of a human.
With automation being generally recognized by the case law as a technical problem, the Appellant concluded that a neural network is not an abstract mathematical method, but instead uses mathematics to solve a technical problem. Accordingly, the present application would contribute to said domain by providing a new network structure allowing for a more efficient implementation by reducing computing and storage requirements.
However, the Board did not follow these arguments:
Regarding the allegedly achieved effects “within the computer”, the Board stated that:
while the storage and computation requirements are indeed reduced in comparison with the fully-connected network, this does not in and by itself translate to a technical effect, for the simple reason that the modified network is different and will not learn in the same way [as the fully-connected network]. So it requires less storage, but it does not do the same thing.”
The Board also provided an example in which they outline:
For instance, a one-neuron neural network requires the least storage, but it will not be able to learn any complex data relationship.
Therefore, the Board followed that:
the proposed comparison is incomplete, as it only focuses on the computational requirements.
Regarding the neural network as an automation tool, the Board stated that;
[it] sees no evidence that a neural network functions like a human brain. While its structure is inspired by that of the human brain, this does not imply that they can actually function like one.
A general “brain automation problem” can thus not be considered as solved by claim 1. In addition, as claim 1 does not further specify any particular task (i.e., the type of relationship to be learned), claim 1 does also not solve a corresponding automation problem.
As a result, the Board concluded that the claim as a whole merely specifies abstract computer-implemented mathematical operations on unspecified data. Accordingly, it’s subject-matter cannot be said to solve any technical problem and thus does not go beyond a mathematical method in the sense of Article 52 (2) EPC, which is implemented on a computer.
Therefore, the Board dismissed the appeal due to lack of inventive step.
You can read the whole decision here: T 0702/20 (Sparsely connected neural network/MITSUBISHI) of November 7, 2022