The invention relates to automated assessment of scripts written in examination. The Examining Division refused the application, as the features were primarily mathematical or linguistic operations, and no clear technical purpose was inherent as it related to grading written scripts. The Appellant argued that the invention provided technical contribution in “educational technology” and automatically graded text scripts so that the grades closely match those given by humans.
The Board disagreed with the applicant and considered that the human grading process was a cognitive task and could not be properly defined to assess if a problem was solved. Furthermore, while the training model may have a technical contribution, it was in the field of mathematics and also not considered. The Board noted that at least an indirect link to physical reality was required, and since automated script grading for evaluating linguistic competencies did not have an implied technical use or purpose, it was not considered for assessing inventive step.
Here are the practical takeaways from the decision T 0761/20 (Automated script grading/UNIVERSITY OF CAMBRIDGE) of May 22, 2023, of the Technical Board of Appeal 3.5.06.
The invention was defined by the Board as follows:
1. The application relates to automated assessment of scripts written in examination, in particular English for Speakers of Other Languages (ESOL) examinations (paragraphs 1 to 3).
1.1 The system comprises a feature analysis module, denoted as RASP (robust accurate statistical parsing) which extracts and numerically quantifies linguistic features of text (paragraphs 52 to 56) to form a feature vector.
1.2 This feature vector is used to grade scripts on the basis of discriminative models, such as SVM or large margin perceptrons, including a variant, said to be new, called the Timed Aggregate Perceptron (TAP, see paragraph 28). In the TAP training procedure, unlike in standard perceptron training, a timing parameter reduces the update rate as a function of how far the process has progressed, of the magnitude of the increase in empirical loss, and of the balance of the training distributions (paragraph 36). This has the role of providing an approximate solution that prevents overfitting (by early stopping).
1.3 The application describes embodiments with binary outputs based on SVM or TAP, useful for pass/fail grading systems (paragraphs 24 to 39), and an embodiment denoted as a modification of the TAP using preference ranking (paragraphs 41 to 49).
1.4 In the latter embodiment, the perceptron’s success is measured by its ability to correctly rank pairs of training samples on the basis of its scalar output; this scheme is conceptually aimed at reducing errors in relative grading (i.e. the decision which test to assign a higher score) as opposed to errors in absolute grading. The output of a perceptron is in essence the result of a dot product between the learned weight vector and the incoming sample. In the standard perceptron, to reduce the ranking errors, the weight vector is updated in the direction of the misclassified samples; in the proposed variant, the update direction is provided by the sum of the difference vectors between the samples of the misclassified pairs. This variant can be used both for binary fail/pass grading and for non-binary grading.
1.5 The application discloses performance assessments of the described methods (paragraphs 62 to 73) based on how well its results correlate with those of prior art systems and of human markers (or examiners/raters) (see Tables 4 and 5, paragraphs 63 and 71). According to those results, the preference ranking TAP model outputs grades that correlate with those provided by human markers almost as well as the human markers’ grades correlate with one other. Also, the preference ranking TAP outperforms TAP on a binary task, while binary TAP and SVM outperform prior art systems.
1.6 The requests on file are all based on the TAP preference ranking model.
Method Claim 1
A computer-implemented method of grading scripts (145) comprising text, the method comprising:
training an automated computerized text assessment system to grade text of scripts, the training including, by a computer device (900): receiving (210) a plurality of training linguistic vectors (x1, x2, x3,…xn) each training linguistic vector comprising a plurality of numerical values representing linguistic features of text within a training script (105); receiving, for each of a plurality of pairs of said training linguistic vectors, ranking data (x1
generating a linguistic vector comprising a plurality of numerical values representing linguistic features of text of an input script (145) that is to be graded;
calculating, a dot product between the trained model weight vector and the linguistic vector for the text of the input script that is to be graded to generate a scalar value for the input script; and
outputting a grade for the input script using the scalar value generated for the input script (145).
Is it patentable?
In the decision under appeal, the Examining Division considered that the features are primarily mathematical or linguistic operations performed on a regular computer. There was no clear technical purpose, especially because grading text scripts is not inherently a technical task.
The Appellant argued that, even though grading scripts by a human marker might not be a technical task in itself, their invention does provide a technical contribution in the field of “educational technology,” which combines computer hardware, software, and educational theory to enhance learning, by providing automatically graded text scripts so that the grades closely match those given by humans. The system, can handle more complex responses and it uses novel AI techniques to provide more accurate grading, and can provide students immediate feedback, which improves their learning experience.
The Board assessed if the claim, as a whole, was linked to a technical effect:
6. As the Board understands the argument of the Examining Division, it does not depend on whether the differences to D2 also comprise the ones advanced by the Appellant, as the claim as a whole can be said to only define “mathematical or linguistic steps” used for “grading text scripts”. This means that, if the argument of the Examining Division is correct, the claim as a whole is not “causally linked to a technical effect”.
7. Also the Appellant, challenging the finding of the Examining Division, refers to the claim as a whole when it states an alleged contribution to the art and the corresponding technical problem solved. This is appropriate, as the specific effects of any distinguishing features over D2 are only relevant for inventive step if it can be acknowledged at all that a technical problem is solved. If that is the case, the differences themselves might give rise, for instance, to an argument that the results according to the invention correlate better with those of human markers than the prior art methods (instead of merely “well”).
8. The Board shares this view and will therefore also address the claim in its entirety to assess whether a combination of features solving a technical problem can be identified.
9. The claim defines a method of automated script grading using machine learning, which is effectively a computer implemented process. Such processes may have technical effects – and thus be deemed to solve a technical problem – at their input or output, but also by way of their execution (see G 1/19, reasons 85). A technical effect may also be acknowledged in view of their purpose, i.e. an (implied) technical use of their output (see G 1/19, reasons 137).
The Board then assessd if the technical effects was “within the computer”:
10. The claimed method contains steps for extracting numerical “linguistic” vectors from scripts (for all considered samples, training scripts and scripts to be graded), a step of training a perceptron, and a step of using the perceptron to grade the scripts.
10.1 The extraction of linguistic vectors, which is the step providing the input to the grading perceptron, is not detailed in the claim. According to the description (see paragraph 52), they are defined and selected to capture sufficient information for evaluating the degree of linguistic competence; they can be said to provide a “mathematical” summary of a script. Since the claim provides no detail as to the contents of the vector, this step cannot be considered to provide any contribution on its own, be it related to the script acquisition (e.g. scanning or OCR) or modelling, or to any optimization within the computer.
10.2 The claimed perceptron model is a linear mathematical function mapping the input numerical vectors to output grades. Specific details are only claimed with regard to its training procedure, which is optimized to preserve the ranking of grades, as opposed to minimizing the absolute error in output grades (see point 1.4 above). The model is not based on technical considerations relating to the internal functioning of a computer (e.g. targeting specific hardware or satisfying certain computational requirements), and the preference ranking is chosen merely according to its educational purpose, which does not relate to any effects within the computer either.
10.3 Also the final step of using the perceptron to grade the scripts provides no effects within the computer.
11. In principle, the claimed training procedure might constitute a technical contribution to the state of the art (see e.g. G1/19, reasons 33). Taken alone, however, this is a mathematical method, so this contribution is in the – excluded – field of mathematical methods (see T 0702/20 and T 0755/18, catchwords) and is therefore not a patentable contribution.
12. Thus the Board cannot identify any technical problem solved be it at the input, or in generating the output grade output, or by execution of the claimed process.
The Board then assessd if the technical effect via an “implied technical use” is provided:
13. What remains as a potentially patentable contribution is the purpose of the claimed system to provide an automated tool for script grading. This corresponds to the problem formulated by the Appellant, namely “providing a computer system that can automatically grade text scripts [and provide grades] that correlate well with the grades provided by human markers”. The questions to be answered are (i) whether this problem is, or implies, a technical one, and (ii) whether it is actually solved (T 641/00, reasons 5 and 6).
14. Turning first to question (ii), the Board remarks that the human grading process is a cognitive task in which the marker evaluates the content of the script (e.g. language richness and grammatical correctness) to assign a grade.
14.1 The assigned grade depends on the content of the script itself, but is also at least partly subjective: the marker will have preferences as to style and language, and will be influenced by experience and grades assigned to scripts in the past.
14.2 The Board thus doubts that the problem of automating script grading is defined well enough that one can properly assess whether it has been solved, i.e. in the sense that it provides a system that can actually replace different human markers and provide “correct” grades.
14.3 The Appellant has captured this in the problem formulation by the qualifier “correlate well”. Given the results in the application, showing that the claimed system provides results that agree with the ground truth on the same level as the markers agree with each other, the Board is satisfied that the system can produce outputs that “correlate well” with the training data from human markers. The Board has no occasion to challenge that the invention may for instance be useful, as the Appellant submitted, for the (self-)evaluation of linguistic competences by students.
15. In its communication, the Board questioned under Article 84 EPC whether the claims of the main request comprised all the features necessary to produce this result. However, given that the Appellant was willing to amend the claims to overcome this objection, the Board leaves this question open and proceeds on the assumption that the problem, as qualified by the Appellant, is solved.
16. Under this assumption, there is a first argument that any automation of human tasks, irrespective of the task, is sufficient to conclude that a technical problem is solved, as it reduces human labor.
16.1 This argument, however, contradicts the requirement of G 1/19 that there must be a technical purpose. Though G 1/19 was related to computer-implemented simulations, its reasons apply to computer-implemented methods other than simulations as well.
16.2 The Enlarged Board stated that “information which may reflect properties possibly occurring in the real world […] may be used in many different ways”, that “a claim concerning the calculation of technical information with no limitation to specific technical uses would therefore routinely raise concerns with respect to the principle that the claimed subject-matter has to be a technical invention” (reasons 98), and that “[i]f the claimed process results in a set of numerical values, it depends on the further use of such data (which use can happen as a result of human intervention or automatically within a wider technical process) whether a resulting technical effect can be considered in that assessment” (reasons 124), and concluded that “such further [technical] use has to be at least implicitly specified in the claim” (reasons 137).
16.3 Therefore, the argument that a technical problem is already solved by the mere provision of any automated tool cannot succeed.
17. As stated above, the Board assumes that the claimed invention serves the purpose of supporting its users in evaluating linguistic competences, as the Appellant argued. The Board also cannot see any other implied purposes. The question remains whether the assessment of linguistic competences, or maybe merely providing a grade, is a technical purpose.
The appeallant then argued that education technolgy was a technical field and raised the question of what is a technical field, or field of technology?
18. The Appellant considers that automated grading makes a technical contribution in the field of “educational technology” and, if the Board disagrees, asks the question “what is a technical field?” or “a field of technology?”.
19. The Board understands these two questions to be equivalent. The express reference to “fields of technology” in Article 52(1) EPC, introduced with the EPC 2000 in order to bring Article 52 EPC in line with Article 27(1) TRIPS, was not intended to change the established understanding that patent protection is “reserved for creations in a technical field”, i.e. involving a “technical teaching […] as to how to solve a particular technical problem” (see OJ EPO Special edition 4/2007, 48, but also G 1/19, reasons 24, and T 1784/06, reasons 2.4).
19.1 The Board further notes that the field of “educational technology” as defined by the Appellant (see point 5 above) is a rather inhomogeneous one, covering insights from – and presumably contributions to – a wide range of “fields”, technical ones and non-technical ones. It appears questionable, therefore, that this field can be considered a technical one as a whole. However, this question is not decisive.
19.2 What is decisive, according to established case law of the Boards of appeal, is whether the invention makes a contribution which may be qualified as technical in that it provides a solution to a technical problem. If this is the case, a contribution to a field of technology may be said to also be present. It is noted that the “field” of this contribution may be different from the one to which the patent more generally relates: for instance, inventions within the broad field of “educational technology” may make contributions in the field of computer science.
20. In G 1/19, the Enlarged Board followed its earlier case law and “refrain[ed] from putting forward a definition for ‘technical'”, because this term must remain open (section E.I.a, especially reasons 75 and 76; see also OJ EPO Special Edition 4/2007, 48). Nonetheless, the Enlarged Board provided considerations as to what may be considered technical.
20.1 The referring Board had suggested that a technical effect might require a “direct link with physical reality, such as a change in or a measurement of a physical entity” (see T 489/14, reasons 11).
20.2 The Enlarged Board accepted that such a “direct link with physical reality […] is in most cases sufficient to establish technicality” (reasons 88) and, in this context, that “[i]t is generally acknowledged that measurements have technical character since they are based on an interaction with physical reality at the outset of the measurement method” (reasons 99). It also stressed that an effect could also be “within the computer system or network” (i.e. internal rather than “(external) physical reality”, see G 1/19, reasons 51 and 88).
20.3 It recalled that potential technical effects might also be sufficient (see also reasons E.I.e), i.e. “effects which, for example when a computer program […] is put to its intended use, necessarily become real technical effects” (reasons 97).
20.4 And it also considered that calculated data, while “routinely raising concerns with respect to the principle that the claimed subject-matter has to be a technical invention over substantially the whole scope of the claims” might contribute to a technical effect by way of an implied technical use (reasons 98 and 137), “e.g. a use having an impact on physical reality” (reasons 137).
20.5 While the Enlarged Board of Appeal has thus found that a direct link with physical reality may not be required for a technical effect to exist, it has, in this Board’s view, confirmed that an at least indirect link to physical reality, internal or external to the computer, is indeed required. The link can be mediated by the intended use or purpose of the invention (“when executed” or when put to its “implied technical use”).
21. Returning to the case at hand, the Board finds that automated script grading, by itself or via its intended use for evaluating linguistic competences, does not have an implied use or purpose which would be technical via any direct or indirect link with physical reality.
The Board found method of automated script grading does not provide a contribution to any technical and non-excluded field, be it by way of how the automation is carried out, or by way of its use. Therefore, it could not be considered for assessment of inventive step.
You can read the full decision here: T 0761/20 (Automated script grading/UNIVERSITY OF CAMBRIDGE) of May 22, 2023, of the Technical Board of Appeal 3.5.06.