Automatically generating a list of expressions semantically related to an input linguistic expression: non-technical

In this decision, the European Patent Office did not grant a patent on the concept of automatically generating a list of expressions semantically related to an input linguistic expression. Here are the practical takeaways of the decision T 1569/05 (Method for retrieving data/CANON) of 26.6.2008 of Technical Board of Appeal 3.5.01:

Catchwords

If a method is computer-implemented, it is considered technical.

However, features relating to automatically generating a list of expressions semantically related to an input linguistic expression is basically not of a technical nature but a matter of the meaning of those expressions, i.e. of their abstract linguistic information content.

The invention

This European patent application generally relates to a data retrieving apparatus and method thereof. More particularly, it relates to a database system or an interface system between a database system and users (cf. EP 0 822 506 A2, page 1, lines 3-4). In the retrieving operation as used in common pattern matching systems, a user is unable to retrieve data having the same meaning but represented by the different representation forms, or data having similar meanings. Moreover, pattern matching cannot deal with polysemy of words (cf. EP 0 822 506 A2, page 1, lines 7-12). The stored data could be of any kind but for simplicity, it is assumed they represent images (in accordance with the third embodiment of the invention). The Board in charge summarized the subject-matter of the underlying application as follows:

The invention is a data processing method (claim 13) and apparatus (claim 1) for searching a database. The stored data could be of any kind but for illustration it is here assumed they represent images in accordance with the third embodiment of the invention. Each image is described by a number of words (“comparison-subjected word group”) representing its contents. A user searching for an image inputs a keyword as well as a number of “context words” intended to define the appropriate semantic context. The keyword, the context words and the comparison-subjected words are transformed to vectors in what is referred to as “semantic space”. This space has been created using eigenvalue decomposition of “space generation words”, taken for example from a dictionary. The context vectors form a “semantic center”, which is a subspace of semantic space corresponding to the given context. The semantic center does not include (…) the axes corresponding to the most frequent meanings of the space generation words. The keyword vector and the comparison-subjected vector group are projected onto the semantic center and the distances (“correlation amounts”) between the keyword vector and the comparison-subjected vectors are calculated. The closest comparison-subjected vector is identified and the corresponding image retrieved from the database (see also p.4, l.44 to p.11, l.9 of the A-publication).

Fig. 3 of EP 0 822 506 A2

Here is how the invention is defined in claim 13:

Claim 13 (main request)

A computer-implemented semantic data processing method performed by a computer to search a database for data, said method comprising:

a first inputting step (S1801) of inputting a keyword;

a space generation word transforming step (S401, S402) of transforming a predetermined space generation word group into a space generation vector group by transforming each space generation word of the space generation word group into a space generation vector which has elements corresponding to a predetermined characteristic word group to represent a meaning of the space generation word;

a semantic space generation step (S203, S402,-S407 /sic/) of generating a semantic space on the basis of the space generation vector group obtained in said space generation word transforming step;

a second inputting step (S1301) of inputting a context word group;

a first transforming step (S1802-S1804) of transforming the keyword into a keyword vector in the semantic space, the keyboard vector corresponding to a combination of words which are used to describe a meaning of said keyword in a dictionary where meanings of words are described by predetermined basic words;

a second transforming step (S1302) of transforming each context word in the context word group into a vector in a context word vector group in the semantic space, the vector corresponding to a combination of words which are used to describe a meaning of said context word in said dictionary;

a third inputting step (S1601) of inputting a comparison-subjected vector group in the semantic space, each vector in the comparison-subjected vector group corresponds to respective data in the database (205, 305);

a semantic center calculating step (S1306) of calculating a semantic center of the context word vector group by performing a logical operation on all vectors of the context word vector group and dividing the results of the logical operation by a norm thereof;

a projector generating step (S1307, S1308) of generating a projector for projecting a vector in the semantic space into a substance /sic, should be subspace/ of the semantic space corresponding to the context word group, on the basis of the semantic center;

a projecting step (Sl603, S1804) of projecting the keyword vector and the comparison-subjected vector group in the substance /sic, should be subspace/ by utilizing the projector;

a calculating step (S1805) of calculating a correlation amount between each word of a comparison-subjected word group and the keyword; and

a selecting step (S2309, S2809) of selecting at least one vector from the comparison-subjected vector group on the basis of the correlation amount; and

a retrieving step (S2310, S2810) of retrieving data from the database (205, 305) based on the selected vector and outputting the retrieved data as a search result,

characterized in that,

in said second inputting step, the comparison-subjected vector group is input by transforming each comparison-subjected word in a comparison-subjected word group into a vector in the comparison-subjected vector group in the semantic space, the vector corresponding to a combination of words which are used to describe a meaning of said comparison-subjected word in said dictionary;

in said semantic space generation step, a principal- axis index set is generated (S407) by calculating a sum vector of the space generation words /sic/ vector group and selecting an axis of the sum vector as the principal-axis index set if an absolute value of corresponding element satisfies a condition for a ratio to an absolute value of a succeeding element in descending order of the absolute values;

in said projector generating step, the projector is generated (S1307, Sl308) so as to project the vector in the subspace consisting of axes that correspond to elements of the semantic center, the absolute values of which are larger than a predetermined value, and that do not belong to the principal-axis index set, and

in said retrieving step, data associated with a word corresponding to the selected vector in the database are retrieved.

Is it technical?

The first-instance examining division had refused the patent application for lack of inventive step in light of the cited prior art. In reaction thereto, the applicants appealed the decision.

With respect to technicality, the Board in charge stated as follows:

Claim 13 is directed to a “computer-implemented method… performed by a computer”. A computer being a technical means, the subject-matter of claim 13 is an invention within the meaning of Article 52(1) EPC.

The appellants agreed to the Board’s assessment, that the subject-matter of claim 13 of the main requests differs from the teachings of the closest prior art in that:

– a principal-axis index set is generated by calculating a sum vector of the space generation vector group and selecting an axis of the sum vector as the principal-axis index set if an absolute value of the corresponding element satisfies a condition for a ratio to an absolute value of a succeeding element in descending order of the absolute values, and

– the subspace into which the projector projects a vector contains no axes belonging to the principal-axis index set.

The Board summarized the teaching of these two distinguishing features as follows:

3.2 (…) Hence, in essence the claimed data processing method differs from the prior art by a modification of the mathematical model of meaning used for data retrieval. Put simply, common elements of meaning, having no distinguishing power, are determined, and the corresponding axes are excluded from the subspace (“semantic center”) where the correlations between the keyword and the image descriptions (“comparison-subjected word group”) are evaluated.

However, the Board agreed to the assessment of the first-instance examining division and considered the distinguishing features as non-technical:

3.3 Also the examining division found that the above two features (as they were then formulated) distinguished the invention from D3 (cf the decision under appeal, point 1.1). In the division’s opinion, the features merely caused a further restriction of the subspace to be searched (cf the decision under appeal, point 1.2). This was a technically non-functional modification of the known “mathematical model of meaning”, relating to the field of linguistics. The invention thus did not involve an inventive step.

In response, the appellants argued that the application relates to the technical field of utilizing a natural language as a search input, as allegedly confirmed by T 208/84. However, the Board in charge did not follow these arguments and argued that the present case could not be compared to the decision referred to by the appellants:

3.5 In the Board’s view, (…) the modified model according to the invention [is not] within the technical area, since only the meaning of the words determines how they are represented, stored and selected, and since mathematical algorithms completely define the processing.

3.6 A technical aspect can therefore at most be seen in the application of these models for retrieving data in a computer database, such retrieval being normally considered to have technical character.

In this respect, the Board further argued that using such a modified model would be obvious in light of the cited prior art:

3.8 (…) To use such a modified model for data retrieval is obvious in the light of D3. Search efficiency is a standard problem in data retrieval applications and any modification leading to faster and arguably better search results would be clearly desirable.

As a result, the Board in charge dismissed the appeal due to lack of inventive step.

More information

You can read the whole decision here: T 1569/05 (Method for retrieving data/CANON) of 26.6.2008.

Stay in the loop

Never miss a beat by subscribing to the email newsletter. Please see our Privacy Policy.

Privacy policy Yes, I consent to the collection, processing and use of my above-mentioned personal data for the purposes of processing my message and for the purposes of contacting me via email. The legal basis of the processing shall be formed by my consent pursuant to Art. 6 (1) lit. a GDPR. The data will be deleted three months after expiry of the purpose, provided that longer retention periods are not required by law. I can revoke this consent with future effect at any time. I have taken note of the privacy statement and consent to it. With regard to the processing of my data, I am entitled to inalienable rights, information on which can be found in the privacy statement.

* = Required field

Catchwords

The invention

Claim 13 (main request)

Is it technical?

More information

Stay in the loop

Related Articles