The application underlying this decision relates to translations of text within images, which was refused by the EPO since the independent claims merely provide an obvious implementation for improvements at the user’s cognitive level. Here are the practical takeaways of the decision T 0415/21 (Translations of text in images/GOOGLE) of November 14, 2023 of Technical Board of Appeal 3.5.07:

Key takeaways

Designing a GUI layout and presenting information in a way that merely improves the user’s cognitive level are non-technical aspects.

Interactive elements of a GUI which a user may trigger to display additional information are considered technical.

The invention

The application underlying the present decision concerns a system for identifying text depicted in an image, translating the text, and presenting the translation, the presentation being based on the arrangement and/or other visual characteristics of the text within the image. The purpose is to present the translation in a manner that is useful to a user and avoids cluttering the display (see original description, page 9, lines 11 to 19).

Based on the arrangement and visual characteristics of the text in the image, the system selects a “presentation context”. The system then selects a user interface corresponding to the selected presentation context for presenting additional information about text identified in the image (e.g. a language translation of text identified in the image or a currency translation of a monetary amount identified in the image, see page 11, line 26, to page 12, line 3, and original claim 2).

Fig. 2 of WO 2015/069737 A1

Here is how the invention is defined in claim 1 of the main request:

  • Claim 1 (Main Request)

Is it technical?

It was common ground that claim 1 differs over the closest prior art D1 by the following features:

  1. the presentation context is selected from a plurality of presentation contexts based on an arrangement of the text depicted by the image, wherein each presentation context corresponds to a particular arrangement of text within images and each presentation context has a corresponding user interface for presenting additional information related to the text depicted in the image, wherein the user interface for each presentation context is different from the user interface for other presentation contexts;
  2. the user interface that corresponds to the selected presentation context is identified;
  3. additional information for a second portion of the text is not automatically presented;
  4. a selectable user-interface element is provided, which is selectable by a user to present additional information for the second portion of text depicted in the image.

The Appellant argued that these features solve the objective technical problem of:

How to modify the image to include further information while resolving the conflict with the technical constraints of a limited display area and the physical features of the original image.

In support of the technicality of this problem, the appellant cited decision T 928/03, in which the particular manner of conveying to the user the location of the nearest teammate by dynamically displaying a guide mark on the edge of the screen when the teammate was off-screen produced the technical effect of facilitating a continued human-machine interaction by resolving conflicting technical requirements.

Thus, the Appellant argued that:

The distinguishing features of claim 1 likewise achieved an enhanced human-machine interaction by resolving conflicting technical requirements.

Features (1)-(3) were considered as non-technical by the Board for the following reasons:

Features (1), (2) and (3) concern the layout design of a graphical user interface (GUI) and presentation of information.

It cannot be derived from claim 1 that the choice of presentation of information of features (1) to (3) is determined by constraints of the display area. Instead, information is presented such that the user is not confused by too much information or multiple text blocks being displayed simultaneously.

Accordingly, the improvement caused by distinguishing features (1) to (3) is merely at the user’s cognitive level, which is not a technical effect.

Therefore, distinguishing features (1) to (3) merely reflect non-technical requirements of how to present the translated text. This case is different from that of T 928/03 because features (1) to (3) do not solve technical constraints of the display and there is no zone of interest outside of the displayed area.

With respect to feature (4), the Board stated:

Feature (4) specifies a selectable user-interface element which the user may select to view additional information for a second portion of the text which is not automatically displayed (feature (3)). The selectable user-interface element of feature (iv) is an interactive element which the user can activate to trigger the display of the additional information.

Feature (iv) is thus a technical part of the user interface (see also T 2028/11, reasons 3.6). It solves the problem of modifying the method of D1 to implement the optional display of additional information for a second portion of the text (feature (iii)).

Selectable user-interface elements such as buttons or links which, when selected, cause further information to be displayed are notoriously known, for example from web applications and from mobile devices. In particular, buttons of a graphical user interface and links (or hyperlinks) to further information in a web application are universally used by members of the general public, e.g. when operating smart phones or laptops.

It would thus have been obvious for the skilled person given the task of implementing the desired non-technical presentation of the optional display of a second portion of the text to use a selectable user-interface element in the system of D1, which runs on a mobile device (see e.g. D1, abstract and paragraph [0002]).

As a result, the Board came to the conclusion that claim 1 lacks inventive step and dismissed the appeal.

More information

You can read the whole decision here: T 0415/21 (Translations of text in images/GOOGLE) of November 14, 2023.

