The application relates to identifying so-called keypoints in a digital image, for e.g. scene or object identification/matching. The Board agreed with the applicant that the distinguishing feature provides the effect of eliminating spurious or unstable candidate keypoints since the points outside the validity range of the approximation are eliminated. Here are the practical takeaways from the decision T 1482/19 (Scale space approximation/TELECOM ITALIA) of February 11, 2022, of the Technical Board of Appeal 3.5.06:
The application relates to a computation method for identifying so-called keypoints in a digital image, for e.g. scene or object identification/matching.
The physical parameters of an image such as luminance are used to identify the size and position of the salient features of the image, and its location by identifying the keypoint, which is the centre of the detail. The known method detects keypoints as extrema in the image “scale space”. However, these are not reliable as they must settle for a solution that trades off efficiency with completeness.
The application proposes a method of approximating the scale space to determine keypoint. Although scale space approximation was a generally known computational method, it was not known to be used with keypoints and the invention additionally included eliminating spurious or unstable candidate keypoints.
A method for identifying keypoints in a digital image (I) comprising a set of pixels, each pixel having associated thereto a respective value of an image representative parameter, said method comprising:
– approximating (108), without explicitly computing, a filtered image (L(σ), 102) of said digital image (I), said filtered image being obtained from said digital image (I) by means of a filtering function, said filtering function being a Laplacian of Gaussian or a Difference of Gaussians and having a filtering parameter (σ) being the standard deviation of the Gaussian function, said approximating comprising:
a) generating (102) a set of base filtered images (L(σ)) of said digital image (I), each base filtered image (L(σi) being the digital image (I) filtered by means of a respective base filter, each base filter being a matrix the elements of which are obtained by calculating said filtering function with a respective value (σi) of the filtering parameter;
b) for each pixel (xt,yt) of at least a subset of said set of pixels of the digital image (I), approximating (106) the filtered image (L(xt, yt, σ)) obtained from said digital image (I) by means of said filtering function having a filtering parameter value by means of a respective approximation function based on the base filtered images, said approximation function being a function of the filtering parameter (σ) within a predefined range (104) of values of the filtering parameter (σ), said predefined range ranging from a lowest value of the filtering parameter to a highest value of the filtering parameter, wherein:
– wherein said approximating function is a polynomial expressed as follows:
L(x, y, σ) is the filtered image to be approximated
c(x, y) is the polynomial coefficient for σ**(r)
wherein said polynomial coefficients cr(x, y) are obtained as a linear combination of said base filtered images according to the following equations:
wherein F**(T) is calculated by solving the following equation:
F is the unknown which, after being calculated and transposed, gives F**(T)
S is a matrix formed by specific values of σi as follows:
wherein σ’i (with i varying from 1 to q) are specific values of the filtering parameter σ
W is a matrix calculated by solving the following equation:
AW = D
A is a matrix formed by the base filters, wherein matrix A has a number of columns equal to the number of base filters, wherein each column of matrix A represents a respective base filter, wherein each column of matrix A is built by arranging in columns the columns of each base filter
D is a matrix each element of which is obtained by calculating the filtering function, for each point (x, y) and for each of said specific values σ’j (with j varying from 1 to q)
W is the unknown which, after being calculated and transposed, gives W**(T)
– for each pixel of said at least a subset, identifying (120) such pixel as a candidate keypoint if, in such pixel, the approximation function has a local extreme (110) which is also a global extreme (116) with respect to the filtering parameter (σ) for a value σm of the filtering parameter in a respective sub-range of values of the filtering parameter internal to said predefined range said respective sub-range having a lower end larger than said lowest value of the filtering parameter and an upper end lower than said highest value of the filtering parameter, wherein the difference between said lowest value and said lower end, and the difference between said highest value and said upper end, are such that only the maximum or minimum points that happen in σm of which the behavior of the approximation function is known in a neighborhood of σm that is sufficiently large are considered as candidate keypoints;
– for each pixel identified as a candidate keypoint:
c) comparing (130) the value assumed by the approximation function at the value of the filtering parameter corresponding to the global extreme of the pixel with the values assumed by the approximation functions of the pixels adjacent to said pixel in the image at the values of the filtering parameters of the respective global extremes of such adjacent pixels, and
d) selecting (136) such pixel based on this comparison.
Is it patentable?
The Examining Division refused the application and considered the subject-matter of the claims lacked inventive steps starting from D1 (J Ng et al: “Steering in Scale Space to Optimally Detect Image Structures”) in view of D4 (D.G. Lowe: “Distinctive Image Features from Scale-Invariant Keypoints”).
The Board summarised the cited documents D1 and D4 as follows:
12. Document D1 describes a method of scale space approximation which is the same as that of the application (see D1, equations 5 to 15). This was also stated in the decision at point 2 and was not denied by the appellant.
12.1 The purpose of D1 is (introduction, page 483, paragraph 2) to address the “main problem … that exhaustive filtering with kernels over a wide range of finely-sampled scales is computationally intensive and inefficient”. The proposed method, by the formulation of what are called polynomial steering functions “lends itself well to the detection of global maxima by analytically finding the roots of the derivatives of the polynomials, rather than exhaustively performing operations over multiple scales to detect maxima as in the previous aforementioned works” (section 4, paragraph 1).
12.2 D1 uses global maxima over scales (section 4) to detect the scale of image structures in order to improve low-level feature detection and extraction by spatial filters (introduction). It is said though that the other polynomial roots “provide scale information about the other local energy maxima, minima and inflection points” (conclusion, last paragraph).
13. Document D4 represents the seminal work of Lowe regarding keypoint detection with scale invariant features. The method starts (section 3) by detecting local extrema in (DoG) scale space. These local extrema are obtained by comparing the value of a point with its neighbours both in the scale and the spatial direction (adjacent points – see figure 2). This raises the question of the sampling frequency in both scale and space domains (section 3.1). D4 uses experimentation in order to determine the frequency of sampling in the scale domain and “must settle for a solution that trades off efficiency with completeness” (end of section 3.1). To reduce computation, D4 further proposes to work on octaves (where the scale doubles), i.e. to repeat the method with the same set of scales, but on reduced versions of the image for each octave (see figure 1).
The applicant argued that document D1 was not a suitable starting document, but rather it was document D4.
16. The Board remarks that document D4 is commonly known in the field of image analysis. A person skilled in this art, i.e. the intended addressee of D1, would see that D1, although it mentions edges and ridges, proposes a general method of scale space approximation, which is said to be more computationally efficient than computation by discretization of the scale space. Furthermore, D1 teaches that with this approximation, for each image point, local and global extrema in scale can be computed analytically in an efficient way. So the skilled person would consider employing the method of D1 to improve the commonly known keypoint detection method of D4, which uses the standard scale space discretization.
17. That said, the Board accepts that this way of presenting the argument does not follow the established (but not mandatory) problem-solution approach, according to which the starting point would be the prior art to be improved and the objective technical problem to achieve that improvement. One does not, however, come to a different conclusion that way.
Therefore, the Board noted that irrespective of which document was used as the closest prior art, all features, except the last point of feature b, were obvious. However, the Board agreed with the applicant that the feature indeed provided a technical effect and acknowledged in the application as filed.
19. Thus the Board agrees in essence with the reasoning of the Examining Division regarding the combination of D1 and D4, though indeed the natural starting point according to the problem-solution approach is D4 rather than D1, because D4 is improved by D1 and not vice versa. It follows that all the claimed features but the last in feature b are obvious in view of said combination.
20. Regarding this last feature, the Examining Division and the appellant agree that this step solves a technical problem in that it allows to eliminate spurious or unstable candidate keypoints, due to the effects of the approximation.
20.1 The Board also notes that the Examining Division did not find this feature obvious over D1 and D4 but did not accept that the alleged technical effect could be acknowledged based on the application documents as filed.
20.2 The Board accepts this as a legitimate concern but finds to the contrary. The skilled person understands from page 17 that by keeping only the points for which “the behavior of the approximation functions are known in a neighborhood … that is sufficiently large”, the points outside the validity range of the approximation are eliminated, and that these are potentially not true extrema, and hence no real keypoints, but are due to errors in the approximation. Thus the technical problem of eliminating “false” extrema is clear from the application.
Therefore, the Board concluded that the subject-matter of the claim involves an inventive step, and the appeal is allowed with an order to grant a patent
You can read the whole decision here: T 1482/19 (Scale space approximation/TELECOM ITALIA) of February 11, 2022, of the Technical Board of Appeal 3.5.06