Digital Signal Processing

Scientific & Technical

“Digital Signal Processing” No. 3-2018

Digital image processing

In the issue:

- applied television system adaptation

- SAR image
- geometric processing of images
- mathematical model of synthesized images
- subpixel identification of objects
- wavelet transform in video compression
- increasing contrast of small image details
- aerial object detection and recognition
- neural networks for object recognition
- pathology detection in images of gastric
- system-on-chip optical flow computation

Applied television system adaptation to the plots dynamic
Bobrovsky A. I.
Federal State Unitary Enterprise «State research Institute of applied problems» (FSUE «Gos-NIIPP»), Russia, St. Petersburg, e

Keywords: optimization, adaptation, video control, information processing, plots dynamic.

The main methods of adaptation of the image decomposition parameters in the applied television system to the dynamics of the plot and the criteria of optimization of its control system are considered.

The adaptation of the television system to the dynamics of the plot is based on the solution of the problem of minimizing errors in measuring the time-varying coordinates of objects with the speed limit of reading information from the photodetector matrices. Variable parameters are the image clarity and frame rate when they are discretely switched interchange, carried out on the basis of the principle of equality of dispersion of interelement and inter-frame increments of the video signal.

To eliminate the dependence of the control on the level of illumination of the scene, a transition from the traditional scheme with an estimate of the difference between the compared values to their ratio is made, which, due to the discreteness of the raster, changes when switching from state to state ("full clarity and low frame rate", "reduced clarity and high frame rate") four times (each of the estimated variances twice, but in opposite directions).

Ensuring the stability of the system of automatic control of the decomposition parameters of the measuring system requires the introduction of hysteresis when switching between the two States. The relative width of the hysteresis (threshold ratio) due to fluctuations in the observation statistics should be greater than the minimum possible value of 4. The optimal value of the relative hysteresis width is determined by the criterion of the maximum time spent within the interval [2, 1/2].

The stated concept of the interchange of clarity and frame rate is the realization of a new paradigm in the theory of applied television systems, which replaced the old paradigm of passive accounting for the reduction of resolution in the motion of the observed object. The optimization of the considered space television system is aimed at extracting information of maximum quality, taking into account the limitation of the bandwidth of the communication channels of the television camera and the on-board computer and/or the ground receiving terminal. The proposed development of the theory of synthesis of computer vision systems takes into account the influence of solid-state imaging technology on the methods of system analysis and synthesis, optimization, management, decision-making, information processing based on the principle of dominant information.


1. Tvorcheskoe nasledie akademika Sergeya Pavlovicha Koroleva. Izbrannie trudy i documenty/ Pod red.M.V.Keldysha. M.: Nauka, 1980. -592 p.

2. Legostaev V.P., Raushenbakh B.V. Avtomaticheskaya sborka v kosmose/ Kosmicheskie issledovaniya, 1969, no. 6. - pp. 803–813.

3. Mikrin E.A. Boortovye kompleksy upravleniya kosmicheskikh apparatov. M., MGTU im. N. EH. Baumana, 2014. – 245 p.

4. Bratslavets P. F., Rosselevich I. À., KHromov L. I. Kosmicheskoe televidenie. M.: Svyaz, 1973. – 248 p.

5. Legostaev V. P., Shmyglevskij I. P. Upravlenie sblizheniem kosmicheskikh apparatov na ehtape prichalivaniya. Upravlenie v kosmose. Ò. 2. Ì., Nauka, 1972. - pp. 218–228.

6. Tverdotel'naya revolyutsiya v televidenii: Televizionnye sistemy na osnove priborov s zaryadovoj svyaz'yu, sistem na kristalle i videosistem na kristalle/ Pod red. À. À. Umbitalieva i À. K. TSytsulina. - M.: Radio i svyaz', 2006. - 312 p.

7. Teoriya i praktika kosmicheskogo televideniya/ Umbitaliev À. À., Pyatkov V. V., Bobrovskij À. I. i dr. //Pod red. Umbitalieva À. À., TSytsulina À. K. SPb, NII televideniya, 2017. – 368 p.

8. Obnaruzhenie ob"ektov na zvyozdnom fone / Levko G. V., Bobrovskij À. I., Morozov À. V., TSytsulin À. K. // Voprosy radioehlektroniki, seriya Tekhnika televideniÿ, 2016, no. 2. - pp. 29–38.

9. Statisticheskij sintez upravleniya televizionnoj sistemoj, adaptivnoj k dinamike syuzheta / Umbitaliev À. À., Pyatkov V. V., Morozov À. V. i dr. // Voprosy radioehlektroniki, seriya Tekhnika televideniya, 2016, no. 1. - pp. 3–11.

10. Àdaptatsiya parametrov razlozheniya televizionnogo koordinatora tselej/Umbitaliev À. À., Pyatkov V. V., Bobrovskij À. I. i dr.// Voenno-nauchnaya konferentsiya «Àktual'nye nauchno-tekhnicheskie aspekty razrabotki, ispytanij i ehkspluatatsii sredstv raketno-kosmicheskoj oborony», SPb, VKÀ im. À. F. Mozhajskogo, 20.10.2017. pp. 183–188.

11. Veroyatnostnye kharakteristiki reshayushhej statistiki v televizionnoj sisteme, adaptivnoj k dinamike syuzheta/ Rogachyov V. À., Morozov À. V., Bobrovskij À. I. i dr. Voprosy radioehlektroniki, seriya Tekhnika televideniya, 2018, no. 1. – pp. 71–78

12. Khromov L. I., Tsytsulin À. K., Kulikov À. N. Videoinformatika. M., Radio i svyaz', 1991. – 192 p.


14. Rekursivnaya fil'tratsiya opornoj statistiki adaptivnoj televizionnoj sistemy/ Morozov À. V., Chepelev À. G., Bobrovskij À. I. i dr. // Trudy 14-oj Mezhdunar. konf. «Televidenie: peredacha i obrabotka izobrazhenij», SPb, 26–27 june 2018. SPb: Izd-vo «Tekhnolit», 2018. - p. 199-203.

15. Shamis À. L. Vektor ehvolyutsii. Zhizn', ehvolyutsiya. Myshlenie s tochki zreniya programmista. M.: Knizhnyj dom «LIBROKOM», 2013. – 200 p.

16. Optimizatsiya gisterezisa sistemy upravleniya telekameroj, adaptivnoj k dinamike syuzheta/ Bobrovskij À. I., Rogachyov V. À., Morozov À. V. I dr.// Trudy 14-oj Mezhdunar. konf. «Televidenie: peredacha i obrabotka izobrazhenij», SPb, 26-27 june 2018. SPb: Izd-vo «Tekhnolit», 2018. - pp. 40-44.

Ascending and descending pass SAR image fusion based on fuzzy logic
A.E. Moskvitin
V.A. Ushenkin
The Ryazan State Radio Engineering University (RSREU), Russia, Ryazan, e-mail:

Keywords: SAR image, ascending pass, descending pass, orbit, fuzzy logic, fusion.


The task of ascending and descending pass SAR image fusion is considered. These images correspond to the observation of the scene from different sides. The goal of this fusion is to reduce amount of geometric distortions and shading. An algorithm of fusion, based on fuzzy logic and strict computation of layover and shadow masks, is proposed.

1. Zhang J., Wei J., Huang G., Zhang Y. Fusion of ascending and descending polarimetric SAR data for colour orthophoto generation // International Archives of Photogrammetry and Remote Sensing. 2010. Vol. 38. Part 7A. pp. 323–328.

2. Chandrakanth R., Saibaba J., Varadan G., Ananth Raj P. Fusion of ascending and descending pass high resolution SAR data // Journal of Geomatics. 2014. Vol. 8(2). pp. 164–169.

Effective organization of mass coordinate transformations in the geometric processing of SAR images
N.A. Egoshkin
V.A. Ushenkin

The Ryazan State Radio Engineering University (RSREU), Russia, Ryazan
, e-mail:

Keywords: SAR image, geometric processing, interpolation, geocoding, orthotransformation.

The problem of mass coordinate transformations effective organization in the geometric processing of SAR images is considered in order to achieve high processing speed with low RAM costs. The approximate regular grid with piecewise parabolic interpolation between nodes is pro-posed. Its efficiency in terms of speed and memory is substantiated and confirmed experimentally.

1. GOST 32453 2017. Global'naja navigacionnaja sputnikovaja sistema. Sistemy koordinat. Metody preobrazovanij koordinat opredeljaemyh tochek (Global navigation satellite system. Coordinate systems. Methods for transforming the coordinates of defined points). Moscow: Standartinform, 2017. 23 p.

2. Zlobin V.K., Eremeev V.V. Obrabotka ajerokosmicheskih izobrazhenij (Aerospace image processing). Moscow: FIZMATLIT, 2006. 288 p.

Decimeter-resolution spaceborne SAR raw data focusing model

V.A. Ushenkin
The Ryazan State Radio Engineering University (RSREU), Russia, Ryazan, e-mail:

Keywords: SAR, raw data, focusing, SAR image.

The new factors, which have to be taken into account in spaceborne SAR raw data focusing, when SAR image spatial resolution becomes sub-meter and approaches to decimeter, are considered in the paper. The mathematical model of focusing, that provides higher quality in this case, is proposed. It is confirmed by the results of simulation modeling.

1. Koshljakov N.S., Gliner Je.B., Smirnov M.M. Uravnenija v chastnyh proizvodnyh matemati-cheskoj fiziki (Equations in partial derivatives of mathematical physics). Moscow: Vysshaja shkola, 1970. 712 p.

2. Kuang H., Chen J., Yang W., Zhu Y., Zhou J., Li Ch. Accurate compensation of stop-go approximation for high resolution spaceborne SAR using modified hyperbolic range equation // IEEE In-ternational Geoscience and Remote Sensing Symposium. 2014. pp. 462–465.

3. Prats-Iraola P., Scheiber R., Rodriguez-Cassola M., Wollstadt S., Mittermayer J., Brautigam B., Schwerdt M., Reigber A., Moreira A. High precision SAR focusing of TerraSAR X experimental staring spotlight data // IEEE International Geoscience and Remote Sensing Symposium. 2012. pp. 3576–3579.

4. Cumming I.G., Wong F.H. Digital processing of synthetic aperture radar data: algorithms and implementation. Artech House, 2005. 660 p.

Subpixel identification of objects by multi- and hyperspectral data applying sequential quadratic programming and a method of spectral components analyses
O.V. Grigoreva, e-mail:
A.F. Mozhaisky Military Space Academy, Russia, Saint Petersburg

hyperspectral data, treatment, sequential quadratic programming, spectral components, spatial-scalable filtering, sequence analysis.

The paper shows an original ensemble algorithm of thematic treatment of hyperspectral data of remote sensing. The algorithm is based on the sequential quadratic programming (SQP) method, defining spectral components of hyperspectral data, analysis of which allows to identify the objects of interest on subpixel level. Linear mixing of objects in the pixel of hyperspectral data is supposed. Detection of the spectral component, corresponding to the identified object, is carried out by special features. To prepare the informative features we make convolution of the etalon spectra by the methods of spatial-scalable filtering and sequence analysis. This method of spectral signature formalization provides the detection of features based on the displacement and locale inflection points of spectra that aren’t typical for the background. Also this method of formalization let us decrease the amount of processed information without reduction in probability of target detection in comparison with traditionally used indices.

It is reviewed in detail how the matrix of spectral signatures of image components was obtained using SQP method. Thanks to SQP the developed algorithm takes into account full limitations of positivity and additivity imposed on the coefficients of mixture decomposition.

We give some examples of algorithm verification in the treatment of experimental aerial hyperspectral images. As initial hyperspectral data, we took images received by the video-spectrometer made by Research and Production Association "Lepton". Etalon spectral signatures of objects, needed for the formation of feature values, were obtained as a result of ground-based measurements, using spectroradiometer FieldSpec®4 ASD. The experiments showed that the developed algorithm provides a reduction of false targets in 1.5-2 times in comparison with other linear spectral unmixing methods

1. J.C. Harsanyi, C.-I. Chang Hyperspectral Image Classification and Dimensionality Reduction: An Orthogonal Subspace Projection Approach // IEEE Trans. Geosci. Remote Sens. 32 (4).1994. pp. 779-785.

2. S. Kraut, L.L. Scharf, L.T. McWhorter Adaptive Subspace Detectors // IEEE Trans. Signal Process. 49 (1). 2001. pp. 1-16.

3. Gladkikh B.A. Metody optimizatsii i issledovanie operatsii dlya bakalavrov informatiki. Ch. II. Nelineinoe i dinamicheskoe programmirovanie: uchebnoe posobie. (The study of optimization methods and operations for bachelors of computer science. Part II. Nonlinear and dynamic programming: tutorial). Tomsk: Izd-vo NTL. 2011. 264 p.

4. Klaus Schittkowski, Ya-xiang Yuan Sequential Quadratic Programming Methods. Wiley Encyclopedia of Operations Research and Management Science. 2011.

5. K.Schittkowski More test examples for nonlinear programming // Lecture notes in Economics and Mathematical Systems, Vol. 282, Springer. 1987.

6. Li J., Bioucas-Dias J. M. Minimum volume simplex analysis: a fast algorithm to unmix hyperspectral data // Proc. of IEEE International Conference on Geoscience and Remote Sensing Simposium (IGARSS). Boston, USA: IEEE. 2008. V. 3. pp. 250-253.

7. Chapurskii L.I., Markov A.V., Grigor'eva O.V., Saidov A.G., Astakhova E.I., Zhukov D.V., Baza dannykh KSYa v spektral'nom diapazone 400…1000 nm dlya osnovnykh tipov podstilayushchei poverkhnosti (pochv, rastitel'nosti, ledovykh i snezhnykh pokrovov, vodnoi poverkhnosti, ob"ektov iskusstvennogo proiskhozhdeniya), vstrechayushchikhsya v raionakh s"emki KA «Resurs-P» (Database reflectance the main types of underlying surface (soil, vegetation, ice and snow cover, water surface, objects of artificial origin) found in the areas of survey spacecraft "Resurs-P”. in the spectral range of 400 ... 1000 nm), Svidetel'stvo o gosudarstvennoi registratsii bazy dannykh, No. 2012621165, reg. 13.11.2012.

8. O.V. Grigor'eva, M.O. Ivanets, A.V. Markov, D.V. Zhukov Metody podgotovki formalizovannykh etalonnykh priznakov dlya identifikatsii ob"ektov po dannym giperspektral'noi kosmicheskoi s"emki (Preparation methods of formal etalon features for target detection using hyperspectral remote sensing data) // Materialy V Vserossiiskoi nauchno-tekhnicheskoi konferentsii «Aktual'nye problemy raketno-kosmicheskoi tekhniki» (V Kozlovskie chteniya). Samara: SamNTs RAN. 2017. T. 1. pp. 281-286.

9. Gonzalez, Rafael C, Woods, Richard E., Digital image processing. Pearson Education, Inc, 2008. 1072 p.

10. Kostrov B.V. Teoriya i metodologiya primeneniya sekventnogo analiza dlya obrabotki aerokosmicheskikh izobrazhenii (Theory and methodology for applying sequence analysis to aerospace image processing). Avtoreferat dissertatsii na soiskanie uchenoi stepeni doktora tekhnicheskikh nauk. Ryazan. 2012.

11. Zalmanzon L.A. Preobrazovaniya Fur'e, Uolsha, Khaara i ikh primenenie v upravlenii, svyazi i drugikh oblastyakh (Fourie, Walsh, Haar transforms and their application in control, communications and other fields). M.: Nauka. 1989. 496 ð.

Applications of complex wavelet transform in video compression
Dam Trong Nam, e-mail:
The Moscow Institute of Physics and Technology (MIPT), Russian, Moscow

Keywords: inter-frame coding, discrete wavelet transform, complex wavelet transform, dual tree complex wavelet transform, motion compensation, block matching, overlapped block matching.

The article researches applications of complex wavelet transform in video compression. It is known that most modern video codecs use the discrete cosine transform (DCT), which has drawbacks such as blocking effect and mosaic effect. Unlike DCT, the discrete wavelet transform (DWT) does not work with blocks of small size, but with tiles or even with frame, as a result DWT devoid of such defects. However, for moving picture compression DWT suffers from some disadvantages. The first drawback is the shift variance, which means that the small shift in frame leads to significant changes in the values of wavelet coefficients. Therefore, it is impossible to use wavelet coefficients for motion estimation and motion compensation, which is very important in video codecs to reduce temporal redundancy, and to increase the compression ratio as consequence. The second shortcoming is the lack of phase notion like Fourier transform, which can describe motion between frames more accurately and therefore can offer the usage for motion compensation. Fortunately, widely used in signal processing Fourier transform have attractive features such as perfect shift invariance of magnitude and a simple linear phase encoding the shift. Inspired by the Fourier transform, complex wavelet transform (CWT) is a powerful tool to solve mentioned above problems of DWT for video compression application.

The aim of this work is to research application of CWT for motion compensation. For the given task, in the article the construction and properties of DWT and CWT which is based on the dual true complex wavelet transform (DTCWT) are considered. The principle of inter-frame coding on the example of wavelet-based video coder is shortly given. Motion compensation method for real videos using CWT is proposed. For evaluation of the proposed compensation method, this work also investigates reference methods, which are based on using the value of luma component (block matching (BM) and overlapped block matching (OBM), which is used in video codec DIRAC).

Working with high definition video "city" and "stockholm" (, the following results were obtained:
- The prediction quality of the proposed method depends very few on the type of wavelet that will allow us to build DTCWT by any wavelet filter and it is easy to adapt the proposed method into the existing wavelet codecs.
- The proposed method provides less energy of prediction error. Specifically, if the difference between the current and reference frames is small, the proposed method gains over reference ones from 0.2 to 0.5 dB PSRN of prediction error, otherwise the proposed method provides more than 0.5 dB PSRN of prediction error in comparison with the reference ones.
- For the same quality of the reconstructed frame, the proposed method reduces from 20% to 25% of the compressed output for quantized prediction error for video “city” and from 6% to 8% for video “stockholm”, which means the better suitability of the proposed method for video compression task than the reference ones.

1. V. P. Dvorkovich, A.V. Dvorkovich. Tsifrovye videoinformatsionnye sistemy (teoriya i praktika) [Digital video and information systems (theory and practice)]. Moscow: Tekhnosfera, 2012, 1008 p. ISBN: 978-5-94836-336-3.

2. ITU-T Recommendation T.800 (2002-08) - Information technology – JPEG 2000 image coding system: Core coding system.

3. ISO/IEC 14496-2 (Second edition 2001-12-01) - Information technology – Coding of audio-visual objects – Part 2: Visual.

4. V. P. Dvorkovich, A.V. Dvorkovich. Okonnye funktsii dlya garmonicheskogo analiza signalov (Window Function for the Harmonic Analysis of Signals). Moscow: Tekhnosfera, 2016, 216 p. ISBN: 978-5-94836-432-2.

5. A.B. Sergienko. Tsifrovaia obrabotka signalov (Digital signal processing). St. Petersburg, Piter, 2002, 608 p. ISBN: 5-318-00666-9.

6. J. M. Lina, M. Mayrand. Complex Daubechies wavelets // Appl. Comput. Harmon. Anal., vol. 2, no. 3, pp. 219–229, 1995.

7. R. V. Spaendonck, T. Blu, R. Baraniuk, and M. Vetterli. Orthogonal Hilbert transform filter banks and wavelets // Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Apr. 6–10, 2003, vol. 6.

8. J. Magarey and N. G. Kingsbury, Motion Estimation Using a Complex-Valued Wavelet Transform // IEEE Trans. on Signal Processing, 46(4):1069-1084, 1998.

9. N G Kingsbury. Image processing with complex wavelets // Phil. Trans. Royal Society London A, September 1999, Special issue for the discussion meeting on “Wavelets: the key to intermittent information?” (held Feb 24-25, 1999), 357, pp 2543-2560

10. Lawton. W. Applications of complex valued wavelet transforms to subband decomposition // IEEE Trans. Signal Proc. 41, 3566-3568.

11. B. Belzer, J. M. Lina, and J. Villasenor. Complex, linear-phase filters for efficient image coding // IEEE Transactions on Signal Processing, 40(4):2425–2427.

12. Selesnick, R. Baraniuk, and N. Kingsbury. The dual-tree complex wavelet transform // IEEE Signal Process. Mag., vol. 22, no. 6, pp. 123–151, Nov. 2005.

13. I. Selesnick. Matlab Implementation of Wavelet Transforms. Dual-Tree Complex Wavelet Transform //

14. Naga Prudhvi Raj, V., Venkateswarlu. Denoising of medical images using dual tree complex Wavelet transform // Proc. Technol.4, 238–244, C3IT- 2012.

15. F. Shi and I.W. Selesnick, Video denoising using oriented complex wavelet transforms // Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), June 2004, vol.2, pp.949-952.

16. Abdul Rehman, Yang Gao, Jiheng Wang, Zhou Wang. Image Classication Based on Complex Wavelet Structural Similarity // IEEE International Conference on Image Processing, Brussels, Belgium, Sept. 2011.

17. An Vo, Soontorn Oraintara. A study of relative phase in complex wavelet domain: Property, statistics and applications in texture image retrieval and segmentation // Signal Processing: Image Communication 25, 2010, pp.28-46.

18. Unan Y. Oktiawati and Vooi Voon Yap. A Motion Estimation Algorithm Using DTCWT // ITB J. ICT, vol.6, no.1, 2012, pp.82-101.

19. Jianguo Zhang Ling Shao Lei Zhang Graeme A. Jones. Intelligent video event analysis and understanding. Springer, 2010 edition. - 251 pages. ISBN 978-3-642-17554-1.

20. GJ Sullivan, Jens-Rainer Ohm, Woo-Jin Han, Thomas Wiegand. Overview of the high efficiency video coding (HEVC) standard // IEEE Trans. Circ. Syst. Video Technol. 22(12). 1649 - 1668 (2012).

21. Jill M. Boyce, Yan Ye, Jianle Chen, Adarsh K. Ramasubramonian. Overview of SHVC: Scalable Extensions of the High Efficiency Video Coding Standard // IEEE Trans. Circuits Syst. Video Techn. 26(1): 20-34 (2016).

Optimization of quantization method for wavelet-based video codec
Dam Trong Nam, e-mail:
The Moscow Institute of Physics and Technology (MIPT), Russian, Moscow

Keywords: video codec Dirac, video coding, wavelet decomposition, quantization, uniform quantization, nonlinear quantization, Lloyd-Max algorithm, entropy-constrained quantization.

The article deals with quantization methods for wavelet-based video codec. It's known that the problem of optimal transform coefficients quantization is crucial for image compression. In the article the uniform, nonlinear and entropy-constrained quantization methods are investigated.

In considered wavelet-based video codec Dirac, wavelet transform coefficients are divided into different frequency bands and simply quantized by uniform quantization. However, wavelet transform coefficients can be quantized considering the subjective sensitivity of the human eye to different spatial frequencies. Therefore, divided bands can be quantized by different quantization steps as accepted in the standard JPEG2000. Furthermore, there are possibilities of applying the nonlinear quantization algorithm Lloyd-Max and the entropy-constrained quantization method, which are mostly used for converting analog signals to digital signals. Due to the simplicity of quantization method implemented in this video codec, there exists a potential for finding a better quantization method.

The quantization methods mentioned above were adapted into considered wavelet-based video codec. Based on analyzing the processing results of various video types using different quantization methods, the optimized quantization method, which is the joint use of uniform quantization for some high-frequency bands and the algorithm Lloyd-Max for other bands, was proposed. The output data rate for given distortion levels is reduced noticeably by using the proposed method. The processing results for various types of video show that the proposed method provides up to 4.2% bitrate saving for low distortion level in the intra-frame coding mode, up to 9.8% bitrate reduction for low distortion level and up to 11.2% bitrate saving for medium (acceptable) distortion level in the inter-frame coding mode compared to the uniform quantization method implemented in this video codec.

1. V. P. Dvorkovich, A.V. Dvorkovich. Tsifrovye videoinformatsionnye sistemy (teoriya i praktika) [Digital video and information systems (theory and practice)]. Moscow: Tekhnosfera, 2012, 1008 p. ISBN: 978-5-94836-336-3.

2. William B. Pennebaker, Joan L. Mitchell, JPEG still image data compression standard (3rd ed.), Springer, 1993, - p. 291. ISBN 978-0-442-01272-4.

3. ITU-T Recommendation T.800 (2002-08) - Information technology – JPEG 2000 image coding system: Core coding system.

4. Dirac video codec //

5. S. Lloyd, Least squares quantization in PCM // IEEE Transactions on Information Theory, IT-28, 129–137, March 1982.

6. J. Max, Quantizing for minimum distortion // IRE Transactions on Information Theory, IT-6, 7–12, 1960.

7. P. A. Chou, T. Lookabaugh, and R. M. Gray. Entropy-constrained vector quantization // IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 31–42, Jan. 1989.

8. Jianguo Zhang Ling Shao Lei Zhang Graeme A. Jones. Intelligent video event analysis and understanding. Springer, 2010 edition. - 251 pages. ISBN 978-3-642-17554-1.

9. G.Y. Gryzov, A.V. Dvorkovich, Three-Channel Wavelet Transform for Video Compression Applications // 6th Mediterranean Conference on Embedded Computing MECO 2017, 11-15 June 2017, pp. 1-4.

10. V. P. Dvorkovich, A.V. Dvorkovich. Okonnye funktsii dlya garmonicheskogo analiza signalov (Window Function for the Harmonic Analysis of Signals). Moscow: Tekhnosfera, 2016, 216 p. ISBN: 978-5-94836-432-2.

No-reference integrated-multiplicative quality index for digital grayscale images
A.S. Sychev, e-mail:
I.S. Kholopov
, e-mail:
The Ryazan State Radio Engineering University (RSREU), Russia, Ryazan

Keywords: no-reference quality index, integral quality index, brightness, standard deviation, histogram, contrast, brightness levels, entropy.

The aim of the work is the development of a no-reference normalized integrated-multiplicative quality index (IMQI) of digital grayscale images for evaluating the efficiency of algorithms for improving vision and selecting channels for fusion of images from different sensors.

The article provides an overview of several literary sources, in which it is shown that the assessment of visual image quality by individual measures is not objective. For this reason, it was concluded that the use of combined quality criteria, which operate with several particular indicators, is appropriate. It is shown that despite the obvious advantages of the well-known integral quality index (IQI), which operate with normalized values of brightness, standard deviation, number of brightness levels, contrast and entropy, its values for noisy images do not correlate with the results of the subjective perception of their quality. Introduced an integral-multiplicative quality criterion, operating with partial indicators: estimates of average brightness, standard deviation of brightness of high-frequency components and noise, as well as average values of local contrasts of the analyzed frame and its low-frequency component. The results of applying the study of the developed quality criterion to images of various spectral ranges (visible, short-wave and long-wave infrared) are analyzed. It is concluded that it is necessary to pre-compensate infrared image defects (dead pixels, structural noise) before calculating the IMQI. It is also shown that for images obtained as a result of nonlinear brightness transformations (for example, Multiscale Retinex), the quality index obtained by IMQI formula is overvalued.

The results of the semi-real experiment showed that, unlike the known IQI, the proposed IMQI for images decreases rather than increases with a high power of additive white gaussian noise.


1. Gruzman I.S., Kirichuk V.S., Kosykh V.P., Peretyagin G.I., Spektor A.A. Cifrovaja obrabotka izobrazhenij v informatsionnyh sistemah (Digital image processing in information systems). Novosibirsk: NSTU Publisher, 2002. 352 p.

2. Starovoitov V.V., Starovoitov F.V. Sravnitel’nyj analiz bez’etalonnyh mer otsenki kachestva cifrovyh izobrazhenij (Comparative analysis of no-reference quality measures for digital images) // Sistemnyj analiz i prikladnaja informatika, 2017, no. 1, pp. 24-32.

3. Kolchaev D.A., Muratov E.R., Nikiforov M.B. Matematicheskoe obespechenie sistemy dinamicheskogo vybora metoda uluchshenija izobrazhenij v real’nom vremeni (Software of system dynamically selection methods to improve images in real time) // Izvestiya TulGU. Tehnicheskie nauki, 2017, is. 2, pp. 83-89.

4. Kolchaev D.A., Muratov E.R., Nikiforov M.B. Avtomaticheskaja nastrojka konvejera izobrazhenij (Automatic adjustment of image processing pipeline)// Informatsionnye tehnologii i nanotehnologii: sbornik trudov III mezhdunarodnoj konferentsii i molodezhnoy shkoly. Samara: Novaja tehnika, 2017. pp. 624-628.

5. Bondarenko A, Bondarenko M. Apparatno-programmnaja realizatsija mul’tispectral’noj sistemy uluchshennogo videnija (Hardware and software implementation of the enhanced vision multispectral system)// Sovremennaja electronika, 2017, no. 1, pp. 32-37.

6. Bondarenko M.A., Drynkin V.N., Nabokov S.A., Pavlov Yu.V. Adaptivnyj algoritm vybora informativnyh kanalov v bortovyh mul’tispectral’nyh videosistemah (Adaptive algorithm for selecting informative channels in onboard multispectral video systems) // Programmnye sistemy i vychislitel’nye metody, 2017, no. 1, pp. 46-52.

7. Wang Z., Bovik A. C., Sheikh H. R., Simoncelli E. P. Image quality assessment: from error visibility to structural similarity // IEEE Trans. on Image Processing, 2004, vol. 13, is. 4, pp. 600-612.

8. Piella G., Heijmans H. A new quality metric for image fusion // Proc. of IEEE International Conference in Image Processing ICIP–2003, vol. 3, pp. 173-176.

9. Bogdanov A.P., Romanov Yu.N. Otsenka kachestva cifrovyh izobrazhenij (Evaluation of the quality of digital images) // Tehnicheskoe zrenie v sistemah upravlenija: tezisy dokladov. M.: IKI RAN, 2012. pp. 218-226.

10. Pertuz S., Puig D., Garcia M. A. Analysis of focus measure operators for shape-from-focus // Pattern Recognition, 2013, vol. 46, no. 5, pp. 1415-1432.

11. Michelson A. A. Studies in optics. Chicago: The University of Chicago Press, 1927. 164 p.

12. Vorobel’ R.A., Zhuravel’ I.M., Opyr N.V., Popov B.O., Derecha V.Ya., Ravlik Ya.M. Metod kolichestvennoj otsenki kachestva rentgenographicheskih izobrazhenij (Method for quantitative assessment of the quality of radiographic images) // Nerazrushajuschij kontrol’ i tehnicheskaja diagnostika: trudy 3rd ukrainskoj nauchno-tehnicheskoj konferentsii. Dnepropetrovsk, 2000. pp. 233-236.

13. Uljushkina N., Chobanu M. Primenenie novyh kriteriev otsenki kachestva izobrazhenij posle ih szhatija s poterjami (Application of new criteria for assessing the quality of images after their compression with losses) // Sovremennaja electronika, 2007, no. 3, pp. 66-69.

14. Konukhov A.L., Kostevitch A.G., Kouryatchy M.I. Kriterii otsenki otnoshenija signal/shum v aktivno-umpul’snyh televizionno-vychislitel’nyh sistemah (The evaluation criteria of the signal/noise ratio active-impulse television-computer systems) // Doklady TUSURa, 2012, no. 2, part 1, pp. 111-115.

15. Kosmicheskoe zemlevedenie: informatsionno-matematicheskie osnovy (Space geography: information and mathematical foundations) / ed. V.A. Sadovnichij. M.: MGU, 1998. 576 p.

16. Rozenfel’d A. Raspoznavanie i obrabotka izobrazhenij (Image recognition and processing): ed. L. S. Lebedev. M.: Mir, 1972. 232 p.

17. Gonzales R., Woods R. Cifrovaja obrabotka izobrazhenij (Digital image processing). M.: Tehnosphera, 2006. 1072 p.

18. Jahne B. Digital image processing / 6th ed., revised and extended. New York: Springer, 2005. 585 p.

19. Tai Yu-W., Brown M. S. Single image defocus map estimation using local contrast prior // 16th IEEE Int. Conf. on image processing (ICIP-2009). Cairo, 2009. pp. 1797-1800.

20. Prett W. Cifrovaja obrabotka izobrazhenij (Digital image processing). Part 1. M.: Mir, 1982. 312 p.

21. Ratliff F., Hartline H. K., Miller W. H. Spatial and temporal aspects of retinal inhibitory interaction // J. Opt. Soc. Am, 1963, vol. 53, no. 1, pp. 110-120.

22. Travnikova N.P. Effectivnost’ visual’nogo poiska (The effectiveness of visual search). M.: Mashinostroenie, 1985. 128 p.

23. Gruzevich Yu.K. Optico-electronnye pribory nochnogo videnija (Optical electronic night vision devices). M.: Fizmatlit, 2014. 276 p.

24. Bekhtin Y.S. Komplexirovanie zashumljennyh mul’tispectral’nyh izobrazhenij s ispol’zovaniem prostranstvenno-orientirovannyh derev’ev veivlet-preobrazovanija (Wavelet-based fusion of noisy multispectral images using structure-oriented trees) // Cifrovaja obrabotka signalov (Digital signal processing), 2012, no. 1, pp. 27-31.

25. Jobson D. J., Rahman Z., Woodell G. A. A multiscale retinex for bridging the gap between color images and the human observation of scenes // IEEE Trans. on image processing, 1997, vol. 6, no. 7, pp. 965-976.

26. Jobson D. J., Rahman Z., Woodell G. A. Properties and performance of a center/surround retinex // IEEE Trans. on image processing, 1997, vol. 6, is. 3, pp. 451-462.

Aerial object detection and recognition based on multispectral image fusion and processing
Muraviev Ì.S., Smirnov S.A., Strotov V.V.

Keywords: aerial object detection, object position estimation, object recognition, outer contour descriptor, multispectral image processing

In this work the approach for aerial object detection, position estimation and recognition based on multispectral image fusion and processing is proposed.

Object detection and position estimation is based on multi-step image spatial filtering. This approach is expanded on multispectral imaging. The algorithm performance indicator based on object brightness is proposed.

The object recognition algorithm is based on outer contour descriptor matching. The proposed orientation estimation algorithm consists of 2 stages: learning and recognition. Learning stage is devoted to the exploring of studied objects. Using 3D model of the reference objects we can collect the set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. The object contour can be produced using various border extraction techniques, active contour method etc.

The recognition stage is focusing on matching process between an observed image descriptor and the reference image descriptors. The contour descriptor of the object is shifted cyclically to archive the rotation invariance. The result of the matching produces the measure of the difference between the captured object and the n-th reference object from the database. The recognition stage includes the limited number of the operation and can be processed in the real time image processed systems. The source data for the recognition algorithm is a set of binary images that produced by object detection algorithm.

The results on the experimental examinations are given. The experimental examinations are performed using a set of natural multispectral video sequences. They show that detection true positive rate is better than 0,9 with false alarm rate is less 0,05. The recognition true positive rate exceeds 90%.


1. J. Dong, D. Zhuang, Y. Huang and J. Fu. Advances in Multi-Sensor Data Fusion: Algorithms and Applications // Sensors. – 2009. – ¹9(10). p. 7771-7784.

2. Lanir J. Maltz M., Rotman S.R. Comparing multispectral image fusion methods for a target detection task // Optical Engineering. – 2007. – Vol. 46(6). – P. 066402-1–066402-8

3. Hailiang Shi., Baohui Tian, Yuanzheng Wang Fusion of multispectral and panchromatic satellite images using Principal Component Analysis and Nonsubsampled Contourlet Transform / Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). – 2010. – PP. 2312 – 2315.

4. Mitianoudis N., Stathaki T. Adaptive image fusion using ICA bases / Proceedings of the International Conference on Acoustics, Speech and Signal Processing. – Toulouse, 2006. – PP. II-829–II-832.

5. Kaarna A. Integer PCA and wavelet transforms for multispectral image compression / IEEE 2001 International Geoscience and Remote Sensing Symposium (IGARSS). – 2001. – Vol.4. – PP. 1853 – 1855.

6. A. Sarkar et al. A MRF model-based segmentation approach to classification for multispectral imagery // IEEE Transactions on Geoscience and Remote Sensing. – 2002. – Vol. 40, Issue 5. – pp. 1102-1113.

7. F. Samadzadegan. Data integration related to sensors, data and models // ISPRS Congress, Vol. XXXV, Proceedings of Commission IV, Istanbul, Turkey, 2004.– p. 569-574.

8. Vidya Manian, Luis O. Jimenez, Land cover and benthic habitat classification using texture features from hyperspectral and multispectral images // Journal of Electronic Imaging 16(2), 023011 (Apr–Jun 2007), pp. 1-12.

9. Babayan P.V., Smirnov S.A. Object tracking using the template matching algorithm for multispectral (visible and infrared band) visual systems // Digital signal processing – 2010.–Pp.18-21.

10. Alpatov B.A. Optimal moving object parameter estimation in image sequences. // Avtometriya. – ¹2. – Pp. 32-37.

11. Muraviev V.S., Muraviev S.I. Object extraction and position estimation algorithm for the cloudy backgrounds // The Bulletin of the RSREU. – 2007. – ¹21– Pp. 20-24.

12. Alpatov B.A., Babayan P.V., Smirnov S.A., Maslennikov E.A. The special orierntation prior estimation algorithm using the outer contour descriptor // Digital signal processing – 2014. -¹3. – Pp.43-46.

13. Repin V.G., Tartakovskiy G.P. Statistical synthesis in case of the apriority ambiguity and the information system adaptation – Moscow: Soviet Radio, 1977. – 432 pages.

14. Alpatov B.A., Babayan P.V., Balashov O.E. and Stepashkin A.I. The methods of the automatically object detection and tracking. Image processing and control – Moscow: Radiotechnika, 2008. – 176 pages.

15. Alpatov B.A., Babayan P.V., Smirnov S.A. The composite aerial object tracking algorithm. // The Bulletin of the RSREU. – 2011. – ¹37– Pp. 7-12.

16. Muraviev V. S., Smirnov S. A., Strotov V. V. Aerial vehicles detection and recognition for UAV vision system // Computer Optics. – 2017. – Vol. 41. – ¹. 4. – Pp. 545-551.

Modern convolutional neural networks for object detection and recognition
Erokhin D.Y., e-mail:
Ershov M.D., e-mail:
The Ryazan State Radio Engineering University (RSREU), Russia, Ryazan

Keywords: intelligent systems, image processing, object detection, pattern recognition, neural networks, machine learning.

Intelligent video processing systems are now widely implemented in various areas of human life. The development of such systems is associated with enhancement of the computer technology and with the development of new methods for video processing and analyzing. The key tasks of most video processing systems are the detection, recognition and tracking of objects.

Modern artificial neural networks are able to detect and localize objects of known classes. This allows them to be used in various technical vision systems. The article contains a comparison of different neural network architectures that are used to solve the problem of object detection and recognition.

Neural network architectures for the detection and recognition of objects can be divided into two large groups. The first group includes architectures that process regions in the image (Region-based Convolution Neural Network – R-CNN). The second group includes architectures that process the entire image (You Only Look Once – YOLO; Single Shot MultiBox Detector – SSD).

In this work we compare three architectures (YOLO, Faster R-CNN, SSD) by the following criteria: processing speed, mean Average Precision (mAP), precision and recall. Five neural network detectors were trained for comparison purposes: YOLOv3; Faster R-CNN with the Inceptio-nResnet-2 network for feature extraction; Faster R-CNN with the Resnet-101 network for feature extraction; SSD with the MobileNet-1 network for feature extraction; SSD with the MobileNet-2 network for feature extraction.

During the experiment we used images containing objects of the classes “pedestrian” and “vehicle”. About 6,700 marked up images were used for training and 750 images for processing. The quality of object detectors was assessed by plotting the precision-recall curve, as well as graphs of precision, recall and F-measure for different threshold. Also to assess the quality depending on the training iteration we calculated the average precision (AP) metric for each class of objects and the mAP metric (average AP value over all classes). The area under precision-recall curve (AUC) and mAP was used as integral assessments of the detector accuracy. Computational efficiency was evaluated by processing images with a resolution of 720×468 on the personal computer with NVIDIA GeForce GTX 1070 graphics processor.

Faster R-CNN networks have demonstrated an advantage in accuracy. So, according to the experiment results, Faster R-CNN based on the InceptionResnet-2 network has the highest accuracy but the average processing time is much longer. The SSD architecture is the most suitable for real-time image processing (especially with MobileNet networks) but it must be borne in mind that high accuracy requirements usually cannot be satisfied. The detector based on neural network YOLOv3 has a mean accuracy and computational efficiency compared with other detectors.


1. Lukyanitsa A.A., Shishkin A.G. Digital video processing. – Moscow: Ai-Es-Es Press, 2009. – 518 p. (in Russian).

2. Alpatov B.A., Babayan P.V. Image processing and recognition technologies in on-board technical vision systems // Vestnik of Ryazan State Radio Engineering University. – Ryazan. – 2017. – No. 2. – pp. 34-44 (in Russian).

3. Alpatov B.A., Babayan P.V., Balashov O.E., Stepashkin A.I. Methods for automatic detection and tracking of objects. Control and image processing. – Moscow: Radiotehnika, 2008. – 176 p. (in Russian).

4. Alpatov B.A., Babayan P.V., Ershov M.D. Vehicle Detection and Counting System for Real-Time Traffic Surveillance // Proceedings of 7th Mediterranean Conference on Embedded Computing (MECO). – IEEE, 2018. – pp. 120-123.

5. Gouk H.G.R., Blake A.M. Fast sliding window classification with convolutional neural networks // Proceedings of the 29th International Conference on Image and Vision Computing, New Zealand. – ACM, 2014. – pp. 114-118.

6. Boser B.E., Guyon I.M., Vapnik V.N. A training algorithm for optimal margin classifi-ers // Proceedings of the fifth annual workshop on Computational learning theory. – ACM, 1992. – pp. 144-152.

7. Redmon J., Divvala S., Girshick R., Farhadi A. You only look once: Unified, real-time object detection // Proceedings of the IEEE conference on computer vision and pattern recognition. – 2016. – pp. 779-788.

8. Redmon J., Farhadi A. YOLO9000: better, faster, stronger // arXiv preprint, ar-Xiv:1612.08242. – 2016. – 9 p.

9. Redmon J., Farhadi A. YOLOv3: An incremental improvement // Tech report, ar-Xiv:1804.02767. – 2018. – 6 p.

10. Bishop C.M. Pattern Recognition and Machine Learning. – Springer-Verlag, New York, 2006. – 738 p.

11. Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks // Extended tech report, arXiv:1506.01497. – 2016. – 14 p.

12. Girshick R.B., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate ob-ject detection and semantic segmentation // IEEE Conference on Computer Vision and Pattern Rec-ognition (CVPR). – 2014. – 21 p.

13. Girshick R. Fast R-CNN // IEEE International Conference on Computer Vision (ICCV). – 2015. – 9 p.

14. Uijlings J.R.R., van de Sande K.E.A., Gevers T., Smeulders A.W.M. Selective Search for Object Recognition // International Journal of Computer Vision. – 2013. – Vol. 104. – pp. 154-171.

15. Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu Ch.-Y., Berg A.C. SSD: Sin-gle Shot MultiBox Detector // European Conference on Computer Vision (ECCV), Springer, Cham. – 2016. – Vol. 9905. – pp. 21-37.

16. Wan S., Chen Z., Zhang T., Zhang B., Wong K. Bootstrapping Face Detection with Hard Negative Examples // arXiv:1608.02236. – 2016. – 7 p.

17. Geiger A., Lenz P., Urtasun R. Are we ready for Autonomous Driving? The KITTI Vi-sion Benchmark Suite // Conference on Computer Vision and Pattern Recognition (CVPR). – 2012. – 8 p.

18. Cordts M., Omran M., Ramos S., Rehfeld T., Enzweiler M., Benenson R., Franke U., Roth S., Schiele B. The Cityscapes Dataset for Semantic Urban Scene Understanding // Conference on Computer Vision and Pattern Recognition (CVPR). – 2016. – 11 p.

Development and analysis of algorithm of pathology detection in endoscopic images of gastric based on convolutional neural network
V.V. Khryashchev1, e-mail:
A.N. Ganin2, e-mail:
A.A. Lebedev1, e-mail:
O.A. Stepanova1, e-mail:
S.V. Kashin3, e-mail:
R.O. Kuvaev3, e-mail:
P.G. Demidov Yaroslavl State University1, Russia, Yaroslavl
CEO “Point of View”2, Russia,
Yaroslavl Yaroslavl Clinical Oncological Hospital3, Russia, Yaroslavl

Keywords: machine learning, convolution neural network, endoscopic image analyses, gastric cancer.

Computer-aided diagnostics of cancer pathologies based on endoscopic image analyses is a promising area in the field of computer vision and machine learning. The introduction of such systems in clinical medicine is aimed at improving the efficiency of diagnosis and therapy, reducing the time and costs of research. More than that, such systems are able to provide quality control, as well as training and improving the medical skills of specialists. Convolutional neural networks are one of the most popular approaches in endoscopic image analyses.

The paper presents an algorithm of pathology detection and classification in endoscopic images. The algorithm is based on the use of the convolutional neural network SSD. This neural network allows you to detect and classify objects with the best indicators of the ratio of speed and quality of work among the currently existing approaches.

Training and testing of the developed algorithm was carried out on the NVIDIA DGX-1 supercomputer using endoscopic images from the test base, assembled together with the Yaroslavl Clinical Oncological Hospital.

In the framework of the study, the following was obtained: AP (average precision), mAP (mean average precision), precision-recall curves. The results of the research show that the proposed algorithm based on neural network SSD can be successfully used for the endoscopic image analyses in real medical practice, which is confirmed by the high level of similarity of the obtained results with the expert markup.


1. Goodfellow, Y. Bengio, A. Courville, A. (2016) // MIT-Press, 2016, 652 p.

2. Bisschops R. et al Performance measures for upper gastrointestinal endoscopy: a European Society of Gastrointestinal Endoscopy (ESGE) Quality Improvement Initiative // Endoscopy, 48(9), 2016, 843-64

3. Kuvayev R.O., Nikonov Ye.L., Kashin S.V., Kapranov V.A., Gvozdev A.A. Kontrol' kachestva endoskopicheskikh issledovaniy, perspektivy avtomatizirovannogo analiza endosko-picheskikh izobrazheniy (Quality control of endoscopic studies, prospects for automated analysis of endoscopic images) // Kremlevskaya meditsina. Klinicheskiy vestnik, 2, 2013, 51-56.

4. Lebedev A.A., Stepanova O.A., Yurchenko Ye.A., Khryashchev V.V. Razrabotka algoritmov analiza izobrazheniy dlya klassifikatsii patologiy slizistoy obolochki zheludka (Development of image analysis algorithms for the classification of pathologies of the gastric mucosa) // Tsifrovaya obra-botka signalov i yeye primeneniye (DSPA-2018): dokl. 20-y mezhdunar. konf. – Moskva, 2018. T. 2. S. 644-649.

5. Batukhtin D.M., Peganova Ye.V., Mitrakova N.N., Rozhentsov A.A., Furman YA.A. Analiz uzkospektral'nykh endoskopicheskikh izobrazheniy na vnutrenney poverkhnosti pishchevoda (Analysis of narrow-spectrum endoscopic images on the inner surface of the esophagus) // Vestnik Povolzhskogo gosudarstvennogo tekhnologicheskogo universiteta. Seriya: radiotekh-nicheskiye i infokommunikatsionnyye sistemy, ¹ 4 (23), 2014. – s. 45 - 57.

6. Peganova Ye. V., Batukhtin D. M., Mitrakova N. N. Avtomatizirovannaya sistema segmentatsii uzkospektral'nykh izobrazheniy dlya optimizatsii endoskopicheskoy diagnostiki pri patologii pishchevoda (Automated system for segmentation of narrow-spectrum images to optimize endoscopic diagnostics for esophageal pathology) // EiKG. 2014. ¹3 (103).

7. O. A. Dunayeva, D. B. Malkova, M. L. Myachin, KH. Edel'sbrunner, Segmentatsiya klinicheskikh endoskopicheskikh izobrazheniy, osnovannaya na klassifikatsii vektornykh topologicheskikh priznakov (Segmentation of clinical endoscopic images based on the classification of vector topological features) // Model. i analiz inform. sistem, 20:6 (2013), 162–173.

8. Kovalenko D.A., Gnatyuk V.C. Assotsiatsiya stsen v endoskopicheskikh video (Association of scenes in endoscopic videos) // GraphiCon 2017: Obrabotka i analiz biomeditsinskikh izobrazheniy, Perm', 2017. – s. 269-274.

9. T. Tamaki, S. Sonoyama, T. Hirakawa, B. Raytchev, K. Kaneda, T. Koide, Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features // in The Korea-Japan joint workshop on Frontiers of Computer Vision (FCV2016), 2016, pp. 61–65.

10. E. Ribeiro, A. Uhl, G. Wimmer, M. Hafner, Exploring Deep Learning and Transfer Learning for Colonic Polyp Classification // Computational and Mathematical Methods in Medicine, Volume 2016, 16 p.

11. Y. Bar, L. Wolf, I. Diamant, H. Greenspan, Deep Learning with Non-Medical Training Used for Chest Pathology Identification // In: SPIE Medical Imaging. 2015. p. 94140V-V-7.

12. Kuvayev R.O., Kashin S.V., Nikonov Ye.L., Itoh T., Gotoda T., Gono K. Ranniy rak zheludka: metodiki skrininga, endoskopicheskoy diagnostiki i maloinvazivnogo lecheniya (Early gastric cancer: screening, endoscopic diagnosis and minimally invasive treatment methods) // Do-kazatel'naya gastroenterologiya, 3 (3), 2014, 44-51.

13. Kuvayev R.O., Kashin S.V. Sovremennoye endoskopicheskoye issledovaniye zheludka s ispol'zovaniyem metodik uzkospektral'noy i uvelichitel'noy endoskopii: tekhnika provedeniya i algoritmy diagnostiki // Dokazatel'naya gastroenterologiya, 2 (5), 2016, 3-13.

14. Kuvayev R.O., Nikonov Ye.L., Kashin S.V. Helicobacter pylori-assotsiirovannyy khronicheskiy gastrit: novyye tekhnologii endoskopicheskoy diagnostiki (Helicobacter pylori-associated chronic gastritis: new technologies for endoscopic diagnosis) // Dokazatel'naya gastro-enterologiya, 4 (1), 2015, 19-24.

15. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. E. Reed. SSD: Single Shot Multibox Detector. CoRR, abs/1512.02325, 2015.

16. Fully convolutional reduced VGGNet [Web-site]. URL:

17. Canny, J. A Computational Approach to Edge Detection // IEEE Transactions On Pattern Analysis And Machine Intelligence, vol. Pami-8, no. 6, 1986. pp. 679-698.

18. ImageNet Image Database [Web-site]. URL:

System-on-Chip variational optical flow computation
P.V. Belyakov, e-mail:
M.B. Nikiforov, e-mail:
The Ryazan State Radio Engineering University (RSREU), Russia, Ryazan

Keywords: optical flow, variational methods, FPGA, system-on-chip.

The paper is devoted to the study of the variational method of the optical flow computation and its implementation in a system-on-chip (SoC). The optical flow is the velocity field where each point has two components vector to show the displacement between points due to their movement from the first image to the second image. The results of the optical flow computation are widely used in various types of image processing tasks, such as motion detection, object tracking, 3D reconstruction, and autonomous robot navigation.

The variational nonlinear method of the optical flow computation is the most accurate, but at the same time the most computationally intensive. Its implementation on a system-on-chip is a trade-off between the design difficulty and high performance hardware implementation. The variational approach of the optical flow computation is based on the solving a system of nonlinear partial differential equations. In the article technique for their approximation by finite differences is investigated. Nonlinear equations discretization leads to a system of linear algebraic equations, which can be numerically solved by an iterative Gauss-Seidel method (SOR method - Successive-Over-Relaxation) with enhanced convergence.

The proposed methodology was implemented in a system-on-chip containing a processor system (ARM processor) and programmable logic (FPGA). Appropriate hardware architecture based on the tasks distribution between software (SW) and hardware (HW) parts of the system for the optical flow computation was justified. Verilog hardware description language was used for the most effective hardware implementation.

The solution proposed in article is capable for a dense non-linear optical flow real-time computation and can act as SoC hardware-accelerator of the optical flow computation in various kinds of image processing tasks.


1. B. Lucas and T. Kanade. An iterative image registration technique with an application in stereo vision. In Proc. IEEE Int. Joint Conf., Artificial Intelligence, 1981, pp. 674–679.

2. Elesina S.I., Nikiforov M.B., Loginov À.À., Kostyashkin L.N. Monografiya pod. red.. L.N. Kostyashkina, M.B. Nikiforova. Sovmeshenie izobrageniy v correlyacionno-extrimalnih navigacionnih sistemah (Image complexing in correlation-extreme navigation systems). .M: Radiotechnika, 2015, p. 208.

3. B. K. P. Horn and B. G. Schunck. Determining optical flow, Artificial Intelligence, 17:185–203, 1981.

4. Abukhalikov A.A., Belyakov P.V., Nikiforov M.B., Poisk kluchevih tochek na izobragenii (Key points detection on the image). Megdunarodnaya nauchno-tehnicheskaya i nauchno-metodicheskaya konferencia «Sovremennie tehnologii v nauke I obrazovanii » STNO-2016, 2016, pp. 103-108.

5. T. Brox, A. Bruhn, N. Papenberg, and J. Weickert. High accuracy optical flow estimation based on a theory for warping. In Proc. European Conf., Computer Vision, volume 4, 2004, pp. 25–36.

6. Obrabotka izobragenii v aviacionnih sistemah tehnicheskogo zreniya (Image processing in aviation technical vision systems)/ Pod. red. L.N.Kostyashkina, M.B. Nikiforova. Ì.: FIZMATLIT, 2016, pp. 28-32

7. D. Ustukov, Y. Muratov, M. Nikiforov, V. Gurov. Implementing one of stereovision algorithms on FPGA. Mediterranean Conference on Embedded Computing, Jun 2016.

8. A. Bruhn and J. Weicker. Towards ultimate motion estimation: combing highest accuracy with real-time performance. In Proc. 10th IEEE Int.Conf., Computer Vision, 2005, pp. 749–755.

9. A. Bruhn, J. Weickert, and C. Schnorr. Lucas/Kanade meets Horn/Schunck: Combining local and global optic flow methods. Int. J. Computer Vision, 2005, 61:211–231.

10. M. Kunz, A. Ostrowski, P. Zipf. An FPGA-optimized architecture of Horn and Schunck optical flow algorithm for real-time applications. Field Programmable Logic and Applications (FPL), 2014 24th International Conference.

11. J. L. Martin, A. Zuloaga, C. Cuadrado, J. Lazaro, and U. Bidarte. Hardware implementation of optical flow constraint equation using fpgas. Computer Vision and Image Understanding, 2005, pp 462–490.

12. Z. Chai, H. Zhou, Z. Wang and D. Wu Using C to implement high-efficient computation of dense optical flow on FPGA-accelerated heterogeneous platforms. IEEE 14 International Conference on Field-Programmable Technology (FPT), 2014.

13. Ortega, James M. Introduction to Parallel and Vector Solution of Linear Systems, 1988.

14.Xilinx.Zynq-7000SoC. ug479_7Series_DSP48E1.pdf.

15. Xilinx. Zynq-7000 SoC.

16. Xilinx. Vivado Design Suite.

17. Larkin E.V. Modelirovanie processa distancionnogo upravlenia robotom (Remote robot control process simulation). Izvestiya TulGU. Technicheckie nauki, 2016, Vip. 12. P. 4, pp. 202-214

If you have any question please write: