Digital Signal Processing

Russian
Scientific & Technical
Journal


“Digital Signal Processing” No. 3-2019

Digital image processing

In the issue:

- integration of multispectral images

- increase of the noise stability
- surface large scale mapping
- influence of angular elements precision
- creation of crucial object coefficient
- increase the dynamic range of the video system
- generating a panoramic video
- satellite images segmentation
- object contour detection
- modified motion compensation method
- the optical function of the atmosphere



Improving the performance of the algorithm for generating a panoramic video in vision systems with a distributed aperture
I.A. Kudinov, e-mail: i.a.kudinov@yandex.ru
I.S. Kholopov,
e
-mail:
kholopov.i.s@rsreu.ru
The Ryazan State Radio Engineering University named V.F. Utkin, Russia, Ryazan

Keywords: panorama image, homography matrix, blending, dilation, Manhattan distance, bilinear interpolation.

Abstract
The aim of the work is the optimization of computationally intensive procedures like bilinear interpolation, morphological dilation and spatial filtering, which are performed for forming of personal region of interest (RoI) in panoramic vision systems with a distributed aperture.

An article provides a comparative analyze of four methods for bilinear interpolation computation, differing in the number of additions and multiplications and the number of records of auxiliary variables in RAM, as well as numerical methods aimed at reducing the asymptotic computational complexity of the multi-band blending algorithm. These methods consist in optimizing the spatial 2D box filtering algorithm by a two-stage 1D Box filtering (by row and column) with brightness accumulator and replacing the division operation with an arithmetic right shift and applying the morphological dilation algorithm based on construction of Manhattan distances map for blending mask forming. The latter allows us to reduce the computational cost of implementing morphological dilatation with a k×k pixels structural element almost k2 times.

It is noted that considered optimization methods are effective when implementing calculations on a single-core processors and not applicable for parallel computations, because processing for each pixel of the image is not uniform.

The results of experiments performed on a prototype of a distributed panoramic system with five cameras and RoI size of 1024x768 pixels showed that when implementing computations on single core PC processor, the use of the considered optimization algorithms allows to increase the speed of RoI forming by 5.4 times.

References

1. Lazarenko V.P., Dzhamiykov T.S., Korotaev V.V., Yarishev S.N. Metod sozdanija sphericheskih panoram iz izobrazhenij, poluchennyh vsenapravlennymi optico-electronnymi sistemami (Method for creation of spherical panoramas from images obtained by omnidirectional optoelectronic systems) // Nauchno-tehnicheskij vestnik informatsionnyh tehnologij, mehaniki I optiki, 2016, vol. 16, no. 1, pp. 46-53.

2. Prudnikov N.V., Shlishevsky V.B. Panoramnye optico-electronnye ustrojstva krugovogo i sektornogo obzora (All-round looking optoelectronic surveillance systems) // Vestnik SGUGiT, 2016, vol. 33, no. 1, pp. 148-161.

3. Belskiy A., Zhosan N., Brondz D., Gorbachev K., Grebenschikov V., Kargaev A. Day/night synthetic vision system // Photonics, 2013, vol. 38, no. 2, pp. 80-86.

4. Bagdasarova O.V., Bagdasarov A.A. Panoramnye sistemy krugovogo obzora v ustrojstvah otobrazhenija vtorichnoj informatsii komplexov avioniki i avtobazirovanija (Panoramic systems of a circular review in devices for displaying secondary information of avionics and auto placement complexes) // Phazotron, 2012, no. 3, pp. 28-33.

5. Arshakyan A.A. Panoramnoe nabljudenie stseny s borta letatel’nogo apparata (Scene panoramic observation from flying machine board) // Izvestija TulGU. Tehnicheskie nauki, 2013, vol. 4, pp. 144-151.

6. Shirokov R.I., Alekhnovich V.I. Uvelichenie uglov polja zrenija sostavnoj WEB-kamery metodom skleivanija izobrazhenij (Compound webcam angles of field of vision extension by a method of images pasting together) // Contenant, 2014, no. 4, pp. 10-23.

7. Silvestrova O.V. Otsenka effektivnosti sposobov tsvetovoj korrektsii dlja panoramnyh otobrazhenij s malorazmernymi ob’ektami (Color correction methods efficiency evaluation for panoramic images with small-size objects) // Tsifrovaja obrabotka signalov, 2015, no. 3, pp. 25-28.

8. Brown M, Lowe D. Automatic Panoramic Image Stitching using Invariant Features // International Journal of Computer Vision. 2007. Vol. 74(1). P. 59-73.

9. Szeliski R. Image alignment and stitching: a tutorial // Foundations and trends in computer graphics and vision. 2006. Vol. 2(1). P. 1-104.

10. Perazzi F., Sorkine-Hornung A., Zimmer H., Kaufmann P., Wang O., Watson S., Gross M. Panoramic video from unstructured camera arrays // Comput. Graph. Forum. 2015. Vol. 34(2). P. 57-68.

11. Agarwala A., Zheng K.C., Pal C., Agrawala M., Cohen M., Curless B., Salesin D., Szeliski R. Panoramic video textures // ACM Transactions on Graphics. Vol. 24(3). P. 821-827.

12. Shum H.-Y., Szeliski R. Construction of panoramic mosaics with global and local alignment // International J. of Computer Vision. 2000. Vol. 36(2). P. 101-130.

13. Hartley R., Zisserman A. Multiple view geometry in computer vision: 2nd edition. Cambridge: Cambridge University Press, 2003. 656 ð.

14. Faugeras O., Lustman F. Motion and structure from motion in a piecewise planar environment // International J. of Pattern Recognition and Artificial Intelligence. 1988. Vol. 2(3). P. 485-508.

15. Malis E., Vargas M., Deeper understanding of the homography decomposition for vision-based control, Technical report, INRIA, 2007.

16. Efimov A.I., Novikov A.I. An algorithm for multistage projective transformation adjustment for image superimposition // Computer Optics. 2016. Vol. 40, No. 2. P. 258-265.

17. Strotov V.V. Sravnenie dvyh strukturnyh algoritmov otsenki parametrov geometricheskih preobrazovanij izobrazhenij (The comparation of two structural image geometric transformation parameters estimation algorithms) // Tsifrovaja obrabotka signalov, 2013, no. 3, pp. 21-25.

18. Lowe D.G Distinctive Image Features from Scale-Invariant Keypoints // International Journal of Computer Vision. 2004. Vol. 60(2). P. 91-110.

19. Fischler M. Bolles R. Random Sample Consensus: a Paradigm for Model Fitting with Application to Image Analysis and Automated Cartography // Communications of the ACM. 1981. Vol. 24(6). P. 381-395.

20. Novikov A.I. Sablina V.A., Nikiforov M.B., Loginov A.A. The contour analysis and image-superimposition problem in computer vision systems // Pattern Recognition and Image Analysis. 2015. Vol. 25(1). P. 73-80.

21. Kudinov I.A., Pavlov O.V., Kholopov I.S., Khramov M.Yu. The algorithm for a video panorama construction and its software implementation using CUDA technology // CEUR Workshop Proceedings. Vol. 1902. 2017. P. 37-42.

22. Gruzman I.S., Kirichuk V.S., Kosykh V.P., Peretyagin G.I., Spektor A.A. Cifrovaja obrabotka izobrazhenij v informatsionnyh sistemah (Digital image processing in information systems). Novosibirsk: NSTU Publisher, 2002. 352 p.

23. Krasil’nikov N.N. Tsifrovaja obrabotka 2D- i 3D-izobrazhenij (Digital processing of 2D- and 3D-images). SPb.: BHV-Peterburg. 2011. 608 p.

24. Parker J.A. Kenyon R.V., Troxel D.E. Comparison of interpolating methods for image resampling // IEEE Trans. on Medical Imaging. 1983. – Vol. 2(1). P. 31-39.

25. Li J., Su J., Zeng X. A solution method for image distortion correction model based on bilinear interpolation // Computer Optics. 2019. Vol. 43(1). P. 99-104.

26. Timofeeva N.E., Geraskin A.S. Issledovanie vozmozhnosti uluchshenija algoritma bilinejnoj interpoljatsii dlja korrektirovki tsifrovyh izobrazhenij primeneniem teorii polej orientatsii (Research of improved bilinear interpolation algorithm for correcting a digital image using the theory of orientation fields) // Vestnik VGU. Serija: sistemnyj analiz I informatsionnye tehnologii, 2018. no. 1, pp. 119-125.

27. Zhu Z., Lu J., Wang M., Zhang S., Martin R.R., Liu H., Hu S.-M. A comparative study of algorithms for realtime panoramic video blending // IEEE Trans. on image processing. 2018. Vol. 27, No. 6. P. 2952-2965.

28. Kanaeva I.A., Bolotova Yu.A. Color and luminance corrections for panoramic image stitching // Computer Optics. 2018. Vol. 42, No. 5. P. 885-897.

29. Jahne B. Digital image processing: 6th ed, revised and extended. New York: Springer, 2005. 585 p.

30. Myratov Ye.R., Ustukov D.I. Algoritm vyravnivanija osveschennosti ob’ektov na izobrazhenijah s potochnym vypolneniem (Algorithm of objects illumination alignment on images with pipeline) // Tehnicheskoe zrenie v sistemah upravlenija – 2017: tezisy dokladov. Ìoscow: IKI RAN, 2017, pp. 29-31.

31. Lukin A. Tips & Tricks: Fast Image Filtering Algorithms // GraphiCon: Proc. of GraphiCon'2007Russia conference, Moscow, June 23-27, 2007.

32. Jarosz W. Fast Image Convolutions [Ýëåêòðîííûé ðåñóðñ]. Ðåæèì äîñòóïà: http://elynxsdk.free.fr/ext-docs/Blur/Fast_box_blur.pdf. Äàòà îáðàùåíèÿ: 17.05.2017.

33. Pratt W.K. Digital image processing: 4th ed. New Jersey: John Wiley & Sons, 2007. 807 p.

34. Burt P., Adelson E. A multiresolution spline with application to image mosaics // ACM Transactions on Graphics. 1983. Vol. 2(4). P. 217-236.

35. Vizilter Yu.V., Zheltov S.Yu., Bondarenko A.V., Ososkov M.V., Morzhin A.V. Obrabotka i analiz izobrazhenij v zadachah mashinnogo zrenija: kurs lektsij i prakticheskih zanjatij (Image processing and analysis in problems of machine vision: a course of lectures and practical exercises). Moscow: Phizmatkniga, 2010. 672 p.

36. Efficiently implementing dilate and erode image functions [Ýëåêòðîííûé ðåñóðñ]. Ðåæèì äîñòóïà: https://blog.ostermiller.org/dilate-and-erode (äàòà îáðàùåíèÿ: 17.10.2017).

37. Minkowski H. Geometrie der zahlen. Leipzig und Berlin: Druck und Verlag von B.G. Teubner, 1910. 256 p.

38. Krause E.F. Taxicab geometry: an adventure in non-euclidean geometry. New York: Dover Publications, 2012. 96 p.

39. Kudinov I.A., Kholopov I.S., Khramov M.Yu. Tehnologija formirovanija panoramnyh raznospektral’nyh videoizobrazhenij dlja obzornyh avaitsionnyh optico-electronnyh sistem (The technology of the panoramic multi-spectral video forming aviation vision systems) // Trudy MAI, 2019, vol. 104, pp. 1-20.


Increase the dynamic range of the video system logical addition of digital images
V.A. Kottsov
Institute of space researches of the Russian Academy of Sciences (IKI of RAS), Russia, Moscow, e-mail: vladkott@mail.ru

Keywords: dynamic range, multi-channel shooting, logical addition, logical filter.

Abstract

The article shows the possibility of rapid increase in the dynamic range of digital images by obtaining images with mutually additional characteristics and their parallel logical summation in the streaming mode.

Dynamic range is one of the important characteristics of the shooting system. It determines the ability to display the entire variety of brightness simultaneously observed objects in the field of view. A large range of changes in the brightness of objects in real scenes does not always fit into the limited technical capabilities of the means of observation. If the shooting results are important, it is necessary to look for a simple way to increase the dynamic range of the resulting image.

An option for solving this problem can be to obtain a set of images of the same scene with different, complementary shooting parameters and their summation in an arithmetic device. The article considers another, more effective for digital systems method of summation with the help of simple logic functions. It performs the summation procedure without intermediate memorization and transfer operations. The formation of the total image is carried out at the rate of receipt of information.

The sequence of operations of the proposed method is described. An example of its implementation is given.

References
1. Bondarenko A.V., Dokuchaev I. V., Kottsov V. A. Experience of the digital camera on the microsatellite "Chibis-M" // International scientific and technical conference "Academic microsatellite "Chibis-M". Results, lessons, prospects". IKI RAN, 2014.

2. Garanin S. G., Zykov L. I., Klimov A. N. et al. Day observation of stars of weak brightness (7 m-8 m) from flat terrain. // Optical journal, vol. 84, No. 12, 2017.

3. Soyuz 22 is exploring Earth. / Joint publication of the USSR Academy of Sciences and Acad. of Sciences of the German democratic Republic – Moscow; Berlin: Nauka, 1980.

4. Vilenchik L. S., Goncharenko B. G., Kurkov I. N. et al. Method of expanding the dynamic range of transmitted gradations of brightness and/or illumination in a television system. Patent RU 2199827 // Bulletin of inventions No. 6, 2003.

5. Kartsev M. A. Arithmetic of digital machines. - M: Nauka, 1969.

6. Pospelov D. A. Logical methods of analysis and synthesis of schemes. - M: Energy, 1968.

7. Khanjyan O. A. Linear filtration based on the theory of symmetric functions // Radio engineering and electronics, vol. 8, 1986.

8. Kottsov V. A. Method of obtaining images with increased dynamic range. Patent RU 2578799 // Bulletin of inventions No. 9, 2016.

9. Bergson A. Creative evolution. - M: TERRA - Book club, 2001..


Modification of the U-Net convolutional neural architecture in the multi-channel satellite images segmentation problem
V.V. Khryashchev, e-mail: v.khryashchev@uniyar.ac.ru
A.L. Priorov,
e-mail: andcat@yandex.ru
V.A. Pavlov,
e-mail: vladimir@1pavlov.com

R.V. Larionov,
e-mail: rv.larionov@yandex.ru
Yaroslavl State University named by P.G. Demidova (YSU), Russia, Yaroslavl


Keywords: Earth remote sensing, segmentation, satellite images, convolutional neural network, deep learning.

Abstract
The article deals with multi-channel satellite images for solving image segmentation problem. The task of automatic segmentation of objects in satellite imagery is relevant for areas such as agriculture, urban planning and the protection of natural resources. Most modern approaches to solving this problem are based on the use of deep learning algorithms, specifically convolutional networks. This work presents the implementation of the algorithm for segmentation of multispectral images to segment buildings and structures.

The original architecture of the convolutional neural network U-Net has been modified by two encoders were used for RGB and NIR channels with their combination on the central unit. The final neural network architecture has 47 convolutional layers, 47 ReLU activation functions, 47 batches normalization operations, 1 sigmoid activation function, 10 upsampling operations, 5 downsampling operations, 11 feature merging operations.

9784 images from the Spacenet database with 250×250-pixel size was used for training the neural network. To assess the quality of the segmentation algorithm, the following metrics were used: Sorensen similarity coefficient (dice) and Jacquard index (IoU).

Conducting pretraining based on Spacenet images allowed us to achieve a Sorensen coefficient of 0.783 and a Jacquard index of 0.649. The proposed modification algorithm for the convolutional neural network can find application in the field of urban development to track the construction of large objects.

References
1. Rassel D. Sputnikovaja fotosemka (Satellite photography) // M.: Mir, 2013. pp. 157.

2. Vizilter Ju.V., Zheltov S.Ju. Obrabotka i analiz izobrazhenij v zadachah mashinnogo zrenija. Kurs lekcij i prakticheskih zanjatij (Image processing and analysis in machine vision problems. Lecture and practical course) // M.: Fizmatkniga, 2010. 672 p.

3. Alpatov B.A., Ershov M.D., Fel'dman A.B. Algoritm obrabotki izobrazhenij dlja sistemy kombinirovannogo videnija letatel'nogo apparata (Image Processing Algorithm for Aircraft Combined Vision System) // Cifrovaja obrabotka signalov. 2015. no 3. pp. 8–14.

4. Goodfellow Y., Bengio Y., Courville A. Deep Learning // The MIT Press, 2016. 800 p.

5. Bartalev S.A., Hovratovich T.S. Analiz vozmozhnostej primenenija metodov segmentacii sputnikovyh izobrazhenij dlja vyjavlenija izmenenij v leash (Analysis of the possibilities of applying satellite image segmentation methods to detect changes in forests) // Sovremennye problemy di-stancionnogo zondirovanija Zemli iz kosmosa. 2011. T. 8, no 1. pp. 44–62.

6. Zhang Z., Liu Q., Wang, Y. (2018). Road extraction by deep residual U-Net // IEEE Geoscience and Remote Sensing Letters, 15(5), pp. 749–753.

7. Zhang L., Zhang L., Du B. Deep learning for remote sensing data: A technical tutorial on the state of the art // IEEE Geoscience and Remote Sensing Magazine 4.2. 2016. pp. 22–40.

8. Chen Y., Jiang H., Li C., Jia X., Ghamisi P. Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks, in IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 10, pp. 6232–6251, Oct. 2016.

9. Zhu X., Tuia D., Mou L., Xia G-S., Zhang L., Xu F., Fraundorfer F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources, in IEEE Geoscience and Remote Sensing Magazine, vol. 5, no. 4, pp. 8–36, Dec. 2017.

10. Iglovikov V., Mushinskiy S., Osin V. Satellite imagery feature detection using deep convolutional neural network: A kaggle competition. 2017, arXiv preprint arXiv:1706.06169.

11. Grigoreva O.V. Subpikselnaja identifikacija obektov mestnosti po giperspektralnym dannym na osnove posledovatel'nogo kvadratichnogo programmirovanija i metoda ana-liza formy spektralnyh component (Subpixel identification of terrain objects using hyperspectral data based on sequential quadratic programming and a method for analyzing the shape of spectral components) // Cifrovaja obrabotka signalov. 2018. no 3. pp. 26–31.

12. Nikolenko C., Kadurin A., Arhangel'skaja E. Glubokoe obuchenie. SPb (Deep learning).: Piter, 2019. 480 p.

13. Erohin D.J., Ershov M.D. Sovremennye svertochnye nejronnye seti dlja obnaruzhenija i raspoznavanija obektov (Modern convolutional neural networks for object detection and recognition) // Cifrovaja obrabotka signalov. 2018. no 3. pp. 64–69.

14. Seferbekov S., Iglovikov V., Buslaev A., Shvets A. Feature Pyramid Network for Multi-Class Land Segmentation. Web: https://arxiv.org/pdf/1806.03510.pdf.

15. Ivanov E., Tishhenko I., Vinogradov A. Segmentacija mul'tispektral'nyh snimkov s pri-meneniem svertochnyh nejronnyh setej (Segmentation of multispectral images using convolutional neural networks) // Sovremennye problemy distancionnogo zondiro-vanija Zemli iz kosmosa. 2019. T. 16, no 1. pp. 25–34.

16. Khryashchev V., Pavlov V., Priorov A., Ostrovskaya A. Deep Learning for Region Detection in High-Resolution Aerial Images // 2018 IEEE East-West Design & Test Symposium (EWDTS), Kazan, 2018. pp. 1–5.

17. Khryashchev V., Ivanovsky L., Pavlov V., Ostrovskaya A., Rubtsov A. Comparison of Different Convolutional Neural Network Architectures for Satellite Image Segmentation // Proceedings of the FRUCT’23, Bologna, Italy, 2018. pp. 172–179.

18. Ronneberger O., Fischer P., Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, LNCS, vol. 9351, 2015. pp. 234–241.

19. Solovev R., Telpuhov D., Kustov A. Avtomaticheskaja segmentacija sputnikovyh snimkov na baze modificirovannoj svertochnoj nejronnoj seti UNET (Automatic segmentation of satellite images based on the modified UNET convolutional neural network) // Inzhenernyj vestnik Dona. 2017. no. 4.47 p.

20. Ivanovsky L., Khryashchev V., Pavlov V., Ostrovskaya A. Building Detection on Aerial Images Using U-NET Neural Networks // Proceedings of the FRUCT’24, Moscow, Russia, 2019. pp. 116–122.

21. Khryashchev V.V., Priorov A.L., Pavlov V.A., Ivanovsky L.I. Segmentacija obektov na sputnikovyh izobrazhenijah s ispolzovaniem svertochnyh nejronnyh setej (Segmentation of objects in satellite images using convolutional neural networks) // Uspehi sovremennoj radiojelektroniki. 2019. T. 73, no. 6. pp. 28–34.

22. SpaceNet Database, Web: http://explore.digitalglobe.com/spacenet.

23. Gulli A., Pal S. Deep Learning with Keras, Packt Publishing, 2017. 320 p.

24. Kingma D.P., Ba J. Adam: A Method for Stochastic Optimization, Web: https://arxiv.org/abs/1412.6980.


Research on approaches to object contour detection on the basis of preliminary filtration and fuzzy logic

Ershov M.D., e-mail: ershov.m.d@rsreu.ru
Georgieva S.S.
, e-mail: frolowa.sofia@yandex.ru
The Ryazan State Radio Engineering University named after V.F. Utkin (RSREU), Russia, Ryazan


Keywords: image processing, feature extraction, contours of objects, edge detection, pre-processing, filtering, fuzzy logic, Mamdani model, Takagi-Sugeno model.

Abstract
The paper describes one of the fundamental problems in the field of image processing. This problem is associated with the detection of object edges on the observed scene. In the general case, contour detection is used to significantly reduce the amount of data in the image. At the same time, structural properties are preserved and can be used for further image processing. The contours of objects and scene elements can play the role of key features, for example, when combining heterogeneous images obtained from sensors of different types. The aim of the work is to develop and study an algorithm for object contour detection on the basis of pre-filtering and fuzzy logic.

The use of preliminary filtration is due to the ability to suppress the noise component of signal as well as to emphasize the edges. The following approaches to preliminary filtering are considered: global and local approaches to contrast enhancement, morphology-based image correction, bilateral and guided filtering. Pre-processing is an optional step and also can be applied in conjunction with well-known edge detection algorithms.

The developed contour detection algorithm is based on the gradient calculation and on the application of a fuzzy inference system. The fuzzy inference system is based on the Mamdani or Takagi-Sugeno model. The considered systems allow determining the degree of membership of a point to a contour or a homogeneous area. The algorithm consists of the following steps:
1. Calculation of the image gradient. Horizontal and vertical gradients are calculated by convolution of the image with gradient filters in the case of the Mamdani system. The gradient modulus is calculated for the Takagi-Sugeno system.
2. Evaluation of the fuzzy inference system output for each row of image.
3. Threshold selection and further threshold processing to obtain a binary contour image.

The problems of configuring fuzzy inference systems are considered including setting membership functions for inputs and output as well as fuzzy rules.

Software implementation and experimental studies were carried out in the Matlab development environment. The proposed approaches were compared with the well-known Sobel and Canny edge detectors. The studies consisted in processing the base of real images and reference contour images. The quality of the algorithm work was numerically estimated by calculating the precision, recall and F-measure. The contrast enhancement allows to increase the F-measure by 1.5-4.5%. The best results were achieved using bilateral and guided filters. The F-measure has been increased by 6-15% compared with the results without preliminary processing.

The approach based on fuzzy inference systems allows to obtain higher values of quality metrics in comparison with the Sobel operator (7-11% in the F-measure) and the Canny algorithm (8-21% in the F-measure). The type and size of gradient filters have a significant influence on the work of approaches based on both the Mamdani and Takagi-Sugeno systems.

References
1. Alpatov B.A., Babayan P.V., Balashov O.E., Stepashkin A.I. Methods for Automatic Detection and Tracking of Objects. Image Processing and Control. Moscow, Radiotehnika, 2008. 176 p. (in Russian).

2. Alpatov B.A., Ershov M.D., Feldman A.B. Image processing algorithm for combined vision system of aircraft. Digital Signal Processing, 2015. No.3. P. 8-14 (in Russian).

3. Roberts L.G. Machine Perception of 3-D Solids. MIT Press, 1965. 82 p.

4. Sobel I., Feldman G. A 3x3 Isotropic Gradient Operator for Image Processing // Stanford Artificial Intelligence Project. 1968.

5. Prewitt J.M.S. Object Enhancement and Extraction, Picture processing and Psychopictorics. New York: Academic Press, 1970. P. 75-149.

6. Titov I.O., Emelyanov G.M. Vydelenie konturov izobrazhenija dvizhushhegosja ob'ekta. Vestnik Novgorodskogo gosudarstvennogo universiteta, 2010. No.55. P. 27-31 (in Russian).

7. Kirsch R.A. Computer determination of the constituent structure of biological images // Computers and Biomedical Research. 1971. Vol.4. P. 315-328.

8. Robinson G.S. Edge detection by compass gradient mask // Computer Graphics and Image Processing. 1977. Vol.6. P. 492-501.

9. Marr D.C., Hildreth E.C. Theory of Edge Detection // Proceedings of the Royal Society of London. Series B, Biological Sciences. 1980. Vol.207. P. 187-217.

10. Canny J.F. A Computational Approach to Edge Detection // IEEE Transactions on Pattern Analysis and Machine Intelligence. 1986. Vol.8. P. 679-698.

11. Vizilter Y.V., Zheltov S.Y. Image Processing and Analysis in Computer Vision Tasks. Moscow, Fizmatkniga, 2010. 672 p. (in Russian).

12. Mitchell M. An introduction to genetic algorithm. MIT Press, 1996. 158 p.

13. Babayan P.V., Shubin N.Y. Line detection in a noisy environment with weighted Radon transform // Proc. SPIE 9024, Image Processing: Machine Vision Applications VII. 2014. Vol.902409. 6 p.

14. Mamdani E.H., Assilian S. An experiment in linguistic synthesis with a fuzzy logic controller // International Journal of Man-Machine Studies. 1975. Vol.7. No.1. P. 1-13.

15. Sugeno M. Industrial applications of fuzzy control. Elsevier Science Ltd, 1985. 278 p.

16. Paris S., Hasinoff S.W., Kautz J. Local Laplacian Filters: Edge-Aware Image Processing with a Laplacian Pyramid // Communications of the ACM, 2015. Vol. 58. No. 3. P. 81-91.

17. Szeliski R. Computer Vision: Algorithms and Applications. Springer, 2010. 979 p.

18. Zuiderveld K. Contrast Limited Adaptive Histogram Equalization // Graphic gems IV, 1994. P. 474-485.

19. Tomasi C., Manduchi R. Bilateral filtering for gray and color images // 6th International conference on computer vision. IEEE, 1998. P. 839-846.

20. He K., Sun J., Tang X. Guided Image Filtering // IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013. Vol. 35. P. 1397-1409.

21 .Arbelaez P., Maire M., Fowlkes C., Malik J. Contour Detection and Hierarchical Image Segmentation // IEEE TPAMI. 2011. Vol.33. No.5. P. 898-916.


The quality improvement of the traditional motion compensation method
Dam Trong Nam , The Moscow Institute of Physics and Technology (MIPT), Russian, Moscow , e-mail: chong.dam@phystech.edu

Keywords:
ideo coding, approximation function, motion estimation, motion compensation.

Abstract
For temporal redundancy reduction, most modern video codecs utilize motion compensation method. The traditional motion compensation method is based on motion vectors received by the block matching method, which finds blocks of current frame corresponding to blocks of reference frame using some cost function [1]. In modern implementations of video codecs x.264 [2] and x.265 [3], as well as in the wavelet-based video codec Dirac [4], a simple cost function, namely Sum of Absolute Difference (SAD) is used only due to the simplicity of its calculation, which does not guarantee the optimality of motion compensation method. Indeed, the traditional compensation method with the cost function SAD works well for simple videos, where parallel movement (translation) is the main movement type of objects in the current frame in accordance with the reference frame. For more complex videos where exist the change in scale (zooming), rotation, brightness change, etc., the traditional method is ineffective.

It should be noted that modern technologies and the possibility of parallelization for calculations allow us to perform motion compensation with more complex but more accurate cost functions. In this paper, for the traditional motion compensation method we propose a modification which applies other cost functions.

This paper is dedicated to the research and development of prediction quality improvement of traditional motion compensation method for video codecs based on inter-frame block motion compensation. The paper proposes a new motion compensation method with the use of approximation functions with additional parameters. For applying the considered approximation functions in the video codec, the subtasks associated with the accuracy and the transmission method of additional parameters have been solved. Based on the results of high-definition video processing with the approximation functions studied, the best suitable function has been chosen. Application of the proposed method significantly reduces the amount of compressed data (from 15% to 34%) at a given quality of the reconstructed frame for high-definition videos.

References
1. V. P. Dvorkovich, A.V. Dvorkovich, “Digital video information system (theory and practice)”, Moscow: Tekhnosfera, 2012.

2. x.264 video codec // http://www.videolan.org/developers/x264.html

3. x.265 video codec // http://x265.org/

4. Dirac video codec // https://sourceforge.net/projects/dirac/

5. ITU-T Recommendation H.265, High efficiency video coding, 04/2013 // ISO/IEC FDIS 23008-2, Information technology – High efficiency coding and media delivery in heterogeneous environments – Part 2: High efficiency video coding (MPEG-H HEVC).

6. Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, Thomas Wiegand, Overview of the High Efficiency Video Coding (HEVC) Standard // IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, #12, 12/2012, pp. 1649-1668.

7. Jill M. Boyce, Yan Ye, Jianle Chen, Adarsh K. Ramasubramonian. Overview of SHVC: Scalable Extensions of the High Efficiency Video Coding Standard. IEEE Trans. Circuits Syst. Video Techn. 26(1): 20-34 (2016).

8. Jianguo Zhang Ling Shao Lei Zhang Graeme A. Jones. Intelligent video event analysis and understanding. Springer, 2010 edition. - 251 pages. ISBN 978-3-642-17554-1.

9. V. A. Gritsenko, E. V. Belosevic, E. K. Artemeva. Mathematical methods in geography: Textbook / Kaliningrad University – Kaliningrad, 1999. – 75 p. – ISBN 5-88874-151-5.

10. Ferster E., rents B. Methods of correlation and regression analysis. A guide for economists. Translation from German and Preface by V. M. Ivanova, Moscow: "Finance and statistics", 1983 - 304 p.

11. Test videos // https://media.xiph.org/video/derf/

Implementation features of the modified motion compensation method
Dam Trong Nam , The Moscow Institute of Physics and Technology (MIPT), Russian, Moscow , e-mail: chong.dam@phystech.edu


Keywords: video coding, approximation function, motion estimation, motion compensation.

Abstract
For temporal redundancy reduction, most modern video codecs utilize motion compensation method. In paper [1] the modified motion compensation method has been proposed, which is based on the application of other cost function in which the so-called approximation function is introduced. It was noted in [1] that in order to apply the modified method, it is necessary to solve problems related to additional parameters of the approximation function, such as the determination method, the required accuracy and the transmission method of additional parameters. In order to test the practical significance of the modified method, an experimental program written in C++ is developed, which is used to research the proposed approximation functions. According to the entropy estimation of the data required for encoding and the reconstruction quality, it was concluded that among the investigated approximation functions, the linear approximation function is the most suitable for the video compression problem and the modified method can reduce the amount of encoded data significantly at a given reconstruction quality.

This work is devoted to the implementation of the modified method in a specific video codec Dirac and the description of the features of the modified method. In comparison to traditional method which is implemented in the considered video codec Dirac, the proposed method utilizes the other cost function based on approximation function which contains additional parameters. The subtasks associated with the accuracy and the transmission method of additional parameters have been solved in order to adapt the proposed method in considered wavelet-based video codec Dirac. Working with various types of video [8], the following conclusions were obtained:

-Application of the proposed method in the considered video codec Dirac can reduce the bitrate up to 33% with good reconstructed video quality PSNR = 40 dB and up to 40% with excellent quality PSNR = 42 dB.

-The more difficult test videos (the changes in the scene are significantly different from the parallel movement), the better the proposed method in comparison with the traditional one.

References
1. Dam Trong Nam. The quality improvement of the traditional motion compensation method. Digital signal processing. 2019. ¹3 (current number).

2. Dirac video codec // https://sourceforge.net/projects/dirac/

3. V. A. Gritsenko, E. V. Belosevic, E. K. Artemeva. Mathematical methods in geography: Textbook / Kaliningrad University – Kaliningrad, 1999. – 75 p. – ISBN 5-88874-151-5.

4. Ferster E., rents B. Methods of correlation and regression analysis. A guide for economists. Translation from German and Preface by V. M. Ivanova, Moscow: "Finance and statistics", 1983 - 304 p.

5. Test videos // https://media.xiph.org/video/derf/


Voice activity detection methods and algorithms
V.A. Volchenkov, e-mail:
volchenkov.rzn@yandex.ru
The Ryazan State Radio Engineering University named after V.F. Utkin (RSREU), Russia, Ryazan

Keywords: voice activity detection, VAD, speech detection error, pause detection error.

Abstract
Implementation of existing algorithms for pause detection is based on the assumption that speech is a nonstationary signal, its spectrum is usually changed through short periods of time – 10-30 msec. Also it’s considered that the background noise is usually stable for a longer period of time. It’s slightly varying with time, and the level of the speech signal is usually above the level of background noise. It is usually divided into segments of duration 16-32 msec, then the energy level of the signal at each interval and the number of signal transitions through zero are analyzed. In the case where the time interval is determined by the detector as pause, before finally deciding that the signal is absent, the system needs serially detect some more frames (in GSM system 5-6). Thus, most existing methods for voice activity detection can determine pauses, that length exceeds 40 msec.

In the present work is offered voice activity detector (VAD), which increases the probability of correct pauses detection in human speech. The developed method was compared with Likelihood-Ratio-Based VAD (LR-VAD) method and G.729B VAD. The test data was 108 seconds of speech mixed with vehicle noise of 5, 10, 15, 20 and 25 dB SNR. The active and inactive regions of the speech material were marked manually. The proportions of the inactive and active regions of the speech material were 0.46 and 0.54, respectively.

The offered VAD proved to be better than the G.729B VAD for almost all SNR values. The G.729B VAD was better only at 5 dB SNR. The LR-VAD method proved to be better than the other methods by the “speech detection error” parameter, but in terms of the parameter “pause detection error” it was significantly worse than the others VAD methods. The lowest pause detection error for almost all SNR values was demonstrated by the VAD method proposed in this article. In further work, it is intended to improve the offered VAD to reduce speech detection error.

References
1. O.I. Shelukhin, N.F. Lukjanceva, Digital processing and voice transmission. Under edition of O.I. Shelukhin. – Moscow: Radio and communication, 2000. – 456 p.

2. Kondoz A.M. Digital Speech. Coding for Low Bit Rate Communication Systems. – John Wiley & Sons, Ltd. 2004. – 442 p.

3. ITU-T (1996) A silence compression scheme for G.729 optimised for terminals conforming to ITU-T V.70, ITU-T Rec. G.729 Annex B.

4. ITU-T (1996) Coding of speech at 8 kbit/s using conjugate-structure algebraiccode excited linear prediction (CS-ACELP), ITU-T Rec. G.729.

5. J. Sohn, N. S. Kim, and W. Sung (1999) ‘A statistical model-based voice activity detection’, in IEEE Signal Processing Letters, 6(1):1–3.

6. Y. Ephraim and D. Malah (1984) ‘Speech enhancement using a minimum mean square error short-time spectral amplitude estimator’, in IEEE Trans. on Acoust., Speech and Signal Processing, 32(6):1109–20.

7. Y. Ephraim and D. Malah (1985) ‘Speech enhancement using a minimum mean square error log-spectral amplitude estimator’, in IEEE Trans. on Acoust., Speech and Signal Processing, 33(2):443–5.

8. O. Capp'e (1994) ‘Elimination of musical noise phenomenon with the Ephraim and Malah noise suppression’, in IEEE Trans. Speech and Audio Processing, 2(2):345–9.

9. V.V. Vitjazev, V.I. Rozov, V.A. Volchenkov, The Russian Federation patent for an invention ¹ RU 2436173 C1, Int. Cl. G10L 15/00, G10L 11/02, Method of Detecting Pauses in Speech Signals and Device for Realising Said Method. Proprietor – Ryazan State Radio Engineering University. Application: 2010124342/08, Date of filing: 15.06.2010, Date of publication: 10.12.2011 Bull. 34.


The optical modulation-transfer function of the atmosphere
E.Z. Soroka, e-mail: soroka@mniti.ru
V.S. Filatov, e-mail: filatov_vs@mniti.ru
The Moscow television research institute (MNITI) 105094, Moscow, Golianovskaya ul., 7à, bld .1


Keywords: atmospheric transfer channel, modulation-transfer function (MTF), image transmission.

Abstract
The quality of images obtained in optical vision and registration systems (in vision, UF and IR ranges) depends not only on these systems parameters’ but on the atmospheric image transmission channel parameters. The effects of an atmospheric channel consist of overall optical signal attenuation and background illumination but also in high frequency components attenuation of obtained images that results in image sharpness decrease. For quantitative estimation of the atmosphere effect on image sharpness modulation-transfer function (MTF) of atmospheric channel may be used. In this article we analyzed earlier published experimental data for fog, haze and other phenomena, that produced changes in optical MTF of the atmosphere. We constructed the mathematical expressions and graphs that reflect the growth of attenuation of the high-frequency image components with increase of the common attenuation of an optical signal. This enables the operative MTF estimation of an atmospheric image transfer channel.


References

1. Zuev V.E., Kabanov M.V. Transfer of optical signals in terrestrial atmosphere (under interference conditions). Moscow: Sovetskoye Radio, 1977. – 368 p. – Ch. 7.

2. Zuev V.E., Kabanov M.V. Optics of atmospheric aerosol. Leningrad: Gidrometeoizdat, 1987. – 255 p. – Ch. 2.

3. O’Neill E.L. Introduction to statistical optics. Palo Alto, London: Addison-Wesley Publishing Company, 1963.

4. Papoulis Athanasios. Systems and transforms with applications in optic. New York: McGraw-Hill book company, 1968. – Ch. 5.

5. Levshin W.L. Spatial filtering in optical direction-finding systems. Moscow: Sovetskoye Radio, 1971. – Ch. 1.

6. Naoyoshi Nameda. Fog modulation transfer function and signal lighting. Lighting Research and Technology. 1992, no 24(2): 103-106.

7. Smirnov W.A. Theory and method for solving of optic image transfer problems in diffuse environments. Voprosy radioelectroniki – Technika Televideniya, 1965, no 6: 109-124.



If you have any question please write: info@dspa.ru