Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Vision Models for High Dynamic Range and Wide Colour Gamut Imaging: Techniques and Applications
Vision Models for High Dynamic Range and Wide Colour Gamut Imaging: Techniques and Applications
Vision Models for High Dynamic Range and Wide Colour Gamut Imaging: Techniques and Applications
Ebook629 pages6 hours

Vision Models for High Dynamic Range and Wide Colour Gamut Imaging: Techniques and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

To enhance the overall viewing experience (for cinema, TV, games, AR/VR) the media industry is continuously striving to improve image quality. Currently the emphasis is on High Dynamic Range (HDR) and Wide Colour Gamut (WCG) technologies, which yield images with greater contrast and more vivid colours. The uptake of these technologies, however, has been hampered by the significant challenge of understanding the science behind visual perception. Vision Models for High Dynamic Range and Wide Colour Gamut Imaging provides university researchers and graduate students in computer science, computer engineering, vision science, as well as industry R&D engineers, an insight into the science and methods for HDR and WCG. It presents the underlying principles and latest practical methods in a detailed and accessible way, highlighting how the use of vision models is a key element of all state-of-the-art methods for these emerging technologies.

  • Presents the underlying vision science principles and models that are essential to the emerging technologies of HDR and WCG
  • Explores state-of-the-art techniques for tone and gamut mapping
  • Discusses open challenges and future directions of HDR and WCG research
LanguageEnglish
Release dateNov 6, 2019
ISBN9780128138953
Vision Models for High Dynamic Range and Wide Colour Gamut Imaging: Techniques and Applications
Author

Marcelo Bertalmío

Marcelo Bertalmío (Montevideo, 1972) is a full professor at Universitat Pompeu Fabra, Spain, in the Information and Communication Technologies Department. He received B.Sc. and M.Sc. degrees in electrical engineering from the Universidad de la República, Uruguay, and a Ph.D. degree in electrical and computer engineering from the University of Minnesota in 2001. He was awarded the 2012 SIAG/IS Prize of the Society for Industrial and Applied Mathematics (SIAM) for co-authoring the most relevant image processing work published in the period 2008–2012. Has received the Femlab Prize, the Siemens Best Paper Award, the Ramón y Cajal Fellowship, and the ICREA Academia Award, among other honours. He was Associate Editor for SIAM-SIIMS and elected secretary of SIAM’s activity group on imaging. He has obtained an ERC Starting Grant for his project “Image processing for enhanced cinematography” and two ERC Proof of Concept Grants to bring to market tone mapping and gamut mapping technologies. He is co-coordinator of two H2020 projects, HDR4EU and SAUCE, involving world-leading companies in the film industry. He has written a book titled “Image Processing for Cinema”, published by CRC Press in 2014, and edited the book “Denoising of Photographic Images and Video” published by Springer in 2018. His current research interests are in developing image processing algorithms for cinema that mimic neural and perceptual processes in the visual system, and in investigating new vision models based on efficient representation, with fine-tuning by movie professionals.

Related to Vision Models for High Dynamic Range and Wide Colour Gamut Imaging

Related ebooks

Computers For You

View More

Related articles

Reviews for Vision Models for High Dynamic Range and Wide Colour Gamut Imaging

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Vision Models for High Dynamic Range and Wide Colour Gamut Imaging - Marcelo Bertalmío

    2019

    Chapter 1

    Introduction

    Abstract

    The media industry is continuously striving to improve image quality and to enhance the overall viewing experience, with higher frame rates, larger resolution, more vivid colours, and greater contrast. Currently there is significant emphasis on high dynamic range (HDR) and wide colour gamut (WCG) imaging. But we are still a long way from a fully functional HDR ecosystem, upon whose existence depends the realisation of all the possibilities that the HDR format could offer in terms of revenue and market growth for companies and in terms of improved user experience for viewers. Very recent reports from several international organisations and standardisation bodies concur in that there are a number of challenges that need to be addressed for a successful and complete adoption of HDR and WCG technology. This book is focused on solutions, based on vision science, for several of the issues just mentioned but especially for the problems of tone and gamut mapping. Our conclusions are that imaging techniques based on vision models are the ones that perform best for these problems and a number of other applications, but also that the performance of these methods is still far below what cinema professionals can achieve, and vision models are likewise lacking, most key problems in visual perception remain open.

    Keywords

    HDR; WCG; tone mapping; gamut mapping

    1.1 HDR and WCG

    The media industry is continuously striving to improve image quality and to enhance the overall viewing experience, with higher frame rates, larger resolution, more vivid colours, and greater contrast. Currently there is significant emphasis on high dynamic range (HDR) and wide colour gamut (WCG) imaging, let's briefly see what do these concepts mean.

    The contrast in a scene is measured by the ratio, called dynamic range, of light intensity values between its brightest and darkest points. While common natural scenes may have a contrast of 1,000,000:1 or more, our visual system allows us to perceive contrasts of roughly 10,000:1, much greater than the available simultaneous contrast range of traditional cameras and screens. For this reason, the capture and display of content with a dynamic range approaching that of real scenes has been a long-term challenge, limiting the ability to reproduce more realistic images. Currently it is not uncommon for state-of-the-art camera sensors to have a dynamic range of 3 or 4 orders of magnitude, therefore matching in theory the range of human visual perception; computer generated images can have an arbitrarily high dynamic range; and recent display technologies have made brighter TV sets available without increasing the luminance of the black level, which has enabled the appearance of HDR displays.

    The colour gamut of a display is the set of colours its capable of reproducing, and the wider the gamut is the closer the displayed images get to represent the colours we naturally perceive. The colour gamut of a display depends mostly on the properties of the light primaries it uses, and new display technologies like laser RGB projectors have monochromatic primaries with such a high colour purity that their colour gamut is truly wide, covering virtually every colour found in nature.

    HDR and WCG are independent image attributes [1], i.e. a picture may have a high dynamic range but a reduced colour gamut, and the other way round. But there is definitely a connection since HDR displays, just by being brighter, can reproduce more saturated colours than what a SDR display can achieve.

    1.2 Improved image appearance with HDR/WCG

    An HDR system is specified and designed for capturing, processing, and reproducing a scene, conveying the full range of perceptible shadow and highlight detail, with sufficient precision and acceptable artefacts, including sufficient separation of diffuse white and specular highlights and separation of colour details, far beyond current SDR (Standard Dynamic Range) video and cinema systems capabilities [2].

    HDR/WCG technology can provide a never before seen increase in contrast, colour and luminance and is lauded in the creative sector as a truly transformative experience: for the cinema and TV industries, HDR represents the most exciting format to come along since colour TV [3]. Besides the enthusiasm about the unparallelled improvement in picture quality for new content, the industry expects, through repurposing for HDR, to monetise existing film and TV libraries once HDR screens are established, in theatres, at home and on mobile devices.

    The ultra high-definition (UHD) TV community has realised that the transition from high-definition (HD) to the 4K or 8K resolutions represent an insufficient improvement in viewing experience for the audience, therefore UHD is being aligned with HDR since the visual experience of material displayed on an HDR screen is significantly richer, looks more lifelike, and produces a better sense of immersion [4]. As a content creator puts it: Unlike other emerging distribution technologies such as Stereo 3D, high-frame-rate exhibition, wide gamuts, and ever-higher resolutions (4K, 8K) which engender quite a bit of debate about whether or not they're worth it, HDR is something that nearly everyone I've spoken with, professional and layperson alike, agree looks fantastic once they've seen it [...] Furthermore, it's easy for almost anyone to see the improvement, no matter what your eyeglass prescription happens to be [5].

    The contribution of HDR to the sense of immersion stems from the fact that HDR images appear much more faithful to an actual perceived scene, allowing the viewer to resolve details in quite dark or quite bright regions, to distinguish subtle colour gradations, to perceive highlights as much brighter than diffuse white surfaces, all things that we associate with everyday experiences and that cannot be reproduced using SDR systems. Content creators that have been exposed to HDR are extremely enthusiastic about it, because they have realised that it allows them to overcome the artistic limitations (in terms of colour and contrast) imposed by SDR systems, giving them the perceptual tools that fine artists working in the medium of painting have had for hundreds of years [5]. With HDR/WCG technologies, colourists have previously unavailable means for creating dramatically differentiated planes of highlights, for directing the viewer's gaze by sprinkling HDR highlights strategically across the image, for letting the brighter mid tones of an image breathe, for surprising the audience with sudden flares or cuts to bright frames and, in general, for creating compelling narrative opportunities in storytelling [5].

    1.3 The need for an HDR ecosystem

    At present many if not most of the new TV sets in the market are HDR, and they generally have a wider colour gamut as well. Some TV programming is broadcast in HDR, there's HDR streaming of select movies and series, most major new-release discs come in HDR, and there's substantial re-mastering of older movies for HDR.

    But we are still a long way from a fully functional HDR ecosystem, upon whose existence depends the realisation of all the possibilities that the HDR format could offer in terms of revenue and market growth for companies and in terms of improved user experience for viewers. Very recent reports from several international organisations and standardisation bodies concur in that there are a number of challenges that need to be addressed for a successful and complete adoption of HDR technology.

    For instance, a September 2018 guideline [6] by the UltraHDForum, an industry organisation of technology manufacturers set up to promote next generation UHD TV technology that has embraced HDR as a key component in the UHD roadmap, states that no de facto standards emerged which define best practices in configuring colour grading systems for 2160p/HDR/WCG grading and rendering.

    A report [7] by the International Telecommunication Union-Radiocommunication (ITU-R) from April 2019 states that: As HDR-TV is at a formative stage of research and development as presented in this Report, a call for further studies is made, in particular on the characteristics and performance of the recommended HDR-TV image parameter values, for use in broadcasting [...] The introduction of HDR imagery poses a number of housekeeping challenges, associated with the increased number of picture formats that will be in use.

    And the Study Group report on an HDR imaging ecosystem [2] by the Society for Motion Picture and Television Engineers (SMPTE) states that: The HDR Ecosystem needs imaging performance requirements that must be met with sufficient precision [...]. There is a need for a better understanding of the set of elements, including standards, required to form a complete functional and interoperable ecosystem for the creation, delivery and playback of HDR image content. The parameters that make up HDR [...] are beyond the capabilities of existing standards to deliver.

    1.4 Problems to solve to enable an HDR ecosystem

    1.4.1 New colour management and grading tools

    There is a need for colour management and grading tools so that the mastering process has a simple workflow that can deliver consistent images across the whole spectrum of possible display devices and environments.

    The desire is to make sure that the standard post-production pipeline can handle smoothly the latest generation of HDR camera-shot footage, and that the final delivered footage has the fullest range for the new array of HDR displays. It is also essential to guarantee that the look specified by the cinematographer is applied correctly, and kept consistent and as expected throughout the whole chain.

    The situation is significantly complex as there is not just a single HDR, but a whole family of device technologies, formats and viewing environments. Differences between implementations and for varying display luminances are subtle but can be very significant, making the task really difficult: what might be acceptable in a TV showroom as an HDR viewing experience is not enough in a mastering suite.

    1.4.2 New projection systems for movie theatres

    Traditional digital cinema projectors rely on an illumination system that delivers a flat-field illumination onto a two dimensional light valve. The light valve will send the light that is needed to form the image on-screen towards the projection lens and either absorbs or directs the remainder of light away from the projection lens. The average light level of conventional cinema images has been evaluated to be below 10% of its maximum value, and thus the majority of the light is being blocked inside the projector. When moving to HDR cinema content, the average light level relative to the peak white brightness is expected to drop below 2%, so the conventional projector illumination approach becomes even more inefficient.

    From the above, light-valve technology is limited in peak brightness, so the only way to achieve HDR with it would be to lower the black level. The sequential contrast is defined as the luminance of white, measured on a fully white screen, divided by the luminance of black, measured when the screen is fully dark. With a single, standard DLP projector the sequential contrast is around 2,000:1, and can be extended to 6,000:1 with state-of-the-art technology. Dolby Vision cinemas use two modulators in series, so the contrast is multiplied and Dolby claims values of 1,000,000:1. Recent studies have assessed the perceivable dynamic range in a cinema environment, considering the influence on the final contrast of scattering in the projection optics and room reflections. It was found that only in a very limited amount of image frames with extremely low average picture level, an improved projector black level can effectively result in an extended dynamic range. As soon as the average picture level rises, reflections by the room will dominate the on-screen contrast ratio. In addition, as the eye adapts to a higher average luminance level, the black detection threshold rises.

    As such, with the current light valve technology, it appears impossible to substantially raise the white level, even with setups consisting of multiple projectors, and the actual gain in black level in a movie theatre by using two light valves in sequence is on average much lower than the theoretical improvement.

    The conclusion is that there is a need for new projection systems for bringing HDR to movie theatres, since traditional cinema projector technology is not capable of providing images with a dynamic range that is actually high. Also, the conventional projector illumination approach is highly inefficient in terms of energy consumption.

    1.4.3 Tools for conversion between HDR and SDR

    Tone mapping (TM) and inverse tone mapping (ITM) algorithms are expected to be used at several stages of the HDR production chain. For instance, real-time TM will be used during shoots for on-set monitoring of HDR material (cinema, broadcast) on SDR displays, for combining HDR and SDR sources (live broadcast), and in cinema projection systems for presenting HDR material on SDR projectors. Real-time ITM is to be used for broadcast of SDR material over an HDR channel and, in cinema projection systems, for presenting SDR material on HDR projectors. Other methods for TM and ITM, allowing for interaction but not necessarily real-time, are to be used in post-production and grading as tools for performing a single grade regardless of the dynamic range of source and output.

    Current TV programs are exchanged among broadcasters without difficulty in terms of tone mapping, because all program producers have similar understanding on how to map the scene from 0 to 100% video level, which is the reference white level: this practice will not work for HDR systems, and a new practical guideline is needed, which should also be effective for ensuring consistency between SDR and HDR viewing. Efficient and high quality conversion among HDR and SDR, in both directions, is essential for the consolidation of the HDR format given that for the foreseeable future a mix of both SDR and HDR devices and techniques will coexist: the need to merge SDR and HDR content is inevitable. To avoid the massive cost of a complete system overhaul, as HDR technologies enter broadcast production, a single-stream, merged path of HDR and SDR content is economically necessary.

    1.4.4 Tools for colour gamut conversions

    The colour gamut of a device is the set of colours that it can reproduce, and brighter displays (as those associated with emerging technologies for HDR TV sets) have a larger colour volume. Pointer analysed many samples of frequently occurring real surface colours and derived what is commonly known as Pointer's gamut. Although both the standard colour gamuts for cinema (DCI-P3) and TV (BT.709) cover a reasonable amount of Pointer's gamut, many interesting real world colours fall outside them. In 2012 ITU-R recommended a new standard gamut for the next generation UHDTV, called BT.2020, that encompasses DCI-P3 and BT.709 and covers 99.9% of Pointer's gamut.

    New laser projectors are able to cover the very wide BT.2020 gamut, but if the inputs are movies with DCI-P3 gamut, as virtually all professional movies currently are, the full colour rendering potential of these new projectors cannot be realised. There is a pressing need then to develop gamut extension techniques that enlarge the gamut of the movie content while ensuring that the gamut extended result preserves as much as possible the artistic intent of the content creator.

    Likewise, there is need for automatic gamut reduction techniques, for display of wide-colour-gamut HDR content on current screens, whose vast majority has a colour gamut no larger than DCI-P3. Professional movies are colour-graded in DCI-P3 and thus the final result is ready for cinema, but its gamut must be modified to fit into BT.709 for TV, disc release or streaming: this is done through a gamut modification process based on the use of three-dimensional look-up tables (LUTs). These LUTs contain millions of entries and colourists only specify a few colours manually, while the rest are interpolated without taking care of their spatial or temporal context. Subsequently, the resulting video may have false colours that were not present in the original material and intensive manual correction is usually necessary, commonly performed in a shot-by-shot, object-by-object basis.

    In TV production, gamut mapping is performed in real-time with the use of pre-determined LUTs, and therefore issues are unavoidable. There is also the need for gamut mapping when the output is an AR/VR headset: this type of technology mostly uses OLED screens, where the gamut is quite wider than BT.709, but not nearly as wide as BT.2020, hence either gamut extension or gamut reduction are desirable, depending on the input content.

    1.4.5 Production and editing guidelines for HDR material

    The current take on HDR imaging is still rather conservative: for instance, the standard encoding curve ST2084 aims to enable the creation of video images with an increased luminance range, not for the creation of video images with overall higher luminance levels, that is, ST2084 HDR does not attempt to make the whole image brighter, only highlight detail such as chrome reflections or light sources [8]. The movie industry, on the other hand, would like to assess what is the complete range of possibilities that HDR allows, so that movie makers can exploit the full potential of the technology and bring the medium to a new level.

    This would require the elaboration of a set of guidelines to assist content creators in shooting and processing HDR material, and these guidelines will have to address a number of issues that may arise due to the complex and not yet fully understood interactions of HDR images with the human visual system, including: bright light exposure reduces pupil size, a shift from rod-dominated vision to cone-dominated vision when light conditions brighten, bright lights can reduce retinal sensitivity and produce after-images, adaptation to changes of average luminance levels may take seconds, high luminance can affect the perception of flicker and judder artefacts, fast cuts with high brightness elements might cause eye strain, the appearance of cross dissolves and fades in-out will be different in HDR movies, there might be a need of different production procedures if the intended output is TV or large cinema screens because of expected differences in eye-tracking, etc.

    1.4.6 Tools for personalisation

    The cinema industry has a successful, proven record of ensuring that moving pictures have the intended, optimal contrast and colour in the controlled scenarios given by cinema theatres and home TV viewing, and post-production is always performed for these two types of viewing conditions. But we are currently living an explosive growth of video consumption on mobile devices. Some manufacturers have started to tackle this issue with partial, proprietary solutions, but since media convergence and movie watching on the go may happen under a variety of very much uncontrolled viewing conditions, we can expect that image appearance will vary as well, both fluctuating as the surroundings change and departing from the creator's intent, sometimes very noticeably as in the case of memory colours and skin tones, which colourists take pains to preserve during post-production but only for a home TV or cinema screen output. A related issue is that individual differences in colour perception become more prominent as the colour capabilities of displays are improved.

    1.5 Overview of the book

    This book is focused on solutions, based on vision science, for several of the issues just mentioned but especially for the problems of tone and gamut mapping.

    We start with an overview of the biology of vision. In Chapter 2 we will describe the impact of the optics of the eye in the formation of the retinal image, the layered structure of the retina and its different types of cell, and the neural interactions and transmission channels by which information is represented and conveyed. Whenever possible we try to stress two ideas: first, that many characteristics of the retina and its processes can be explained in terms of maximising efficiency, because they simply are the optimal choices; this concept of efficient representation is key for our applications and is the underlying theme in most of the book. And second, that much is still unknown about how the retina works.

    In Chapter 3 we will describe the layered structure, cell types and neural connections in the lateral geniculate nucleus and the visual cortex, with an emphasis on colour representation. We will also introduce linear+nonlinear (L+NL) models, which are arguably the most popular form of model not just for cell activity in the visual system but for visual perception as well. An important take-away message is that despite the enormous advances in the field, the most relevant questions about colour vision and its cortical representation remain open: which neurons encode colour, how does the cortex transform the cone signals, how shape and form are perceptually bound, and how do these neural signals correspond to colour perception. Another important message is that the parameters of L+NL models change with the image stimulus, and the effectiveness of these models decays considerably when they are tested on natural images. This has grave implications for our purposes, since in colour imaging many essential methodologies assume a L+NL form.

    In Chapter 4 we discuss adaptation and efficient representation. Adaptation is an essential feature of the neural systems of all species, a change in the input–output relation of the system that is driven by the stimuli and that is intimately linked with the concept of efficient representation. Through adaptation the sensitivity of the visual system is constantly adjusted taking into account multiple aspects of the input stimulus, matching the gain to the local image statistics through processes that aren't fully understood and contribute to make human vision so hard to emulate with devices. Adaptation happens at all stages of the visual system, from the retina to the cortex, with its effects cascading downstream; it's a key strategy that allows the visual system to deal with the enormous dynamic range of the world around us while the dynamic range of neurons is really limited.

    In Chapter 5 we discuss brightness perception, the relationship between the intensity of the light (a physical magnitude) and how bright it appears to us (a psychological magnitude). It has been known for a long time that this relationship is not linear, that brightness isn't simply proportional to light intensity. But we'll see that determining the brightness perception function is a challenging and controversial problem: results depend on how the experiment is conducted, what type of image stimulus are used and what tasks are the observers asked to perform. Furthermore, brightness perception depends on the viewing conditions, including image background, surround, peak luminance and dynamic range of the display, and, to make things even harder, it also depends on the distribution of values of the image itself. This is a very important topic for imaging technologies, which require a good brightness perception model in order to encode image information efficiently and without introducing visible

    Enjoying the preview?
    Page 1 of 1