Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Biomedical Image Synthesis and Simulation: Methods and Applications
Biomedical Image Synthesis and Simulation: Methods and Applications
Biomedical Image Synthesis and Simulation: Methods and Applications
Ebook1,321 pages13 hours

Biomedical Image Synthesis and Simulation: Methods and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Biomedical Image Synthesis and Simulation: Methods and Applications presents the basic concepts and applications in image-based simulation and synthesis used in medical and biomedical imaging. The first part of the book introduces and describes the simulation and synthesis methods that were developed and successfully used within the last twenty years, from parametric to deep generative models. The second part gives examples of successful applications of these methods. Both parts together form a book that gives the reader insight into the technical background of image synthesis and how it is used, in the particular disciplines of medical and biomedical imaging. The book ends with several perspectives on the best practices to adopt when validating image synthesis approaches, the crucial role that uncertainty quantification plays in medical image synthesis, and research directions that should be worth exploring in the future.

  • Gives state-of-the-art methods in (bio)medical image synthesis
  • Explains the principles (background) of image synthesis methods
  • Presents the main applications of biomedical image synthesis methods
LanguageEnglish
Release dateJun 18, 2022
ISBN9780128243503
Biomedical Image Synthesis and Simulation: Methods and Applications

Related to Biomedical Image Synthesis and Simulation

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Biomedical Image Synthesis and Simulation

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Biomedical Image Synthesis and Simulation - Academic Press

    Part 1: Methods and principles

    Outline

    Chapter 2. Parametric modeling in biomedical image synthesis

    Chapter 3. Monte Carlo simulations for medical and biomedical applications

    Chapter 4. Medical image synthesis using segmentation and registration

    Chapter 5. Dictionary learning for medical image synthesis

    Chapter 6. Convolutional neural networks for image synthesis

    Chapter 7. Generative adversarial networks for medical image synthesis

    Chapter 8. Autoencoders and variational autoencoders in medical image analysis

    Chapter 2: Parametric modeling in biomedical image synthesis

    Pekka Ruusuvuoria,b    aInstitute of Biomedicine, University of Turku, Turku, Finland

    bFaculty of Medicine and Health Technology, Tampere University, Tampere, Finland

    Abstract

    Parametric model-based simulation approaches enable flexibly generating images for specific purposes. Parametric modeling allows incorporating prior knowledge of the physical properties of the image acquisition device and of the underlying biological phenomenon and objects into the simulation system. The simulation process is controlled by a set of model parameters, which allow generating synthetic images with full control on the outcome. Parametric models have been introduced in various areas of biomedical image simulation and object synthesis, ranging from modeling of different imaging and measurement modalities to objects in various scales and dimensions, such as cells and organelles, populations, and tissues. The control of the simulation process through parameters enables simulating various conditions, making validation of image analysis algorithms and tools with simulated images an appealing alternative for manual annotation-based validation. We introduce parametric model-based simulation approaches for generating synthetic cell images, covering the modeling of shape, appearance, spatial distribution, and image acquisition system.

    Keywords

    Shape synthesis; Microscope image simulation; Cell modeling; Parametric modeling

    Acknowledgements

    The author would like to thank Dr. Antti Lehmussola and Dr. David Svoboda for their helpful comments and input to this chapter.

    2.1 Introduction

    Simulation and modeling have been widely used to generate synthetic data across multiple disciplines of biomedical sciences, including multiple different imaging modalities and applications. Depending on the imaging modality, on the targeted model object, and on the purpose of the synthesized data, the modeling naturally differs in various aspects. It is clear that medical imaging of macro-level objects, such as magnetic resonance imaging of brains and positron emission tomography of animal models, differs from microscope imaging of cell populations of tissue samples. Even within microscopy, the abundance of imaging modalities, let alone the differences in objects and samples that can be imaged, makes a general approach for covering biomedical image simulation almost impossible. Here, we focus on presenting the typical phases in parametric modeling-based simulation of microscopy images with cells and cellular structures as the objects of interest.

    Despite the focus set in a certain application area, the simulation process shares some general principles. The goal of the modeling process is to capture essential characteristics of the underlying phenomenon or physical objects, and then it attempts to mimic the image acquisition device through which we capture the distorted images of the true objects. In more detail, the process can be divided into two steps. First, an ideal, undistorted image of the imaged object is generated. This is done using a model for the objects of interest, which may include prior knowledge on the physical properties of the objects, domain knowledge from biology or medicine, as well as ad-hoc modeling based on visual properties. The result, an ideal object representation, is then fed to the second phase of simulation process, which is the modeled measurement system. This phase aims at distorting the ideal object representation with aberrations and errors caused by the image acquisition process into an image which resembles realistic data obtained from imaging devices.

    One way to categorize the simulation approaches is the division into parametric modeling and learning-based modeling approaches [1]. The former category, parametric modeling, is a widely used approach where a simplified, explicitly defined model controlled by user-defined parameters is used for generating shapes and image properties which, to a useful level, resemble the images of real objects. For example, defining a fluorescence-labeled object as a 2D Gaussian surface and its movement over time with a deterministic random walk can be used for simulating time-lapse images using parametric models. The approach enables efficient control over the properties of the simulated image properties, making it well suited for creating synthetic experiments for testing automated image analysis algorithms under various conditions. The latter, learning-based modeling, presents an approach where the model properties are learned from training examples, enabling including the natural variation in generated shapes through the inclusion of representative samples in the training data.

    Here, we focus on simulation processes generating synthetic images using parametric modeling. In particular, we discuss parametric modeling of the cellular objects in 2D, both as a generic random shape model and as cell-type specific ad-hoc models. We will also discuss the modeling of microscopy measurement and image acquisition systems. We limit to using the SIMCEP simulation framework [2,3] and its refined versions for specific use cases [4,5] as an example implementation. Finally, we provide example use cases for simulated images generated through parametric modeling, and discuss potential future directions.

    2.2 Parametric modeling paradigm

    Parametric modeling enables control over the simulated image characteristics, enabling generation of unconstrained quantities of synthetic data with desired properties. In the context of microscopy images of cell populations, some of the obvious characteristics are related to the number of objects, their spatial arrangement in the images, and shape & appearance of the objects. Further, when generating simulated images with realistic appearance for validating image analysis tools, the image acquisition process and other sources of noise and variation also need to be modeled.

    In the parametric modeling simulation paradigm, each property is in principle controlled through specific parameters defining, e.g., the attributes of the statistical model from which the generated instance is drawn. Modeling complex biological objects and underlying phenomena, biochemical process, and physical properties of the measurement system can be done by using practical approximations providing coarse representations or using representations tailored to carefully correspond to realistic visual appearance by detailed parameterization of the model. While offering unlimited possibilities for adding properties in the simulation model through introducing more parameters, the complexity of the parameter space is also a shortcoming of the parametric modeling paradigm.

    2.2.1 Modeling of the cellular objects

    A cell, as the fundamental functional unit in biology, is a very challenging object for modeling with its versatile phenotypes and functionality, despite being extensively studied using different microscopy imaging platforms. Incorporating all the knowledge on various cell types that has been accumulated for decades would be practically impossible using parametric modeling approaches. Instead of fully realistic modeling, the aim is typically to provide simulated images with realistic enough appearance from the perspective of image processing tasks. Further, modeling objects imposes different challenges and limitations when considering modeling of 2D microscopy or temporal 2D modeling, where instead of generating instances from a random model, the model parameters need to incorporate temporal similarity to enable dependency between time points as evolving shapes and spatial appearance. Further, modeling is also dependent on the targeted modality, for example, 2D vs. 3D object generation poses limitations in object placement: overlap in 2D projections is not actual overlap in 3D space. Here, we focus on describing the typical steps in modeling cellular objects in 2D microscopy, which can be considered as a less complex task compared to temporal 2D, 3D, or temporal 3D simulation.

    2.2.1.1 Generic parameter-controlled shape modeling: random shape model for nucleus and cell body

    Parametric shape modeling enables efficiently controlling the shape of the simulated objects. By modifying the model parameters, it is possible to adjust the object shape and appearance in a fully controlled manner. Parametric random models for generic cell shapes have been introduced in, e.g., [2,3] and [6]. The former, SIMCEP cell simulator, proposed a generic shape model, which can be used for generating various combinations of random shapes with varying degrees of complexity. The model is based on a polygon with randomly dislocated vertices, and a smooth contour fit between the vertices, representing the shape outline.

    As presented in [3], a polygon with regular vertices can be generated by equidistantly sampling a circle, and by dislocating the vertices into random spatial locations. The parametric shape model then becomes a random polygon with scale r, generated using a uniform distribution as follows:

    (2.1)

    The key parameters towards generating varying shapes are the range defining parameter β, which can be used for controlling the randomness of vertex sampling, and the object radius randomness controlled with parameter α, which adds a random constant in the scale of the object. The final outline of the object is obtained by interpolating a cubic spline between the vertices. This contour forms a smooth outline for the parameter-controlled random shape.

    In Fig. 2.1, an example of the shape generation is illustrated. The key property of the parameter-controlled random shape is its versatility – through setting the parameter values controlling the object radius (α) and vertex randomness (β), the same model can be used for generating objects with varying shapes. For example, cell nucleus (relatively round, moderate variation between cells) and cell body (more complex shape with significant variation between cells) can be generated using the same basic model.

    Figure 2.1 Shape complexity controlled through parameters α and β , controlling the randomness of object radius and vertice sampling. Increasing the values from α  =  β  = 0 on the left creates shapes with increasing complexity. Adapted from [3].

    Several other parametric random shape models have been presented in the literature. To provide a few examples, in [7] the nucleus shape was modeled using a truncated Fourier series for randomized radius for the nucleus outline, and in [6] the object surface was generated as a deformable model implemented as a fast level-set method with artificial noise as the speed function. In [8], the nuclei shapes (in 3D) were generated using level-set deformed Voronoi diagrams while in [7] the Voronoi diagrams were used for generating cell body outlines.

    2.2.1.2 Cell-type specific parametric shape models

    The shapes generated using the relatively simple parametric model, however, are limited to generic random outlines and cannot produce shapes of any particular cell type. From the analysis perspective, capability to capture differences (and similarity) between cell phenotypes and to characterize properties related to the appearance in terms of shape and intensity are fundamental requirements for an image analysis algorithm. Thus, as when modeling synthetic images, capability to generate objects (cells) with distinctive phenotypic characteristics is a desired feature. When modeling shapes outside the range of the generic parametric model described above, for example, when aiming towards generating shapes typical for specific cells, the model needs to be defined specifically for this purpose.

    Several studies in the literature have proposed cell-type specific parametric models to generate shapes representing certain cell types, or to enable generating different cell types flexibly as in [9]. For example, in [4], ad-hoc parameter-controlled models for five bacteria types with distinctive shapes were presented. Table 2.1 summarizes the shape models, controlled with simple range restricting parameters setting the object lengths. In [7], the cervical cells in Pap smears were modeled using a combination of random parametric models for nuclei and Voronoi tessellation for cell body. In [10], more cell-type specific models were introduced (bacilli as simple linear shapes and white blood cells using the parametric model derived from Eq. (2.1)). In [11], a model-based approach for simulating cells with filopodial protrusions was presented, and used for generating simulated images of lung cancer cells of two phenotypes.

    Table 2.1

    2.2.1.3 Modeling appearance: texture and subcellular organelle models

    In cell microscopy, the intensity profile and texture of the cells are fundamental characteristics for the object, but also often carry crucial information about the studied phenomenon. Especially in fluorescence microscopy, the staining can be used for obtaining detailed readout from cell status and function, as well as to reveal the role and function of various subcellular organelles. Thus, modeling the appearance of cells as their intensity distribution and texture forms a complex and challenging task but also offers possibilities for introducing biologically relevant content on a cellular level.

    Despite the possibilities, the parametric modeling of appearance is typically guided by visual similarity rather than biologically driven insight. In fact, the texture is often modeled as noise, to be more precise, as multiscale random texture generator, the so-called Perlin noise [12]. The texture, denoted below as t, generated in a spatial coordinate , is defined as

    (2.2)

    which forms the texture as a multifrequency noise by taking a weighted sum of n octaves of basic noise function . The parameter B defines the bias term for texture intensity, and p, which is the persistence parameter, controls the scale of the summed noise functions. As for the generic random shape, the texture synthesis allows generating versatile appearances by controlling only a few parameters. In Fig. 2.2, an example object with multiple realizations of the parametric noise model is illustrated, showing varying levels in detail as the object texture.

    Figure 2.2 Texture generation using multiscale noise model. By controlling the noise model scale, texture with different frequency components can be generated. Low frequency components are visible on the top left and high frequency components on the bottom right.

    After its introduction in SIMCEP, the multiscale noise generator has been used in many other studies for texture synthesis. As an exception, in SimuCell [9] the modeling of multimarker cell instances includes possibility to guide the marker intensity through, e.g., spatial micro-environment and through marker codependency, providing enhanced opportunities for more biologically guided simulation of cell appearance. Further, in [13], the fluorescence pattern forming the cell texture is modeled using a fluorescence cluster generator, forming a cloud of point-like spots on varying focal levels around the nucleus shape. Some simulators also allow generating bright spots into additional image channels, representing fluorescence labeling of specific small subcellular organelles [14].

    2.2.1.4 Modeling spatial distribution and populations

    The spatial distribution and quantity of objects in the image are among the key properties when characterizing microscopy images. In reality, there are numerous factors affecting the spatial distribution of cells in a population, including the properties of the cells to the environment of the experiment and many more. In some cases the cells tend to grow rather evenly spaced on a plate, while in others there exists clustering or grouping of cells on varying degrees. The tendency to overlap also varies.

    In order to enable generating images representing such characteristics at a population level, the spatial locations need to be controlled at the population level. One possibility is to define the locations as random spatial coordinates with uniform distribution, but with an additional parameter for controlling the probability for a cell to belong to a cluster. Thus, with probability , a cell is placed randomly in the image, and with probability it is placed within one of the clusters, which is a user-defined number. The maximum allowed overlap of generated objects can be controlled through a parameter with values ranging between , where zero means no overlap is allowed and one would allow full overlap. By tuning these handful of parameter values, the simulation process can be set to generate images with varying population-level characteristics. Fig. 2.3 illustrates how the overall image appearance can be efficiently altered through the parameters controlling the spatial arrangement of cells. The spatial appearance, and also the challenge posed to, for instance, cell segmentation algorithms, is adjusted through the probability of clustering ( ) of a cell.

    Figure 2.3 Modeling population-level differences in spatial locations of cells (from [14]). The probability pc of a cell to belong in a cluster ranges from 0 on the left, 0.30 in the middle, to 0.60 on the right, whereas the number of objects remains constant.

    So far we have considered the spatial organization of cells in the context of cultured cell populations. Tissue micro-environment presents another natural context where cell organization is highly complex and highly regulated. Parametric population models for spatial organization of cells in tissue have also been proposed for various tissue types. For example, in [15], colonic crypt micro-environments were generated through defining clusters as elliptic crypts, while maintaining the simulation framework otherwise similar as presented here. In [8], 3D images representing simulated human colon tissue were generated by placing simulated nuclei on toroid-shaped clusters, representing villi in colon tissue.

    2.2.2 Modeling microscopy and image acquisition: from object models to simulated microscope images

    Referring to the two-step simulation process, where first an ideal representation of the physical objects is generated, and then this ideal image is distorted by a process mimicking the image formation in optical system and digital image acquisition, the latter part can in general be modeled in more detail using existing knowledge on the process. For example, several technical characteristics of microscopy imaging systems, such as optical aberrations, effect of out-of-focus, background intensity, and detector noise are well-known features and error sources in microscopy. Thus, their modeling can be more directly based on known physical limitations than modeling of the object appearance.

    Uneven background illumination is a visually striking feature often present in microscopy images, and one of the common properties complicating automated analysis. Thus, when generating simulated images sharing realistic characteristics, uneven illumination needs to be included. A common approximation is to use a 2D quadratic function to model the additive background intensity, with control over the strength of the intensity profile and potentially additional parameters to control shift from image center. More accurate models, with two separate factors of additive and multiplicative intensity profiles, have also been proposed. Effectively, they still produce a visual effect clearly visible in Fig. 2.4 (left).

    Figure 2.4 Modeling unidealities introduced by microscopy and image acquisition process: (left) uneven background illumination, seen as a slowly-varying background intensity in the simulated image; (middle) zoom-in showing blurring caused by the optical system and between-object focus differences; (right) zoom-in showing noise introduced by the imaging device and implemented with an additive noise model. Adapted from [3].

    Another typical error source from microscopy is the blur from the optical system. As deconvolution has been intensively studied in microscopy, it is in principle possible to build a fully realistic model of optical blurring based on the point spread function (PSF). However, approximations are also useful here, and an obvious choice is to model the blurring using a Gaussian kernel representing the PSF. A common property of microscopy images is the limited depth focus, causing objects outside the in-focus plane appearing blurred in the image. Thus, to model optical blurring by taking into account objects appearing on different focal planes, in [3] a spatially varying PSF was presented, where the objects were assigned into varying depth levels affecting the width of the Gaussian kernel used for blurring. In Fig. 2.4 (middle), an example of the effect of blurring using varying Gaussian kernel of the approximated PSF is shown.

    As the final phase in imaging devices, the continuous signal arriving through the optical system is captured into discrete digital images. In reality, this is done using detectors, which are sensor arrays (e.g., CCD/CMOS) converting the flood of photons arriving through the optical system into voltage differences which are then discretized and converted into a representation suitable for computers. While discrete representation is of course built in the simulation process, the acquisition process also introduces sensor noise. Finally, the acquired image is stored as an image file, where potentially lossy compression is used, introducing the last artifacts to the image. Thus, the image acquisition process can be modeled by a combination of photon shot noise and additive Gaussian noise, and optionally applied lossy compression [1]. Fig. 2.4 (right) illustrates a result from an implementation of such noise model into a simulated microscopy image.

    2.3 On learning the parameters

    Parametric modeling usually refers to the kind of ad-hoc defined models with experimentally defined values described in this chapter. It is, however, also possible to define the parameter values and their range in a data-driven, learning-based manner. In such modeling, either the parameter values or the whole model is defined by learning the representation from training samples. This is especially useful when considering the shape, appearance, and also population and spatial arrangement of objects, for which it is more challenging to define well-grounded models than for the technical measurement and image acquisition process.

    Examples of learning-based models include the approaches by [16,17] for learning subcellular localization models, [4] for learning bacterial cell shapes, [10] for squamous intermediate cell modeling using Fourier shape descriptors and [18] for urothelial cell modeling. Using learning-based modeling, the modeling is largely controlled through the samples included in the training data.

    The subcellular localization models in [16] showcase the learning-based parametric modeling approach. The localization models were built to facilitate capturing protein localization patterns and regenerating their natural variation. Such learning-based automated modeling can be used for the purposes of generating statistically accurate simulations covering various biological phenomena in a systematic manner. The model parameters were learned from real microscope images to capture the parameters for a nested conditional model of medial axis for nucleus shape, nucleus texture, and cell shape, and Gaussian mixture models for the protein localization patterns.

    Considering learning-based modeling of object classes with distinct shapes, such as different bacteria types, a deformable shape model was proposed in [4]. The model learns shapes from training objects, extracted from segmented microscope images representing shapes characteristic of particular cell types – in this case bacteria with clearly separable round and longish shapes were used as training and target objects. In Fig. 2.5, examples of real (upper row) bacteria with round (left) and longish (right) shapes are shown, and the shapes generated by the deformable shape models trained with the two object classes are illustrated in the lower row. Despite the somewhat successful learning-based modeling approaches referred to here, the representative power of such models falls short from that of modern generative machine learning-based approaches.

    Figure 2.5 Learning-based shape simulation using a deformable shape model for two object classes, with parameters learned from real microscope images of two bacteria types: (upper row) examples of real bacteria with round (left) and longish (right) shapes; (lower row) shapes generated by the deformable shape models. Adapted from [4].

    2.4 Use cases

    2.4.1 SIMCEP: parametric modeling framework aimed for generating and understanding microscopy images of cells

    The SIMCEP simulation framework, with the basic framework presented in [2–4], was developed during the era when biomedical imaging was quickly developing from visual inspection and low-throughput manual analysis into a high-throughput quantitative field, with an effort to interpret image data and the underlying biomedical phenomenon at the level of computational systems [19,20]. The SIMCEP simulation framework follows the workflow described in the schema of Fig. 2.6. First an ideal version of the simulated objects and their spatial arrangement in the population present in the image field-of-view is generated, which also serves as the ground truth for the objects vs. background classification task at the pixel level. Second, the image is distorted with several error sources, emulating aberrations, noise and unidealities introduced in a typical microscope measurement.

    Figure 2.6 Simulation workflow of the SIMCEP framework. The process is controlled through input variables, allowing incorporating prior knowledge of the studied samples of conditions, and provides the possibility to generate desired characteristics in the output. The first part generates an ideal image of the underlying sample by generating objects, their appearance, and spatial locations in the image. The second part distorts the ideal image with various sources of errors, resembling the effect of a typical microscope image acquisition.

    The simulation framework had two goals. The first was to provide means for validating automated image analysis algorithms, tools, and software with large volumes of image data in various, controlled, scenarios without the need for manual annotation. This goal has been clearly met, with the simulator being actively used and further developed as of today. One key to this success has been the freely available open source implementation of the SIMCEP framework [3] as well as the benchmark image set [14] generated using it, both of which have been popular resources for the community. The second goal was to create a framework for modeling cells and the microscopy measurement system in a comprehensive manner, possibly later enabling simulation-based experiments by connecting biology-driven models. This would require feeding genomic or protein level information controlling the appearance of cells directly into the image simulation framework. In [21], a somewhat similar approach was proposed for a different measurement platform, where gene regulatory networks were used as a basis for generating data based on which artificial microarray images were generated. Connecting similar regulatory information into cell appearance is obviously a significantly more complex task. To reach this ambitious goal, the feedback loop, from our understanding of cells and their function at the molecular level to the modeling of cells as images, still needs to be strengthened. Here, the modern deep learning-based approaches are likely to provide a stronger computational basis than the simple parametric models covered in this chapter.

    In the previous sections, we described the two basic steps in parametric modeling, generating ideal representation of the underlying objects and modeling the measurement system, using the SIMCEP framework as an example implementation. The SIMCEP approach sets some limitations, for example, we do not cover extensively time-lapse simulations or 3D simulations. Also, the provided examples and alternative approaches do not represent a systematic review of all available approaches.

    2.4.2 Simulated data for benchmarking

    One of the main benefits of parametric modeling is the control over the simulated image characteristics. Parametric modeling enables generating large quantities of simulated image data with desired properties. For example, by controlling the number of cells and their probability to cluster together, as well as by enabling the cells to overlap, different scenarios with varying levels of complexity from the perspective of automated image analysis can be generated. Similarly, by controlling the shape through the size and especially through the shape randomness parameters, image data with varying levels of shape complexity can be generated. Such property becomes useful when analyzing how different image analysis algorithms handle specific tasks, such as the under-/over-segmentation of touching and overlapping cells, or quantification of the cell properties by extracting numerical descriptors on a population or single cell level.

    For example, in [14], a benchmark dataset was generated for surpassing the need for running the simulation framework in order to get access to synthetic validation data. The idea stems from the machine learning community, where benchmark datasets are a popular way to validate and compare algorithms. The simulated benchmark dataset offers possibility to evaluate various attributes in image analysis algorithms, such as cell counting, spot counting, shape descriptors, background noise removal. The obvious downside of a fixed dataset is naturally the lack of control in the simulation process – thus also the parameter setting files were provided to enable easy modification for tailored dataset generation. Similarly, several other publications have presented simulated benchmark datasets for various purposes, see, e.g., [22], where images simulated with methods presented in [6,23] were used alongside real time-lapse experiments for evaluating the performance of cell tracking algorithms. Simulated images have also been accepted in bioimage benchmark collections [24].

    As an example use case of how cell-type specific parametric models can be used for validating algorithms for phenotype prediction, in [5] a simulated dataset of bacterial colony images was generated for testing a shape-based classifier. The simulated images of bacterial colonies with a varying concentration of three artificial bacterial types with distinctive shapes were generated, representing a time-lapse experiment where the population dynamics (relative fraction of each cell type at a given time point) varies over time.

    2.5 Future directions

    Recent years have witnessed the deep neural networks conquering practically all computational data analysis and modeling application areas, also within biomedical imaging. The unprecedented accuracy obtained using deep learning-based methods in numerous areas suggests these methods will be increasingly used also in biomedical image synthesis tasks, shifting the focus from traditional parametric modeling towards data-driven modeling.

    The modern deep learning-based modeling, however, creates extremely complex, difficult to interpret models. As a result, explainable artificial intelligence is used for gaining insight into the complex machine learning models [25]. This approach is increasingly gaining interest in biomedicine where many applications traditionally require human experts. Also in simulation there remains value in representing the samples and processes in a simple, human-interpretable, and controllable manner. One predictable future direction is the combination of computationally transcendent deep learning models and simple parametric models. Another remaining challenge is to move towards system-level modeling, where the modeling process is driven by the underlying biological phenomenon instead of ad-hoc parameter tuning.

    2.6 Summary

    Parametric modeling enables efficient incorporation of prior knowledge into the modeled object shape, appearance and distribution, as well as knowledge of the physical measurement system. With full control on the simulation process through the user-defined parameters, such modeling approach offers a flexible tool for simulating images of complex biological samples, such as of cells and tissue, with realistic characteristics. Some of the pioneering simulation frameworks widely used in the biomedical image analysis community, such as, e.g., SIMCEP [3], PAPsynth [7], SimuCell [9], MitoGen [23], CytoPacq [26], rely heavily on parametric modeling.

    These tools, and many more, have played an important role in validating automated image analysis algorithms for various tasks. In biomedicine, validation with extensive datasets representing the characteristics and variation of the underlying sample distribution is particularly crucial, and at the same time, ground truth is often expensive and tedious to obtain. Thus, biomedical imaging applications, such as microscopy imaging of cells, has been one of the pioneering fields of simulating synthetic and realistic images of the samples and image acquisition system.

    Through increased information on biological processes, and through rapidly improving capability to computationally model complex targets, simulation can be expected to become an integral part of quantitative bioimaging in the future – and the parametric modeling paradigm has its role as the basis of explainable models characterizing the samples and underlying phenomena.

    References

    [1] V. Ulman, D. Svoboda, M. Nykter, M. Kozubek, P. Ruusuvuori, Virtual cell imaging: a review on simulation methods employed in image cytometry, Cytometry. Part A 2016;89(12):1057–1072.

    [2] A. Lehmussola, J. Selinummi, P. Ruusuvuori, A. Niemisto, O. Yli-Harja, Simulating fluorescent microscope images of cell populations, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference. IEEE; 2006:3153–3156.

    [3] A. Lehmussola, P. Ruusuvuori, J. Selinummi, H. Huttunen, O. Yli-Harja, Computational framework for simulating fluorescence microscope images with cell populations, IEEE Transactions on Medical Imaging 2007;26(7):1010–1016.

    [4] A. Lehmussola, P. Ruusuvuori, J. Selinummi, T. Rajala, O. Yli-Harja, Synthetic images of high-throughput microscopy for validation of image analysis methods, Proceedings of the IEEE 2008;96(8):1348–1360.

    [5] P. Ruusuvuori, J. Seppala, T. Erkkila, A. Lehmussola, J.A. Puhakka, O. Yli-Harja, Efficient automated method for image-based classification of microbial cells, 2008 19th International Conference on Pattern Recognition. IEEE; 2008:1–4.

    [6] D. Svoboda, M. Kozubek, S. Stejskal, Generation of digital phantoms of cell nuclei and simulation of image formation in 3d image cytometry, Cytometry. Part A 2009;75(6):494–509.

    [7] P. Malm, A. Brun, E. Bengtsson, Papsynth: simulated bright-field images of cervical smears, 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro. IEEE; 2010:117–120.

    [8] D. Svoboda, O. Homola, S. Stejskal, Generation of 3d digital phantoms of colon tissue, International Conference Image Analysis and Recognition. Springer; 2011:31–39.

    [9] S. Rajaram, B. Pavie, N.E. Hac, S.J. Altschuler, L.F. Wu, Simucell: a flexible framework for creating synthetic microscopy images, Nature Methods 2012;9(7):634.

    [10] P. Malm, A. Brun, E. Bengtsson, Simulation of bright-field microscopy images depicting pap-smear specimen, Cytometry. Part A 2015;87(3):212–226.

    [11] D.V. Sorokin, I. Peterlík, V. Ulman, D. Svoboda, T. Nečasová, K. Morgaenko, L. Eiselleová, L. Tesařová, M. Maška, Filogen: a model-based generator of synthetic 3-D time-lapse sequences of single motile cells with growing and branching filopodia, IEEE Transactions on Medical Imaging 2018;37(12):2630–2641.

    [12] K. Perlin, An image synthesizer, ACM Siggraph Computer Graphics 1985;19(3):287–296.

    [13] J. Ghaye, G. De Micheli, S. Carrara, Simulated biological cells for receptor counting in fluorescence imaging, BioNanoScience 2012;2(2):94–103.

    [14] P. Ruusuvuori, A. Lehmussola, J. Selinummi, T. Rajala, H. Huttunen, O. Yli-Harja, Benchmark set of synthetic images for validating cell image analysis algorithms, 2008 16th European Signal Processing Conference. IEEE; 2008:1–5.

    [15] V.N. Kovacheva, D. Snead, N.M. Rajpoot, A model of the spatial microenvironment of the colonic crypt, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI). IEEE; 2015:172–176.

    [16] T. Zhao, R.F. Murphy, Automated learning of generative models for subcellular location: building blocks for systems biology, Cytometry. Part A 2007;71(12):978–990.

    [17] V.N. Kovacheva, D. Snead, N.M. Rajpoot, A model of the spatial tumour heterogeneity in colorectal adenocarcinoma tissue, BMC Bioinformatics 2016;17(1):1–16.

    [18] M. Scalbert, F. Couzinie-Devy, R. Fezzani, Generic isolated cell image generator, Cytometry. Part A 2019;95(11):1198–1206.

    [19] H. Peng, Bioimage informatics: a new area of engineering biology, Bioinformatics 2008;24(17):1827–1836.

    [20] R. Murphy, The quest for quantitative microscopy, Nature Methods 2012;9:627.

    [21] M. Nykter, T. Aho, M. Ahdesmäki, P. Ruusuvuori, A. Lehmussola, O. Yli-Harja, Simulation of microarray data with realistic characteristics, BMC Bioinformatics 2006;7(1):1–17.

    [22] M. Maška, V. Ulman, D. Svoboda, P. Matula, P. Matula, C. Ederra, A. Urbiola, T. España, S. Venkatesan, D.M. Balak, et al., A benchmark for comparison of cell tracking algorithms, Bioinformatics 2014;30(11):1609–1617.

    [23] D. Svoboda, V. Ulman, Mitogen: a framework for generating 3D synthetic time-lapse sequences of cell populations in fluorescence microscopy, IEEE Transactions on Medical Imaging 2016;36(1):310–321.

    [24] V. Ljosa, K.L. Sokolnicki, A.E. Carpenter, Annotated high-throughput microscopy image sets for validation, Nature Methods 2012;9(7):637.

    [25] A. Adadi, M. Berrada, Peeking inside the black-box: a survey on explainable artificial intelligence (XAI), IEEE Access 2018;6:52138–52160.

    [26] D. Wiesner, D. Svoboda, M. Maška, M. Kozubek, CytoPacq: a web-interface for simulating multi-dimensional cell imaging, Bioinformatics 2019;35(21):4531–4533.

    Chapter 3: Monte Carlo simulations for medical and biomedical applications

    Julien Berta,b; David Sarrutc,d,e    aLaTIM, INSERM, UMR1101, Brest, France

    bBrest University Hospital Center, Brest, France

    cCREATIS, CNRS UMR5220, INSERM U1294, Villeurbanne, France

    dUniversité de Lyon, INSA-Lyon, Villeurbanne, France

    eCentre Léon Bérard, Lyon, France

    Abstract

    Monte Carlo simulations (MCSs) are well-known to produce biomedical nuclear image synthesis with an extreme realism. This is achieved by simulating the complete acquisition physics processes from the particle emission to their detection. The use of Monte Carlo in this context consists of modeling imaging systems, optimizing acquisition protocol, producing raw data for advanced image reconstruction, and computing absorbed dose in tissues. The first section of this chapter, after a brief history of the Monte Carlo method, focuses on the principles of the particle transport through matter. Then different sections are presented to explain the main elements, structures, and mechanisms used to achieve a complete MCS within the context of image synthesis. Since MCSs are computationally demanding due the large number of particles required to simulate real images, a section of this chapter is dedicated to advanced methods for improving the simulation efficiency. Finally, some examples of applications are illustrated, showing the main advantages and the purpose of the use of such simulation technique in image synthesis. Monte Carlo approaches can be used in different domains to solve complex problems. Several examples of applications in computational biology that use the concept of Markov chain Monte Carlo method are presented as well.

    Keywords

    Monte Carlo simulations; Image synthesis; Medical physics; Computational biology

    3.1 Introduction

    3.1.1 A brief history

    Monte Carlo methods refer to calculation algorithms based on random numerical simulation that makes it possible to simulate complex physical phenomena, for example, the transport of particles through matter. The first reference in history of the use of a random process to determine the outcome of a phenomenon was provided by the Comte de Buffon in 1733. At that time, he was studying the probability of winning or losing in the Franc-Carreau game. This French game has been practiced since the Middle Ages and consists in tossing a coin on a tiled floor and betting on the final position of the coin: you win if the coin does not overlap with any edges of the tile, otherwise you loose. Georges Louis Leclerc de Buffon studied the probability to win the Franc-Carreau game by randomly and repetitively tossing needle on the floor. This was the first stochastic sampling and known as the Buffon's needle problem.

    Modern Monte Carlo method, as known today, appeared at the same time as computer science, during World War II, for the needs of the Manhattan Project, in order to model the process of a nuclear explosion. The first numerical simulation in theoretical physics was the Fermi–Pasta–Ulam virtual experiment in 1953 [1]. Much latter than Buffon's needle problem, Ulam and von Neumann were working on particle transport simulation, and published an abstract about a method combining stochastic and deterministic processes [2]. The name of Monte Carlo method appeared for the first time several years later, in 1949, with an article of Metropolis and Ulam [3]. This name referred to the famous gambling casinos Monte Carlo located in Monaco. The Monte Carlo method was not the only ingredient to successfully transport particles through matter, the development of quantum theory, which furnished cross-section data for the interaction of radiation with matter, was the keystone to implement the method in nuclear physics. Nowadays, Monte Carlo simulation is a general method of estimating a numerical quantity that uses random numbers. The method has the capability to explore large configuration spaces in order to extract information, and solve a very wide range of problems (finance, biology, mathematics, physics, etc.) with the advantage of being easy to use.

    A couple of years later, the developments of Monte Carlo method to transport particles whose energy is much lower than those involved in thermonuclear application was investigated. Within this context, the simulation of photon transport in matter was essentially solved by Kahn [4] in 1956. The pioneering article [5] published in 1963 by Berger reviews all necessary methods to transport electron using Monte Carlo method. At the same period, Zerby in 1963 used a Monte Carlo calculation to estimate the response of a gamma-ray scintillation counters [6]. The early 1970s marked the rise of Monte Carlo simulations for medical physics applications. From that time their evolution never ceased to extend, following the improvements in computing and discoveries in nuclear physics.

    The use of Monte Carlo in the field of biomedical physics consists of (1) modeling imaging systems, in particular in nuclear medicine, (2) characterizing beam and particle accelerators in radiation therapy, and (3) computing absorbed dose in patients. Nowadays, all nuclear imaging systems and all treatment planning systems in radiation therapy use Monte Carlo simulations during several parts of their research and development. As an example, the next generation total-body PET projects (Explorer at UC Davis, PennPET in Pennsylvania, PET20.0 in Ghent or J-PET in Krakow) rely on Monte Carlo simulation to design, control, and test instrumentation, but also to perform research in image reconstruction. All treatment planning systems use Monte Carlo to characterize photon/particle beams, to compute dose point kernels for analytical dose engines, or directly absorbed dose in patients. This is particularly true for new promising radiotherapy protocols, such as hypo-fractionated schedule, Flash radiotherapy, proton/hadron therapy, which are mainly oriented towards very high-dose-rate treatment and thus require extremely precise dose distributions.

    Nowadays several referent Monte Carlo codes can be used for realistic medical physics applications, which is the result of a legacy of more than half a century of work in the field. The most widely-used are Geant4 [7,8], Penelope [9], MCNP [10], EGSnrc [11], FLUKA [12,13], etc. Most of them are generic codes, however, there are also GAMOS [14] and Gate [15,16], which extend Geant4 code to propose solution fully dedicated to medical application targeting both imaging and particle therapy applications.

    3.1.2 Monte Carlo method and biomedical physics

    Monte Carlo simulation (MCS) is widely used in the field of biomedical physics. It has many advantages: with the same method, any nuclear imaging model can be simulated, such as computer tomography (CT), cone beam CT, direct digital radiography, positron emission tomography (PET), single-photon emission computed tomography (SPECT), gamma camera, etc. Additionally, to recover image raw data, MCS makes it possible to estimate dose within the imaged subject, allowing as well dosimetry studies. Imaging systems which are not based on particle transport through matter, such as magnetic resonance imaging (MRI) or ultrasound, cannot be handled directly by MCS. Even if a Monte Carlo method can be applied to electromagnetic signal or wave propagation, here the term of MCS is related to the use of Monte Carlo method in particle physics domains.

    MCSs are well-known to produce medical nuclear image synthesis with an extreme realism. This is achieved by the fact that the complete acquisition physics processes from the particle emission to their detection are simulated. For example, standard deterministic synthetic image simulators mostly simulate the noise by using a constant and simple additive model everywhere on the image, which is an approximation. MCSs have the capability to simulate and obtain a realistic noise similar to real data. Another advantage is the ability to derive real clinical data to perform simulation. For example, the MRI and CT image from a patient or a small animal can be used to build a digitized phantom (digital twin of the real object) to achieve realistic simulation. Similarly, radiotracer distribution can be derived from real clinical data to achieve realistic emission tomography simulation [17].

    The MCS is also used for its capability to evaluate new system design and new image detector. One more benefit of MCS is the full knowledge of the object to image since it was specified by the user. With this ground truth, it is easy to evaluate and compare new protocols or new reconstruction algorithms. Indeed, MCS results are raw data directly collected on the image detector. Therefore, a 2D/3D reconstruction step is necessary to obtain an exploitable image. This disadvantage for some is a crucial advantage in the field of tomographic reconstruction. Direct digital radiography, histogram, sinogram, or list mode data are easily obtained from MCS, which is mostly not possible from real clinical systems.

    The validity domain of a standard particle physics MCS is at the scale of the tissue, i.e., human body and small animal simulations. Paradigm and physics models were originally designed for this scale. Even if 3D cellular models exist [18], MCSs are not capable yet to perform a particle simulation at the cell scale, this is a limitation of the current models. Some effort to extend physics processes for the modeling of biological damage induced by ionizing radiation at the DNA scale was proposed with Geant4-DNA [19]. Although promising results were obtained, single cell and single molecule simulations are quite limited to fundamental work. Bridging the gap between cell and tissue scale into a same simulation remains a major challenge in medical physics MCS.

    However, since Monte Carlo method is a generalized concept, it has been used in diverse domains especially in computational biology. Such applications use Monte Carlo method to simulate any biological process, for example, cell population behavior, cell cycle, molecular folding, tumor growing, etc. A few details and application examples will be provided at this end of this chapter.

    3.2 Underlying theory and principles

    In general, Monte Carlo methods are algorithms that consist in estimating an approximate value using a random sampling process. A very simple example to understand the method is to numerically estimate an approximate value of π. Let us consider a circle with a radius r inscribed in a square of dimension l (Fig. 3.1(a)). Then, within this square, we repeatedly and independently add points inside with a uniform random position. This corresponds to the random sampling process of the Monte Carlo method. After a certain number of random draws N, we can calculate the ratio q of points that are inside the circle , compared to the total number N. Knowing the area of the square, we can deduce an approximate value of the area of the disk and therefore, by using the formula for the disk area , estimate an approximate value of π. This principle of estimation by a stochastic process is the main essence of the Monte Carlo method.

    Figure 3.1 Random sampling process of the Monte Carlo method. (a) A circle with a known radius is inscribed in a square with a known dimension. By repeatedly and independently adding points within the square with a uniform random position, the area of the disk can be estimated by using the ratio of points that are inside the circle (green dots – dark gray dots in print version) compared to the total number of points (green + red dots – dark gray + mid gray dots in print version). Subsequently, the value of π by using the formula for a disk area can be estimated and (b) the associated relative error according to the number of points used for the sampling calculated. Here the relative error is illustrated for several independent realizations.

    The uncertainty of the estimated value is conditioned by the number of random draws. It follows the law of large numbers and decreases with a form in . This is illustrated in Fig. 3.1(b), where the relative error between the estimated and real values is plotted as a function of the number of random draws and this is repeated for several independent realizations. The estimated value, at a given number of draws, is different for each new realization due to the statistical uncertainty, but all converge to the same expected value (π). The greater the number of random draws, the closer to the real value the estimation will be.

    The strength of the Monte Carlo method is its capability to achieve an estimate with a high precision by using a simple method, even in the cases where the problems are very complex and cannot be resolved with classical methods. For example, in Fig. 3.1(a), if we replace the simple circle shape by a complex shape without known description, the Monte Carlo method will be easy and straightforward to estimate its area. This is not the case with conventional integration methods. However, the main drawback of the method is the correlation between accuracy and number of random draws. A high precision requires a very large number of draws.

    Monte Carlo method can be used to solve nonlinear, stochastic, multi-dimensional, and complex problems. This method is used to estimate image synthesis in nuclear medical imaging. The aim is to simulate the most realistic image as possible. For this context, the Monte Carlo principle remains the same as for estimating the value of π, but here the estimated value is a 2D image, resulting from the transport of particles in the patient's body. Each random realization is the path of a particle from the radioactive source to its contribution to the image detector. In this context, we speak of random walk, because the transport of the particle undergoes physical interactions that are determined by stochastic processes as well. Each Monte Carlo draw is summarized by simulating the transport of a particle through matter.

    3.3 Particle transport through matter

    There are different models of particle transport through medium, but most of them remain similar and are based on the same common work proposed in the early 1970s. For the next sections, we will mostly focus on models from the Geant4 toolkit [8] because this library is the foundation of the GATE software [16,15] which is dedicated to medical applications, and one of the major platform in the field of emission and transmission tomography simulations. As explained, the following models and methods in this chapter remain similar and equivalent to most of Monte Carlo codes for medical physics. Since most of the simulations for conventional nuclear imaging systems are mostly related to photon particles transport, we will mostly focus on this particle. In addition, photon interactions are discrete processes, meaning there are not considered as interacting continuously along the trajectory as charged particle (electron, positron), but only at specific points. Therefore, the principle of particle navigation in Monte Carlo simulation is easier to describe and understand while considering only

    Enjoying the preview?
    Page 1 of 1