Generation and study of the synthetic brain electron microscopy dataset for segmentation purpose

Advanced microscopy technologies such as electron microscopy have opened up a new field of vision for biomedical researchers. The use of artificial intelligence methods for processing EM data is largely difficult due to the small amount of annotated data at the training stage. Therefore, we add synthetic images to an annotated real EM dataset or use a fully synthetic training dataset. In this work, we present an algorithm for the synthesis of 6 types of organelles. Based on the EPFL dataset, a training set of 1161 real fragments 256×256 (ORG) and 2000 synthetic ones (SYN), as well as their combination (MIX), were generated. The experiment of training models for 6, 5-classes and binary segmentation showed that, despite the imperfections of synthetics, training on a mixed (MIX) dataset gave a significant increase (about 0.1) in the Dice metric for 6 and 5 and same results at binary. The synthetic data strategy gives annotations for free, but shifts the effort to producing sufficiently realistic images.


Introduction
Microscopy plays an indispensable role in biomedical research.Advanced technologies of microscopy like electron microscopy have opened up new eyesight for biomedical researchers.The image resolution is very high, which is why a cubic millimeter of brain tissue can take up more than 1000 terabytes.The resulting images are analyzed to identify individual cells.Segmentation is usually done by biologists manually.Processing one experiment takes up to six months of manual work.
One of the first papers which use the serial block scanning electron microscopy as a source of highresolution three-dimensional nanohistology for cells and tissues was the [1] (2010).A subsequent series of works was aimed at creating datasets for training deep learning networks and DNN methods and models for EM data segmentation designed for binary segmentation of brain cell organelles -neural membranes [2] and supervoxel segmentation of mitochondria [3].Simultaneously, the problem of 3D reconstruction of the brain neural network and the problem of brain connectomics on the basis of neuron organelles and connections between neurons (synapses) is stated [4].In this problem, of particular importance is the segmentation of such organelles as postsynaptic densities (PSD), vesicles, and axons.
The invention of U-Net in 2015 [5] opened a series of novel models and adaptations for segmenting brain EM data.The source of U-Net success is in involving the contextual information of the input image at all levels of processing.Almost immediately, the publication [6] experimentally confirmed that the skip connection of the U-Net architecture is effective for solving segmentation problems in biomedicine.
The application of artificial intelligence methods for EM data processing is largely hampered by a small amount of labeled data for training and testing DNNs.Open EM data as a whole are represented by only a few labeled datasets, both due to the laboriousness of preparing samples for an electron microscope, and due to the lack of specialists for manual labeling.We found four open EM datasets the earliest and most popular of which are labeled only for one class (mitochondria or membranes).In the two other datasets, several classes are distinguished.As a result, the majority of neural networks used in EM processing are trained only to perform binary segmentation.
In our previous article [15], we presented the first results of creating of an algorithm that synthesizes EMdata and markup.We used synthetic data to supplement the original dataset with a synthetic axon.It should be noted that attempts to synthesize electron microscopy data of highly porous structures well described in an article published in 2022 [16].Examples of synthetic scanning electron microscopy image data from Fend et al. shown in Fig. 1.
In connection with the above, the main aim of this work is to (1) to improve algorithms for automatic generation of a dataset of synthetic objects; (2) to develop a parameterization for an algorithm that synthesizes data; (3) to study the capabilities of multiclass segmentation of U-Net-like architectures, starting with U-Net (in this work), using EM-dataset, synthetic dataset and mixed dataset.

Data and methods
In this section, we describe publicly available datasets.The most popular datasets for assessing the segmentation of mitochondria were collected by Lucchi et al. in [3].
It is seen that in three of the four labeled open datasets, only one class is labeled.Only one dataset contains more than one labeled class.For this reason, the vast majority of neural networks in EM are trained to classify only two classes (object and background).We used the dataset EPFL or the data set of mitochondria segmentation Lucchi available at https://www.epfl.ch/labs/cvlab/data/data-em/.
Initially, these data contain masks only for mitochondria.For this reason, to assess multiclass segmentation algorithms, we manually labeled 27 layers in the training sample (1024×768) and 5 test layers for the following classes: (1) mitochondria, including their boundaries; (2) boundaries of mitochondria; (3) cell membranes; (4) postsynaptic densities (PSD); (5) axon sheaths; and (6) vesicles.Accurate manual labeling of one layer takes 5 -8 hours.Our labeling of the dataset EPFL is available at https://github.com/GraphLabEMproj/unet.We plan to continue the work on labeling and do this for both volumes.It just so happens that the axon sheath in the training dataset is present only in the first 36 layers and looks completely different from the axon sheath in the test dataset Fig. 2. In the test dataset, the axon is represented in the first 70 layers, changes its shape from elongated to more rounded, and also has a darker interior and inner ring.
For the synthesized dataset, we generated 2000 images of size 256×256 pixels containing the least represented classes-postsynaptic densities and axon sheaths.An example of one labeled sample is shown in Fig. 3, an example of several samples is shown in Fig. 4. The shape, size, and gray levels of compartments are chosen to be similar to the shape, size, and gray levels of the test EPFL dataset.The advantage of a synthetic set is that you can get any number of images you need along with their labeling automatically.The studies presented in this article are the development of our work , in this study, we expand the topic of synthetic image generation by modifying some organelles generation and adding the ability to parameterize synthetic image generation algorithms to achieve high similarity with different real datasets using a single universal code.

. Algorithm of synthetic data generation
You can view and download the implementation of our algorithms for synthesizing by scanning electron microscopy here: https://github.com/GraphLabEMproj/Synthetics/.Next, we provide a brief description of the synthesis algorithms.The generation of synthetic images is the following sequence: 1) generation of the organelles (axon, mitochondria, PSD, vesicles); 2) placing organelles to the image; 3) generation of the membranes; 4) image blurring and noising.
Organelle location.The position of an organelle is determined by its central point.All other organelle points (like shell) are calculated relative to the center point.A random position of the center point is chosen in the range of [32: image_size -32] pixels from the edge for the PSD and [5: image_size -5] pixels for the rest of the organelles.The organelle is rotated at a random angle around its center.After generating and rotating the points, we check for intersections with organelles already existing in the layer, and if there is an intersection, then the procedure for setting the location and rotation of the organelle is repeated.If in several iterations (we use 300 iterations) it is not possible to add a new organelle to the layer without intersecting with the existing ones, then it is skipped from the generation.

Number of organelles
The number of organelles of one or another class is set by parameters.Empirically, we decided to add one axon, 3 mitochondria, 3 PSDs, and 3 vesicle regions in a single 256×256 patch.Organelles are added randomly one after the other.However, if it is physically impossible to arrange a given number of organelles without intersections, then new organelles will not be added.Since for training networks it is important that the images with the class for which training takes place are in sufficient volume.Therefore, for high-quality training, it is necessary that both the number of images and the area of the required class on the layer be sufficient for training (for example, if the class is too small in area, then the optimizer can consider it noise and ignore it, and if there are few examples with the class, then there is a chance that the grid will have time to forget this class).Therefore, we do not focus on the statistics of the initial data, but, on the contrary, add more organelles to the synthetic set, either poorly represented in the initial data (axon) or occupying a small area (PSD).
Axons.We generate two types of axons: filled inside and empty (only shell).After that, from 7 to 14 shell points are created around the center of the organelle.The distance from the center to the point is set randomly in some given range (from 15 to 82 pixels for the shell and from 40 to 139 pixels for the axon with internal filling).To draw the shell, a closed spline is used along the points of the shell.To drawing the thickening of the shell, we take a subset of successive shell points and use an open spline with a wider line thickness.In the case of generating an axon with an internal part, the internal part is filled via darkened and blurred background texture.The inner shell is drawn as an ellipse.
Mitochondria are generated in the form of an elongated smoothed, slightly asymmetrical figure with a shell of varying thickness.We use from 4 to 10 shell points created around the center line of the mitochondrion for drawing the spline.To generate an oblique cut of the mitochondrion, an open spline is used, consisting of points located near the most distant parts of the mitochondrion.For the interior of the mitochondrion, a texture is generated, consisting of non-intersecting line segments of different lengths.
Vesicles.The generation of vesicles begins with the selection of the area in which the vesicles will be located.After that, a certain number of single vesicles are added to this area -circles of a certain radius, which for the most part are added without intersections.
PSD is generated as a curve segment, and additional areas in front of and behind the curve mimic the darkening from the PSD.
Masks.All masks of all organelles (axons, mitochondria, vesicles, PSD, membranes) are drawn by changing the color for the drawing algorithm to white.We use organelle's masks to place its to the images without intersection.We make several attempts to add a mask to image to an empty space.
Membranes.We use organelle's masks as starting points of the region growth algorithm.Regional boundaries become our membranes.We also add lines from PSD to membranes to connect them.On some tiles we make double thin borders instead of single thick borders to be more like the original dataset.
Blur and Noise.We use Gauss filtering and Poisson noise to simulate image blurring and noise from the registration device.All images were blurred with a Gaussian filter with a kernel of radius 7 pixels.According the Vulović et al. [22] the appearance of Poisson noise is due to the statistical nature of electromagnetic waves such as x-rays, visible light and gamma rays.X-ray sources emit a certain number of photons per unit time.Such sources have random fluctuations in the number of emitted photons.As a result, the resulting image has spatial and temporal randomness corresponding to the Poisson distribution We add Poisson noise with  = 1.It makes the generated images more similar to real-life images.

Personification of generation algorithm for a specific real dataset
Despite the fact that different datasets are obtained from similar biological materials, their stored digital images can vary greatly depending on the equipment used, the scale selected, the quality of the matrix, etc.As a result, datasets of different developers have different statistical characteristics.In order for synthetic datasets to better match real ones, generation algorithms must be parameterized to be able to generate data for a specific type of dataset.In Fig. 6 presented examples images from datasets EPFL (a) and AC4 dataset (b).It can be seen from their images that the images differ greatly in the average brightness of organelles, the thickness of the lines, and in the blurring of the contours.
In order to make it possible to adjust the systematic set to real data, the characteristics of the generated objects were parameterized.The parameterization touched the average brightness of all objects and the deviation from the average brightness, thickness and color of the borders.
To select the parameters, we used the histograms of the classes marked on the zero (layer0000) layer of EPFL dataset.We compared the histograms of the synthetic dataset with the histograms of the entire labeled dataset (Fig. 5).
Of course, by exactly repeating the distribution as in a histogram, one cannot guarantee the similarity of synthetic data to the original ones, and more, one cannot guarantee a good AI model on them.But this is a fairly simple method for checking the correctness of the algorithm.
As can be seen from the histogram of axons, we did not focus on the histogram of the training dataset and testing dataset, but rather proceeded from the generalized representation of the axon in the electron microscopy data, because axon is the least represented class with the least variability.Nevertheless, this approach gave good results when training an AI model.
Thanks to parameterization, the generator can generate images similar to EPFL (Fig. 6c) and AC4 dataset (Fig. 6d), adjusting to a specific dataset.

Network architecture
U-Net is considered to be a standard convolutional network architecture for image segmentation tasks.This architecture consists of a contracting path for capturing the global context and a symmetric expanding path that enables accurate localization.The basis of this network is the project U-Net https://github.com/zhixuhao/unet.In the original project, U-Net was used for the binary classification of membranes.In this work, we use U-Net for multiclass segmentation.We copied the original repository and made modifications in it, which are available at https://github.com/GraphLabEMproj/unettogether with our labeling of the Lucci data.Following the author of the code at https://github.com/zhixuhao/unet, the implementation of U-Net has some differences from the classical U-Net network [5]:  The network input is an image of size 256×256×1.


The network output is 256×256×N, where N is the number of classes.


The sigmoid activation function guarantees that the mask is in the range [0, 1].In addition, we added batch normalization after each ReLU activation layers.Computer Optics, 2023, Vol.47( 5 In the process of improving the model, we constructed more compact modifications of the U-Net model, as a result of which we present tiny-unet-v2 model, which has the following differences from the previous architecture: Number of channels in original U-Net convolutional blocks: 64  128  256  512  1024, number of channels in our architecture: 32  32  64  128  256. The resulting model contains 15.7 times fewer parameters than the original model and takes up 15.5 times less memory (24 MB instead of 364 MB).

Assessment criteria
We use the Dice-Sørensen coefficient (DSC), which are usually used for segmenting biomedical images.The values of the DSC vary from zero to one.Define the number of correctly classified pixels as belonging to the target class (true positive) TP, the number of correctly classified background pixels (true negative) TN, the number of erroneously classified pixels as belonging to the target class (false positive) FP, and the number of erroneously classified background pixels (false negative) FN.Then, define the metrics as follows: Since we consider multiclass segmentation in this work, we are interested in multiclass metrics.Since the Dice metrics compare two sets, in the case of multiclass classification the result will be a vector of Dice metrics for each class.For training a neural network, a scalar error function is used.Therefore, for multiclass segmentation, we should convolve the metrics vector.To convolve a vector into a scalar, we use the linear convolution where  i is a weighting coefficient and W i is the value of the DSC for the i-th class.W scalar is a scalar value or convolution of a metrics vector, and N is the number of classes.In this work, we use the linear convolution of DSC with the weighting coefficients  i equal to 1 / N.

Experiments
We manually labeled 27 EPFL slices (1024×768) for the training purposes and 5 slices (1024×768) for testing purposes from the EPFL dataset.
ORG is a training dataset included only EPFL data.Twenty seven high-resolution images of the original training sample were cut into 256×256, 512×512 and 768×768 fragments, with an overlap of a half of the fragment size.In total, ORG initial training part includes 1161 fragments.

SYN is a training dataset included only synthetic data.
To obtain a synthetic (SYN) training dataset, we generated 2000 synthesized fragments of size 256×256.
MIX is a mixed training dataset.It includes 1161 fragments of EPFL data and 2000 synthesized fragments; thus, we have 3161 fragments in total.
To additionally increase the training datasets, we made random rotations of images, random shifts, and random scale changes in a small range (5%).We selected 20 % of images from the training sample into a validation sample.The batch size equal to seven.
All three models were tested on five layers (175 fragments 256×256 with an overlap of a half of the fragment size).We used the Adam's optimizer with dynamic learning rate from 1×10 -4 to 1×10 -6 .The change in the learning rate starts from the 100th epoch and then decreases every 25 epochs by 5 times.
Experiment 1. Five segmentation classesmitochondria with their boundary, membranes, PSD, axon sheaths, and vesicles.The number of epochs is 200.
Experiment 2. Six segmentation classesmitochondria with their boundary, boundaries of mitochondria, membranes, PSD, axon, and vesicles.The number of epochs is 200.One more class of mitochondria boundaries is added.
Experiment 3. One segmentation class -mitochondria with their boundary.The number of epochs is 200.
It is seen from Tab. 2 that the quality of multiclass segmentation is only slightly better or slightly lower than binary segmentation in various experiments.The class mitochondria boundaries is a subclass of the class mitochondria with their boundaries, and the additional edge enhancement improves the segmentation results of the unifying class.The network was trained on unbalanced classes, since the sizes of compartments and their occurrence differ dozens of times.
In commercial applications based on deep learning, in addition to quality metrics, the performance characteristics of algorithms also play a large role.Based on the values of the quality metric Dice given in Tab. 2 and 3, we see that with a tenfold decrease in the number of model weights (and hence the execution time), the results of the quality of work remain comparable.

Discussion
We test our models on the full EPFL test volume and full Lucci++ test volume and use these values instead of the results of the Tab. 2. We cannot directly compare the results from the Tab. 4 because our models were trained on a significantly reduced version of the EPFL dataset (27 layers instead of 165) and we use our own markup.
The Lucchi++ dataset is based on the EPFL Hippocampus dataset, as published in Structured Image Segmentation using Kernelized Features by Lucchi et al [3].The experts re-annotated the two EPFL Hippocampus stacks.The goal was to achieve consistency for all mitochondria membrane annotations and to correct any Computer Optics, 2023, Vol.47(5) DOI: 10.18287/2412-6179-CO-1273 misclassifications in the ground truth labelings.The markup was manually corrected by one senior biologist and additionally validated by two neuroscientists.In cases of disagreement the biologist corrected the annotations until consensus was reached.The dataset can be loaded there https://casser.io/connectomics/.Thus, the table uses three markups of the same dataset, the difference between the markups is shown in the fig.7.

Tab. 2. Dice coefficient of electron microscopy data segmentation for U-Net model for the original dataset (ORG), the dataset enriched with synthesized images (MIX) and full synthesized dataset (SYN)
We calculate differences between markups using the formula

areas in pixels difference count intersection in pixels 
The difference result by 27 layers is: Lucchi++ vs our 0.09, EPFL vs ours 0.21, Lucchi++ vs EPFL 0.19.This is significant difference.This explains the fact that the our test results are better for Lucci++ markup.At the same time, it can be noted that Lucci++, due to enhanced markup checking compared to epfl, like our dataset, more correctly solves the problem as a whole.This can explain the results in Table.4, in which the model trained on our markup gives Dice metric 0.93+ for Lucci++, and 0.85+ for EPFL.
We test U-Net (1,5,6 classes) and tiny-unet-v2 (1,5,6 classes) trained via original dataset and dataset using synthetic data.For table. 4 we choose best result for original dataset and for original+synthetic dataset for each model.For example we have tiny-unet-v2 Dice results on Lucchi++ dataset trained via original + synthetic data for 1 class: 0.935, for 5 classes 0.931, for 6 classes 0.93.In the table we put only the best result 0.935.
The results with synthetic additive is better.The best results in the table.4 belong to 3D models.This leads us to the idea of building a 3D synthetic dataset.According to the comparison Tab. 4, our approach using the generation of synthetic datasets gives the segmentation quality comparable to the current results from other researchers in the world on the same class.
No one deep learning model architecture can solve the problem of limited training real-data datasets, and the generation of synthetic data will help to obtain high quality segmentation even on medium and small models, which we have demonstrated using tiny-unet-v2 as an example.
The use of synthetic data for training deep models is becoming more widespread in various fields of science, and in particular in computer vision, and we follow this trend.We developed software for generating synthetic EM brain datasets; main purposes of our work were (1) to improve algorithms for automatic generation of a dataset of synthetic objects; (2) to develop a parameterization for an algorithm that synthesizes data; (3) to study the capabilities of multiclass segmentation of U-Net-like architectures, starting with U-Net (in this work), using EM-dataset, synthetic dataset and mixed dataset.
To synthesize electron microscopy images classic computer graphics methods were used.A parametric model of a slice EM-data was constructed.The gray level parameters were chosen in accordance with the histograms of one labeled layer of the base dataset set.The parameters put into the geometric model were based on the shape of the organelles in the layers of the simulated data volume.For example, to select the size of the organelle, the largest organelle and the smallest were selected, and the size of the organelles was uniformly selected from this range.
In order to verify the correct statistical distribution of intensities in synthetic images, the gray levels of the resulting synthetic dataset were checked using histograms of the entire available labeled data volume.
We have to further develop the algorithm in collecting statistics of the geometric parameters of organelles and study the influence of parameter variability on the quality of the trained model.
The approach we have developed can be used in two different directions.The first, narrower direction may be the rapid creation of synthetic datasets with the characteristics of individual tissues, and this is demonstrated in this article.Another, broader goal is to create a more versatile synthetic dataset that can be used to segment various types of data with high quality.This option is much more difficult to develop and validate, however, the potential benefit of such a solution due to its universality could be much bigger.

Conclusion
The algorithm for the automatic generation of a synthetic electron microscopy dataset was developed.The proposed approach allows generating synthetic datasets of any size, as well as quickly changing generation parameters to simulate data from various image registration devices.
As can be seen from the Table .2 synthetic axon generation gives good results in both purely synthetic and mixed tests.The test results depend on how similar structures were presented in the training and test data.Well-represented mitochondria are spoiled by imperfect synthetics.The using of synthetics for a rarely found axon takes segmentation to a new level of quality.
A fully synthetic dataset that was parameterized according to the real EPFL dataset allows training the neural network with Dice metric results on test dataset from 0.55 to 0.92 for different classes, while training on real data gets results from 0.35 to 0.94, and training on a mix dataset gets results from 0.70 to 0.94.Synthetic data needs to be improved to achieve micro-realistic quality.It is especially necessary to add the "imperfections" of the EM technology, such as rough noise, blurred boundaries, smeared vesicles, and smearing of the mitochondrial interior areas.Further efforts should be made to increase the realism of the images.

Fig. 1 .
Fig. 1.Examples of synthetic scanning electron microscopy image data from Fend et al[16]

Fig. 2 .
Fig. 2. Axon sheaths in the training and test EPFL datasets: (a) axon sheath in the training set; (b) axon sheath in the test set, first layer; (c) axon sheath in the test set, 35th layer; (d) axon sheath in the test set, 70th layer

3 .
Example of synthesized data (only nonzero masks are shown): (a) layer, (b) mask of axon sheaths, (c) mask of vesicles,