Image Denoising: Adversarial Distortion Learning

 I found this paper as I was looking for an example for Super-Resolution and Denoising of X-Ray Image.

It was published by Morteza Ghahremani, Member, IEEE, Mohammad Khateri, Member, IEEE, Alejandra Sierra, and Jussi Tohka.


The introduced model is based on Adversarial Distortion Learning for "denoising two- and three-dimensional (2D/3D) biomedical image data".


The basic concept of the model is similar to  Generative Adversarial Networks (aka GAN). It has a denoiser that removes noise from input data, 

and a discriminator that evaluates how close an output of the denoiser is to the ground truth image. This will repeat until the discriminator no longer 

differentiate between generated image and the ground truth image.


In this model (ADL), Efficient-Unet is used for both discriminator and denoiser. 

The Efficient-Unet is a pyramidal learning scheme to "further participate high-level features in the output results during training."

"Residual Block of Encoder and Decoder"


The unique idea about the model is that it does not require any prior knowledge about the noise in images and its light architecture, due to using Efficient Unet, which allows a fast evaluation of images. 

Moreover, this model can be used for all the 2D or 3D biomedical images without re-training.

In ADL, Residual blocks are used when the model downsamples and upsample the image, and the information of the image at each downsampling unit is added to

corresponding upsampling units.


"Transformer Block"

During upsampling, the output of each residual block are passed into Transformers and denoised image with different scales are created.

Each transformer units are a set of n residual blocks followed by a 1 × 1 × 1 convolution and sigmoid activation layers.

In the image, three denoised images with different scales are generated and all of them are used in calculating the Loss.



"Content Enhancer Block"

Another interesting concept in the model is that it uses a "Content Enhancer" that adds information of a noisy input image to the last residual block.


 

Now, one question can be asked is that is the difference between the denoiser and the discriminator.

The only difference here is the size of the filters in convolution layers. The denoiser has a set of [96, 128, 160, 192] while the discriminator has [48, 64, 80, 96].

The lower number of learning parameters of the discriminator is because its only purpose is to compute the loss. 

Therefore, the output of each transformer is "mapped into a binary mask by a mapper" as it is shown above.


Loss Function of GAN :

    L = ∆ −Ex{log D(x)} − Ey{log (1 − D(G(y)))}, LG = ∆ −Ey{log D(G(y))}.


Loss Function of ADL :

    L = ∆ λ1L1 + λpLpyr + λHLHist = −E{| G(y) − x |} − λpE nX J j=1 | ∆jG(y) − ∆j x               − λHE{logcosh H [G(y)] − H [x]  }


One method that differentiates ADL from typical Adversarial nets is ADL uses the loss function that is consisted of combining three distinct losses.

These losses are the following: an L1 loss, a pyramidal textural loss, and a novel histogram loss.

Firstly, the L1 loss shows the fidelity between generated and ground truth images, and it is more robust against outliers compared to L2 loss.

Nextly, the purpose of the pyramidal textural loss (λpLpyr) is to preserve edges and texture and to measure the textural difference between "to-be-denoised images and their corresponding reference only"

Lastly, the histogram loss (λHLHist) makes sure the histograms of a denoised image are close to the ground truth image. 


These three losses make the output of the ADL model closer to the ground truth image.


I trained ADL Model with my private dataset, and instead of adding artificial or Gaussian noise, I used an X-ray image with noise as an input and an averaged image of 50 noise images as ground truth to calculate the loss. 


I change the code to train the model on Windows and with a private dataset with ground truth images and noised images.

To train on your private dataset, clone the code from my github and put your noised image in "noiseDir" and your ground truth image in "gtDir".


I trained the model on my laptop using Cuda RTX 3060 and due to the memory capacity, I downsized the image to 512 x 512 (the original image was 1024 x 1024) with a batch size of 1.


I trained this model for 100 epochs with a learning rate of 0.001. 


Training images are below:


                                        "Ground Truth Image (Training)"



"Noisy Image Input (Training)"




For inferencing, I also downsized the image to 512 x 512 with a batch size of 1. 
The test image was not included in the training dataset. 

The output of the model was brighter than the input image; however, I believe it is due to the fact that I changed the gamma rate of the model when I was training the model.

 

Inferencing images are below:



"Noisy Image Input (Test)"

    


"Denoised Image Output (Test)"

            



paper link: https://arxiv.org/abs/2204.14100

References : 

    Adversarial Distortion Learning for Medical Image Denoising



My GitHub link: https://github.com/AdvancedUno




Comments

Popular Posts