A Deep Dive into Reversible Adversarial Examples

Jiayang Liu; Jun Sakuma

doi:10.5772/intechopen.1005120

Abstract

The advent of deep learning has brought about remarkable advancements in various fields, including computer vision, natural language processing, and reinforcement learning. However, the vulnerability of deep neural networks to adversarial examples has raised significant concerns regarding their robustness and reliability. Adversarial examples are carefully crafted inputs that are imperceptibly perturbed to cause misclassification or incorrect behavior of machine learning models. While extensive research has been conducted to understand and mitigate this vulnerability, a relatively novel perspective has emerged—reversible adversarial examples. In this chapter, we delve into the concept of reversible adversarial examples, exploring their characteristics and generation methods. We review existing literature on reversible adversarial examples, highlighting their significance in safeguarding privacy. Moreover, we introduce potential applications of reversible adversarial examples and discuss future directions for this new research field.

Keywords

adversarial example
reversible data hiding
deep neural network
privacy protection
artificial intelligence

Author Information

Show +

Jiayang Liu*
- National University of Singapore, Singapore
Jun Sakuma
- Tokyo Institute of Technology, Tokyo, Japan
- RIKEN Center for Advanced Intelligence Project, Tokyo, Japan

*Address all correspondence to: ljyljy@mail.ustc.edu.cn

1. Introduction

Deep learning models have demonstrated exceptional capabilities across various domains such as image recognition [1], natural language processing [2], and so on. However, the susceptibility of these models to adversarial examples poses a significant challenge to their reliability and security. Adversarial examples, which add carefully crafted perturbations to input data, can lead to misclassification or incorrect behavior of machine learning models, even with imperceptible changes to human observers. As a result, ensuring robustness against adversarial attacks has become a crucial area of research in machine learning security.

In recent years, a novel approach to understanding and mitigating adversarial vulnerabilities has emerged through the exploration of reversible adversarial examples (RAE) [3]. These examples are crafted with the specific goal of being reversible, meaning that the original input can be recovered from the adversarially perturbed version, leading to correct model predictions.

In this chapter, we delve into the realm of reversible adversarial examples, aiming to provide a comprehensive overview of this emerging field. We begin by introducing the concept of reversible adversarial examples and elucidating their distinguishing characteristics compared to traditional adversarial examples. Building upon this foundation, we review recent advancements in the generation methods of reversible adversarial examples, including white-box attacks and black-box attacks. Furthermore, we explore the applications and implications of reversible adversarial examples.

Moreover, we investigate multiple prominent white-box attack strategies for crafting reversible adversarial examples. These methods leverage various techniques such as perturbation generation, reversible data hiding, and denoising to achieve reversibility while maintaining adversarial potency. Additionally, we explore several black-box attack approaches. These methods aim to generate reversible adversarial examples without accessing the model’s internal parameters or architecture, thereby simulating real-world scenarios where limited information about the target model is available.

By analyzing and comparing these diverse approaches, we gain insights into the capabilities, limitations, and trade-offs associated with reversible adversarial examples. Furthermore, we discuss possible applications and future directions in RAE research, including developing more sophisticated attack algorithms, improving the adversarial transferability, and investigating the practical implications of RAEs in real-world applications. Reversible adversarial examples will play a crucial role in scenarios such as privacy protection, access control, and model authorization. Research in the field of reversible adversarial example will also drive advancements in the broader adversarial machine learning community.

2. Related work

2.1 Adversarial example

The exploration of adversarial examples has been a prominent area of research in the field of machine learning security. Adversarial examples, which are carefully crafted inputs designed to deceive machine learning models, have raised concerns about the robustness and reliability of these models. Several methods have been proposed to generate adversarial examples, both in the context of white-box attacks, where attackers have full access to model parameters, and black-box attacks, where attackers have limited knowledge about the target model.

In the domain of white-box attacks, various algorithms have been developed to generate adversarial examples effectively. The fast gradient sign method (FGSM) introduced by Goodfellow et al. [4] computes the perturbation by taking a small step in the direction of the gradient of the loss function with respect to the input image. Iterative approaches, such as the basic iterative method (BIM) [5] and projected gradient descent (PGD) [6], iteratively perturb the input image to maximize the model’s loss, resulting in stronger adversarial attacks.

On the other hand, black-box attacks operate under more constrained settings, where attackers have no or limited access to the target model. There are two primary categories of black-box attacks: query-based attacks and transfer-based attacks. Query-based attacks involve iteratively querying the victim model to gather gradient information, which is then used to optimize the input to produce adversarial examples. Transfer-based attacks leverage the effectiveness of adversarial examples generated on surrogate models to deceive the victim model. Several techniques have been proposed to enhance the transferability of adversarial examples. These include employing advanced optimization algorithms, applying input transformations, and utilizing ensemble-model attacks. By refining these strategies, attackers can achieve higher success rates in deceiving diverse models with adversarial examples, posing significant challenges for robust model defenses.

The study of adversarial examples is crucial for understanding the vulnerabilities of machine learning models and developing robust defenses against potential attacks.

2.2 Reversible data hiding

Reversible data hiding (RDH) is a unique form of data hiding that allows for the recovery of the original image without any distortion from the marked image while also enabling the extraction of embedded hidden data. RDH algorithms are typically categorized into three main groups: compression embedding [7], difference expansion [8], and histogram shift [9]. Each category of RDH algorithms offers distinct approaches to achieve reversible embedding while preserving data integrity.

Compression embedding methods focus on exploiting the redundancy in the image to embed additional data without causing irreversible changes. These techniques often leverage lossless compression algorithms to compress the image data and then embed additional information into the compressed domain. Upon extraction, the original image can be fully recovered without any loss of information.

Difference expansion techniques operate by modifying the difference values between neighboring pixel intensities to embed hidden data. By carefully adjusting the differences, data can be embedded in a reversible manner, allowing for accurate extraction without distortion of the original image.

Histogram shift methods manipulate the histogram of the image to embed data. By shifting the histogram bins within certain bounds, additional information can be embedded without causing irreversible changes to the image. This allows for the extraction of hidden data, while ensuring the recovery of the original image remains distortion-free.

The performance of reversible data hiding methods is often evaluated based on metrics such as embedding capacity, distortion introduced to the cover signal, and extraction efficiency. Higher embedding capacity allows for more data to be hidden within the cover signal, while minimizing distortion ensures that the original signal can be accurately reconstructed. Extraction efficiency measures the accuracy and reliability of recovering the hidden data from the stego signal.

Overall, reversible data hiding techniques provide a valuable means of embedding additional information into digital images while preserving their integrity and ensuring the reversible extraction of hidden data. These methods find applications in various domains, including image authentication, watermarking, and data hiding for secure communication. Research in reversible data hiding continues to explore new techniques and applications, aiming to strike a balance between embedding capacity, distortion, and reversibility to meet the diverse requirements of different applications and scenarios.

3. Generation methods for reversible adversarial example

3.1 White-box reversible adversarial example

3.1.1 Post smoothing and in-the-loop smoothing method

Liu et al. [3] proposed the white-box reversible adversarial example algorithms by combining adversarial attacks with the RDH algorithm. The overall framework is illustrated in Figure 1. They proposed two RAE methods: post smoothing method and in-the-loop smoothing method. A straightforward approach to achieve reversible adversarial examples involves embedding adversarial perturbations into the adversarial image using a reversible data hiding scheme, enabling the receiver to invalidate the adversarial perturbation. However, it is important to note that RDH is primarily designed for embedding a short length of information into a large image, making it less suitable for this task. To address this limitation, they propose a solution where the images are divided into super-pixels. Then, they embed the adversarial perturbation generated for these super-pixels. The general process of RAE is described in Algorithm 1.

Figure 1.
Generation process of reversible adversarial example [3].

Algorithm 1 Reversible adversarial example generation.

Input: Image X.

Output: Reversible adversarial example XRAE.

1: Generate an adversarial example X′ of image X.

2: Compress adversarial perturbation ΔU=X′−X and auxiliary information R necessary for recovery to obtain the embedded information I.

3: Embed I into the adversarial example X′ and gererate RAE as XRAE=RDHX′I.

Let X denote the original image with dimensions H×W×C and X′ denote its corresponding adversarial example. Each pixel can take integer values in the range 01255. For each color channel C (where C is either red, green, or blue), both the original image X and the adversarial image X′ are divided into nonoverlapping tiles with dimensions h×w, referred to as super-pixels. Let Pi,j and Pi,j′ represent the ijth super-pixel of the original image and its corresponding adversarial example, where 1⩽i⩽H/h and 1⩽j⩽W/w. They adopt a specific type of adversarial perturbation where the pixel values are smoothed over each super-pixel. This smoothing process reduces the amount of information contained in the perturbation by a factor of 1h×w. This reduction renders the perturbation information sufficiently small to be embedded using RDH techniques.

The post-smoothing method represents the most straightforward approach to generate adversarial examples over super-pixels. Initially, an adversarial example is generated using any arbitrary method. Denoting n=h×w, they represent the collection of super-pixels for both the original image and its corresponding adversarial example as Pi,j=p1p2…pn and Pi,j′=p1′p2′…pn′, respectively. Firstly, they calculate the average value of the adversarial perturbations for all pixels within each super-pixel and round it to the nearest integer, obtaining ΔUi,j using the following equation:

ΔUi,j=round1n∑k=1npk−1n∑k=1npk′.E1

Subsequently, the adversarial example generated using the post-smoothing method is obtained as pk″=pk+ΔUi,j. It is essential to note that the pixel value pk″ must be an integer ranging from 0 to 255. Consequently, the transformation may result in overflow/underflow pixel values. To facilitate exact recovery, the truncation information for each super-pixel is recorded as Ri,j=r1r2…rn, where n=h×w. After the transformation and truncation processes, a new tile Pi,j″ is obtained. The super-pixel adversarial image X″ is then generated by completing the transformations and truncations for all the tiles Pi,j″, where 1⩽i⩽H/h and 1⩽j⩽W/w.

The disadvantage of the post-smoothing method is that the adversarial perturbation is smoothed after completing the optimization process, which decreases the attack ability of the generated RAE. To mitigate this issue, they propose the in-the-loop smoothing super-pixel adversarial attack method, which, although requires more computation overhead, is expected to have a less impact on the attack ability. They take the basic iterative method (BIM) [10] as an example to describe how to generate the super-pixel adversarial perturbation and propose the in-the-loop smoothing version of BIM. BIM adds adversarial perturbations by iteratively updating the original image X using the gradient of the loss function with respect to X until the image is misclassified. The concept of in-the-loop smoothing involves taking the gradient with respect to a noise vector ΔU, where the length of this vector corresponds to the number of super-pixels. Initially, the noise vector ΔU is randomly sampled from a uniform distribution in the range −ϵ/2ϵ/2 where ϵ is the perturbation budget. The update process is then iteratively performed as follows:

ΔUt=ΔUt−1+η⋅sign∇ΔUlXt−1,advyE2

where t represents the iteration index, η is the step size, and ∇ΔUlXt−1,advy denotes the gradient of the loss function with respect to the noise vector ΔU. After each update, the tth adversarial example Xt,adv is obtained by applying a clipping operation to X and padding the super-pixel values from the noise vector using ΔUt a mapping function fpad:

Xt,adv=clipX,ϵX+fpadΔUthwE3

where clipX,ϵ ensures that the perturbations within each super-pixel remain consistent. The process of generating a super-pixel adversarial example involves alternating iterations of Eqs. 2 and 3.

Liu et al. [3] introduced the concept of RAE and presented the first prototype framework to verify its feasibility. The proposed method integrates adversarial examples, reversible data hiding, and encryption to achieve RAE. Moreover, RAE can be viewed as a form of encryption for computer vision as the reversibility of RAE ensures the decryption of this type of encryption.

3.1.2 Reversible adversarial example based on reversible data hiding in YUV Colorspace

Yin et al. [3] proposed a reversible adversarial example scheme where the adversarial perturbation is embedded in the UV channels using the reversible data hiding (RDH) technique. Specifically, the prediction error extension embedding algorithm [11] is utilized to embed the perturbation. This algorithm leverages the correlation between image pixels.

Initially, the adversarial component in the Y channel is obtained by the following equation:

Y=0.299×R+0.587×G+0.114×B,V=−0.1687×R−0.3313×G+0.500×B+128,U=0.500×R−0.4187×G−0.0813×B+128.E4

In addition, the class activation mapping (CAM) [12] technique is utilized to narrow down the region of adversarial perturbation. Next, the adversarial distortion in Y channel is embedded into the UV channels using RDH. Finally, images are converted from YUV to RGB color space in the following equation:

R=Y+1.402×V−128G=Y−0.34414×U−128−0.71414×V−128B=Y+1.772×U−128E5

This process is iteratively repeated until the victim model is deceived by the generated reversible adversarial example.

RDH algorithm [11] guarantees the exact recovery of the original images. First of all, convert the reversible adversarial examples from RGB to YUV color space. Next, extract the perturbation from the UV channels by the RDH algorithm [11] and mitigate the perturbation in Y channel. Finally, convert the images from YUV to RGB color space to recover the original images.

This reversible adversarial example scheme can also achieve the exact recovery of the original images from reversible adversarial example, which ensures further applications of the images for the receiver end. Moreover, this method embeds the information in the chrominance channels and introduces adversarial perturbations in the luminance channel, which decreases the influence of the embedded information on the attack ability of the RAE.

3.1.3 Reversible adversarial example based on local visible adversarial perturbation

Yin et al. [13] proposed a RAE scheme based on local visible adversarial perturbation. In the process of generating adversarial examples, they adopt AdvPatch [14] in their method. Rao et al. [15] suggest that the placement of the patch within the image significantly impacts the effectiveness of the attack. Thus, they employ basin hopping evolution (BHE) [16] to determine the position of the patch within the image. BHE combines basin hopping and evolutionary techniques, utilizing multiple starting points and crossover operations to maintain solution diversity, thereby facilitating the attainment of the global optimum. They initialize the population and commence the iterative process. In each iteration, the basin hopping algorithm is employed to generate a series of improved solutions. Subsequently, crossover and selection operations are conducted to choose the next generation of the population.

In the process of generating reversible adversarial examples, the segment of the original image obscured by the adversarial patch is treated as the secret image and is embedded into the adversarial examples using RDH. They compress the secret image and convert into binary to reduce the amount of embedded information. Then, they adopt prediction error extension [11] to embed data. The embedding process mainly includes two steps. First, they calculate the prediction error using the pixel value a and the predicted value â as:

p=a−â.E6

The predictor predicts the pixel value by considering the neighborhood of a given pixel, leveraging the inherent correlation within the pixel neighborhood. Second, the prediction error is calculated as:

ps=p⊕i=2p+i,E7

where i is the embedding bit and ⊕ represents the difference expansion embedding. Finally, the pixel value as is calculated as:

as=â+ps.E8

In the process of recovering original images, they first extract auxiliary information and image data. Then, they decompress the extracted data and recover the original image with no distortion by the auxiliary information.

3.1.4 Reversible adversarial example based on diffusion model

Xing et al. [17] proposed a RAE scheme based on diffusion model. First of all, they define a backdoor trigger, which biases Gaussian distribution in biased Gaussian distribution (BGD) diffusion process. Thus, denoising diffusion probabilistic model (DDPM) is trained on a biased Gaussian distribution (BGD). The standard generative process p˜θ∗xt−1xt trained with parameter θ∗ to approximate q˜xt−1xt is formulated as follows:

p˜θ∗xt−1xt=Nxt−1μ˜θ∗xtβ˜θ∗xtI=q˜xt−1xt,E9

where

μ˜θ∗xt=αt1−α¯t−11−α¯txt+α¯t−1βt1−α¯tx0+1−α¯t−1βt−αt1−α¯t−1kt1−α¯tμ,E10

x0=xt−1−α¯tγϵθ∗xtt−1−α¯tμα¯t,E11

β˜θ∗xt=1−α¯t−1βt1−α¯tγ2.E12

Given ϵ∼N0I, the training optimization of BGD is defined as:

LBGDϵθ∗=MSEϵ−ϵθ∗α¯tx0+1−α¯tγϵ+1−α¯tμt,E13

where MSE denotes mean square error. The clean dataset is subjected to slight noise through a specific time step diffusion on the BGD. RAE is generated from this slightly noisy dataset by the designed adversarial generative process based on the weights of the DDPM. As all the introduced noises in the dataset originate from the DDPM, relative prior knowledge can be utilized to conduct image generation and image restoration.

3.2 Black-box reversible adversarial example

3.2.1 B-RAE method

Xiong et al. [18] proposed a black-box reversible adversarial example scheme (B-RAE). This scheme comprises three components: perturbation generative network (PGN) training, reversible adversarial example (RAE) generation, and original image recovery.

Perturbation generative network is trained to generate robust black-box adversarial perturbations. To enhance the resemblance between the adversarial image and the original image, they employ the discriminator to impose constraints on PGN, ensuring the generation of small and precise noise. The noise layer is designed to simulate typical image processing operations, aimed at enhancing the robustness of adversarial example. The noise robustness in the PGN training process needs to be addressed as RDH unavoidably introduces noise in the image. By incorporating a noise layer, the adversarial example becomes less sensitive to minor noise, thereby decreasing the impact of information embedding. To augment the significance of the perturbation sign on attack ability, they devise the perturbation loss Lnoise to regulate the value of the generated perturbation Mnoise, thus constraining Mnoise within a specific range. The perturbation loss can be formulated as follows:

Lnoise=MSELossMnoiseMatrixzero,E14

where Matrixzero denotes a matrix with all elements being 0 and MSELoss denotes mean square error (MSE) loss [19]. Moreover, the discriminator is employed to regulate image quality and enhance texture details. The discriminator is trained to differentiate between the fake image and the original image. The loss of discriminator is formulated as follows:

Ldis=−b⋅logb̂+1−b⋅log1−b̂E15

where b denotes the label and b̂ denotes the output of discriminator. For PGN, the output of discriminator is employed to enhance the visual quality of the generated noise. The adversarial loss is formulated as follows:

Ladv=−log1−b̂.E16

In addition, they utilize the ensemble strategy [20] to enhance the transferability of adversarial examples. The classification loss is formulated as follows:

Ldassify=∑m=1nmaxmaxi≠5fmXAE1i−fmXA1skE17

where s denotes the source class of X, i denotes the i-th class, k is a predefined bound parameter, and XAE1 denotes the generated adversarial example. The total loss function is formulated as follows:

L=Ladv+λ1⋅Lclassify+λ2⋅Lnoise,E18

where λ1 and λ2 denote the balanced factors.

After the perturbation generative network generates the adversarial perturbation, they employ pre-processing operation and lossless compression to compress the generated adversarial perturbation, thereby reducing information size. Then, the RAE is obtained by embedding the compressed data into the preliminary adversarial example using the RDH technique.

To recover the original image, they first utilize the inverse process of the RDH algorithm applied to extract the embedded data from RAE, thereby restoring the original adversarial example before embedding information. The extracted data comprise the compressed binary data required for restoring the original image. Then, they recover the adversarial perturbation by replicating the decompressed data from a single channel to the other two channels. Finally, they restore the original image by mitigating the adversarial perturbation from the preliminary adversarial example.

3.2.2 Reversible adversarial example based on flipping transformation

Fang et al. [21] proposed a black-box RAE scheme based on flipping transformation. First of all, they adopt CAM [22] technique to obtain the attention map. Inspired by Yang et al. [23], they incorporate flipping transformation in the process of generating adversarial examples to enhance the adversarial transferability. In this approach, the input image is randomly flipped with a probability p at each iteration. The optimization of the adversarial perturbation for the randomly flipped image is conducted as follows:

FTxadvp=FTxadv,with probabilitypxadv,with probability1−pE19

where xadv denotes the adversarial example and FT denotes the image transformation function that flips the image left and right. Moreover, they convert the image from RGB to YUV color space and add perturbation on the Y channel. Then, they employ prediction error expansion [24] to embed perturbation information. Finally, the image is converted from YUV to RGB color space and obtain the generated RAE. In the process of recovering the original image, they first convert the RAE from RGB to YUV color space and extract the perturbation information in the UV channel. Then, they mitigate the perturbation in the Y channel and recover the original image by converting the image from YUV to RGB color space.

4. Applications of reversible adversarial example

4.1 Privacy protection

More and more users would like to share their personal images on social network software. However, malicious commercial companies can utilize deep models to collect user data and obtain personal information. By employing RAE, users can ensure the legitimate utilization of shared data by authorized parties and prevent unauthorized access by illegitimate parties, thereby protecting their privacy.

4.2 Dataset access control

On the internet, there exist numerous commercial image datasets that have been meticulously collected through substantial human effort. RAE scheme can be employed to safeguard such datasets. The RAE image datasets are designed to evade the recognition by AI models, thereby ensuring the protection of access to the original image datasets.

4.3 Model authorization

There are many applications based on AI models on the market, while the quality of these models is not guaranteed. The market needs to authenticate the AI models that meet the application requirements and only allow the authorized models to be published on the market. RAE scheme can be applied to this model authorization application. Leveraging a certain amount of reversible adversarial examples, authorized models can correctly recognize the images, while unauthorized models will misclassify the images. Thus, we can identify authorized models according to the classification accuracy rate.

5. Conclusion and future directions

In this chapter, we have delved into the fascinating realm of reversible adversarial examples and explored multiple distinct methods for generating these adversarial examples. Through our exploration of both white-box and black-box attack methods, we have witnessed the effectiveness of RAE scheme against deep models. Each method introduced in this chapter offers unique insights and techniques for crafting reversible adversarial examples. Whether leveraging adversarial perturbation, reversible data hiding, or innovative transformation strategies, these methods highlight the ingenuity and creativity in the adversarial attack landscape.

Looking ahead, the field of RAE holds promise for further exploration and innovation. Future research efforts may focus on developing more sophisticated attack methods, enhancing the transferability of adversarial examples, and exploring the practical implications of RAE in real-world applications. Additionally, continued efforts in adversarial defense and ethical guidelines will be essential to ensure the responsible use and deployment of AI technologies in society.

Our journey into the realm of reversible adversarial examples has provided valuable insights and perspectives, paving the way for continued exploration and advancement in this captivating field.

Acknowledgments

The author acknowledges the use of AI tools for language polishing of the manuscript.

References

1. Zhao H, Jia J, Koltun V. Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE Computer Society; 2020. pp. 10076-10085
2. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics; 2020. 38-45 p
3. Liu J, Zhang W, Fukuchi K, Akimoto Y, Sakuma J. Unauthorized ai cannot recognize me: Reversible adversarial example. Pattern Recognition. 2023;134:109048
4. Goodfellow I, J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. arXiv. 2014
5. Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world. In: International Conference on Learning Representations. Chapman and Hall/CRC; 2017
6. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations. 2018
7. Fridrich J, Goljan M, Rui D. Lossless data embedding for all image formats. Electronic Imaging. 2002;4675:572-583
8. Tian J. Reversible data embedding using a difference expansion. IEEE Transactions on Circuits and Systems for Video Technology. 2003;13(8):890-896
9. Ni Z, Shi Y, Ansari N, Wei S. Reversible data hiding. IEEE Transactions on Circuits and Systems for Video Technology. 2006;16(3):354-362
10. Kurakin A, Goodfellow I, Bengio S. Adversarial machine learning at scale. arXiv. 2016
11. Thodi DM, Rodríguez JJ. Expansion embedding techniques for reversible watermarking. IEEE Transactions on Image Processing. 2007;16(3):721-730
12. Selvaraju R, R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society; 2017. pp. 618-626
13. Chen L, Zhu S, Andrew A, Yin Z. Reversible attack based on local visible adversarial perturbation. Multimedia Tools and Applications. 2024;83(4):11215-11227
14. Brown T, B, Mané D, Roy A, Abadi M, Gilmer J. Adversarial patch. arXiv. 2017
15. Rao S, Stutz D, Schiele B. Adversarial training against location-optimized adversarial patches. In: European Conference on Computer Vision. Springer; 2020. pp. 429-448
16. Jia X, Wei X, Cao X, Han X. Adv-watermark: A novel watermark perturbation for adversarial examples. In: Proceedings of the 28th ACM International Conference on Multimedia. 2020. pp. 1579-1587
17. Xing F, Zhou X, Fan X, Tian Z, Zhao Y. Raediff: Denoising diffusion probabilistic models based reversible adversarial examples self-generation and self-recovery. arXiv. 2023
18. Xiong L, Yue W, Peipeng Y, Zheng Y. A black-box reversible adversarial example for authorizable recognition to shared images. Pattern Recognition. 2023;140:109549
19. Rezatofighi H, Tsoi N, Gwak JY, Sadeghian A, Reid I, Savarese S. Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2019. pp. 658-666
20. Che Z, Borji A, Zhai G, Ling S, Li J, Le Callet P. A new ensemble adversarial attack powered by long-term gradient memories. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020. pp. 3405-3413
21. Fang Y, Jia J, Yang Y, Lyu W. Improving transferability reversible adversarial examples based on flipping transformation. In: International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer; 2023. pp. 417-432
22. Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2021. pp. 13713-13722
23. Bo YANG, Hengwei ZHANG, Zheming LI, Kaiyong XU. Adversarial example generation method based on image flipping transform. Journal of Computer Applications. IEEE. 2022;42(8):2319
24. He W, Cai Z. Reversible data hiding based on dual pairwise prediction-error expansion. IEEE Transactions on Image Processing. 2021;30:5045-5055

[1] 1. Zhao H, Jia J, Koltun V. Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE Computer Society; 2020. pp. 10076-10085

[2] 2. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics; 2020. 38-45 p

[3] 3. Liu J, Zhang W, Fukuchi K, Akimoto Y, Sakuma J. Unauthorized ai cannot recognize me: Reversible adversarial example. Pattern Recognition. 2023;134:109048

[4] 4. Goodfellow I, J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. arXiv. 2014

[5] 5. Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world. In: International Conference on Learning Representations. Chapman and Hall/CRC; 2017

[6] 6. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations. 2018

[7] 7. Fridrich J, Goljan M, Rui D. Lossless data embedding for all image formats. Electronic Imaging. 2002;4675:572-583

[8] 8. Tian J. Reversible data embedding using a difference expansion. IEEE Transactions on Circuits and Systems for Video Technology. 2003;13(8):890-896

[9] 9. Ni Z, Shi Y, Ansari N, Wei S. Reversible data hiding. IEEE Transactions on Circuits and Systems for Video Technology. 2006;16(3):354-362

[10] 10. Kurakin A, Goodfellow I, Bengio S. Adversarial machine learning at scale. arXiv. 2016

[11] 11. Thodi DM, Rodríguez JJ. Expansion embedding techniques for reversible watermarking. IEEE Transactions on Image Processing. 2007;16(3):721-730

[12] 12. Selvaraju R, R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society; 2017. pp. 618-626

[13] 13. Chen L, Zhu S, Andrew A, Yin Z. Reversible attack based on local visible adversarial perturbation. Multimedia Tools and Applications. 2024;83(4):11215-11227

[14] 14. Brown T, B, Mané D, Roy A, Abadi M, Gilmer J. Adversarial patch. arXiv. 2017

[15] 15. Rao S, Stutz D, Schiele B. Adversarial training against location-optimized adversarial patches. In: European Conference on Computer Vision. Springer; 2020. pp. 429-448

[16] 16. Jia X, Wei X, Cao X, Han X. Adv-watermark: A novel watermark perturbation for adversarial examples. In: Proceedings of the 28th ACM International Conference on Multimedia. 2020. pp. 1579-1587

[17] 17. Xing F, Zhou X, Fan X, Tian Z, Zhao Y. Raediff: Denoising diffusion probabilistic models based reversible adversarial examples self-generation and self-recovery. arXiv. 2023

[18] 18. Xiong L, Yue W, Peipeng Y, Zheng Y. A black-box reversible adversarial example for authorizable recognition to shared images. Pattern Recognition. 2023;140:109549

[19] 19. Rezatofighi H, Tsoi N, Gwak JY, Sadeghian A, Reid I, Savarese S. Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2019. pp. 658-666

[20] 20. Che Z, Borji A, Zhai G, Ling S, Li J, Le Callet P. A new ensemble adversarial attack powered by long-term gradient memories. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020. pp. 3405-3413

[21] 21. Fang Y, Jia J, Yang Y, Lyu W. Improving transferability reversible adversarial examples based on flipping transformation. In: International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer; 2023. pp. 417-432

[22] 22. Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE; 2021. pp. 13713-13722

[23] 23. Bo YANG, Hengwei ZHANG, Zheming LI, Kaiyong XU. Adversarial example generation method based on image flipping transform. Journal of Computer Applications. IEEE. 2022;42(8):2319

[24] 24. He W, Cai Z. Reversible data hiding based on dual pairwise prediction-error expansion. IEEE Transactions on Image Processing. 2021;30:5045-5055