Unlearnable adversarial samples
Jonathan Peck
21 May 2021
Adversarial samples
2
Adversarial samples
3
Adversarial samples
4
Adversarial samples
Can also target other objectives, e.g.
5
Adversarial samples
Can also target other objectives, e.g.
●
Making files uncompressible
6
Adversarial samples
Can also target other objectives, e.g.
●
Making files uncompressible
●
Producing erroneous machine translations
7
Adversarial samples
Can also target other objectives, e.g.
●
Making files uncompressible
●
Producing erroneous machine translations
●
Circumventing content filtering systems
8
Adversarial samples
Can also target other objectives, e.g.
●
Making files uncompressible
●
Producing erroneous machine translations
●
Circumventing content filtering systems
Basically any disruption of ML at inference time
9
Adversarial samples
Can also target other objectives, e.g.
●
Making files uncompressible
●
Producing erroneous machine translations
●
Circumventing content filtering systems
Basically any disruption of ML at inference time
Not necessarily always a bad thing!
10
Adversarial samples
Maximize model loss via minimal perturbations:
subject to norm constraint
11
Adversarial samples
Maximize model loss via minimal perturbations:
subject to norm constraint
Basically add imperceptible “decoy” signals to a
sample that trigger incorrect model outputs
12
Adversarial samples
Common countermeasure: adversarial training
13
Adversarial samples
Common countermeasure: adversarial training
Turns model fitting into bi-level optimization:
subject to norm constraint
14
Adversarial samples
Common countermeasure: adversarial training
Turns model fitting into bi-level optimization:
subject to norm constraint
Can be very effective against known attacks
15
Unlearnable samples
16
Unlearnable samples
Interesting case studies in unauthorized
exploitation of personal data:
17
Unlearnable samples
Interesting case studies in unauthorized
exploitation of personal data:
●
ImageNet
18
Unlearnable samples
Interesting case studies in unauthorized
exploitation of personal data:
●
ImageNet
●
CASIA-WebFace
19
Unlearnable samples
Interesting case studies in unauthorized
exploitation of personal data:
●
ImageNet
●
CASIA-WebFace
●
VGG Face
20
Unlearnable samples
Interesting case studies in unauthorized
exploitation of personal data:
●
ImageNet
●
CASIA-WebFace
●
VGG Face
●
The Human Genome Diversity Project
21
Unlearnable samples
Train model via bi-level optimization:
subject to norm constraint
22
Unlearnable samples
Train model via bi-level optimization:
subject to norm constraint
Basically add imperceptible “decoy” signals to
make model overfit!
23
Unlearnable samples
Goal is to make data useless for training
24
Unlearnable samples
Goal is to make data useless for training
Problem. Classifiers trained on natural data will
still generalize to unlearnable samples
25
Unlearnable adversarial samples
Can we have unlearnable samples that are also
adversarial?
26
Unlearnable adversarial samples
Can we have unlearnable samples that are also
adversarial?
“Best” of both worlds:
1. Existing models are unreliable against our
samples
27
Unlearnable adversarial samples
Can we have unlearnable samples that are also
adversarial?
“Best” of both worlds:
1. Existing models are unreliable against our
samples
2. Models cannot be retrained on our data
without sacrificing generalization
28
Unlearnable adversarial samples
Observation. Unlearnable and adversarial
samples can be restricted to specific regions of
the input images
29
Unlearnable adversarial samples
Observation. Unlearnable and adversarial
samples can be restricted to specific regions of
the input images
Basic idea. Create two disjoint regions: one for
the unlearnable perturbation and one for the
adversarial perturbation.
30
Unlearnable adversarial samples
Regions created uniformly at random: each pixel
has equal probability of belonging to the
unlearnable region or the adversarial region.
31
Unlearnable adversarial samples
Regions created uniformly at random: each pixel
has equal probability of belonging to the
unlearnable region or the adversarial region.
Leads to discontiguous regions: harder to detect
and remove but still effective!
32
Unlearnable adversarial samples
Preliminary results. Using a ResNet-50 on the
CIFAR-10 data set:
Model Natural Unlearnable Adversarial Unlearnable
adversarial
Standard 81.05% 99.76% 44.03% 27.20%
Retrained 39.20% 99.94% 39.66% 100.00%
33
Unlearnable adversarial samples
Preliminary results. Using a ResNet-50 on the
ImageNette data set:
Model Natural Unlearnable Adversarial Unlearnable
adversarial
Standard 83.72% 100.00% 31.40% 10.56%
Retrained 33.71% 98.65% 39.02% 98.17%
34
Unlearnable adversarial samples
Perfect storm. More complex models typically
yield more impressive results, however…
35
Unlearnable adversarial samples
Perfect storm. More complex models typically
yield more impressive results, however…
… they are more susceptible to adversarials
36
Unlearnable adversarial samples
Perfect storm. More complex models typically
yield more impressive results, however…
… they are more susceptible to adversarials
… they overfit more easily on unlearnables
37
Unlearnable adversarial samples
Perfect storm. More complex models typically
yield more impressive results, however…
… they are more susceptible to adversarials
… they overfit more easily on unlearnables
Smaller networks are more resistant to this
attack but significantly less accurate!
38
Unlearnable adversarial samples
Unlearnable and adversarial perturbations seem
to be “orthogonal” to a significant extent
39
Unlearnable adversarial samples
Unlearnable and adversarial perturbations seem
to be “orthogonal” to a significant extent
Can use any adversarial attack including
transferable and universal perturbations!
40
Unlearnable adversarial samples
Unlearnable and adversarial perturbations seem
to be “orthogonal” to a significant extent
Can use any adversarial attack including
transferable and universal perturbations!
Adversarial training by itself already tends to
negatively affect generalization
41
Conclusion
Unlearnable adversarial samples
●
look normal to people
42
Conclusion
Unlearnable adversarial samples
●
look normal to people
●
cause existing models to fail
43
Conclusion
Unlearnable adversarial samples
●
look normal to people
●
cause existing models to fail
●
cannot be effectively learned from
44
Conclusion
Unlearnable adversarial samples
●
look normal to people
●
cause existing models to fail
●
cannot be effectively learned from
Can theoretically be used to protect data from
being exploited for training as well as inference
45
Conclusion
Possible shortcomings:
46
Conclusion
Possible shortcomings:
●
Image transformations (cropping, rotating)
47
Conclusion
Possible shortcomings:
●
Image transformations (cropping, rotating)
●
Adversarially robust models
48
Conclusion
Possible shortcomings:
●
Image transformations (cropping, rotating)
●
Adversarially robust models
●
Specific restoration techniques
49
Conclusion
Possible shortcomings:
●
Image transformations (cropping, rotating)
●
Adversarially robust models
●
Specific restoration techniques
Likely another “arms race” scenario
50