Johnson, J., Alahi, A., & Fei-Fei, L. (2016, October). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694-711). Springer, Cham.

vgg features

The data

path = untar_data(URLs.IMAGENETTE)
db = DataBlock(blocks=(ImageBlock, ImageBlock),
               get_items=get_image_files,
               splitter=RandomSplitter(valid_pct=0.01),
               get_x=noop, get_y=noop,
               item_tfms=Resize(256),
               batch_tfms=Normalize.from_stats(0.5*torch.ones(3), 0.5*torch.ones(3)))
dls = db.dataloaders(path, bs=4, num_workers=4)
dls.show_batch()

For style transfer we have to choose any image as a style target and normlalize it with the imagenet_stats.

def get_style_target(artist, size=256, **kwargs):
    r = requests.get(artists_sources[artist], stream=True)
    style_target_img = PILImage.create(r.content)
    p = Pipeline([ToTensor,
                  Resize(size, **kwargs),
                  IntToFloatTensor,
                  Normalize.from_stats(*imagenet_stats, cuda=False)])
    return p(style_target_img), p
style_target, p = get_style_target('picasso')
p.decode(style_target)[0].show(figsize=(10,10));

The Loss

As the autors extracted features from the VGG16 model.

These are the original weights used in the paper.

!wget http://cs.stanford.edu/people/jcjohns/fast-neural-style/models/vgg16.t7 -O vgg16.t7
URL transformed to HTTPS due to an HSTS policy
--2020-11-11 17:36:44--  https://cs.stanford.edu/people/jcjohns/fast-neural-style/models/vgg16.t7
Resolving cs.stanford.edu (cs.stanford.edu)... 171.64.64.64
Connecting to cs.stanford.edu (cs.stanford.edu)|171.64.64.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 553452665 (528M)
Saving to: ‘vgg16.t7’

vgg16.t7            100%[===================>] 527.81M  14.7MB/s    in 38s     

2020-11-11 17:37:23 (13.8 MB/s) - ‘vgg16.t7’ saved [553452665/553452665]

The PerceptualLoss module computes the feature loss based on feture_layer and the style loss on the style_layers_names.

gramm_matrix[source]

gramm_matrix(x)

anisotropic_total_variation[source]

anisotropic_total_variation(x)

class PerceptualLoss[source]

PerceptualLoss(style_target=None, style_weight=5, feature_weight=1, renormalize=True, feature_layer='relu2_2', style_layers_names=['relu1_2', 'relu2_2', 'relu3_3', 'relu4_3'], bs=1, cuda=True, tv_weight=1e-05) :: Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

style_target_test = TensorImage(torch.rand(1, 3, 256, 256)*2-1)
feature_loss = PerceptualLoss(style_target_test, renormalize=True, style_weight=1, bs=4)
input = TensorImage(torch.rand(2, 3, 256, 256)*2-1).cuda()
target = TensorImage(torch.rand(2, 3, 256, 256)*2-1).cuda()
loss = feature_loss(input, target)
loss
TensorImage(16743.9863, device='cuda:0', grad_fn=<AliasBackward>)

Test that the style image is properly normalized

style_unnorm = TensorImage(torch.rand(1, 3, 256, 256))
style_imagenet = Normalize.from_stats(*imagenet_stats, cuda=False)(style_unnorm)
style_norm = style_unnorm*2-1 
feature_loss = PerceptualLoss(style_imagenet, renormalize=True, feature_weight=0, cuda=False)
target = TensorImage(torch.rand(1, 3, 256, 256)*2-1)
loss = feature_loss(style_norm, target)
test_eq(loss, 0)

Test that cuad=True works

style_target_test = TensorImage(torch.rand(1, 3, 256, 256)*2-1)
feature_loss = PerceptualLoss(style_target_test, renormalize=True)
input = TensorImage(torch.rand(1, 3, 256, 256)*2-1).cuda()
target = TensorImage(torch.rand(1, 3, 256, 256)*2-1).cuda()
loss = feature_loss(input, target)
loss
TensorImage(83739.4297, device='cuda:0', grad_fn=<AliasBackward>)
style_target_test = TensorImage(torch.rand(1, 3, 256, 256)*2-1)
feature_loss = PerceptualLoss(style_target_test, renormalize=True, bs=4)
input = TensorImage(torch.rand(4, 3, 256, 256)*2-1).cuda()
target = TensorImage(torch.rand(4, 3, 256, 256)*2-1).cuda()
loss = feature_loss(input, target)
loss
TensorImage(85110.8438, device='cuda:0', grad_fn=<AliasBackward>)
style_target_test = TensorImage(torch.rand(1, 3, 256, 256)*2-1)
feature_loss = PerceptualLoss(style_target_test, renormalize=True, style_weight=1e5, feature_weight=1)
target = TensorImage(torch.rand(1, 3, 256, 256)*2-1).to('cuda')
loss = feature_loss(target, target)
loss
TensorImage(1.6949e+09, device='cuda:0', grad_fn=<AliasBackward>)
style_target_test = TensorImage(torch.rand(1, 3, 256, 256)*2-1)
feature_loss = PerceptualLoss(style_target_test, renormalize=True, style_weight=1, bs=4)
input = TensorImage(torch.rand(4, 3, 256, 256)*2-1).cuda()
target = TensorImage(torch.rand(4, 3, 256, 256)*2-1).cuda()
loss = feature_loss(input, target)
loss
TensorImage(17311.0059, device='cuda:0', grad_fn=<AliasBackward>)

We use LBFGS optimization to find the images that mimimize the style loss.

100.00% [4/4 08:54<00:00]
14.20% [71/500 00:22<02:13 4.58]

We can also visualize images that minimize the feature reconstruction loss at different layers

Resnet Generator

The authors use a generator with residual connexions. They used a residual block without the last activation.

class JohnsonResBlock[source]

JohnsonResBlock(n) :: Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

jrb = JohnsonResBlock(32)
x = torch.randn(4, 32, 16, 16)
y = jrb(x)
test_eq(y.shape, x.shape)

ResnetGenerator[source]

ResnetGenerator(ni=3, nout=3, nf=32, n_downsamples=2, n_resblocks=5, n_upsamples=2, superres=False)

style_transfer_generator = ResnetGenerator()
x = torch.randn(1, 3, 256, 256)
y = style_transfer_generator(x)
y.shape, y.max(), y.min()
(torch.Size([1, 3, 256, 256]),
 tensor(0.9999, grad_fn=<MaxBackward1>),
 tensor(-0.9999, grad_fn=<MinBackward1>))

Learning

class LossToDevice[source]

LossToDevice(after_create=None, before_fit=None, before_epoch=None, before_train=None, before_batch=None, after_pred=None, after_loss=None, before_backward=None, after_backward=None, after_step=None, after_cancel_batch=None, after_batch=None, after_cancel_train=None, after_train=None, before_validate=None, after_cancel_validate=None, after_validate=None, after_cancel_epoch=None, after_epoch=None, after_cancel_fit=None, after_fit=None) :: Callback

Basic class handling tweaks of the training loop by changing a Learner in various events

style_learner[source]

style_learner(dls, style_target=None, cbs=None, plkwargs={}, loss_func=None, opt_func=Adam, lr=0.001, splitter=trainable_params, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95))

style_target, _ = get_style_target('picasso')
sgc = ShowGraphCallback()
picasso_learn = style_learner(dls, style_target=style_target, cbs=sgc, plkwargs={'style_weight': 0.5, 'feature_weight':5})
with picasso_learn.removed_cbs(sgc):
    picasso_learn.fit(1, lr=1.e-3)                    
picasso_learn.fit(7, lr=1.e-3)
picasso_learn.show_results()

Super-Resolution 4x

ResImageBlock[source]

ResImageBlock(res)

Like fastai ImageBlock, but changes the resolution to res.

db = DataBlock(blocks=(ResImageBlock(72), ResImageBlock(288)),
               get_items=get_image_files,
               get_x=noop, get_y=noop,
               batch_tfms=Normalize.from_stats(0.5*torch.ones(3), 0.5*torch.ones(3)))
dls = db.dataloaders(path, bs=4, num_workers=4)
dls.show_batch()
b = dls.one_batch()

superres_learner[source]

superres_learner(dls, superres_factor=4, cbs=None, loss_func=None, opt_func=Adam, lr=0.001, splitter=trainable_params, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95))

learn = superres_learner(dls)
learn.fit(16, lr=1e-3, wd=0)
learn.show_results()

Superres 8x

db = DataBlock(blocks=(ResImageBlock(36), ResImageBlock(288)),
               get_items=get_image_files,
               get_x=noop, get_y=noop,
               batch_tfms=Normalize.from_stats(0.5*torch.ones(3), 0.5*torch.ones(3)))
dls = db.dataloaders(path, bs=4, num_workers=4)
dls.show_batch()
b = dls.one_batch()
learn = superres_learner(dls, superres_factor=8)
learn.fit(16, lr=1e-3, wd=0)
learn.show_results()