百度网盘AI大赛-水印智能消除赛第19名方案

P粉084495128
发布: 2025-07-30 10:42:40
原创
181人浏览过
该项目针对百度网盘AI大赛水印智能消除赛,基于UNet改进模型:用leaky_relu保留信息,设双分支注意力通路增强容量,加残差连接加速收敛。处理数据并划分训练验证集,以PSNR和SSIM损失训练,A榜分数有所提升,最后生成提交文件及预测结果。

☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜

百度网盘ai大赛-水印智能消除赛第19名方案 - php中文网

百度网盘AI大赛-图像处理挑战赛:水印智能消除赛

本项目基于Baseline使用略作修改的UNet消除图像中水印以完成百度网盘AI大赛-图像处理挑战赛:水印智能消除赛。

比赛链接

一、比赛介绍

日常生活中带有水印的图片很常见,即使是PS专家,也很难快速且不留痕迹的去除水印。而使用智能去除水印的算法,可以快速自动去除图片中的水印。选手需要通过深度学习技术训练模型,对给定的真实场景下采集得到的带有水印的图片进行处理,并最终输出处理后的扫描结果图片。

本次比赛希望选手结合当下前沿的图像处理技术与计算机视觉技术,提升模型的训练性能和泛化能力,在保证效果精准的同时,注意模型在实际应用中的性能问题,做到尽可能的小而快。

评价标准

  • 评价指标为 PSNR 和 MSSSIM;
  • 用于评价的机器环境仅提供两种框架模型运行环境:paddlepaddle 和 onnxruntime,其他框架模型可转换为上述两种框架的模型;
  • 机器配置:V100,显存15G,内存10G;
  • 单张图片耗时>1.2s,决赛中的性能分数记0分。

因此,应尽可能不能使用过大的模型。

二、方法介绍

在Baseline基础上,本项目继续使用UNet网络,对水印图像进行像素级转换。相比较于Baseline,我们做了四处修改:第一、我们选择使用leaky_relu,而不是relu,以尽可能的保留像素信息;第二、在Encoder与Decoder中间的过渡层,我们不是使用原来的单分支,而是以注意力为权重,建立双分支通路,将自适应加权结果变换后输入到Decoder中,增强模型容量;第三、同样地,水印输出,我们选择双分支输出,以注意力值为权重,增强样本适应性;第四,我们引入网络输入与输出之间的残差连接,着重于网络学习真实图片与水印图片之间的差异,加快了模型收敛。因此在一个压缩包的数据训练下,我们从A榜的0.58307提升到了0.61742。然后我们使用5倍更多的数据进行模型训练,与少量数据训练一致,模型收敛很快,但性能仅仅提升到了0.61805。应该模型本身容量不足以及resize成512x512训练预测导致图片损失大量信息,但没有时间继续调整优化了。

三、数据处理

数据解压

这里使用了四个压缩包的数据,其实1个应该就足够了,或者按照比赛学习资料自动生成水印数据

In [1]
%cd data
!mkdir train
%cd train/
!mkdir image
!mkdir mask
%cd ../../
登录后复制
       
/home/aistudio/data
/home/aistudio/data/train
/home/aistudio
登录后复制
       
In [2]
! tar -xf data/data142446/watermark_datasets.part1.tar
!rm data/data142446/watermark_datasets.part1.tar
!cp -r watermark_datasets.part1 -d data/train/image
!rm -r watermark_datasets.part1/
! tar -xf data/data142446/watermark_datasets.part2.tar
!rm data/data142446/watermark_datasets.part2.tar
!cp -r watermark_datasets.part2 -d data/train/image
!rm -r watermark_datasets.part2/
! tar -xf data/data142446/watermark_datasets.part3.tar
!rm data/data142446/watermark_datasets.part3.tar
!cp -r watermark_datasets.part3 -d data/train/image
!rm -r watermark_datasets.part3/
! tar -xf data/data142446/watermark_datasets.part10.tar
!rm data/data142446/watermark_datasets.part10.tar
!cp -r watermark_datasets.part10 -d data/train/image
!rm -r watermark_datasets.part10/
! tar -xf data/data142446/bg_images.tar
!rm data/data142446/bg_images.tar
!cp -r bg_images -d data/train/mask/
!rm -r bg_images/
登录后复制
   

构造数据读取器

通过paddle.io.dataset构造读取器,便于读取数据。

数据预处理包括:

  1. 将带有水印和不带水印的图片均转化为(3,512,512)的形状
  2. 对图片进行归一化
In [1]
#划分训练集及验证集import os
watermark_dir = "data/train/image"bg_dir = "data/train/mask/bg_images"watermark_sub_dir = list(os.listdir(watermark_dir))
all_watermark_list = []for sub_dir in watermark_sub_dir:
    images_path = list(os.listdir(os.path.join(watermark_dir, sub_dir)))    for path in images_path:        if 'jpg' in path:
            all_watermark_list.append(os.path.join(sub_dir, path))

all_gt_list = list(os.listdir(bg_dir))
train_ratio = 0.985all_watermark_list = sorted(all_watermark_list)

train_len = int(train_ratio*len(all_watermark_list))
train_data_list = all_watermark_list[:train_len]
val_data_list = all_watermark_list[train_len:]print("total data num: {}, train num: {}, val_num: {}".format(len(all_watermark_list), len(train_data_list), len(val_data_list)))
登录后复制
       
total data num: 405020, train num: 398944, val_num: 6076
登录后复制
       
In [2]
import paddleimport osimport numpy as npimport pandas as pdimport cv2class MyDateset(paddle.io.Dataset):
    def __init__(self, mode = 'train', train_list=None, watermark_dir=None, bg_dir=None, data_transform=None):
        super(MyDateset, self).__init__()

        self.mode = mode 
        self.watermark_dir = watermark_dir
        self.bg_dir = bg_dir
        self.data_transform = data_transform

        self.train_list = train_list        print(len(self.train_list))    def __getitem__(self, index):
        item = self.train_list[index]
        
        bg_item = item.split('/')[1][:14]+'.jpg'

        img = cv2.imread(os.path.join(self.watermark_dir, item))
        label = cv2.imread(os.path.join(self.bg_dir, bg_item))

        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        label = cv2.cvtColor(label, cv2.COLOR_BGR2RGB)

        img = paddle.vision.transforms.resize(img, (512,512), interpolation='bilinear')
        label = paddle.vision.transforms.resize(label, (512,512), interpolation='bilinear')

        img = img.transpose((2,0,1))
        label = label.transpose((2,0,1))
        
        img = img/255
        label = label/255

        img = paddle.to_tensor(img).astype('float32')
        label = paddle.to_tensor(label).astype('float32')        return img,label    def __len__(self):
        return len(self.train_list)
登录后复制
   

四、定义网络结构

我们将Baseline的UNet模型做了四处修改:第一、我们选择使用leaky_relu,而不是relu,以尽可能的保留像素信息;第二、在Encoder与Decoder中间的过渡层,我们不是使用原来的单分支,而是以注意力为权重,建立双分支通路,将自适应加权结果变换后输入到Decoder中,增强模型容量;第三、同样地,水印输出,我们选择双分支输出,以注意力值为权重,增强样本适应性;第四,我们引入网络输入与输出之间的残差连接,着重于网络学习真实图片与水印图片之间的差异,加快了模型收敛。因此在一个压缩包的数据训练下,我们从A榜的0.58307提升到了0.61742。

In [3]
import paddlefrom paddle import nnimport paddle.nn.functional as Fclass CALayer(nn.Layer):
    def __init__(self, channels, reduction=16):
        super(CALayer, self).__init__()

        mid_c = max(channels//reduction, 16)
        self.conv1 = nn.Sequential(
            nn.Conv2D(channels, mid_c, 1),
            nn.ReLU(),
            nn.Conv2D(mid_c, channels, 1),
            nn.Sigmoid(),
            )    def forward(self, x):
        y = x.mean(axis=(-1, -2), keepdim=True)
        y = self.conv1(y)        return yclass Encoder(nn.Layer):#下采样:两层卷积,两层归一化,最后池化。
    def __init__(self, num_channels, num_filters):
        super(Encoder,self).__init__()#继承父类的初始化
        self.conv1 = nn.Conv2D(in_channels=num_channels,
                              out_channels=num_filters,
                              kernel_size=3,#3x3卷积核,步长为1,填充为1,不改变图片尺寸[H W]
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn1   = nn.BatchNorm(num_filters)#归一化,并使用了激活函数
        
        self.conv2 = nn.Conv2D(in_channels=num_filters,
                              out_channels=num_filters,
                              kernel_size=3,
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn2   = nn.BatchNorm(num_filters)
        
        self.pool  = nn.MaxPool2D(kernel_size=2,stride=2,padding="SAME")#池化层,图片尺寸减半[H/2 W/2]

        if num_channels!=num_filters:
            self.downsample = nn.Sequential(
                nn.Conv2D(num_channels, num_filters, 1, bias_attr=False),
                nn.BatchNorm2D(num_filters)
            )        else:
            self.downsample = lambda x: x        
    def forward(self,inputs):
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = F.leaky_relu(x, 0.2)
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.leaky_relu(x+self.downsample(inputs), 0.2)
        x_conv = x           #两个输出,灰色 ->
        x_pool = self.pool(x)#两个输出,红色 | 
        return x_conv, x_pool    
    
class Decoder(nn.Layer):#上采样:一层反卷积,两层卷积层,两层归一化
    def __init__(self, num_channels, num_filters):
        super(Decoder,self).__init__()
        self.up = nn.Conv2DTranspose(in_channels=num_channels,
                                    out_channels=num_filters,
                                    kernel_size=2,
                                    stride=2,
                                    padding=0,
                                    bias_attr=False)#图片尺寸变大一倍[2*H 2*W]
        self.up_bn   = nn.BatchNorm(num_filters)

        self.conv1 = nn.Conv2D(in_channels=num_filters*2,
                              out_channels=num_filters,
                              kernel_size=3,
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn1   = nn.BatchNorm(num_filters)
        
        self.conv2 = nn.Conv2D(in_channels=num_filters,
                              out_channels=num_filters,
                              kernel_size=3,
                              stride=1,
                              padding=1,
                              bias_attr=False)
        self.bn2   = nn.BatchNorm(num_filters)        if num_channels!=num_filters:
            self.upsample = nn.Sequential(
                nn.Conv2D(num_filters*2, num_filters, 1, bias_attr=False),
                nn.BatchNorm2D(num_filters)
            )        else:
            self.downsample = lambda x: x        
    def forward(self,input_conv,input_pool):
        x = self.up_bn(self.up(input_pool))
        x = F.leaky_relu(x, 0.2)
        h_diff = (input_conv.shape[2]-x.shape[2])
        w_diff = (input_conv.shape[3]-x.shape[3])
        pad = nn.Pad2D(padding=[h_diff//2, h_diff-h_diff//2, w_diff//2, w_diff-w_diff//2])
        x = pad(x)                                #以下采样保存的feature map为基准,填充上采样的feature map尺寸
        x = paddle.concat(x=[input_conv,x],axis=1)#考虑上下文信息,in_channels扩大两倍
        x_sc = self.upsample(x)
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.leaky_relu(x, 0.2)
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.leaky_relu(x+x_sc, 0.2)        return x    
class UNet(nn.Layer):
    def __init__(self,num_classes=3):
        super(UNet,self).__init__()
        self.down1 = Encoder(num_channels=  3, num_filters=64) #下采样
        self.down2 = Encoder(num_channels= 64, num_filters=128)
        self.down3 = Encoder(num_channels=128, num_filters=256)
        self.down4 = Encoder(num_channels=256, num_filters=512)
        
        self.mid_conv1 = nn.Sequential(
            nn.Conv2D(512,1024,1, bias_attr=False),
            nn.BatchNorm(1024),
            nn.LeakyReLU(0.2)
        )

        self.mid_conv2 = nn.Sequential(
            nn.Conv2D(512,1024,3, padding=1, bias_attr=False),
            nn.BatchNorm(1024),
            nn.LeakyReLU(0.2)
        )

        self.ca_layer1 = CALayer(1024, 32)

        self.mid_conv3 = nn.Sequential(
            nn.Conv2D(1024,1024,1, bias_attr=False),
            nn.BatchNorm(1024),
            nn.LeakyReLU(0.2)
        )

        self.up4 = Decoder(1024,512)                           #上采样
        self.up3 = Decoder(512,256)
        self.up2 = Decoder(256,128)
        self.up1 = Decoder(128,64)
        
        self.last_conv1 = nn.Conv2D(64,num_classes,1)           #1x1卷积,softmax做分类
        self.last_conv2 = nn.Conv2D(64,num_classes,3, padding=1)

        self.ca_layer2 = CALayer(num_classes)        
    def forward(self,inputs):
        x1, x = self.down1(inputs)
        x2, x = self.down2(x)
        x3, x = self.down3(x)
        x4, x = self.down4(x)
        
        x_m1 = self.mid_conv1(x)
        x_m2 = self.mid_conv2(x)
        attn = self.ca_layer1(x_m1+x_m2)
        x = x_m1*attn+x_m2*(1.-attn)
        x = self.mid_conv3(x)
        
        x = self.up4(x4, x)
        x = self.up3(x3, x)
        x = self.up2(x2, x)
        x = self.up1(x1, x)
        
        out1 = self.last_conv1(x)
        out2 = self.last_conv2(x)
        attn = self.ca_layer2(out1+out2)
        x = out1*attn+out2*(1.-attn)        
        return inputs-x# 查看网络各个节点的输出信息#paddle.summary(UNet(), (1, 3, 600, 600))
登录后复制
   
In [ ]
net = UNet()
train_dataset = MyDateset(train_list=train_data_list, watermark_dir=watermark_dir, bg_dir=bg_dir)
train_loader = paddle.io.DataLoader(
    train_dataset,
    batch_size=1,
    shuffle=True,
    drop_last=False)for data in train_loader:
    img, label = data
    pred = net(img)    break
登录后复制
   

五、定义loss

同样秉承着拿来主义的思想,从图像评价指标PSNR、SSIM以及MS-SSIM 复制一份MSSSIM代码过来。

因赛AIGC
因赛AIGC

因赛AIGC解决营销全链路应用场景

因赛AIGC73
查看详情 因赛AIGC

~看不看得懂代码不重要,重要是看得懂文字,明白大佬已经写好了一个现成的直接调用的loss函数~

当然,仅有MSSSIM是不够的,还可以再根据通过Sub-Pixel实现图像超分辨率写一个PSNR的损失函数。

In [5]
import paddleimport paddle.nn.functional as Fdef gaussian1d(window_size, sigma):
    ###window_size = 11
    x = paddle.arange(window_size,dtype='float32')
    x = x - window_size//2
    gauss = paddle.exp(-x ** 2 / float(2 * sigma ** 2))    # print('gauss.size():', gauss.size())
    ### torch.Size([11])
    return gauss / gauss.sum()def create_window(window_size, sigma, channel):
    _1D_window = gaussian1d(window_size, sigma).unsqueeze(1)
    _2D_window = _1D_window.mm(_1D_window.t()).unsqueeze(0).unsqueeze(0)    # print('2d',_2D_window.shape)
    # print(window_size, sigma, channel)
    return _2D_window.expand([channel,1,window_size,window_size])def _ssim(img1, img2, window, window_size, channel=3 ,data_range = 255.,size_average=True,C=None):
    # size_average for different channel

    padding = window_size // 2

    mu1 = F.conv2d(img1, window, padding=padding, groups=channel)
    mu2 = F.conv2d(img2, window, padding=padding, groups=channel)    # print(mu1.shape)
    # print(mu1[0,0])
    # print(mu1.mean())
    mu1_sq = mu1.pow(2)
    mu2_sq = mu2.pow(2)
    mu1_mu2 = mu1 * mu2
    sigma1_sq = F.conv2d(img1 * img1, window, padding=padding, groups=channel) - mu1_sq
    sigma2_sq = F.conv2d(img2 * img2, window, padding=padding, groups=channel) - mu2_sq
    sigma12 = F.conv2d(img1 * img2, window, padding=padding, groups=channel) - mu1_mu2    if C ==None:
        C1 = (0.01*data_range) ** 2
        C2 = (0.03*data_range) ** 2
    else:
        C1 = (C[0]*data_range) ** 2
        C2 = (C[1]*data_range) ** 2
    # l = (2 * mu1_mu2 + C1) / (mu1_sq + mu2_sq + C1)
    # ssim_map = ((2 * mu1_mu2 + C1) * (2 * sigma12 + C2)) / ((mu1_sq + mu2_sq + C1) * (sigma1_sq + sigma2_sq + C2))
    sc = (2 * sigma12 + C2) / (sigma1_sq + sigma2_sq + C2)
    lsc = ((2 * mu1_mu2 + C1) / (mu1_sq + mu2_sq + C1))*sc    if size_average:        ### ssim_map.mean()是对这个tensor里面的所有的数值求平均
        return lsc.mean()    else:        # ## 返回各个channel的值
        return lsc.flatten(2).mean(-1),sc.flatten(2).mean(-1)def ms_ssim(
    img1, img2,window, data_range=255, size_average=True, window_size=11, channel=3, sigma=1.5, weights=None, C=(0.01, 0.03)):

    r""" interface of ms-ssim
    Args:
        img1 (torch.Tensor): a batch of images, (N,C,[T,]H,W)
        img2 (torch.Tensor): a batch of images, (N,C,[T,]H,W)
        data_range (float or int, optional): value range of input images. (usually 1.0 or 255)
        size_average (bool, optional): if size_average=True, ssim of all images will be averaged as a scalar
        win_size: (int, optional): the size of gauss kernel
        win_sigma: (float, optional): sigma of normal distribution
        win (torch.Tensor, optional): 1-D gauss kernel. if None, a new kernel will be created according to win_size and win_sigma
        weights (list, optional): weights for different levels
        K (list or tuple, optional): scalar constants (K1, K2). Try a larger K2 constant (e.g. 0.4) if you get a negative or NaN results.
    Returns:
        torch.Tensor: ms-ssim results
    """
    if not img1.shape == img2.shape:        raise ValueError("Input images should have the same dimensions.")    # for d in range(len(img1.shape) - 1, 1, -1):
    #     img1 = img1.squeeze(dim=d)
    #     img2 = img2.squeeze(dim=d)

    if not img1.dtype == img2.dtype:        raise ValueError("Input images should have the same dtype.")    if len(img1.shape) == 4:
        avg_pool = F.avg_pool2d    elif len(img1.shape) == 5:
        avg_pool = F.avg_pool3d    else:        raise ValueError(f"Input images should be 4-d or 5-d tensors, but got {img1.shape}")

    smaller_side = min(img1.shape[-2:])    assert smaller_side > (window_size - 1) * (2 ** 4), "Image size should be larger than %d due to the 4 downsamplings " \                                                        "with window_size %d in ms-ssim" % ((window_size - 1) * (2 ** 4),window_size)    if weights is None:
        weights = [0.0448, 0.2856, 0.3001, 0.2363, 0.1333]
    weights = paddle.to_tensor(weights)    if window is None:
        window = create_window(window_size, sigma, channel)    assert window.shape == [channel, 1, window_size, window_size], " window.shape error"

    levels = weights.shape[0] # 5
    mcs = []    for i in range(levels):
        ssim_per_channel, cs =  _ssim(img1, img2, window=window, window_size=window_size,
                                       channel=3, data_range=data_range,C=C, size_average=False)        if i < levels - 1:
            mcs.append(F.relu(cs))
            padding = [s % 2 for s in img1.shape[2:]]
            img1 = avg_pool(img1, kernel_size=2, padding=padding)
            img2 = avg_pool(img2, kernel_size=2, padding=padding)

    ssim_per_channel = F.relu(ssim_per_channel)  # (batch, channel)
    mcs_and_ssim = paddle.stack(mcs + [ssim_per_channel], axis=0)  # (level, batch, channel) 按照等级堆叠
    ms_ssim_val = paddle.prod(mcs_and_ssim ** weights.reshape([-1, 1, 1]), axis=0) # level 相乘
    print(ms_ssim_val.shape)    if size_average:        return ms_ssim_val.mean()    else:        # 返回各个channel的值
        return ms_ssim_val.flatten(2).mean(1)class SSIMLoss(paddle.nn.Layer):
   """
   1. 继承paddle.nn.Layer
   """
   def __init__(self, window_size=11, channel=3, data_range=255., sigma=1.5):
       """
       2. 构造函数根据自己的实际算法需求和使用需求进行参数定义即可
       """
       super(SSIMLoss, self).__init__()
       self.data_range = data_range
       self.C = [0.01, 0.03]
       self.window_size = window_size
       self.channel = channel
       self.sigma = sigma
       self.window = create_window(self.window_size, self.sigma, self.channel)       # print(self.window_size,self.window.shape)
   def forward(self, input, label):
       """
       3. 实现forward函数,forward在调用时会传递两个参数:input和label
           - input:单个或批次训练数据经过模型前向计算输出结果
           - label:单个或批次训练数据对应的标签数据
           接口返回值是一个Tensor,根据自定义的逻辑加和或计算均值后的损失
       """
       # 使用Paddle中相关API自定义的计算逻辑
       # output = xxxxx
       # return output
       return 1-_ssim(input, label,data_range = self.data_range,
                      window = self.window, window_size=self.window_size, channel=3,
                      size_average=True,C=self.C)class MS_SSIMLoss(paddle.nn.Layer):
   """
   1. 继承paddle.nn.Layer
   """
   def __init__(self,data_range=255., channel=3, window_size=11, sigma=1.5):
       """
       2. 构造函数根据自己的实际算法需求和使用需求进行参数定义即可
       """
       super(MS_SSIMLoss, self).__init__()
       self.data_range = data_range
       self.C = [0.01, 0.03]
       self.window_size = window_size
       self.channel = channel
       self.sigma = sigma
       self.window = create_window(self.window_size, self.sigma, self.channel)       # print(self.window_size,self.window.shape)
   def forward(self, input, label):
       """
       3. 实现forward函数,forward在调用时会传递两个参数:input和label
           - input:单个或批次训练数据经过模型前向计算输出结果
           - label:单个或批次训练数据对应的标签数据
           接口返回值是一个Tensor,根据自定义的逻辑加和或计算均值后的损失
       """
       # 使用Paddle中相关API自定义的计算逻辑
       # output = xxxxx
       # return output
       return 1-ms_ssim(input, label, data_range=self.data_range,
                      window = self.window, window_size=self.window_size, channel=self.channel,
                      size_average=True,  sigma=self.sigma,
                      weights=None, C=self.C)class PSNRLoss(paddle.nn.Layer):
   def __init__(self):
       super(PSNRLoss, self).__init__()   def forward(self, input, label):
       return 100 - 20 * paddle.log10( ((input - label)**2).mean(axis = [1,2,3])**-0.5 )
登录后复制
   

六、训练

In [6]
def seed_paddle(seed=1024):
    seed = int(seed)
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    paddle.seed(seed)
登录后复制
   
In [7]
class AverageMeter(object):
    """Computes and stor|es the average and current value"""
    def __init__(self):
        self.reset()    def reset(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.countdef evaluate(val_loader, model, criterion, print_interval=100):
    losses = AverageMeter()
    psnr = AverageMeter()
    ssim = AverageMeter()
    batch_time = AverageMeter()
    lossfn, losspsnr = criterion    for step, data in enumerate(val_loader):

        img, label = data
        end = time.time()
        pre = model(img)
        batch_time.update(time.time() - end)
        loss1 = lossfn(pre,label).mean()
        loss2 = losspsnr(pre,label).mean()
        loss = (loss1+loss2/100)/2

        losses.update(loss.item(), img.shape[0])
        psnr.update(100.-loss2.item(), img.shape[0])
        ssim.update(1.-loss1.item(), img.shape[0])        if step%print_interval==0:            print('Test: [{0}/{1}]\t'
                'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
                'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
                'SSIM {ssim.val:.3f} ({ssim.avg:.3f})\t'
                'PSNR {psnr.val:.3f} ({psnr.avg:.3f})'.format(
                step,                len(val_loader),
                batch_time=batch_time,
                loss=losses,
                ssim=ssim,
                psnr=psnr))    print(' * SSIM {ssim.avg:.3f} PSNR {psnr.avg:.3f} Time {batch_time.avg:.3f}'
            .format(ssim=ssim, psnr=psnr, batch_time=batch_time))    return losses.avg, ssim.avg, psnr.avg
登录后复制
   
In [8]
batch_size = 8max_epoch = 10init_lr = 0.005print_interval = 100val_interval = 1save_dir = "models/output"save_interval = 1save_interval_s = 8000start_epoch = 0start_step = 0init_loss = 999
登录后复制
   
In [12]
import time 
from visualdl import LogWriter

model = UNet()
model.train()

train_dataset = MyDateset(train_list=train_data_list, watermark_dir=watermark_dir, bg_dir=bg_dir)
val_dataset = MyDateset(train_list=val_data_list, watermark_dir=watermark_dir, bg_dir=bg_dir)# 需要接续之前的模型重复训练可以取消注释if start_epoch>0:    #param_dict = paddle.load(os.path.join(save_dir, 'model_step_{}.pdparams'.format(str(start_step))))
    param_dict = paddle.load(os.path.join(save_dir, 'model_{}.pdparams'.format(str(start_epoch-1))))
    model.set_state_dict(param_dict)

train_loader = paddle.io.DataLoader(
    train_dataset,
    batch_size=batch_size,
    shuffle=True,
    drop_last=False)

val_loader = paddle.io.DataLoader(
    val_dataset,
    batch_size=batch_size,
    shuffle=False,
    drop_last=False)

losspsnr = PSNRLoss()
lossfn = SSIMLoss(window_size=3,data_range=1)

scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=init_lr, T_max=max_epoch)
opt = paddle.optimizer.Adam(learning_rate=scheduler, parameters=model.parameters())if start_epoch>0:    pass
    #param_dict = paddle.load(os.path.join(save_dir, 'opt_step_{}.pdopt'.format(str(start_step))))
    param_dict = paddle.load(os.path.join(save_dir, 'opt_{}.pdopt'.format(str(start_epoch-1))))
    opt.set_state_dict(param_dict)

writer = LogWriter(os.path.join(save_dir, "logs"))
登录后复制
       
398944
6076
登录后复制
       
In [ ]
now_step = start_epoch*len(train_loader)
min_loss = init_lossfor epoch in range(start_epoch, max_epoch):
    losses = AverageMeter()
    psnr = AverageMeter()
    ssim = AverageMeter()
    batch_time = AverageMeter()
    data_time = AverageMeter()
    end = time.time()    for step, data in enumerate(train_loader):        if epoch==start_epoch and step>=((start_epoch+1)*len(train_loader)-now_step):            #print(step+start_epoch*len(train_loader))
            break

        img, label = data
        data_time.update(time.time() - end)
        pre = model(img)
        loss1 = lossfn(pre,label).mean()
        loss2 = losspsnr(pre,label).mean()
        loss = (loss1+loss2/100)/2

        loss.backward()
        opt.step()
        opt.clear_gradients()

        losses.update(loss.item(), img.shape[0])
        psnr.update(100.-loss2.item(), img.shape[0])
        ssim.update(1.-loss1.item(), img.shape[0])
        batch_time.update(time.time() - end)        
        if now_step%print_interval==0:
            writer.add_scalar('train/loss', losses.val, now_step)
            writer.add_scalar('train/ssim', ssim.val, now_step)
            writer.add_scalar('train/psnr', psnr.val, now_step)            print('Epoch: [{0}][{1}/{2}]\t'
                  'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
                  'Data {data_time.val:.3f} ({data_time.avg:.3f})\t'
                  'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
                  'SSIM {ssim.val:.3f} ({ssim.avg:.3f})\t'
                  'PSNR {psnr.val:.3f} ({psnr.avg:.3f})'.format(
                    epoch,
                    step,                    len(train_loader),
                    batch_time=batch_time,
                    data_time=data_time,
                    loss=losses,
                    ssim=ssim,
                    psnr=psnr))        if now_step%save_interval_s==0:
            paddle.save(model.state_dict(), os.path.join(save_dir, 'model_step_{}.pdparams'.format(str(now_step))))
            paddle.save(opt.state_dict(), os.path.join(save_dir, 'opt_step_{}.pdopt'.format(str(now_step))))
        now_step += 1
        end = time.time()

    writer.add_scalar('train/lr', opt.get_lr(), epoch)
    scheduler.step()    if epoch%val_interval==0:        with paddle.no_grad():
            model.eval()
            val_loss, val_ssim, val_psnr = evaluate(val_loader, model, criterion=(lossfn, losspsnr), print_interval=print_interval)
            model.train()
            writer.add_scalar('val/loss', val_loss, epoch)
            writer.add_scalar('val/ssim', val_ssim, epoch)
            writer.add_scalar('val/psnr', val_psnr, epoch)        if val_loss<min_loss:
            min_loss = val_loss
            paddle.save(model.state_dict(), os.path.join(save_dir, 'model_best.pdparams'))    if epoch%save_interval==0:
        paddle.save(model.state_dict(), os.path.join(save_dir, 'model_{}.pdparams'.format(str(epoch))))
        paddle.save(opt.state_dict(), os.path.join(save_dir, 'opt_{}.pdopt'.format(str(epoch))))
登录后复制
   

七、预测及结果提交

本题目提交需要提交对应的模型和预测文件。predict.py需要读取同目录下的模型信息,预测去水印后的图片并保存。

想要自定义训练模型,只需要将predict.py中的模型和process函数中的do something 替换为自己的模型内容即可。

Baseline说:直接用UNet处理的结果可能不够理想,并非所有的情况都需要通过修正网络来解决。以下述情况为例,把在某个阈值内的颜色都设定为黑色(字的颜色)/白色(背景的颜色),可以让处理结果更契合人眼的需求。在predict.py中已经通过以下语句包含了这样的处理策略:

pre[pre>0.9]=1pre[pre<0.1]=0
登录后复制
       

但我们去除了这个策略,因为我们A榜分数发现,去除后分数从0.61805升到0.61919。

In [1]
# 压缩可提交文件! zip submit_removal.zip model_best.pdparams predict.py
登录后复制
       
  adding: model_best.pdparams (deflated 7%)
  adding: predict.py (deflated 72%)
登录后复制
       

查看预测结果(可选、非常耗时)

是不是想知道自己训练后的网络去除水印之后的图片到底长啥样?直接下载测试集A看看效果吧~

下载测试集

In [ ]
! wget https://staticsns.cdn.bcebos.com/amis/2022-4/1649745356784/watermark_test_datasets.zip! unzip -oq watermark_test_datasets.zip! rm -rf watermark_test_datasets.zip
登录后复制
   

在测试集上预测

In [ ]
! python predict.py watermark_test_datasets/images results
登录后复制
   

预测结束之后,打开results文件夹就能看到去除水印的图片了~

以图片bg_image_00005_0002.jpg为例

with watermask without watermask
@@##@@                     @@##@@

                   
百度网盘AI大赛-水印智能消除赛第19名方案 - php中文网百度网盘AI大赛-水印智能消除赛第19名方案 - php中文网

以上就是百度网盘AI大赛-水印智能消除赛第19名方案的详细内容,更多请关注php中文网其它相关文章!

百度网盘
百度网盘

百度网盘是一款省心、好用的超级云存储产品,已为超过7亿用户提供云服务,空间超大,支持多类型文件的备份、分享、查看和处理,自建多个数据存储中心。有需要的小伙伴快来保存下载体验吧!

下载
来源:php中文网
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn
最新问题
开源免费商场系统广告
热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板
关于我们 免责申明 意见反馈 讲师合作 广告合作 最新更新 English
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送
PHP中文网APP
随时随地碎片化学习
PHP中文网抖音号
发现有趣的

Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号