CVPR2022 NAS竞赛Track 2 第1名技术方案分享-人工智能-PHP中文网

本文介绍2022 CVPR Track2解决方案，聚焦小样本下架构性能预测。预处理含深度编码转换、归一化及Sigmoid处理；模型选择中，梯度提升类算法效果佳，经调参达0.78；尝试多任务学习未果，后通过集成GBRT等模型，结合GPNAS作为最终估计器，优化后得分0.7991。

☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜

cvpr2022 nas竞赛track 2 第1名技术方案分享 - php中文网

Our 2022 CVPR Track2 Solution

Accurately Predicting the performance of architecture with small sample training is a very important but not easy task. How to analysis and train data as much as we can and overcome over fitting is the core problem we should deal with. Meanwhile if there is the mult-task problem, we should also think about if we can take advantage of their correlation.

In this track Super Network builds a search space based on ViT-Base.The search space contain depth, num_heads, mpl_ratio and embed_dim.

Pre process the data

Simplest way is directly train all the network after transform the structure to float and predict. The problem of this way is singularity and does not take advantage of information of background information. Therefore, we pre process the data by following step.

因赛AIGC

因赛AIGC解决营销全链路应用场景

查看详情

Transfer the depth encoding(j,k,l) to integer(1,2,3). Here we consider ordinal encoding instead of one-Hot encoding as we assume depth has some correlation with predictability, the experiment also confirm our thought.

If the actual depth of a sub-network is less than 12, its encoding trailing end is padded with 0. Here we change 0 to 2 as we assume the input is 1,2,3 and 2 represent neutral information. after this transformation, the correlation between depth encoding feature and others decreased a lot which is a good thing for fitting later. Then we normalize data that map (1,2,3) to (-1,0,1).

Using Sigmoid function as Activiation Function. As we known, rank follow uniform distribution. By Sigmoid function we transfer uniform distribution to Gaussian distribution which is best choice for most model. After we get prediction, we use Sigmoid function again to transfer output back to uniform distribution as rank, then we round up to nearest integer.

we also try lots of other ideas such as different activiation function, add gassian noise. None of them has obviously improve across different task.

Model selection

What we did and how we pick our model.

Only GPNAS model, the baseline or modified based on baseline, the final score is about 0.67.

Other machine learning model such as linear regression, random forest model, kernel ridge regression, tree model, XGBoost, Gradient Boosting and so on. We find Gradient Boosting or other boosting based algorithm have best average result. When we firstly try it, we get score about 0.73. after tunning parameters, we get 0.78.

As our problem is multi-task, could we train different task at same time? We consider Multivariate Gradient Boosting based on muliMSE. However, the result is worse than univariate gradient boosting method. We think the reason is different task should consider different hyperparameters and we can not reduce the noise in Multivariate Gradient Boosting effectively.

As our trainig sample size is very small, in order to avoid over fitting, we try to ensemble our models. We choose GBRT,HISTGB,CATGB,XGB as our sub model. The obvious way is average all the result, this result improve to about 0.79. Then we try to improve after naive averaging.

We tried bagging Gradient Boosting which doesn't improve result. Then we think about stack gradient boosting group of model by a final estimator and the final estimator we use is GPNAS. Stacking allows to use the strength of each individual estimator by using their output as input of a final estimator. As sklearn already provide us this regressor, we modified GPNAS as a sklearn API. Now we get result about 0.792.

Until now our submodel of stack model still use same hyperparameters such as loss function, learning rate and depth. Then we consider select different loss function, learning rate and depth for different model as we assume different submodel and task have total different feature. And we add two more GBRT,CATGB submodel with different loss funtion. This round we get result about 0.798.

Also we modified and tunning our final estmator GPNAS, such as ridge parameter and instead of identity matrix, we choose inv(X.T*X) as our prior covariance matrix. We get final score 0.7991.

In [12]

#----------------------------------------------------------------------------------------------------------#   Install Packages#----------------------------------------------------------------------------------------------------------# If a persistence installation is required, # you need to use the persistence path as the following: #!mkdir /home/aistudio/external-libraries#!pip install --upgrade sklearn -i https://mirrors.aliyun.com/pypi/simple/ -t /home/aistudio/externallibraries#!conda install lightgbm #!conda install xgboost -i https://mirrors.aliyun.com/pypi/simple/!pip install catboost -i https://mirrors.aliyun.com/pypi/simple/#----------------------------------------------------------------------------------------------------------#   Import Packages#----------------------------------------------------------------------------------------------------------import matplotlib.pyplot as plt
%matplotlib inlineimport catboostimport lightgbmimport xgboostimport sklearnfrom sklearn import ensemblefrom sklearn.experimental import enable_hist_gradient_boostingfrom sklearn.ensemble import *#HistGradientBoostingRegressor,StackingRegressor,BaggingRegressor,ExtraTreesRegressorfrom sklearn.kernel_ridge import *from sklearn.linear_model import *#LinearRegressionfrom sklearn.semi_supervised import *from sklearn.svm import *from sklearn.metrics import mean_squared_errorimport numpy as npimport scipyimport copyimport jsonfrom sklearn.model_selection import cross_val_scorefrom scipy.linalg import hankelprint(sklearn.__version__)

登录后复制

Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Requirement already satisfied: catboost in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.0.6)
Requirement already satisfied: graphviz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (0.13)
Requirement already satisfied: plotly in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (5.8.0)
Requirement already satisfied: pandas>=0.24.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.1.5)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (2.2.3)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.16.0)
Requirement already satisfied: scipy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.6.3)
Requirement already satisfied: numpy>=1.16.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.20.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pandas>=0.24.0->catboost) (2.8.2)
Requirement already satisfied: pytz>=2017.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pandas>=0.24.0->catboost) (2019.3)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (3.0.8)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (1.1.0)
Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from plotly->catboost) (8.0.1)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->catboost) (56.2.0)WARNING: You are using pip version 22.0.4; however, version 22.1.1 is available.
You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.0.24.2

登录后复制

In [2]

#----------------------------------------------------------------------------------------------------------#   Load trainning and Testing data#----------------------------------------------------------------------------------------------------------def convert_X(arch_str):
    temp_arch = []    for i,elm in enumerate(arch_str):        if i in [3,6,9,12,15,18,21,24,27,30,33,36]: pass #Get rid of non-info columns,all is 768
        elif elm == 'j': temp_arch.append(1-2) #Transform it to number then central data, normalize Data
        elif elm == 'k': temp_arch.append(2-2) #Transform it to number then central data, normalize Data
        elif elm == 'l': temp_arch.append(3-2)  #Transform it to number then central data, normalize Data
        elif int(elm) == 0: temp_arch.append(2-2)  #Make 0 as 2 as it should contain neutral information（reduce correlation), then central data, normalize Data
        else: temp_arch.append(int(elm)-2)  #central data,normalize Data
    return(temp_arch)with open('./data/data134077/CVPR_2022_NAS_Track2_train.json', 'r') as f:
    train_data = json.load(f)with open('./data/data134077/CVPR_2022_NAS_Track2_test.json', 'r') as f:
    test_data = json.load(f)

    
test_arch_list = []for key in test_data.keys():
    test_arch =  convert_X(test_data[key]['arch'])
    test_arch_list.append(test_arch)
bb = np.array(test_arch_list).T

train_list = [[],[],[],[],[],[],[],[]]
arch_list_train = []
name_list = ['cplfw_rank', 'market1501_rank', 'dukemtmc_rank', 'msmt17_rank', 'veri_rank', 'vehicleid_rank', 'veriwild_rank', 'sop_rank']for key in train_data.keys():    for idx, name in enumerate(name_list):
        train_list[idx].append(train_data[key][name])
    xx = train_data[key]['arch']
    arch_list_train.append(convert_X(train_data[key]['arch']))

Y_all0 = np.array(train_list)
Y_all = np.log((Y_all0+1)/(500-Y_all0)) #Transfer rank data by Sigmoid function

登录后复制

In [3]

#----------------------------------------------------------------------------------------------------------#Correlation reserach #----------------------------------------------------------------------------------------------------------regcoef = np.round(np.array([LinearRegression().fit(np.array(arch_list_train), Y_all[i]).coef_ for i in range(8)]),2)print('Task Corr: ')print(np.round(np.corrcoef(regcoef),2))#correlation bewtween different taskprint('Parameters Corr: ')print(np.round(np.corrcoef(np.array(arch_list_train).T.astype('float')),2))#correlation bewtween different task, we can find the correlation is very low between different arch parameters

登录后复制

Task Corr: 
[[ 1.    0.18  0.12 -0.17 -0.29 -0.3  -0.3  -0.13]
 [ 0.18  1.    0.78  0.81  0.52  0.3   0.53  0.4 ]
 [ 0.12  0.78  1.    0.85  0.62  0.2   0.57  0.52]
 [-0.17  0.81  0.85  1.    0.8   0.42  0.82  0.66]
 [-0.29  0.52  0.62  0.8   1.    0.46  0.77  0.48]
 [-0.3   0.3   0.2   0.42  0.46  1.    0.52  0.12]
 [-0.3   0.53  0.57  0.82  0.77  0.52  1.    0.69]
 [-0.13  0.4   0.52  0.66  0.48  0.12  0.69  1.  ]]
Parameters Corr: 
[[ 1.    0.   -0.02  0.02 -0.09 -0.01  0.03 -0.02  0.09  0.01  0.   -0.05
   0.1  -0.04 -0.05 -0.02  0.04  0.01  0.   -0.08 -0.03  0.05  0.07  0.06
  -0.11]
 [ 0.    1.    0.01 -0.05  0.05  0.12  0.    0.02 -0.01 -0.02  0.01  0.1
   0.05 -0.03 -0.01  0.    0.03 -0.09 -0.01 -0.07  0.02  0.02  0.03  0.05
  -0.07]
 [-0.02  0.01  1.    0.01  0.03  0.01  0.1  -0.06  0.05  0.04  0.01  0.
  -0.01 -0.03 -0.01 -0.08 -0.01  0.03  0.05  0.04  0.01  0.03 -0.01  0.05
   0.03]
 [ 0.02 -0.05  0.01  1.    0.05 -0.03  0.01  0.   -0.02 -0.03 -0.08  0.08
  -0.04 -0.01  0.04 -0.01  0.04  0.03  0.01 -0.03 -0.02 -0.01  0.04  0.05
   0.09]
 [-0.09  0.05  0.03  0.05  1.   -0.02  0.04 -0.06  0.05  0.01 -0.03  0.
  -0.04 -0.03 -0.04  0.01  0.03 -0.03  0.07  0.08  0.04  0.03  0.03  0.
  -0.03]
 [-0.01  0.12  0.01 -0.03 -0.02  1.    0.02 -0.04 -0.04 -0.02 -0.06 -0.
   0.01  0.05  0.01 -0.09  0.   -0.05 -0.08 -0.04  0.01  0.08  0.07 -0.04
  -0.03]
 [ 0.03  0.    0.1   0.01  0.04  0.02  1.   -0.03  0.01 -0.02 -0.02  0.01
  -0.05 -0.04 -0.03  0.02 -0.04 -0.06 -0.03  0.04 -0.03  0.01 -0.02  0.
   0.01]
 [-0.02  0.02 -0.06  0.   -0.06 -0.04 -0.03  1.    0.01  0.04 -0.01 -0.09
   0.07  0.07 -0.01  0.05 -0.    0.05  0.05 -0.11 -0.02 -0.04 -0.   -0.04
   0.02]
 [ 0.09 -0.01  0.05 -0.02  0.05 -0.04  0.01  0.01  1.   -0.06 -0.03 -0.04
   0.06  0.05 -0.09  0.   -0.01  0.01  0.03  0.08 -0.01 -0.03  0.1   0.02
   0.08]
 [ 0.01 -0.02  0.04 -0.03  0.01 -0.02 -0.02  0.04 -0.06  1.   -0.05 -0.06
   0.02 -0.06 -0.06  0.   -0.04 -0.13 -0.03 -0.06 -0.03 -0.01 -0.02  0.03
  -0.06]
 [ 0.    0.01  0.01 -0.08 -0.03 -0.06 -0.02 -0.01 -0.03 -0.05  1.   -0.04
  -0.06  0.   -0.02 -0.06  0.03 -0.04  0.07 -0.09  0.02 -0.01 -0.04  0.01
  -0.02]
 [-0.05  0.1   0.    0.08  0.   -0.    0.01 -0.09 -0.04 -0.06 -0.04  1.
   0.02  0.02  0.09 -0.02 -0.04  0.02  0.    0.02  0.03 -0.03  0.03 -0.04
  -0.03]
 [ 0.1   0.05 -0.01 -0.04 -0.04  0.01 -0.05  0.07  0.06  0.02 -0.06  0.02
   1.   -0.07 -0.01  0.09 -0.06  0.06  0.03  0.01 -0.   -0.03 -0.06 -0.06
   0.02]
 [-0.04 -0.03 -0.03 -0.01 -0.03  0.05 -0.04  0.07  0.05 -0.06  0.    0.02
  -0.07  1.    0.02  0.04  0.    0.05 -0.   -0.04 -0.06 -0.05  0.06 -0.05
  -0.07]
 [-0.05 -0.01 -0.01  0.04 -0.04  0.01 -0.03 -0.01 -0.09 -0.06 -0.02  0.09
  -0.01  0.02  1.    0.02  0.03  0.03  0.02 -0.04  0.03  0.01 -0.05 -0.05
  -0.01]
 [-0.02  0.   -0.08 -0.01  0.01 -0.09  0.02  0.05  0.    0.   -0.06 -0.02
   0.09  0.04  0.02  1.   -0.05  0.05  0.04 -0.04 -0.02 -0.04  0.01  0.03
  -0.03]
 [ 0.04  0.03 -0.01  0.04  0.03  0.   -0.04 -0.   -0.01 -0.04  0.03 -0.04
  -0.06  0.    0.03 -0.05  1.    0.06 -0.   -0.04  0.04 -0.02  0.03  0.02
   0.08]
 [ 0.01 -0.09  0.03  0.03 -0.03 -0.05 -0.06  0.05  0.01 -0.13 -0.04  0.02
   0.06  0.05  0.03  0.05  0.06  1.    0.    0.04 -0.06 -0.02  0.05 -0.
  -0.06]
 [ 0.   -0.01  0.05  0.01  0.07 -0.08 -0.03  0.05  0.03 -0.03  0.07  0.
   0.03 -0.    0.02  0.04 -0.    0.    1.   -0.03 -0.02  0.03 -0.03  0.03
  -0.08]
 [-0.08 -0.07  0.04 -0.03  0.08 -0.04  0.04 -0.11  0.08 -0.06 -0.09  0.02
   0.01 -0.04 -0.04 -0.04 -0.04  0.04 -0.03  1.   -0.    0.07 -0.05 -0.1
  -0.  ]
 [-0.03  0.02  0.01 -0.02  0.04  0.01 -0.03 -0.02 -0.01 -0.03  0.02  0.03
  -0.   -0.06  0.03 -0.02  0.04 -0.06 -0.02 -0.    1.    0.02  0.03 -0.03
   0.05]
 [ 0.05  0.02  0.03 -0.01  0.03  0.08  0.01 -0.04 -0.03 -0.01 -0.01 -0.03
  -0.03 -0.05  0.01 -0.04 -0.02 -0.02  0.03  0.07  0.02  1.   -0.1   0.05
  -0.11]
 [ 0.07  0.03 -0.01  0.04  0.03  0.07 -0.02 -0.    0.1  -0.02 -0.04  0.03
  -0.06  0.06 -0.05  0.01  0.03  0.05 -0.03 -0.05  0.03 -0.1   1.   -0.04
  -0.05]
 [ 0.06  0.05  0.05  0.05  0.   -0.04  0.   -0.04  0.02  0.03  0.01 -0.04
  -0.06 -0.05 -0.05  0.03  0.02 -0.    0.03 -0.1  -0.03  0.05 -0.04  1.
   0.05]
 [-0.11 -0.07  0.03  0.09 -0.03 -0.03  0.01  0.02  0.08 -0.06 -0.02 -0.03
   0.02 -0.07 -0.01 -0.03  0.08 -0.06 -0.08 -0.    0.05 -0.11 -0.05  0.05
   1.  ]]

登录后复制

In [13]

#----------------------------------------------------------------------------------------------------------#    Modify GPNAS code as sklearn API and add some more function#----------------------------------------------------------------------------------------------------------__all__ = ["GPNAS_API"]class GPNAS_API(object):
    _estimator_type = "regressor"
    def __init__(self, cov_w = None, w = None, c_flag=2, m_flag=2, hp_mat = 0.0000001, hp_cov = 0.01, icov = 1):
        self.hp_mat = hp_mat
        self.hp_cov = hp_cov
        self.cov_w = cov_w
        self.w = w 
        self.c_flag = c_flag
        self.m_flag = m_flag
        self.icov = icov #if we use initial cov as prior

    def get_params(self, deep=True):
        return {            "hp_mat": self.hp_mat, 
            "hp_cov": self.hp_cov, 
            "cov_w": self.cov_w, 
            "w": self.w, 
            "c_flag": self.c_flag, 
            "m_flag": self.m_flag, 
            "icov": self.icov, 
            }    def set_params(self, **parameters):
        for parameter, value in parameters.items():            setattr(self, parameter, value)        return self    def _get_corelation(self, mat1, mat2):
        """
        give two typical kernel function
        Auto kernel hyperparameters estimation to be updated
        """

        mat_diff = abs(mat1 - mat2)        if self.c_flag == 1:            return 0.5 * np.exp(-np.dot(mat_diff, mat_diff) / 16)        elif self.c_flag == 2:            return 1 * np.exp(-np.sqrt(np.dot(mat_diff, mat_diff)) / 12)    def _preprocess_X(self, X):
        """
        preprocess of input feature/ tokens of architecture
        more complicated preprocess can be added such as nonlineaer transformation
        """
        X = X.tolist()
        p_X = copy.deepcopy(X)        for feature in p_X: feature.append(1)        return p_X    def _get_cor_mat(self, X):
        """get kernel matrix"""
        X = np.array(X)
        l = X.shape[0]
        cor_mat = []        for c_idx in range(l):
            col = []
            c_mat = X[c_idx].copy()            for r_idx in range(l):
                r_mat = X[r_idx].copy()
                temp_cor = self._get_corelation(c_mat, r_mat)
                col.append(temp_cor)
            cor_mat.append(col)        return np.mat(cor_mat)    def _get_cor_mat_joint(self, X, X_train):
        """
        get kernel matrix
        """
        X = np.array(X)
        X_train = np.array(X_train)
        l_c = X.shape[0]
        l_r = X_train.shape[0]
        cor_mat = []        for c_idx in range(l_c):
            col = []
            c_mat = X[c_idx].copy()            for r_idx in range(l_r):
                r_mat = X_train[r_idx].copy()
                temp_cor = self._get_corelation(c_mat, r_mat)
                col.append(temp_cor)
            cor_mat.append(col)        return np.mat(cor_mat)    def fit(self, X,y):
        self.get_initial_mean(X[0::2],y[0::2])
        self.get_initial_cov(X)        # 更新（训练）gpnas预测器超参数
        self.get_posterior_mean(X[1::2],y[1::2])        
    def predict(self, X):
        X = self._preprocess_X(X)
        X = np.mat(X)        #print('beta',self.w.flatten())
        return X * self.w            
    def get_predict(self, X):
        """
        get the prediction of network architecture X
        """
        X = self._preprocess_X(X)
        X = np.mat(X)        return X * self.w    def get_predict_jiont(self, X, X_train, Y_train):
        """
        get the prediction of network architecture X based on X_train and Y_train
        """
        X = np.mat(X)
        X_train = np.mat(X_train)
        Y_train = np.mat(Y_train)
        m_X = self.get_predict(X)
        m_X_train = self.get_predict(X_train)
        mat_train = self._get_cor_mat(X_train)
        mat_joint = self._get_cor_mat_joint(X, X_train)        return m_X + mat_joint * np.linalg.inv(mat_train + self.hp_mat * np.eye(
            X_train.shape[0])) * (Y_train.T - m_X_train)    def get_initial_mean(self, X, Y):
        """
        get initial mean of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)
        Y = np.mat(Y)
        self.w = np.linalg.inv(X.T * X + self.hp_mat * np.eye(X.shape[            1])) * X.T * Y.T        #inv(X.T*X)X.T*Y as initial mean
        print('Variance',np.var(Y-X * self.w))#Show variance of residual then we can base this tunning self.hp_cov
        return self.w    def get_initial_cov(self, X):
        """
        get initial coviarnce matrix of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)        if self.icov == 1: #use inv(X.T*X) as initial covariance
            self.cov_w = self.hp_cov * np.linalg.inv(X.T * X)        elif self.icov == 0:# use identity matrix as initial covariance
            self.cov_w = self.hp_cov * np.eye(X.shape[1])        else:            assert 0,'not available yet'
        return self.cov_w    def get_posterior_mean(self, X, Y):
        """
        get posterior mean of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)
        Y = np.mat(Y)
        cov_mat = self._get_cor_mat(X)        if self.m_flag == 1:
            self.w = self.w + self.cov_w * X.T * np.linalg.inv(
                np.linalg.inv(cov_mat + self.hp_mat * np.eye(X.shape[0])) + X *
                self.cov_w * X.T + self.hp_mat * np.eye(X.shape[0])) * (
                    Y.T - X * self.w)        else:
            self.w = np.linalg.inv(X.T * np.linalg.inv(
                cov_mat + self.hp_mat * np.eye(X.shape[0])) * X + np.linalg.inv(
                    self.cov_w + self.hp_mat * np.eye(X.shape[                        1])) + self.hp_mat * np.eye(X.shape[1])) * (
                            X.T * np.linalg.inv(cov_mat + self.hp_mat * np.eye(
                                X.shape[0])) * Y.T +
                            np.linalg.inv(self.cov_w + self.hp_mat * np.eye(
                                X.shape[1])) * self.w)        return self.w    def get_posterior_cov(self, X, Y):
        """
        get posterior coviarnce matrix of w
        """
        X = self._preprocess_X(X)
        X = np.mat(X)
        Y = np.mat(Y)
        cov_mat = self._get_cor_mat(X)
        self.cov_mat = np.linalg.inv(
            np.linalg.inv(X.T * cov_mat * X + self.hp_mat * np.eye(X.shape[1]))
            + np.linalg.inv(self.cov_w + self.hp_mat * np.eye(X.shape[                1])) + self.hp_mat * np.eye(X.shape[1]))        return self.cov_mat

登录后复制

In [14]

#----------------------------------------------------------------------------------------------------------#    Modify GPNAS code as sklearn API and add some more function#----------------------------------------------------------------------------------------------------------max_iter = [10000,10000,10000,10000,10000,10000,10000,10000] 

#learning_rate = [0.008,0.038,0.032,0.02,0.025,0.012,0.025,0.006]#learning_rate = [0.01.,0.04.,0.04.,0.04.,0.02.,0.02.,0.04.,0.001.]learning_rate = [0.005,0.038,0.035,0.03,0.025,0.01,0.03,0.01] #Final learning ratemax_depth = [1,3,2,2,2,3,1,3] #depth for GBRT(huber),CATGB(MSE),GBRT2(MSE),CATGB2(huber)max_depth2 = [1,1,1,1,1,1,1,1] #depth for HISTGB,LIGHTGB,XGBlist_est = []

model_GBRT,model_HISTGB,model_CATGB,model_LIGHTGB,model_XGB,model_GBRT2,model_CATGB2= [],[],[],[],[],[],[]for i in range(8):

    params_GBRT = {"n_estimators": max_iter[i],    "max_depth": max_depth[i],    "subsample": .8,    "learning_rate": learning_rate[i],    "loss": 'huber',    "max_features": 'sqrt',    "random_state":1,
    } 
    model_GBRT.append(ensemble.GradientBoostingRegressor(**params_GBRT)) 
    
    params_HISTGB = {    "max_depth": max_depth2[i],    "max_iter":max_iter[i] ,    "learning_rate": learning_rate[i],    "loss": 'least_squares',    "max_leaf_nodes":31,    "min_samples_leaf":5,    "l2_regularization":5,    "random_state":1,
    }
    model_HISTGB.append(HistGradientBoostingRegressor(**params_HISTGB))


    model_CATGB.append(catboost.CatBoostRegressor(iterations= max_iter[i] ,
                             learning_rate= learning_rate[i],
                             depth= max_depth[i],
                             silent=True,
                             task_type="CPU",
                             loss_function= 'RMSE',                     
                             eval_metric='RMSE',
                             random_seed = 1,
                             od_type='Iter',
                             metric_period = 75,
                             od_wait=100,
                             ))

    
    model_LIGHTGB.append(lightgbm.LGBMRegressor(boosting_type='gbdt',learning_rate = learning_rate[i],num_leaves=31,
    max_depth = max_depth2[i], alpha = 0.1, n_estimators = max_iter[i] ,random_state=1))
    
    model_XGB.append(xgboost.XGBRegressor(learning_rate = learning_rate[i],tree_method = 'auto',
    max_depth = max_depth2[i], alpha = 0.8, n_estimators = max_iter[i] ,random_state=1))   

    params_GBRT2 = {"n_estimators": max_iter[i] ,    "max_depth": max_depth[i],    "subsample": .8,    "learning_rate": learning_rate[i],    "loss": 'ls',    "max_features": 'log2',    "random_state":1,
    } 
    model_GBRT2.append(ensemble.GradientBoostingRegressor(**params_GBRT2)) 
    
    model_CATGB2.append(catboost.CatBoostRegressor(iterations= max_iter[i] ,
                             learning_rate= learning_rate[i],
                             depth= max_depth[i],
                             silent=True,
                             task_type="CPU",
                             loss_function=  'Huber:delta=2',                     
                             eval_metric= 'Huber:delta=2',
                             random_seed = 1,
                             od_type='Iter',
                             metric_period = 75,
                             od_wait=100,
                             l2_leaf_reg = 1,
                             subsample = 0.8,
                             ))for i in range(8): 
    list_est.append([
    ('GBRT', model_GBRT[i]),
    ('HISTGB', model_HISTGB[i]),
    ('CATGB',model_CATGB[i]),
    ('LIGHTGB', model_LIGHTGB[i]),
    ('XGB', model_XGB[i]),
    ('GBRT2', model_GBRT2[i]),
    ('CATGB2', model_CATGB2[i]),
    ])

登录后复制

In [6]

#----------------------------------------------------------------------------------------------------------#    In sample training and testing, just for reserach and parameter selection#----------------------------------------------------------------------------------------------------------#X_all_k = np.array(arch_list_train)#X_val = np.array(test_arch_list)#print(np.array(test_arch_list).shape,X_val.shape)#train_num = 400;#gb_list = []#for i in range(8):#    model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = hp_list[i], hp_cov = 3, icov = 1),passthrough=False,n_jobs=4)#    Y_all_k = Y_all[i]#    X_train_k, Y_train_k, X_test_k, Y_test_k = X_all_k[0:train_num:1], Y_all_k[0:train_num:1], X_all_k[train_num::1], Y_all_k[train_num::1]#    model_final.fit(X_train_k,Y_train_k)#    gb_list.append(copy.copy(model_final))#    print('Kendalltau:',i,scipy.stats.stats.kendalltau(model_final.predict(X_test_k),Y_test_k))

登录后复制

In [7]

#----------------------------------------------------------------------------------------------------------#    Plot training and testing params result to help make prediction.#----------------------------------------------------------------------------------------------------------#i = 0#params = gb_list[i].get_params()#itr = "n_estimators"#test_score0 = np.zeros((params[itr],), dtype=np.float64)#test_score1 = np.zeros((params[itr],), dtype=np.float64)#for i,y_pred0 in enumerate(model_gb.staged_predict(X_train_k)):#    test_score0[i] = scipy.stats.stats.kendalltau(Y_train_k, y_pred0)[0]#for i,y_pred1 in enumerate(model_gb.staged_predict(X_test_k)):#    test_score1[i] = scipy.stats.stats.kendalltau(Y_test_k, y_pred1)[0]
    #fig = plt.figure(figsize=(6, 6))#plt.subplot(1, 1, 1)#plt.title("Deviance")#plt.plot(#    np.arange(params[itr])[:] + 1,#    test_score0[:],#    "b-",#    label="Training Set Deviance",#)#plt.plot(np.arange(params[itr])[:] + 1, test_score1[:], "r-", label="Test Set Deviance")#plt.legend(loc="upper right")#plt.xlabel("Boosting Iterations")#plt.ylabel("Deviance")#fig.tight_layout()#plt.show()

登录后复制

In [8]

#----------------------------------------------------------------------------------------------------------#   Feature Importance#----------------------------------------------------------------------------------------------------------#from sklearn.inspection import permutation_importance#for model_gb in gb_list:#  feature_importance = model_gb.feature_importances_#  sorted_idx = np.argsort(feature_importance)#  pos = np.arange(sorted_idx.shape[0]) + 0.5#  fig = plt.figure(figsize=(6, 4))#  plt.barh(pos, feature_importance[sorted_idx], align="center")#  plt.yticks(pos, np.array(range(len(feature_importance)))[sorted_idx])#  plt.title("Feature Importance ")

登录后复制

In [9]

#----------------------------------------------------------------------------------------------------------#    Result 1, use GPNAS with inv(X.T*X) as initial covariance prior to stack 'GBRT','HISTGB','CATGB','LIGHTGB','XGB','GBRT2','CATGB2' with cross validation#----------------------------------------------------------------------------------------------------------X_train_k = np.array(arch_list_train)
X_val = np.array(test_arch_list)

rank_all1= []for i in range(len(list_est)):    print('No: ',i)    #stack different regressor by GPNAS with cross validation
    model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = 0.5, hp_cov = 3, icov = 1),passthrough=False,n_jobs=4)
    Y_train_k = Y_all[i]
    model_final.fit(X_train_k,Y_train_k)
    zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function
    print(zz[0])
    rank_all1.append(zz)

登录后复制

No:  0
Variance 3.3439391282630595

登录后复制

---------------------------------------------------------------------------KeyboardInterrupt Traceback (most recent call last)/tmp/ipykernel_165/1501576174.py in <module> 13 Y_train_k = Y_all[i] 14 model_final.fit(X_train_k,Y_train_k)---> 15 zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function 16 print(zz[0]) 17 rank_all1.append(zz) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/utils/metaestimators.py in <lambda>(*args, **kwargs) 118 119 # lambda, but not partial, allows help() to work with update_wrapper --> 120 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs) 121 # update the docstring of the returned function 122 update_wrapper(out, self.fn) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in predict(self, X, **predict_params) 244 check_is_fitted(self) 245return self.final_estimator_.predict( --> 246 self.transform(X), **predict_params 247 ) 248 /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in transform(self, X) 702 Prediction outputs for each estimator. 703 """ --> 704 return self._transform(X) 705 706 def _sk_visual_block_(self): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in _transform(self, X) 215predictions = [ 216 getattr(est, meth)(X) --> 217 for est, meth in zip(self.estimators_, self.stack_method_) 218 if est != 'drop' 219 ] /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_stacking.py in <listcomp>(.0) 216 getattr(est, meth)(X) 217 for est, meth in zip(self.estimators_, self.stack_method_) --> 218 if est != 'drop' 219 ] 220 return self._concatenate_predictions(X, predictions) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py in predict(self, X) 1086 # Return inverse link of raw predictions after converting 1087 # shape (n_samples, 1) to (n_samples,) -> 1088 returnself._loss.inverse_link_function(self._raw_predict(X).ravel()) 1089 1090 def staged_predict(self, X): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.pyin _raw_predict(self, X) 749 raw_predictions += self._baseline_prediction 750 self._predict_iterations( --> 751 X, self._predictors, raw_predictions, is_binned 752 ) 753 return raw_predictions /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py in _predict_iterations(self, X, predictors, raw_predictions, is_binned) 771 known_cat_bitsets=known_cat_bitsets, 772f_idx_map=f_idx_map) --> 773 raw_predictions[k, :] += predict(X) 774 775 def _staged_raw_predict(self, X): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.pyin predict(self, X, known_cat_bitsets, f_idx_map) 65 out = np.empty(X.shape[0], dtype=Y_DTYPE) 66 _predict_from_raw_data(self.nodes, X, self.raw_left_cat_bitsets, ---> 67 known_cat_bitsets, f_idx_map, out) 68 return out 69KeyboardInterrupt:

In [ ]

#----------------------------------------------------------------------------------------------------------# Save Result 1(My result is ran locally which is a little different from here)#----------------------------------------------------------------------------------------------------------for idx,key in enumerate(test_data.keys()):    #print(key)
    test_data[key]['cplfw_rank'] = int(rank_all1[0][idx])
    test_data[key]['market1501_rank'] = int(rank_all1[1][idx])
    test_data[key]['dukemtmc_rank'] = int(rank_all1[2][idx])
    test_data[key]['msmt17_rank'] = int(rank_all1[3][idx])
    test_data[key]['veri_rank'] = int(rank_all1[4][idx])
    test_data[key]['vehicleid_rank'] = int(rank_all1[5][idx])
    test_data[key]['veriwild_rank'] = int(rank_all1[6][idx])
    test_data[key]['sop_rank'] = int(rank_all1[7][idx])print('Ready to save results!')with open('./CVPR_2022_NAS_Track2_submit_ACCNAS_1.json', 'w') as f:
    json.dump(test_data, f)

登录后复制

In [10]

#----------------------------------------------------------------------------------------------------------# Result 2, use GPNAS with identity matrix as initial covariance prior to stack 'GBRT','HISTGB','CATGB','LIGHTGB','XGB','GBRT2','CATGB2' with cross validation#----------------------------------------------------------------------------------------------------------X_train_k = np.array(arch_list_train)
X_val = np.array(test_arch_list)

rank_all2= []for i in range(len(list_est)):    print('No: ',i)    #stack different regressor by GPNAS with cross validation
    model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = 0.5, hp_cov = 0.01, icov = 0),passthrough=False,n_jobs=4)
    Y_train_k = Y_all[i]
    model_final.fit(X_train_k,Y_train_k)
    zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function
    print(zz[0])
    rank_all2.append(zz)

登录后复制

No:  0
Variance 3.3439391282630595
[[42459.]]
No:  1

登录后复制

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py:706: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
  "timeout or by a memory leak.", UserWarning

登录后复制

Variance 6.0485427006730506
[[28879.]]
No:  2
Variance 6.006150913480946
[[47402.]]
No:  3
Variance 6.103473245508546
[[72536.]]
No:  4
Variance 6.025486965953786
[[92206.]]
No:  5
Variance 5.8364155067895815
[[70782.]]
No:  6
Variance 6.52441906729945
[[92304.]]
No:  7
Variance 5.725986746928656
[[90033.]]

登录后复制

In [11]

#----------------------------------------------------------------------------------------------------------# Save Result 2(My result is ran locally which is a little different from here)#----------------------------------------------------------------------------------------------------------for idx,key in enumerate(test_data.keys()):    #print(key)
    test_data[key]['cplfw_rank'] = int(rank_all2[0][idx])
    test_data[key]['market1501_rank'] = int(rank_all2[1][idx])
    test_data[key]['dukemtmc_rank'] = int(rank_all2[2][idx])
    test_data[key]['msmt17_rank'] = int(rank_all2[3][idx])
    test_data[key]['veri_rank'] = int(rank_all2[4][idx])
    test_data[key]['vehicleid_rank'] = int(rank_all2[5][idx])
    test_data[key]['veriwild_rank'] = int(rank_all2[6][idx])
    test_data[key]['sop_rank'] = int(rank_all2[7][idx])print('Ready to save results!')with open('./CVPR_2022_NAS_Track2_submit_ACCNAS_2.json', 'w') as f:
    json.dump(test_data, f)

登录后复制

Ready to save results!

登录后复制

In [ ]

#----------------------------------------------------------------------------------------------------------#    Multivariate Gradient Boosting (result is not that good final score 0.78759) #----------------------------------------------------------------------------------------------------------#from catboost import Pool, CatBoostRegressor#import numpy as np#train_num = 400;gb_list = [];rank_list= []#index_list = [[0],[1],[2],[3],[4],[5],[6],[7]]#index_list = [[0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7]]#[50, 17, 43, 18, 3, 10, 28, 31]#index_list = [[1,0],[1,1],[1,2],[1,3],[1,4],[1,5],[1,6],[1,7]]#1,1#index_list = [[1],[1,2],[1,3],[1,4],[1,2],[1,2,3],[1,2,3,4],[1,2,3,4,5]]#[48, 19, 36, 30, 0, 22, 23, 22]#index_list = [[2,0],[2,1],[2],[2,3],[2,4],[2,5],[2,6],[2,7]]#[1, 30, 68, 35, 25, 15, 14, 12]#index_list = [[3,0],[3,1],[3,2],[3,3],[3,4],[3,5],[3,6],[3,7]]#3,3#index_list = [[4,0],[4,1],[4,2],[4,3],[4,4],[4,5],[4,6],[4,7]]#[0, 25, 27, 31, 70, 8, 23, 16]#index_list = [[5,0],[5,1],[5,2],[5,3],[5,4],[5,5],[5,6],[5,7]]#[7, 18, 27, 22, 21, 56, 33, 16]#index_list = [[6,0],[6,1],[6,2],[6,3],[6,4],[6,5],[6],[6,7]]#[0, 25, 19, 29, 19, 24, 79, 5]#index_list = [[7,0],[7,1],[7,2],[7,3],[7,4],[7,5],[7,6],[7]]#[3, 14, 23, 20, 28, 19, 11, 82]#xx = np.zeros([len(index_list),50])#for j in range(xx.shape[1]):#  for i in range(xx.shape[0]):#    md_catboost = CatBoostRegressor(iterations=10000,#                             learning_rate=.005,#                             depth=2,#                             verbose=0,#                             #silent=True,#                             task_type="CPU",#                             l2_leaf_reg=1,#                             loss_function= 'MultiRMSE',#                             eval_metric= 'MultiRMSE',#                             random_seed = 1,#                             bagging_temperature = 0.1,#                             od_type= 'Iter', #                             metric_period = 75,#                             od_wait=100)#    X_all_k, Y_all_k  = np.array(arch_list_train).astype('float'), Y_all.astype('float').T[:,index_list[i]]#    X_train_k, X_test_k, Y_train_k, Y_test_k = sklearn.model_selection.train_test_split(#        X_all_k, Y_all_k, test_size=0.2, shuffle=True,random_state=j)##    md_catboost.fit(X_train_k,Y_train_k)#    y_predict = md_catboost.predict(X_test_k)#    if len(index_list[i]) == 1:#        mse = sklearn.metrics.mean_squared_error(y_predict,Y_test_k)#    else:#        mse = sklearn.metrics.mean_squared_error(y_predict[:,0],Y_test_k[:,0])#    print('MSE:','i',i,'j',j,index_list[i],mse)#    print('Kendalltau:',scipy.stats.stats.kendalltau(y_predict[:,0],Y_test_k[:,0]))]#    xx[i,j] = np.round(mse,5)#from catboost import Pool, CatBoostRegressor#X_train_k, Y_train_k = X_all_k, Y_all_k#print(X_train_k.shape, Y_train_k.shape, X_test_k.shape, Y_test_k.shape)#dtrain = Pool(X_train_k, label=Y_train_k)#dvalid = Pool(X_test_k, label=Y_test_k)#md_catboost.fit(dtrain,eval_set=dvalid, use_best_model=True,early_stopping_rounds=None)#y_predict = md_catboost.predict(dvalid)#[print('Kendalltau:',scipy.stats.stats.kendalltau(y_predict,Y_test_k))]

登录后复制

In [ ]

#----------------------------------------------------------------------------------------------------------#    Other experiment we try#----------------------------------------------------------------------------------------------------------#Only CAT regressor#	Kendalltau:0.79394		#Only HIST regressor#	Kendalltau:0.78626	#Only GB regressor#	Kendalltau:0.79383		#Only XGB regressor #	Kendalltau:0.78741	#Only LIGHTGB regressor#	Kendalltau:0.78647		#5 regressor combine with same learning rate and same depth parameters(depth = 1)#	Kendalltau:0.79321		#5 regressor combine n_iters = 5000 with different learning rate for different regressor#	Kendalltau:0.79457	#5 regressor combine n_iters = 10000#	Kendalltau:0.79696	
        #7 regressor combine, hp_mat = 1, learn_rate*0.8#	Kendalltau:0.79608	#7 regressor combine, hp_mat = 1#	Kendalltau:0.79697	
        #7 regressor combine, hp_mat = 1, learn_rate*2#	Kendalltau:0.79732	#7 regressor combine, hp_mat = 1, tunning learn_rate#	Kendalltau:0.79769	#7 regressor combine, hp_mat = 0.4#	Kendalltau:0.79785	#7 regressor combine, hp_mat = 0.5, all depth parameter equal to 1 #	Kendalltau:0.79389	#7 regressor combine, hp_mat = 0.5, tunning depth parameter#	Kendalltau:0.79788	  #7 regressor combine, hp_mat = 0.5, round up final int rank#	Kendalltau:0.79796	#7 regressor combine, hp_mat = 0.5, tunning learn_rate after new depth parameter#	Kendalltau:0.79859

登录后复制

以上就是CVPR2022 NAS竞赛Track 2 第1名技术方案分享的详细内容，更多请关注php中文网其它相关文章！