本文介绍2022 CVPR Track2解决方案,聚焦小样本下架构性能预测。预处理含深度编码转换、归一化及Sigmoid处理;模型选择中,梯度提升类算法效果佳,经调参达0.78;尝试多任务学习未果,后通过集成GBRT等模型,结合GPNAS作为最终估计器,优化后得分0.7991。
☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜

Accurately Predicting the performance of architecture with small sample training is a very important but not easy task. How to analysis and train data as much as we can and overcome over fitting is the core problem we should deal with. Meanwhile if there is the mult-task problem, we should also think about if we can take advantage of their correlation.
In this track Super Network builds a search space based on ViT-Base.The search space contain depth, num_heads, mpl_ratio and embed_dim.
Simplest way is directly train all the network after transform the structure to float and predict. The problem of this way is singularity and does not take advantage of information of background information. Therefore, we pre process the data by following step.
- Transfer the depth encoding(j,k,l) to integer(1,2,3). Here we consider ordinal encoding instead of one-Hot encoding as we assume depth has some correlation with predictability, the experiment also confirm our thought.
- If the actual depth of a sub-network is less than 12, its encoding trailing end is padded with 0. Here we change 0 to 2 as we assume the input is 1,2,3 and 2 represent neutral information. after this transformation, the correlation between depth encoding feature and others decreased a lot which is a good thing for fitting later. Then we normalize data that map (1,2,3) to (-1,0,1).
- Using Sigmoid function as Activiation Function. As we known, rank follow uniform distribution. By Sigmoid function we transfer uniform distribution to Gaussian distribution which is best choice for most model. After we get prediction, we use Sigmoid function again to transfer output back to uniform distribution as rank, then we round up to nearest integer.
- we also try lots of other ideas such as different activiation function, add gassian noise. None of them has obviously improve across different task.
What we did and how we pick our model.
- Only GPNAS model, the baseline or modified based on baseline, the final score is about 0.67.
- Other machine learning model such as linear regression, random forest model, kernel ridge regression, tree model, XGBoost, Gradient Boosting and so on. We find Gradient Boosting or other boosting based algorithm have best average result. When we firstly try it, we get score about 0.73. after tunning parameters, we get 0.78.
- As our problem is multi-task, could we train different task at same time? We consider Multivariate Gradient Boosting based on muliMSE. However, the result is worse than univariate gradient boosting method. We think the reason is different task should consider different hyperparameters and we can not reduce the noise in Multivariate Gradient Boosting effectively.
- As our trainig sample size is very small, in order to avoid over fitting, we try to ensemble our models. We choose GBRT,HISTGB,CATGB,XGB as our sub model. The obvious way is average all the result, this result improve to about 0.79. Then we try to improve after naive averaging.
- We tried bagging Gradient Boosting which doesn't improve result. Then we think about stack gradient boosting group of model by a final estimator and the final estimator we use is GPNAS. Stacking allows to use the strength of each individual estimator by using their output as input of a final estimator. As sklearn already provide us this regressor, we modified GPNAS as a sklearn API. Now we get result about 0.792.
- Until now our submodel of stack model still use same hyperparameters such as loss function, learning rate and depth. Then we consider select different loss function, learning rate and depth for different model as we assume different submodel and task have total different feature. And we add two more GBRT,CATGB submodel with different loss funtion. This round we get result about 0.798.
- Also we modified and tunning our final estmator GPNAS, such as ridge parameter and instead of identity matrix, we choose inv(X.T*X) as our prior covariance matrix. We get final score 0.7991.
#----------------------------------------------------------------------------------------------------------# Install Packages#----------------------------------------------------------------------------------------------------------# If a persistence installation is required, # you need to use the persistence path as the following: #!mkdir /home/aistudio/external-libraries#!pip install --upgrade sklearn -i https://mirrors.aliyun.com/pypi/simple/ -t /home/aistudio/externallibraries#!conda install lightgbm #!conda install xgboost -i https://mirrors.aliyun.com/pypi/simple/!pip install catboost -i https://mirrors.aliyun.com/pypi/simple/#----------------------------------------------------------------------------------------------------------# Import Packages#----------------------------------------------------------------------------------------------------------import matplotlib.pyplot as plt %matplotlib inlineimport catboostimport lightgbmimport xgboostimport sklearnfrom sklearn import ensemblefrom sklearn.experimental import enable_hist_gradient_boostingfrom sklearn.ensemble import *#HistGradientBoostingRegressor,StackingRegressor,BaggingRegressor,ExtraTreesRegressorfrom sklearn.kernel_ridge import *from sklearn.linear_model import *#LinearRegressionfrom sklearn.semi_supervised import *from sklearn.svm import *from sklearn.metrics import mean_squared_errorimport numpy as npimport scipyimport copyimport jsonfrom sklearn.model_selection import cross_val_scorefrom scipy.linalg import hankelprint(sklearn.__version__)
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: catboost in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.0.6) Requirement already satisfied: graphviz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (0.13) Requirement already satisfied: plotly in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (5.8.0) Requirement already satisfied: pandas>=0.24.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.1.5) Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (2.2.3) Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.16.0) Requirement already satisfied: scipy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.6.3) Requirement already satisfied: numpy>=1.16.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from catboost) (1.20.3) Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pandas>=0.24.0->catboost) (2.8.2) Requirement already satisfied: pytz>=2017.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pandas>=0.24.0->catboost) (2019.3) Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (0.10.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (3.0.8) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->catboost) (1.1.0) Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from plotly->catboost) (8.0.1) Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->catboost) (56.2.0)WARNING: You are using pip version 22.0.4; however, version 22.1.1 is available. You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.0.24.2
#----------------------------------------------------------------------------------------------------------# Load trainning and Testing data#----------------------------------------------------------------------------------------------------------def convert_X(arch_str):
temp_arch = [] for i,elm in enumerate(arch_str): if i in [3,6,9,12,15,18,21,24,27,30,33,36]: pass #Get rid of non-info columns,all is 768
elif elm == 'j': temp_arch.append(1-2) #Transform it to number then central data, normalize Data
elif elm == 'k': temp_arch.append(2-2) #Transform it to number then central data, normalize Data
elif elm == 'l': temp_arch.append(3-2) #Transform it to number then central data, normalize Data
elif int(elm) == 0: temp_arch.append(2-2) #Make 0 as 2 as it should contain neutral information(reduce correlation), then central data, normalize Data
else: temp_arch.append(int(elm)-2) #central data,normalize Data
return(temp_arch)with open('./data/data134077/CVPR_2022_NAS_Track2_train.json', 'r') as f:
train_data = json.load(f)with open('./data/data134077/CVPR_2022_NAS_Track2_test.json', 'r') as f:
test_data = json.load(f)
test_arch_list = []for key in test_data.keys():
test_arch = convert_X(test_data[key]['arch'])
test_arch_list.append(test_arch)
bb = np.array(test_arch_list).T
train_list = [[],[],[],[],[],[],[],[]]
arch_list_train = []
name_list = ['cplfw_rank', 'market1501_rank', 'dukemtmc_rank', 'msmt17_rank', 'veri_rank', 'vehicleid_rank', 'veriwild_rank', 'sop_rank']for key in train_data.keys(): for idx, name in enumerate(name_list):
train_list[idx].append(train_data[key][name])
xx = train_data[key]['arch']
arch_list_train.append(convert_X(train_data[key]['arch']))
Y_all0 = np.array(train_list)
Y_all = np.log((Y_all0+1)/(500-Y_all0)) #Transfer rank data by Sigmoid function#----------------------------------------------------------------------------------------------------------#Correlation reserach #----------------------------------------------------------------------------------------------------------regcoef = np.round(np.array([LinearRegression().fit(np.array(arch_list_train), Y_all[i]).coef_ for i in range(8)]),2)print('Task Corr: ')print(np.round(np.corrcoef(regcoef),2))#correlation bewtween different taskprint('Parameters Corr: ')print(np.round(np.corrcoef(np.array(arch_list_train).T.astype('float')),2))#correlation bewtween different task, we can find the correlation is very low between different arch parametersTask Corr: [[ 1. 0.18 0.12 -0.17 -0.29 -0.3 -0.3 -0.13] [ 0.18 1. 0.78 0.81 0.52 0.3 0.53 0.4 ] [ 0.12 0.78 1. 0.85 0.62 0.2 0.57 0.52] [-0.17 0.81 0.85 1. 0.8 0.42 0.82 0.66] [-0.29 0.52 0.62 0.8 1. 0.46 0.77 0.48] [-0.3 0.3 0.2 0.42 0.46 1. 0.52 0.12] [-0.3 0.53 0.57 0.82 0.77 0.52 1. 0.69] [-0.13 0.4 0.52 0.66 0.48 0.12 0.69 1. ]] Parameters Corr: [[ 1. 0. -0.02 0.02 -0.09 -0.01 0.03 -0.02 0.09 0.01 0. -0.05 0.1 -0.04 -0.05 -0.02 0.04 0.01 0. -0.08 -0.03 0.05 0.07 0.06 -0.11] [ 0. 1. 0.01 -0.05 0.05 0.12 0. 0.02 -0.01 -0.02 0.01 0.1 0.05 -0.03 -0.01 0. 0.03 -0.09 -0.01 -0.07 0.02 0.02 0.03 0.05 -0.07] [-0.02 0.01 1. 0.01 0.03 0.01 0.1 -0.06 0.05 0.04 0.01 0. -0.01 -0.03 -0.01 -0.08 -0.01 0.03 0.05 0.04 0.01 0.03 -0.01 0.05 0.03] [ 0.02 -0.05 0.01 1. 0.05 -0.03 0.01 0. -0.02 -0.03 -0.08 0.08 -0.04 -0.01 0.04 -0.01 0.04 0.03 0.01 -0.03 -0.02 -0.01 0.04 0.05 0.09] [-0.09 0.05 0.03 0.05 1. -0.02 0.04 -0.06 0.05 0.01 -0.03 0. -0.04 -0.03 -0.04 0.01 0.03 -0.03 0.07 0.08 0.04 0.03 0.03 0. -0.03] [-0.01 0.12 0.01 -0.03 -0.02 1. 0.02 -0.04 -0.04 -0.02 -0.06 -0. 0.01 0.05 0.01 -0.09 0. -0.05 -0.08 -0.04 0.01 0.08 0.07 -0.04 -0.03] [ 0.03 0. 0.1 0.01 0.04 0.02 1. -0.03 0.01 -0.02 -0.02 0.01 -0.05 -0.04 -0.03 0.02 -0.04 -0.06 -0.03 0.04 -0.03 0.01 -0.02 0. 0.01] [-0.02 0.02 -0.06 0. -0.06 -0.04 -0.03 1. 0.01 0.04 -0.01 -0.09 0.07 0.07 -0.01 0.05 -0. 0.05 0.05 -0.11 -0.02 -0.04 -0. -0.04 0.02] [ 0.09 -0.01 0.05 -0.02 0.05 -0.04 0.01 0.01 1. -0.06 -0.03 -0.04 0.06 0.05 -0.09 0. -0.01 0.01 0.03 0.08 -0.01 -0.03 0.1 0.02 0.08] [ 0.01 -0.02 0.04 -0.03 0.01 -0.02 -0.02 0.04 -0.06 1. -0.05 -0.06 0.02 -0.06 -0.06 0. -0.04 -0.13 -0.03 -0.06 -0.03 -0.01 -0.02 0.03 -0.06] [ 0. 0.01 0.01 -0.08 -0.03 -0.06 -0.02 -0.01 -0.03 -0.05 1. -0.04 -0.06 0. -0.02 -0.06 0.03 -0.04 0.07 -0.09 0.02 -0.01 -0.04 0.01 -0.02] [-0.05 0.1 0. 0.08 0. -0. 0.01 -0.09 -0.04 -0.06 -0.04 1. 0.02 0.02 0.09 -0.02 -0.04 0.02 0. 0.02 0.03 -0.03 0.03 -0.04 -0.03] [ 0.1 0.05 -0.01 -0.04 -0.04 0.01 -0.05 0.07 0.06 0.02 -0.06 0.02 1. -0.07 -0.01 0.09 -0.06 0.06 0.03 0.01 -0. -0.03 -0.06 -0.06 0.02] [-0.04 -0.03 -0.03 -0.01 -0.03 0.05 -0.04 0.07 0.05 -0.06 0. 0.02 -0.07 1. 0.02 0.04 0. 0.05 -0. -0.04 -0.06 -0.05 0.06 -0.05 -0.07] [-0.05 -0.01 -0.01 0.04 -0.04 0.01 -0.03 -0.01 -0.09 -0.06 -0.02 0.09 -0.01 0.02 1. 0.02 0.03 0.03 0.02 -0.04 0.03 0.01 -0.05 -0.05 -0.01] [-0.02 0. -0.08 -0.01 0.01 -0.09 0.02 0.05 0. 0. -0.06 -0.02 0.09 0.04 0.02 1. -0.05 0.05 0.04 -0.04 -0.02 -0.04 0.01 0.03 -0.03] [ 0.04 0.03 -0.01 0.04 0.03 0. -0.04 -0. -0.01 -0.04 0.03 -0.04 -0.06 0. 0.03 -0.05 1. 0.06 -0. -0.04 0.04 -0.02 0.03 0.02 0.08] [ 0.01 -0.09 0.03 0.03 -0.03 -0.05 -0.06 0.05 0.01 -0.13 -0.04 0.02 0.06 0.05 0.03 0.05 0.06 1. 0. 0.04 -0.06 -0.02 0.05 -0. -0.06] [ 0. -0.01 0.05 0.01 0.07 -0.08 -0.03 0.05 0.03 -0.03 0.07 0. 0.03 -0. 0.02 0.04 -0. 0. 1. -0.03 -0.02 0.03 -0.03 0.03 -0.08] [-0.08 -0.07 0.04 -0.03 0.08 -0.04 0.04 -0.11 0.08 -0.06 -0.09 0.02 0.01 -0.04 -0.04 -0.04 -0.04 0.04 -0.03 1. -0. 0.07 -0.05 -0.1 -0. ] [-0.03 0.02 0.01 -0.02 0.04 0.01 -0.03 -0.02 -0.01 -0.03 0.02 0.03 -0. -0.06 0.03 -0.02 0.04 -0.06 -0.02 -0. 1. 0.02 0.03 -0.03 0.05] [ 0.05 0.02 0.03 -0.01 0.03 0.08 0.01 -0.04 -0.03 -0.01 -0.01 -0.03 -0.03 -0.05 0.01 -0.04 -0.02 -0.02 0.03 0.07 0.02 1. -0.1 0.05 -0.11] [ 0.07 0.03 -0.01 0.04 0.03 0.07 -0.02 -0. 0.1 -0.02 -0.04 0.03 -0.06 0.06 -0.05 0.01 0.03 0.05 -0.03 -0.05 0.03 -0.1 1. -0.04 -0.05] [ 0.06 0.05 0.05 0.05 0. -0.04 0. -0.04 0.02 0.03 0.01 -0.04 -0.06 -0.05 -0.05 0.03 0.02 -0. 0.03 -0.1 -0.03 0.05 -0.04 1. 0.05] [-0.11 -0.07 0.03 0.09 -0.03 -0.03 0.01 0.02 0.08 -0.06 -0.02 -0.03 0.02 -0.07 -0.01 -0.03 0.08 -0.06 -0.08 -0. 0.05 -0.11 -0.05 0.05 1. ]]
#----------------------------------------------------------------------------------------------------------# Modify GPNAS code as sklearn API and add some more function#----------------------------------------------------------------------------------------------------------__all__ = ["GPNAS_API"]class GPNAS_API(object):
_estimator_type = "regressor"
def __init__(self, cov_w = None, w = None, c_flag=2, m_flag=2, hp_mat = 0.0000001, hp_cov = 0.01, icov = 1):
self.hp_mat = hp_mat
self.hp_cov = hp_cov
self.cov_w = cov_w
self.w = w
self.c_flag = c_flag
self.m_flag = m_flag
self.icov = icov #if we use initial cov as prior
def get_params(self, deep=True):
return { "hp_mat": self.hp_mat,
"hp_cov": self.hp_cov,
"cov_w": self.cov_w,
"w": self.w,
"c_flag": self.c_flag,
"m_flag": self.m_flag,
"icov": self.icov,
} def set_params(self, **parameters):
for parameter, value in parameters.items(): setattr(self, parameter, value) return self def _get_corelation(self, mat1, mat2):
"""
give two typical kernel function
Auto kernel hyperparameters estimation to be updated
"""
mat_diff = abs(mat1 - mat2) if self.c_flag == 1: return 0.5 * np.exp(-np.dot(mat_diff, mat_diff) / 16) elif self.c_flag == 2: return 1 * np.exp(-np.sqrt(np.dot(mat_diff, mat_diff)) / 12) def _preprocess_X(self, X):
"""
preprocess of input feature/ tokens of architecture
more complicated preprocess can be added such as nonlineaer transformation
"""
X = X.tolist()
p_X = copy.deepcopy(X) for feature in p_X: feature.append(1) return p_X def _get_cor_mat(self, X):
"""get kernel matrix"""
X = np.array(X)
l = X.shape[0]
cor_mat = [] for c_idx in range(l):
col = []
c_mat = X[c_idx].copy() for r_idx in range(l):
r_mat = X[r_idx].copy()
temp_cor = self._get_corelation(c_mat, r_mat)
col.append(temp_cor)
cor_mat.append(col) return np.mat(cor_mat) def _get_cor_mat_joint(self, X, X_train):
"""
get kernel matrix
"""
X = np.array(X)
X_train = np.array(X_train)
l_c = X.shape[0]
l_r = X_train.shape[0]
cor_mat = [] for c_idx in range(l_c):
col = []
c_mat = X[c_idx].copy() for r_idx in range(l_r):
r_mat = X_train[r_idx].copy()
temp_cor = self._get_corelation(c_mat, r_mat)
col.append(temp_cor)
cor_mat.append(col) return np.mat(cor_mat) def fit(self, X,y):
self.get_initial_mean(X[0::2],y[0::2])
self.get_initial_cov(X) # 更新(训练)gpnas预测器超参数
self.get_posterior_mean(X[1::2],y[1::2])
def predict(self, X):
X = self._preprocess_X(X)
X = np.mat(X) #print('beta',self.w.flatten())
return X * self.w
def get_predict(self, X):
"""
get the prediction of network architecture X
"""
X = self._preprocess_X(X)
X = np.mat(X) return X * self.w def get_predict_jiont(self, X, X_train, Y_train):
"""
get the prediction of network architecture X based on X_train and Y_train
"""
X = np.mat(X)
X_train = np.mat(X_train)
Y_train = np.mat(Y_train)
m_X = self.get_predict(X)
m_X_train = self.get_predict(X_train)
mat_train = self._get_cor_mat(X_train)
mat_joint = self._get_cor_mat_joint(X, X_train) return m_X + mat_joint * np.linalg.inv(mat_train + self.hp_mat * np.eye(
X_train.shape[0])) * (Y_train.T - m_X_train) def get_initial_mean(self, X, Y):
"""
get initial mean of w
"""
X = self._preprocess_X(X)
X = np.mat(X)
Y = np.mat(Y)
self.w = np.linalg.inv(X.T * X + self.hp_mat * np.eye(X.shape[ 1])) * X.T * Y.T #inv(X.T*X)X.T*Y as initial mean
print('Variance',np.var(Y-X * self.w))#Show variance of residual then we can base this tunning self.hp_cov
return self.w def get_initial_cov(self, X):
"""
get initial coviarnce matrix of w
"""
X = self._preprocess_X(X)
X = np.mat(X) if self.icov == 1: #use inv(X.T*X) as initial covariance
self.cov_w = self.hp_cov * np.linalg.inv(X.T * X) elif self.icov == 0:# use identity matrix as initial covariance
self.cov_w = self.hp_cov * np.eye(X.shape[1]) else: assert 0,'not available yet'
return self.cov_w def get_posterior_mean(self, X, Y):
"""
get posterior mean of w
"""
X = self._preprocess_X(X)
X = np.mat(X)
Y = np.mat(Y)
cov_mat = self._get_cor_mat(X) if self.m_flag == 1:
self.w = self.w + self.cov_w * X.T * np.linalg.inv(
np.linalg.inv(cov_mat + self.hp_mat * np.eye(X.shape[0])) + X *
self.cov_w * X.T + self.hp_mat * np.eye(X.shape[0])) * (
Y.T - X * self.w) else:
self.w = np.linalg.inv(X.T * np.linalg.inv(
cov_mat + self.hp_mat * np.eye(X.shape[0])) * X + np.linalg.inv(
self.cov_w + self.hp_mat * np.eye(X.shape[ 1])) + self.hp_mat * np.eye(X.shape[1])) * (
X.T * np.linalg.inv(cov_mat + self.hp_mat * np.eye(
X.shape[0])) * Y.T +
np.linalg.inv(self.cov_w + self.hp_mat * np.eye(
X.shape[1])) * self.w) return self.w def get_posterior_cov(self, X, Y):
"""
get posterior coviarnce matrix of w
"""
X = self._preprocess_X(X)
X = np.mat(X)
Y = np.mat(Y)
cov_mat = self._get_cor_mat(X)
self.cov_mat = np.linalg.inv(
np.linalg.inv(X.T * cov_mat * X + self.hp_mat * np.eye(X.shape[1]))
+ np.linalg.inv(self.cov_w + self.hp_mat * np.eye(X.shape[ 1])) + self.hp_mat * np.eye(X.shape[1])) return self.cov_mat#----------------------------------------------------------------------------------------------------------# Modify GPNAS code as sklearn API and add some more function#----------------------------------------------------------------------------------------------------------max_iter = [10000,10000,10000,10000,10000,10000,10000,10000]
#learning_rate = [0.008,0.038,0.032,0.02,0.025,0.012,0.025,0.006]#learning_rate = [0.01.,0.04.,0.04.,0.04.,0.02.,0.02.,0.04.,0.001.]learning_rate = [0.005,0.038,0.035,0.03,0.025,0.01,0.03,0.01] #Final learning ratemax_depth = [1,3,2,2,2,3,1,3] #depth for GBRT(huber),CATGB(MSE),GBRT2(MSE),CATGB2(huber)max_depth2 = [1,1,1,1,1,1,1,1] #depth for HISTGB,LIGHTGB,XGBlist_est = []
model_GBRT,model_HISTGB,model_CATGB,model_LIGHTGB,model_XGB,model_GBRT2,model_CATGB2= [],[],[],[],[],[],[]for i in range(8):
params_GBRT = {"n_estimators": max_iter[i], "max_depth": max_depth[i], "subsample": .8, "learning_rate": learning_rate[i], "loss": 'huber', "max_features": 'sqrt', "random_state":1,
}
model_GBRT.append(ensemble.GradientBoostingRegressor(**params_GBRT))
params_HISTGB = { "max_depth": max_depth2[i], "max_iter":max_iter[i] , "learning_rate": learning_rate[i], "loss": 'least_squares', "max_leaf_nodes":31, "min_samples_leaf":5, "l2_regularization":5, "random_state":1,
}
model_HISTGB.append(HistGradientBoostingRegressor(**params_HISTGB))
model_CATGB.append(catboost.CatBoostRegressor(iterations= max_iter[i] ,
learning_rate= learning_rate[i],
depth= max_depth[i],
silent=True,
task_type="CPU",
loss_function= 'RMSE',
eval_metric='RMSE',
random_seed = 1,
od_type='Iter',
metric_period = 75,
od_wait=100,
))
model_LIGHTGB.append(lightgbm.LGBMRegressor(boosting_type='gbdt',learning_rate = learning_rate[i],num_leaves=31,
max_depth = max_depth2[i], alpha = 0.1, n_estimators = max_iter[i] ,random_state=1))
model_XGB.append(xgboost.XGBRegressor(learning_rate = learning_rate[i],tree_method = 'auto',
max_depth = max_depth2[i], alpha = 0.8, n_estimators = max_iter[i] ,random_state=1))
params_GBRT2 = {"n_estimators": max_iter[i] , "max_depth": max_depth[i], "subsample": .8, "learning_rate": learning_rate[i], "loss": 'ls', "max_features": 'log2', "random_state":1,
}
model_GBRT2.append(ensemble.GradientBoostingRegressor(**params_GBRT2))
model_CATGB2.append(catboost.CatBoostRegressor(iterations= max_iter[i] ,
learning_rate= learning_rate[i],
depth= max_depth[i],
silent=True,
task_type="CPU",
loss_function= 'Huber:delta=2',
eval_metric= 'Huber:delta=2',
random_seed = 1,
od_type='Iter',
metric_period = 75,
od_wait=100,
l2_leaf_reg = 1,
subsample = 0.8,
))for i in range(8):
list_est.append([
('GBRT', model_GBRT[i]),
('HISTGB', model_HISTGB[i]),
('CATGB',model_CATGB[i]),
('LIGHTGB', model_LIGHTGB[i]),
('XGB', model_XGB[i]),
('GBRT2', model_GBRT2[i]),
('CATGB2', model_CATGB2[i]),
])#----------------------------------------------------------------------------------------------------------# In sample training and testing, just for reserach and parameter selection#----------------------------------------------------------------------------------------------------------#X_all_k = np.array(arch_list_train)#X_val = np.array(test_arch_list)#print(np.array(test_arch_list).shape,X_val.shape)#train_num = 400;#gb_list = []#for i in range(8):# model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = hp_list[i], hp_cov = 3, icov = 1),passthrough=False,n_jobs=4)# Y_all_k = Y_all[i]# X_train_k, Y_train_k, X_test_k, Y_test_k = X_all_k[0:train_num:1], Y_all_k[0:train_num:1], X_all_k[train_num::1], Y_all_k[train_num::1]# model_final.fit(X_train_k,Y_train_k)# gb_list.append(copy.copy(model_final))# print('Kendalltau:',i,scipy.stats.stats.kendalltau(model_final.predict(X_test_k),Y_test_k))#----------------------------------------------------------------------------------------------------------# Plot training and testing params result to help make prediction.#----------------------------------------------------------------------------------------------------------#i = 0#params = gb_list[i].get_params()#itr = "n_estimators"#test_score0 = np.zeros((params[itr],), dtype=np.float64)#test_score1 = np.zeros((params[itr],), dtype=np.float64)#for i,y_pred0 in enumerate(model_gb.staged_predict(X_train_k)):# test_score0[i] = scipy.stats.stats.kendalltau(Y_train_k, y_pred0)[0]#for i,y_pred1 in enumerate(model_gb.staged_predict(X_test_k)):# test_score1[i] = scipy.stats.stats.kendalltau(Y_test_k, y_pred1)[0]
#fig = plt.figure(figsize=(6, 6))#plt.subplot(1, 1, 1)#plt.title("Deviance")#plt.plot(# np.arange(params[itr])[:] + 1,# test_score0[:],# "b-",# label="Training Set Deviance",#)#plt.plot(np.arange(params[itr])[:] + 1, test_score1[:], "r-", label="Test Set Deviance")#plt.legend(loc="upper right")#plt.xlabel("Boosting Iterations")#plt.ylabel("Deviance")#fig.tight_layout()#plt.show()#----------------------------------------------------------------------------------------------------------# Feature Importance#----------------------------------------------------------------------------------------------------------#from sklearn.inspection import permutation_importance#for model_gb in gb_list:# feature_importance = model_gb.feature_importances_# sorted_idx = np.argsort(feature_importance)# pos = np.arange(sorted_idx.shape[0]) + 0.5# fig = plt.figure(figsize=(6, 4))# plt.barh(pos, feature_importance[sorted_idx], align="center")# plt.yticks(pos, np.array(range(len(feature_importance)))[sorted_idx])# plt.title("Feature Importance ")#----------------------------------------------------------------------------------------------------------# Result 1, use GPNAS with inv(X.T*X) as initial covariance prior to stack 'GBRT','HISTGB','CATGB','LIGHTGB','XGB','GBRT2','CATGB2' with cross validation#----------------------------------------------------------------------------------------------------------X_train_k = np.array(arch_list_train)
X_val = np.array(test_arch_list)
rank_all1= []for i in range(len(list_est)): print('No: ',i) #stack different regressor by GPNAS with cross validation
model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = 0.5, hp_cov = 3, icov = 1),passthrough=False,n_jobs=4)
Y_train_k = Y_all[i]
model_final.fit(X_train_k,Y_train_k)
zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function
print(zz[0])
rank_all1.append(zz)No: 0 Variance 3.3439391282630595
#----------------------------------------------------------------------------------------------------------# Save Result 1(My result is ran locally which is a little different from here)#----------------------------------------------------------------------------------------------------------for idx,key in enumerate(test_data.keys()): #print(key)
test_data[key]['cplfw_rank'] = int(rank_all1[0][idx])
test_data[key]['market1501_rank'] = int(rank_all1[1][idx])
test_data[key]['dukemtmc_rank'] = int(rank_all1[2][idx])
test_data[key]['msmt17_rank'] = int(rank_all1[3][idx])
test_data[key]['veri_rank'] = int(rank_all1[4][idx])
test_data[key]['vehicleid_rank'] = int(rank_all1[5][idx])
test_data[key]['veriwild_rank'] = int(rank_all1[6][idx])
test_data[key]['sop_rank'] = int(rank_all1[7][idx])print('Ready to save results!')with open('./CVPR_2022_NAS_Track2_submit_ACCNAS_1.json', 'w') as f:
json.dump(test_data, f)#----------------------------------------------------------------------------------------------------------# Result 2, use GPNAS with identity matrix as initial covariance prior to stack 'GBRT','HISTGB','CATGB','LIGHTGB','XGB','GBRT2','CATGB2' with cross validation#----------------------------------------------------------------------------------------------------------X_train_k = np.array(arch_list_train)
X_val = np.array(test_arch_list)
rank_all2= []for i in range(len(list_est)): print('No: ',i) #stack different regressor by GPNAS with cross validation
model_final = StackingRegressor(estimators=list_est[i],final_estimator=GPNAS_API(c_flag=2, m_flag=2, hp_mat = 0.5, hp_cov = 0.01, icov = 0),passthrough=False,n_jobs=4)
Y_train_k = Y_all[i]
model_final.fit(X_train_k,Y_train_k)
zz = np.round((X_val.shape[0]-1)/(1+ np.exp(-1*model_final.predict(X_val)))) #Transfer by Sigmoid function
print(zz[0])
rank_all2.append(zz)No: 0 Variance 3.3439391282630595 [[42459.]] No: 1
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py:706: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak. "timeout or by a memory leak.", UserWarning
Variance 6.0485427006730506 [[28879.]] No: 2 Variance 6.006150913480946 [[47402.]] No: 3 Variance 6.103473245508546 [[72536.]] No: 4 Variance 6.025486965953786 [[92206.]] No: 5 Variance 5.8364155067895815 [[70782.]] No: 6 Variance 6.52441906729945 [[92304.]] No: 7 Variance 5.725986746928656 [[90033.]]
#----------------------------------------------------------------------------------------------------------# Save Result 2(My result is ran locally which is a little different from here)#----------------------------------------------------------------------------------------------------------for idx,key in enumerate(test_data.keys()): #print(key)
test_data[key]['cplfw_rank'] = int(rank_all2[0][idx])
test_data[key]['market1501_rank'] = int(rank_all2[1][idx])
test_data[key]['dukemtmc_rank'] = int(rank_all2[2][idx])
test_data[key]['msmt17_rank'] = int(rank_all2[3][idx])
test_data[key]['veri_rank'] = int(rank_all2[4][idx])
test_data[key]['vehicleid_rank'] = int(rank_all2[5][idx])
test_data[key]['veriwild_rank'] = int(rank_all2[6][idx])
test_data[key]['sop_rank'] = int(rank_all2[7][idx])print('Ready to save results!')with open('./CVPR_2022_NAS_Track2_submit_ACCNAS_2.json', 'w') as f:
json.dump(test_data, f)Ready to save results!
#----------------------------------------------------------------------------------------------------------# Multivariate Gradient Boosting (result is not that good final score 0.78759) #----------------------------------------------------------------------------------------------------------#from catboost import Pool, CatBoostRegressor#import numpy as np#train_num = 400;gb_list = [];rank_list= []#index_list = [[0],[1],[2],[3],[4],[5],[6],[7]]#index_list = [[0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7]]#[50, 17, 43, 18, 3, 10, 28, 31]#index_list = [[1,0],[1,1],[1,2],[1,3],[1,4],[1,5],[1,6],[1,7]]#1,1#index_list = [[1],[1,2],[1,3],[1,4],[1,2],[1,2,3],[1,2,3,4],[1,2,3,4,5]]#[48, 19, 36, 30, 0, 22, 23, 22]#index_list = [[2,0],[2,1],[2],[2,3],[2,4],[2,5],[2,6],[2,7]]#[1, 30, 68, 35, 25, 15, 14, 12]#index_list = [[3,0],[3,1],[3,2],[3,3],[3,4],[3,5],[3,6],[3,7]]#3,3#index_list = [[4,0],[4,1],[4,2],[4,3],[4,4],[4,5],[4,6],[4,7]]#[0, 25, 27, 31, 70, 8, 23, 16]#index_list = [[5,0],[5,1],[5,2],[5,3],[5,4],[5,5],[5,6],[5,7]]#[7, 18, 27, 22, 21, 56, 33, 16]#index_list = [[6,0],[6,1],[6,2],[6,3],[6,4],[6,5],[6],[6,7]]#[0, 25, 19, 29, 19, 24, 79, 5]#index_list = [[7,0],[7,1],[7,2],[7,3],[7,4],[7,5],[7,6],[7]]#[3, 14, 23, 20, 28, 19, 11, 82]#xx = np.zeros([len(index_list),50])#for j in range(xx.shape[1]):# for i in range(xx.shape[0]):# md_catboost = CatBoostRegressor(iterations=10000,# learning_rate=.005,# depth=2,# verbose=0,# #silent=True,# task_type="CPU",# l2_leaf_reg=1,# loss_function= 'MultiRMSE',# eval_metric= 'MultiRMSE',# random_seed = 1,# bagging_temperature = 0.1,# od_type= 'Iter', # metric_period = 75,# od_wait=100)# X_all_k, Y_all_k = np.array(arch_list_train).astype('float'), Y_all.astype('float').T[:,index_list[i]]# X_train_k, X_test_k, Y_train_k, Y_test_k = sklearn.model_selection.train_test_split(# X_all_k, Y_all_k, test_size=0.2, shuffle=True,random_state=j)## md_catboost.fit(X_train_k,Y_train_k)# y_predict = md_catboost.predict(X_test_k)# if len(index_list[i]) == 1:# mse = sklearn.metrics.mean_squared_error(y_predict,Y_test_k)# else:# mse = sklearn.metrics.mean_squared_error(y_predict[:,0],Y_test_k[:,0])# print('MSE:','i',i,'j',j,index_list[i],mse)# print('Kendalltau:',scipy.stats.stats.kendalltau(y_predict[:,0],Y_test_k[:,0]))]# xx[i,j] = np.round(mse,5)#from catboost import Pool, CatBoostRegressor#X_train_k, Y_train_k = X_all_k, Y_all_k#print(X_train_k.shape, Y_train_k.shape, X_test_k.shape, Y_test_k.shape)#dtrain = Pool(X_train_k, label=Y_train_k)#dvalid = Pool(X_test_k, label=Y_test_k)#md_catboost.fit(dtrain,eval_set=dvalid, use_best_model=True,early_stopping_rounds=None)#y_predict = md_catboost.predict(dvalid)#[print('Kendalltau:',scipy.stats.stats.kendalltau(y_predict,Y_test_k))]#----------------------------------------------------------------------------------------------------------# Other experiment we try#----------------------------------------------------------------------------------------------------------#Only CAT regressor# Kendalltau:0.79394 #Only HIST regressor# Kendalltau:0.78626 #Only GB regressor# Kendalltau:0.79383 #Only XGB regressor # Kendalltau:0.78741 #Only LIGHTGB regressor# Kendalltau:0.78647 #5 regressor combine with same learning rate and same depth parameters(depth = 1)# Kendalltau:0.79321 #5 regressor combine n_iters = 5000 with different learning rate for different regressor# Kendalltau:0.79457 #5 regressor combine n_iters = 10000# Kendalltau:0.79696
#7 regressor combine, hp_mat = 1, learn_rate*0.8# Kendalltau:0.79608 #7 regressor combine, hp_mat = 1# Kendalltau:0.79697
#7 regressor combine, hp_mat = 1, learn_rate*2# Kendalltau:0.79732 #7 regressor combine, hp_mat = 1, tunning learn_rate# Kendalltau:0.79769 #7 regressor combine, hp_mat = 0.4# Kendalltau:0.79785 #7 regressor combine, hp_mat = 0.5, all depth parameter equal to 1 # Kendalltau:0.79389 #7 regressor combine, hp_mat = 0.5, tunning depth parameter# Kendalltau:0.79788 #7 regressor combine, hp_mat = 0.5, round up final int rank# Kendalltau:0.79796 #7 regressor combine, hp_mat = 0.5, tunning learn_rate after new depth parameter# Kendalltau:0.79859以上就是CVPR2022 NAS竞赛Track 2 第1名技术方案分享的详细内容,更多请关注php中文网其它相关文章!
每个人都需要一台速度更快、更稳定的 PC。随着时间的推移,垃圾文件、旧注册表数据和不必要的后台进程会占用资源并降低性能。幸运的是,许多工具可以让 Windows 保持平稳运行。
Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号