1. modelassessment指标
modelassessment is 衡量modelperformance important 步骤, 不同 taskclass型需要using不同 assessment指标.
1.1 classificationtaskassessment指标
1.1.1 准确率 (Accuracy)
准确率 is 最常用 classification指标, 它表示正确预测 样本数占总样本数 比例:
准确率 = 正确预测 样本数 / 总样本数
# 计算准确率
accuracy = tf.keras.metrics.Accuracy()
accuracy.update_state(y_true, y_pred)
print(f"准确率: {accuracy.result().numpy()}")
1.1.2 精确率 (Precision)
精确率表示预测 for 正class 样本inpractical for 正class 比例:
精确率 = 真正class / (真正class + fake正class)
# 计算精确率
precision = tf.keras.metrics.Precision()
precision.update_state(y_true, y_pred)
print(f"精确率: {precision.result().numpy()}")
1.1.3 召回率 (Recall)
召回率表示practical for 正class 样本in被正确预测 for 正class 比例:
召回率 = 真正class / (真正class + fake负class)
# 计算召回率
recall = tf.keras.metrics.Recall()
recall.update_state(y_true, y_pred)
print(f"召回率: {recall.result().numpy()}")
1.1.4 F1分数
F1分数 is 精确率 and 召回率 调 and 平均值, 综合考虑了两者 performance:
F1 = 2 * (精确率 * 召回率) / (精确率 + 召回率)
# 计算F1分数
f1_score = 2 * (precision.result().numpy() * recall.result().numpy()) / (precision.result().numpy() + recall.result().numpy())
print(f"F1分数: {f1_score}")
1.1.5 混淆矩阵
混淆矩阵 is a 用于visualizationclassificationmodelperformance 矩阵, 它展示了model for 每个class别 预测结果:
# 计算混淆矩阵
confusion_matrix = tf.math.confusion_matrix(y_true, y_pred)
print("混淆矩阵:")
print(confusion_matrix.numpy())
# visualization混淆矩阵
import matplotlib.pyplot as plt
import seaborn as sns
def plot_confusion_matrix(cm, class_names):
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
plt.xlabel('预测tag')
plt.ylabel('真实tag')
plt.title('混淆矩阵')
plt.show()
# fake设class_names is class别名称list
class_names = ['class别0', 'class别1', 'class别2']
plot_confusion_matrix(confusion_matrix.numpy(), class_names)
1.1.6 ROC曲线 and AUC
ROC曲线 (Receiver Operating Characteristic Curve) is a用于assessment二classificationmodelperformance 曲线, AUC (Area Under the Curve) is ROC曲线 under 面积, 用于衡量model 整体performance.
# 计算ROC曲线 and AUC
from sklearn.metrics import roc_curve, auc
# fake设model is 训练 good 二classificationmodel
y_pred_proba = model.predict(X_test)[:, 1] # 获取正class 预测概率
# 计算ROC曲线
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
# 计算AUC
roc_auc = auc(fpr, tpr)
# 绘制ROC曲线
plt.figure(figsize=(10, 8))
plt.plot(fpr, tpr, color='blue', lw=2, label=f'ROC曲线 (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='gray', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('fake正率 (FPR)')
plt.ylabel('真率 (TPR)')
plt.title('ROC曲线')
plt.legend(loc="lower right")
plt.show()
1.2 回归taskassessment指标
1.2.1 均方误差 (MSE)
均方误差 is 回归taskin最常用 assessment指标, 它表示预测值 and 真实值之间差值 平方 平均值:
MSE = (1/n) * Σ(y_pred - y_true)²
# 计算均方误差
mse = tf.keras.metrics.MeanSquaredError()
mse.update_state(y_true, y_pred)
print(f"均方误差: {mse.result().numpy()}")
1.2.2 平均绝 for 误差 (MAE)
平均绝 for 误差表示预测值 and 真实值之间差值 绝 for 值 平均值:
MAE = (1/n) * Σ|y_pred - y_true|
# 计算平均绝 for 误差
mae = tf.keras.metrics.MeanAbsoluteError()
mae.update_state(y_true, y_pred)
print(f"平均绝 for 误差: {mae.result().numpy()}")
1.2.3 均方根误差 (RMSE)
均方根误差 is 均方误差 平方根, 它 and 原始data具 has 相同 量纲:
RMSE = √MSE
# 计算均方根误差
rmse = tf.math.sqrt(mse.result())
print(f"均方根误差: {rmse.numpy()}")
1.2.4 R²分数
R²分数表示model解释data方差 capacity, 其取值范围 for [0, 1], 越接近1表示modelperformance越 good :
# 计算R²分数
r2_score = tf.keras.metrics.R2Score()
r2_score.update_state(y_true, y_pred)
print(f"R²分数: {r2_score.result().numpy()}")
2. modelassessmentmethod
2.1 usingevaluate()methodassessmentmodel
TensorFlowproviding了evaluate()method, 用于assessmentmodel in testdata on performance:
# assessmentmodel
loss, accuracy = model.evaluate(X_test, y_test, verbose=1)
print(f"test损失: {loss}")
print(f"test准确率: {accuracy}")
2.2 usingpredict()methodfor预测
usingpredict()method for new datafor预测:
# for testdatafor预测
y_pred = model.predict(X_test)
print(f"预测结果形状: {y_pred.shape}")
# for 于classificationtask, 获取预测class别
y_pred_classes = tf.argmax(y_pred, axis=1)
print(f"预测class别: {y_pred_classes.numpy()}")
2.3 交叉verification
交叉verification is aassessmentmodel泛化capacity method, 它将data集分成k个fold, usingk-1个foldfor训练, 1个foldforverification, 重复k次:
# usingKFoldfor交叉verification
from sklearn.model_selection import KFold
# 定义折数
k = 5
kfold = KFold(n_splits=k, shuffle=True, random_state=42)
# store每折 准确率
accuracies = []
# 交叉verification循环
for train_index, val_index in kfold.split(X, y):
# 划分训练集 and verification集
X_train_fold, X_val_fold = X[train_index], X[val_index]
y_train_fold, y_val_fold = y[train_index], y[val_index]
# 构建model
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(X.shape[1],)),
tf.keras.layers.Dense(10, activation='softmax')
])
# 编译model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# 训练model
model.fit(X_train_fold, y_train_fold, epochs=50, batch_size=32, verbose=0)
# assessmentmodel
_, accuracy = model.evaluate(X_val_fold, y_val_fold, verbose=0)
accuracies.append(accuracy)
print(f"第 {len(accuracies)} 折准确率: {accuracy}")
# 计算平均准确率
print(f"平均准确率: {sum(accuracies) / k}")
3. model保存 and 加载
保存训练 good model for 于 after 续using and deployment至关 important . TensorFlowproviding了 many 种保存 and 加载model method.
3.1 usingHDF5格式保存model
HDF5 is a用于store big 型data集 file格式, 适合保存完整 TensorFlowmodel:
# 保存完整model to HDF5file
model.save('model.h5')
print("model已保存 to model.h5")
保存 HDF5filepackage含:
- model architecture
- model 权重
- model 编译information (损失function, optimization器etc.)
- optimization器 status (such as果保存时指定)
3.2 加载HDF5格式 model
# 加载HDF5格式 model
loaded_model = tf.keras.models.load_model('model.h5')
print("model已 from model.h5加载")
# assessment加载 model
loss, accuracy = loaded_model.evaluate(X_test, y_test, verbose=1)
print(f"加载model test损失: {loss}")
print(f"加载model test准确率: {accuracy}")
3.3 usingSavedModel格式保存model
SavedModel is TensorFlow 原生model格式, 它保存了完整 modelinformation, 适合用于deployment:
# 保存model to SavedModel格式
model.save('saved_model')
print("model已保存 to saved_modelTable of Contents")
3.4 加载SavedModel格式 model
# 加载SavedModel格式 model
loaded_model = tf.keras.models.load_model('saved_model')
print("model已 from saved_modelTable of Contents加载")
# assessment加载 model
loss, accuracy = loaded_model.evaluate(X_test, y_test, verbose=1)
print(f"加载model test损失: {loss}")
print(f"加载model test准确率: {accuracy}")
3.5 只保存model权重
has 时候我们只需要保存model 权重, 而不需要保存model architecture:
# 保存model权重
model.save_weights('model_weights.h5')
print("model权重已保存 to model_weights.h5")
3.6 加载model权重
加载model权重需要先creation and 原model相同architecture model, 然 after 加载权重:
# creation and 原model相同architecture model
new_model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(X.shape[1],)),
tf.keras.layers.Dense(10, activation='softmax')
])
# 编译model
new_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# 加载model权重
new_model.load_weights('model_weights.h5')
print("model权重已加载")
# assessmentmodel
loss, accuracy = new_model.evaluate(X_test, y_test, verbose=1)
print(f"加载权重 after modeltest损失: {loss}")
print(f"加载权重 after modeltest准确率: {accuracy}")
3.7 保存modelarchitecture
我们可以单独保存model architecture, 以便 after 续using:
3.7.1 保存 for JSON格式
# 保存modelarchitecture for JSON格式
model_json = model.to_json()
with open('model_architecture.json', 'w') as f:
f.write(model_json)
print("modelarchitecture已保存 to model_architecture.json")
3.7.2 from JSON格式加载modelarchitecture
# from JSON格式加载modelarchitecture
with open('model_architecture.json', 'r') as f:
model_json = f.read()
loaded_model = tf.keras.models.model_from_json(model_json)
print("modelarchitecture已 from model_architecture.json加载")
3.7.3 保存 for YAML格式
# 保存modelarchitecture for YAML格式
model_yaml = model.to_yaml()
with open('model_architecture.yaml', 'w') as f:
f.write(model_yaml)
print("modelarchitecture已保存 to model_architecture.yaml")
3.7.4 from YAML格式加载modelarchitecture
# from YAML格式加载modelarchitecture
with open('model_architecture.yaml', 'r') as f:
model_yaml = f.read()
loaded_model = tf.keras.models.model_from_yaml(model_yaml)
print("modelarchitecture已 from model_architecture.yaml加载")
4. 自定义assessment指标
除了usingTensorFlow in 置 assessment指标, 我们还可以自定义assessment指标:
# 自定义assessment指标class
class CustomAccuracy(tf.keras.metrics.Metric):
def __init__(self, name='custom_accuracy', **kwargs):
super(CustomAccuracy, self).__init__(name=name, **kwargs)
self.correct = self.add_weight(name='correct', initializer='zeros')
self.total = self.add_weight(name='total', initializer='zeros')
def update_state(self, y_true, y_pred, sample_weight=None):
# 将预测值转换 for class别
y_pred = tf.argmax(y_pred, axis=1)
# 计算正确预测 数量
correct = tf.equal(y_true, y_pred)
correct = tf.cast(correct, dtype=tf.float32)
# such as果providing了样本权重, application权重
if sample_weight is not None:
sample_weight = tf.cast(sample_weight, dtype=tf.float32)
correct = tf.multiply(correct, sample_weight)
# updatestatus
self.correct.assign_add(tf.reduce_sum(correct))
self.total.assign_add(tf.cast(tf.size(y_true), dtype=tf.float32))
def result(self):
# 计算准确率
return self.correct / self.total
def reset_states(self):
# resetstatus
self.correct.assign(0.0)
self.total.assign(0.0)
# using自定义assessment指标
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy', CustomAccuracy()]
)
# assessmentmodel
loss, accuracy, custom_accuracy = model.evaluate(X_test, y_test, verbose=1)
print(f"自定义准确率: {custom_accuracy}")
5. modelexport and deployment
5.1 export for TensorFlow Litemodel
TensorFlow Lite is TensorFlow 轻量级version, 适合 in move设备 and 嵌入式设备 on deployment:
# export for TensorFlow Litemodel
converter = tf.lite.TFLiteConverter.from_saved_model('saved_model')
tflite_model = converter.convert()
# 保存TensorFlow Litemodel
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
print("model已export for TensorFlow Lite格式")
5.2 export for ONNX格式
ONNX (Open Neural Network Exchange) is a开放 model格式, 允许 in 不同framework之间转换model:
# installationonnx and tf2onnx
# pip install onnx tf2onnx
# export for ONNX格式
import tf2onnx
# 转换model
tf2onnx.convert.from_keras(model, output_path='model.onnx')
print("model已export for ONNX格式")
5.3 usingTensorFlow Servingdeploymentmodel
TensorFlow Serving is a 用于deployment机器Learningmodel system, It supportsmodel versionmanagement and A/Btest:
# 保存model for TensorFlow Serving格式
import os
# creationmodelversionTable of Contents
model_version = 1
export_dir = os.path.join('serving_model', str(model_version))
# 保存model
tf.saved_model.save(model, export_dir)
print(f"model已保存 to {export_dir}")
然 after usingTensorFlow Serving启动service:
tensorflow_model_server --model_base_path="/path/to/serving_model" --model_name="my_model" --port=8501
usingREST API调用model:
import requests
import json
# 准备testdata
test_data = X_test[:1].tolist()
# 构建requestdata
request_data = json.dumps({
"instances": test_data
})
# 发送request
response = requests.post(
'http://localhost:8501/v1/models/my_model:predict',
data=request_data,
headers={'Content-Type': 'application/json'}
)
# 解析response
response_data = json.loads(response.text)
predictions = response_data['predictions']
print(f"预测结果: {predictions}")
6. best practices
6.1 保存model best practices
- 保存完整model: for 于 big many 数circumstances, 建议保存完整model (includingarchitecture, 权重 and 编译information) .
- 定期保存model: in 训练过程in定期保存model, 以便 in 训练in断时restore.
- 保存最佳model: using
ModelCheckpointcallback, 只保存verification损失最 low model. - usingversion控制: for model添加version号, 便于management and rollback.
- 选择合适 格式: 根据deploymentrequirements选择合适 model格式.
6.2 modelassessment best practices
- using独立 test集: test集应该 and 训练集 and verification集完全独立, 以确保assessment结果 reliability.
- using many 种assessment指标: 不同 assessment指标 from 不同角度衡量modelperformance, 综合考虑 many 种指标可以更全面地assessmentmodel.
- for交叉verification: for 于 small data集, using交叉verification可以获得更 reliable assessment结果.
- visualizationassessment结果: using混淆矩阵, ROC曲线etc.visualizationtool, 直观地展示modelperformance.
- 考虑data分布: assessmentmodel时, 要考虑data 分布circumstances, 确保model in 不同data子集 on performance都良 good .
7. 练习
练习 1: modelassessment
- 构建一个graph像classificationmodel, usingMNISTdata集for训练.
- using many 种assessment指标 (准确率, 精确率, 召回率, F1分数) assessmentmodelperformance.
- 绘制混淆矩阵 and ROC曲线.
- using交叉verificationassessmentmodel泛化capacity.
练习 2: model保存 and 加载
- 训练一个回归model, using波士顿房价data集.
- 保存完整model to HDF5格式.
- 保存model权重 and architecture to 单独 file.
- 加载保存 model, assessment其performance.
- 将modelexport for TensorFlow Lite格式.
练习 3: 自定义assessment指标
- for 二classificationtaskcreation一个自定义assessment指标, 计算 specificity (特异度) .
- in model编译时using自定义指标.
- assessmentmodel, 比较自定义指标 and in 置指标 结果.