TensorFlow modelassessment and 保存

Learningsuch as何assessmentmodelperformance, including准确率, 损失值 and 混淆矩阵etc.指标, 以及model 保存 and 加载method

1. modelassessment指标

modelassessment is 衡量modelperformance important 步骤, 不同 taskclass型需要using不同 assessment指标.

1.1 classificationtaskassessment指标

1.1.1 准确率 (Accuracy)

准确率 is 最常用 classification指标, 它表示正确预测 样本数占总样本数 比例:

准确率 = 正确预测 样本数 / 总样本数

# 计算准确率
accuracy = tf.keras.metrics.Accuracy()
accuracy.update_state(y_true, y_pred)
print(f"准确率: {accuracy.result().numpy()}")

1.1.2 精确率 (Precision)

精确率表示预测 for 正class 样本inpractical for 正class 比例:

精确率 = 真正class / (真正class + fake正class)

# 计算精确率
precision = tf.keras.metrics.Precision()
precision.update_state(y_true, y_pred)
print(f"精确率: {precision.result().numpy()}")

1.1.3 召回率 (Recall)

召回率表示practical for 正class 样本in被正确预测 for 正class 比例:

召回率 = 真正class / (真正class + fake负class)

# 计算召回率
recall = tf.keras.metrics.Recall()
recall.update_state(y_true, y_pred)
print(f"召回率: {recall.result().numpy()}")

1.1.4 F1分数

F1分数 is 精确率 and 召回率 调 and 平均值, 综合考虑了两者 performance:

F1 = 2 * (精确率 * 召回率) / (精确率 + 召回率)

# 计算F1分数
f1_score = 2 * (precision.result().numpy() * recall.result().numpy()) / (precision.result().numpy() + recall.result().numpy())
print(f"F1分数: {f1_score}")

1.1.5 混淆矩阵

混淆矩阵 is a 用于visualizationclassificationmodelperformance 矩阵, 它展示了model for 每个class别 预测结果:

# 计算混淆矩阵
confusion_matrix = tf.math.confusion_matrix(y_true, y_pred)
print("混淆矩阵:")
print(confusion_matrix.numpy())

# visualization混淆矩阵
import matplotlib.pyplot as plt
import seaborn as sns

def plot_confusion_matrix(cm, class_names):
    plt.figure(figsize=(10, 8))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
    plt.xlabel('预测tag')
    plt.ylabel('真实tag')
    plt.title('混淆矩阵')
    plt.show()

# fake设class_names is class别名称list
class_names = ['class别0', 'class别1', 'class别2']
plot_confusion_matrix(confusion_matrix.numpy(), class_names)

1.1.6 ROC曲线 and AUC

ROC曲线 (Receiver Operating Characteristic Curve) is a用于assessment二classificationmodelperformance 曲线, AUC (Area Under the Curve) is ROC曲线 under 面积, 用于衡量model 整体performance.

# 计算ROC曲线 and AUC
from sklearn.metrics import roc_curve, auc

# fake设model is 训练 good  二classificationmodel
y_pred_proba = model.predict(X_test)[:, 1]  # 获取正class 预测概率

# 计算ROC曲线
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)

# 计算AUC
roc_auc = auc(fpr, tpr)

# 绘制ROC曲线
plt.figure(figsize=(10, 8))
plt.plot(fpr, tpr, color='blue', lw=2, label=f'ROC曲线 (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='gray', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('fake正率 (FPR)')
plt.ylabel('真率 (TPR)')
plt.title('ROC曲线')
plt.legend(loc="lower right")
plt.show()

1.2 回归taskassessment指标

1.2.1 均方误差 (MSE)

均方误差 is 回归taskin最常用 assessment指标, 它表示预测值 and 真实值之间差值 平方 平均值:

MSE = (1/n) * Σ(y_pred - y_true)²

# 计算均方误差
mse = tf.keras.metrics.MeanSquaredError()
mse.update_state(y_true, y_pred)
print(f"均方误差: {mse.result().numpy()}")

1.2.2 平均绝 for 误差 (MAE)

平均绝 for 误差表示预测值 and 真实值之间差值 绝 for 值 平均值:

MAE = (1/n) * Σ|y_pred - y_true|

# 计算平均绝 for 误差
mae = tf.keras.metrics.MeanAbsoluteError()
mae.update_state(y_true, y_pred)
print(f"平均绝 for 误差: {mae.result().numpy()}")

1.2.3 均方根误差 (RMSE)

均方根误差 is 均方误差 平方根, 它 and 原始data具 has 相同 量纲:

RMSE = √MSE

# 计算均方根误差
rmse = tf.math.sqrt(mse.result())
print(f"均方根误差: {rmse.numpy()}")

1.2.4 R²分数

R²分数表示model解释data方差 capacity, 其取值范围 for [0, 1], 越接近1表示modelperformance越 good :

# 计算R²分数
r2_score = tf.keras.metrics.R2Score()
r2_score.update_state(y_true, y_pred)
print(f"R²分数: {r2_score.result().numpy()}")

2. modelassessmentmethod

2.1 usingevaluate()methodassessmentmodel

TensorFlowproviding了evaluate()method, 用于assessmentmodel in testdata on performance:

# assessmentmodel
loss, accuracy = model.evaluate(X_test, y_test, verbose=1)
print(f"test损失: {loss}")
print(f"test准确率: {accuracy}")

2.2 usingpredict()methodfor预测

usingpredict()method for new datafor预测:

#  for testdatafor预测
y_pred = model.predict(X_test)
print(f"预测结果形状: {y_pred.shape}")

#  for 于classificationtask, 获取预测class别
y_pred_classes = tf.argmax(y_pred, axis=1)
print(f"预测class别: {y_pred_classes.numpy()}")

2.3 交叉verification

交叉verification is aassessmentmodel泛化capacity method, 它将data集分成k个fold, usingk-1个foldfor训练, 1个foldforverification, 重复k次:

# usingKFoldfor交叉verification
from sklearn.model_selection import KFold

# 定义折数
k = 5
kfold = KFold(n_splits=k, shuffle=True, random_state=42)

# store每折 准确率
accuracies = []

# 交叉verification循环
for train_index, val_index in kfold.split(X, y):
    # 划分训练集 and verification集
    X_train_fold, X_val_fold = X[train_index], X[val_index]
    y_train_fold, y_val_fold = y[train_index], y[val_index]
    
    # 构建model
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(64, activation='relu', input_shape=(X.shape[1],)),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    
    # 编译model
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    
    # 训练model
    model.fit(X_train_fold, y_train_fold, epochs=50, batch_size=32, verbose=0)
    
    # assessmentmodel
    _, accuracy = model.evaluate(X_val_fold, y_val_fold, verbose=0)
    accuracies.append(accuracy)
    print(f"第 {len(accuracies)} 折准确率: {accuracy}")

# 计算平均准确率
print(f"平均准确率: {sum(accuracies) / k}")

3. model保存 and 加载

保存训练 good model for 于 after 续using and deployment至关 important . TensorFlowproviding了 many 种保存 and 加载model method.

3.1 usingHDF5格式保存model

HDF5 is a用于store big 型data集 file格式, 适合保存完整 TensorFlowmodel:

# 保存完整model to HDF5file
model.save('model.h5')
print("model已保存 to model.h5")

保存 HDF5filepackage含:

  • model architecture
  • model 权重
  • model 编译information (损失function, optimization器etc.)
  • optimization器 status (such as果保存时指定)

3.2 加载HDF5格式 model

# 加载HDF5格式 model
loaded_model = tf.keras.models.load_model('model.h5')
print("model已 from model.h5加载")

# assessment加载 model
loss, accuracy = loaded_model.evaluate(X_test, y_test, verbose=1)
print(f"加载model test损失: {loss}")
print(f"加载model test准确率: {accuracy}")

3.3 usingSavedModel格式保存model

SavedModel is TensorFlow 原生model格式, 它保存了完整 modelinformation, 适合用于deployment:

# 保存model to SavedModel格式
model.save('saved_model')
print("model已保存 to saved_modelTable of Contents")

3.4 加载SavedModel格式 model

# 加载SavedModel格式 model
loaded_model = tf.keras.models.load_model('saved_model')
print("model已 from saved_modelTable of Contents加载")

# assessment加载 model
loss, accuracy = loaded_model.evaluate(X_test, y_test, verbose=1)
print(f"加载model test损失: {loss}")
print(f"加载model test准确率: {accuracy}")

3.5 只保存model权重

has 时候我们只需要保存model 权重, 而不需要保存model architecture:

# 保存model权重
model.save_weights('model_weights.h5')
print("model权重已保存 to model_weights.h5")

3.6 加载model权重

加载model权重需要先creation and 原model相同architecture model, 然 after 加载权重:

# creation and 原model相同architecture model
new_model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(X.shape[1],)),
    tf.keras.layers.Dense(10, activation='softmax')
])

# 编译model
new_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# 加载model权重
new_model.load_weights('model_weights.h5')
print("model权重已加载")

# assessmentmodel
loss, accuracy = new_model.evaluate(X_test, y_test, verbose=1)
print(f"加载权重 after  modeltest损失: {loss}")
print(f"加载权重 after  modeltest准确率: {accuracy}")

3.7 保存modelarchitecture

我们可以单独保存model architecture, 以便 after 续using:

3.7.1 保存 for JSON格式

# 保存modelarchitecture for JSON格式
model_json = model.to_json()
with open('model_architecture.json', 'w') as f:
    f.write(model_json)
print("modelarchitecture已保存 to model_architecture.json")

3.7.2 from JSON格式加载modelarchitecture

#  from JSON格式加载modelarchitecture
with open('model_architecture.json', 'r') as f:
    model_json = f.read()

loaded_model = tf.keras.models.model_from_json(model_json)
print("modelarchitecture已 from model_architecture.json加载")

3.7.3 保存 for YAML格式

# 保存modelarchitecture for YAML格式
model_yaml = model.to_yaml()
with open('model_architecture.yaml', 'w') as f:
    f.write(model_yaml)
print("modelarchitecture已保存 to model_architecture.yaml")

3.7.4 from YAML格式加载modelarchitecture

#  from YAML格式加载modelarchitecture
with open('model_architecture.yaml', 'r') as f:
    model_yaml = f.read()

loaded_model = tf.keras.models.model_from_yaml(model_yaml)
print("modelarchitecture已 from model_architecture.yaml加载")

4. 自定义assessment指标

除了usingTensorFlow in 置 assessment指标, 我们还可以自定义assessment指标:

# 自定义assessment指标class
class CustomAccuracy(tf.keras.metrics.Metric):
    def __init__(self, name='custom_accuracy', **kwargs):
        super(CustomAccuracy, self).__init__(name=name, **kwargs)
        self.correct = self.add_weight(name='correct', initializer='zeros')
        self.total = self.add_weight(name='total', initializer='zeros')
    
    def update_state(self, y_true, y_pred, sample_weight=None):
        # 将预测值转换 for class别
        y_pred = tf.argmax(y_pred, axis=1)
        # 计算正确预测 数量
        correct = tf.equal(y_true, y_pred)
        correct = tf.cast(correct, dtype=tf.float32)
        
        # such as果providing了样本权重, application权重
        if sample_weight is not None:
            sample_weight = tf.cast(sample_weight, dtype=tf.float32)
            correct = tf.multiply(correct, sample_weight)
        
        # updatestatus
        self.correct.assign_add(tf.reduce_sum(correct))
        self.total.assign_add(tf.cast(tf.size(y_true), dtype=tf.float32))
    
    def result(self):
        # 计算准确率
        return self.correct / self.total
    
    def reset_states(self):
        # resetstatus
        self.correct.assign(0.0)
        self.total.assign(0.0)

# using自定义assessment指标
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy', CustomAccuracy()]
)

# assessmentmodel
loss, accuracy, custom_accuracy = model.evaluate(X_test, y_test, verbose=1)
print(f"自定义准确率: {custom_accuracy}")

5. modelexport and deployment

5.1 export for TensorFlow Litemodel

TensorFlow Lite is TensorFlow 轻量级version, 适合 in move设备 and 嵌入式设备 on deployment:

# export for TensorFlow Litemodel
converter = tf.lite.TFLiteConverter.from_saved_model('saved_model')
tflite_model = converter.convert()

# 保存TensorFlow Litemodel
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)
print("model已export for TensorFlow Lite格式")

5.2 export for ONNX格式

ONNX (Open Neural Network Exchange) is a开放 model格式, 允许 in 不同framework之间转换model:

# installationonnx and tf2onnx
# pip install onnx tf2onnx

# export for ONNX格式
import tf2onnx

# 转换model
tf2onnx.convert.from_keras(model, output_path='model.onnx')
print("model已export for ONNX格式")

5.3 usingTensorFlow Servingdeploymentmodel

TensorFlow Serving is a 用于deployment机器Learningmodel system, It supportsmodel versionmanagement and A/Btest:

# 保存model for TensorFlow Serving格式
import os

# creationmodelversionTable of Contents
model_version = 1
export_dir = os.path.join('serving_model', str(model_version))

# 保存model
tf.saved_model.save(model, export_dir)
print(f"model已保存 to  {export_dir}")

然 after usingTensorFlow Serving启动service:

tensorflow_model_server --model_base_path="/path/to/serving_model" --model_name="my_model" --port=8501

usingREST API调用model:

import requests
import json

# 准备testdata
test_data = X_test[:1].tolist()

# 构建requestdata
request_data = json.dumps({
    "instances": test_data
})

# 发送request
response = requests.post(
    'http://localhost:8501/v1/models/my_model:predict',
    data=request_data,
    headers={'Content-Type': 'application/json'}
)

# 解析response
response_data = json.loads(response.text)
predictions = response_data['predictions']
print(f"预测结果: {predictions}")

6. best practices

6.1 保存model best practices

  • 保存完整model: for 于 big many 数circumstances, 建议保存完整model (includingarchitecture, 权重 and 编译information) .
  • 定期保存model: in 训练过程in定期保存model, 以便 in 训练in断时restore.
  • 保存最佳model: usingModelCheckpointcallback, 只保存verification损失最 low model.
  • usingversion控制: for model添加version号, 便于management and rollback.
  • 选择合适 格式: 根据deploymentrequirements选择合适 model格式.

6.2 modelassessment best practices

  • using独立 test集: test集应该 and 训练集 and verification集完全独立, 以确保assessment结果 reliability.
  • using many 种assessment指标: 不同 assessment指标 from 不同角度衡量modelperformance, 综合考虑 many 种指标可以更全面地assessmentmodel.
  • for交叉verification: for 于 small data集, using交叉verification可以获得更 reliable assessment结果.
  • visualizationassessment结果: using混淆矩阵, ROC曲线etc.visualizationtool, 直观地展示modelperformance.
  • 考虑data分布: assessmentmodel时, 要考虑data 分布circumstances, 确保model in 不同data子集 on performance都良 good .

7. 练习

练习 1: modelassessment

  1. 构建一个graph像classificationmodel, usingMNISTdata集for训练.
  2. using many 种assessment指标 (准确率, 精确率, 召回率, F1分数) assessmentmodelperformance.
  3. 绘制混淆矩阵 and ROC曲线.
  4. using交叉verificationassessmentmodel泛化capacity.

练习 2: model保存 and 加载

  1. 训练一个回归model, using波士顿房价data集.
  2. 保存完整model to HDF5格式.
  3. 保存model权重 and architecture to 单独 file.
  4. 加载保存 model, assessment其performance.
  5. 将modelexport for TensorFlow Lite格式.

练习 3: 自定义assessment指标

  1. for 二classificationtaskcreation一个自定义assessment指标, 计算 specificity (特异度) .
  2. in model编译时using自定义指标.
  3. assessmentmodel, 比较自定义指标 and in 置指标 结果.
on 一节: TensorFlow model训练 and optimization under 一节: TensorFlow 卷积神经network(CNN)