1. what is 深度Learning?
深度Learning (Deep Learning) is 机器Learning 一个branch, 它using many 层神经network来Learningdata complex 表示. and 传统机器Learning相比, 深度Learningable to自动 from datain提取特征, 无需手动特征工程.
提示
深度Learning "深度"指 is 神经networkin隐藏层 数量. 深层networkable toLearning更abstraction, 更 complex 特征表示.
1.1 深度Learning and 传统机器Learning 区别
| features | 传统机器Learning | 深度Learning |
|---|---|---|
| 特征提取 | 需要手动design特征 | 自动 from datainLearning特征 |
| datarequirements | 适用于 small to inetc.规模data | 需要 big 量data才能发挥优势 |
| 计算resource | 计算requirements较 low | 需要强 big 计算resource (GPU/TPU) |
| model可解释性 | 通常具 has 较 good 可解释性 | 被视 for "黑盒", 可解释性较差 |
| 适用task | structure化data, simple 模式 | 非structure化data, complex 模式 |
2. 深度Learning 发展历程
深度Learning 发展经历了以 under 几个 important 阶段:
2.1 早期发展 (1940s-1980s)
- 1943年: 麦卡洛克 and 皮茨提出人工神经元model
- 1957年: 罗森布拉特发明感知器
- 1969年: 明斯基 and 派珀特指出感知器 局限性
- 1986年: 辛顿etc.人提出反向传播algorithms
2.2 复兴期 (2000s-2010s)
- 2006年: 辛顿etc.人提出深度信念network and 预训练method
- 2009年: 李飞飞etc.人creationImageNetdata集
- 2012年: AlexNet in ImageNet竞赛in取得突破, 深度Learning开始崛起
- 2014年: 生成 for 抗network (GAN) 提出
- 2015年: ResNet提出, 解决了深层network 梯度消失issues
- 2017年: Transformerarchitecture提出, revolutionizing自然languageprocessing
2.3 繁荣期 (2010s-至今)
- 2018年: BERTmodel提出, 自然languageprocessingcapacity big 幅提升
- 2020年: GPT-3提出, parameter规模达 to 1750亿
- 2022年: ChatGPTrelease, 引发全球AI热潮
- 2023年: GPT-4, Claudeetc. big languagemodel相继release
3. 深度Learning corearchitecture
3.1 卷积神经network (CNN)
卷积神经network (Convolutional Neural Network, CNN) is a专门用于processing网格data (such asgraph像) 深度Learningarchitecture.
3.1.1 CNN basic组成
- 卷积层: using卷积核提取局部特征
- 池化层: reducing特征graph尺寸, 保留 important information
- 全连接层: for最终 classification or 回归
- 激活function: 引入非线性features
3.1.2 CNN application
- graph像classification
- 目标检测
- graph像分割
- 人脸识别
- 医学影像analysis
3.2 循环神经network (RNN)
循环神经network (Recurrent Neural Network, RNN) is a专门用于processing序列data 深度Learningarchitecture.
3.2.1 RNN basic原理
RNNthrough in networkin引入循环连接, able toprocessing序列datain 时间依赖relationships. 然而, 传统RNN存 in 梯度消失 and 梯度爆炸issues.
3.2.2 RNN 变体
- LSTM ( long short 期记忆network) : 解决了 long 序列依赖issues
- GRU (门控循环单元) : LSTM 简化version, 计算efficiency更 high
3.2.3 RNN application
- 自然languageprocessing
- speech recognition
- 时间序列预测
- 机器翻译
- 视频analysis
3.3 生成 for 抗network (GAN)
生成 for 抗network (Generative Adversarial Network, GAN) is a用于生成 new data 深度Learningarchitecture.
3.3.1 GAN basic原理
GAN由两个network组成:
- 生成器 (Generator) : 尝试生成逼真 data
- 判别器 (Discriminator) : 尝试区分真实data and 生成data
两个networkthrough for 抗训练continuouslyimproving各自 capacity, 最终生成器able to生成 high quality fakedata.
3.3.2 GAN application
- graph像生成
- 风格migration
- 超分辨率重建
- 文本 to graph像生成
- data增强
3.4 Transformerarchitecture
Transformer is a基于自注意力mechanism 深度Learningarchitecture, 最初 for 机器翻译taskdesign, 现已widely used in各种自然languageprocessingtask.
3.4.1 Transformer basic原理
Transformer完全基于注意力mechanism, 抛弃了传统 循环 and 卷积structure, able toparallelprocessing序列data, big 幅improving训练efficiency.
3.4.2 Transformer application
- 自然languageprocessing (BERT, GPTetc.)
- 机器翻译
- speech recognition
- graph像processing
- many 模态Learning
4. 深度Learningframework
深度Learningframework for Development者providing了构建, 训练 and deployment深度Learningmodel tool and interface.
4.1 主流深度Learningframework
4.1.1 TensorFlow
- 由GoogleDevelopment
- support静态计算graph
- 适合producedeployment
- 拥 has 丰富 ecosystem
4.1.2 PyTorch
- 由FacebookDevelopment
- support动态计算graph
- 适合研究 and 原型Development
- Pythonic APIdesign
4.1.3 otherframework
- Keras: advancedAPI, 可基于TensorFlow or Theano
- Caffe: 适合computer visiontask
- MXNet: high 效 distributed训练
- JAX: GoogleDevelopment 数值计算library
5. 深度Learning 训练过程
5.1 data准备
- data收集
- data清洗
- data增强
- data分割 (训练集, verification集, test集)
5.2 modeldesign
- 选择合适 networkarchitecture
- 确定network深度 and 宽度
- 选择激活function
- design损失function
5.3 model训练
- 选择optimizationalgorithms (such asAdam, SGD)
- 设置Learning率 and 批量 big small
- monitor训练过程
- 防止过拟合 (正则化, dropoutetc.)
5.4 modelassessment
- in verification集 on assessmentmodelperformance
- 调整超parameter
- in test集 on 最终assessment
5.5 modeldeployment
- modelexport
- modeloptimization (量化, 剪枝etc.)
- deployment to produceenvironment
- modelmonitor and maintenance
6. codeexample: usingPyTorch构建 simple CNN
under 面 is a usingPyTorch构建 simple CNN用于MNIST手写number识别 example:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
# data预processing
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# 加载data集
trainset = torchvision.datasets.MNIST(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64,
shuffle=True, num_workers=2)
testset = torchvision.datasets.MNIST(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64,
shuffle=False, num_workers=2)
# 定义CNNmodel
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 卷积层
self.conv1 = nn.Conv2d(1, 16, 3, padding=1)
self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
# 池化层
self.pool = nn.MaxPool2d(2, 2)
# 全连接层
self.fc1 = nn.Linear(32 * 7 * 7, 128)
self.fc2 = nn.Linear(128, 10)
# 激活function
self.relu = nn.ReLU()
def forward(self, x):
# 第一层卷积
x = self.pool(self.relu(self.conv1(x)))
# 第二层卷积
x = self.pool(self.relu(self.conv2(x)))
# 展平
x = x.view(-1, 32 * 7 * 7)
# 全连接层
x = self.relu(self.fc1(x))
x = self.fc2(x)
return x
# creationmodelinstance
net = Net()
# 定义损失function and optimization器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)
# 训练model
epochs = 5
for epoch in range(epochs):
run_loss = 0.0
for i, data in enumerate(trainloader, 0):
inputs, labels = data
# 梯度清零
optimizer.zero_grad()
# before 向传播
outputs = net(inputs)
# 计算损失
loss = criterion(outputs, labels)
# 反向传播
loss.backward()
# updateparameter
optimizer.step()
run_loss += loss.item()
if i % 100 == 99:
print(f'[{epoch + 1}, {i + 1}] loss: {run_loss / 100:.3f}')
run_loss = 0.0
print('Finished Training')
# testmodel
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f'Accuracy of the network on the 10000 test images: {100 * correct / total}%')
# visualization预测结果
dataiter = iter(testloader)
images, labels = dataiter.next()
# 打印预测结果
outputs = net(images)
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join(f'{predicted[j]}' for j in range(4)))
print('Actual: ', ' '.join(f'{labels[j]}' for j in range(4)))
# 显示graph像
fig, axes = plt.subplots(1, 4, figsize=(10, 3))
for i in range(4):
img = images[i].numpy().squeeze()
axes[i].imshow(img, cmap='gray')
axes[i].set_title(f'Pred: {predicted[i]}, True: {labels[i]}')
axes[i].axis('off')
plt.tight_layout()
plt.show()
7. 实践case: usingTensorFlow构建 simple RNN
under 面 is a usingTensorFlow构建 simple RNN用于文本classification example:
7.1 data准备 and model定义
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, SimpleRNN, Dense
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence
# 设置parameter
max_features = 10000 # 词汇表 big small
maxlen = 500 # 序列最 big long 度
batch_size = 32 # 批量 big small
# 加载IMDBdata集
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(f'Training data shape: {x_train.shape}')
print(f'Testing data shape: {x_test.shape}')
# 序列填充
print('Padding sequences...')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print(f'Training data shape after padding: {x_train.shape}')
print(f'Testing data shape after padding: {x_test.shape}')
# 构建RNNmodel
print('building model...')
model = Sequential()
model.add(Embedding(max_features, 32)) # 嵌入层
model.add(SimpleRNN(32)) # RNN层
model.add(Dense(1, activation='sigmoid')) # 输出层
# 编译model
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# 查看modelstructure
model.summary()
7.2 训练 and assessmentmodel
# 训练model
print('Training model...')
history = model.fit(x_train, y_train,
epochs=10,
batch_size=batch_size,
validation_split=0.2)
# assessmentmodel
print('Evaluating model...')
test_loss, test_acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print(f'Test accuracy: {test_acc:.4f}')
# 绘制训练过程
import matplotlib.pyplot as plt
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.tight_layout()
plt.show()
8. 深度Learning application领域
- computer vision: graph像识别, 目标检测, 人脸识别, 自动驾驶etc.
- 自然languageprocessing: 机器翻译, 情感analysis, 文本生成, 问答systemetc.
- speech recognition: 语音转文本, 语音助手, 语音合成etc.
- 推荐system: e-commerce recommendations, content recommendations, personalizedserviceetc.
- 医疗healthy: disease diagnosis, 医学影像analysis, 药物研发etc.
- 金融service: fraud detection, riskassessment, algorithms交易etc.
- 游戏: 游戏AI, role动画, procedural content generationetc.
- 机器人: 机器人感知, 运动控制, 自主导航etc.
9. 互动练习
练习 1: CNNmodel构建
- usingPyTorch or TensorFlow构建一个CNNmodel.
- usingCIFAR-10data集for训练 and test.
- 尝试不同 networkstructure (such as增加卷积层, 调整池化层etc.) .
- 比较不同structure modelperformance.
练习 2: RNN文本classification
- usingPyTorch or TensorFlow构建一个RNNmodel.
- usingIMDBdata集for情感analysis.
- 尝试using不同 RNN变体 (such asLSTM, GRU) .
- 比较不同model performance and 训练速度.
练习 3: 深度Learningapplication调研
- 选择一个你感兴趣 深度Learningapplication领域.
- 调研该领域 最 new techniques and applicationcase.
- analysis该领域面临 challenges and 未来发展方向.
- 撰写一份简 short 调研报告.