CNN 人脸表情识别的问题 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
请不要在回答技术问题时复制粘贴 AI 生成的内容
larryli1995
V2EX    程序员

CNN 人脸表情识别的问题

  •  
  •   larryli1995 2018-03-14 13:22:05 +08:00 3443 次点击
    这是一个创建于 2780 天前的主题,其中的信息可能已经有所发展或是发生改变。

    各位大神,我想请教问题,刚入门的小白。

    人脸表情数据集用的 FER2013。

    CNN 结构是:conv1(3 3 64)->conv2(3 3 64)->maxpool1->conv3(3 3 128)->conv4(3 3 128)->maxpool2(dropout=0.2)->conv5(3 3 256)->conv6(3 3 256)->maxpool3(dropout=0.25)->conv7(3 3 512)->conv8(3 3 512)->maxpool4(dropout=0.25)->fc1(dropout=0.25)->fc2(dropout=0.25)->softmax

    激活函数都是 RElu batch_size=50 learning_rate=0.001 训练数据=30000 个 测试数据=5000 个

    我发现 epoch 跑到 50 多的时候,每一个 batch 的 loss 跟 acc 开始重复上一个 epoch 了,请问这样应该怎么改进呀,这个模型是看的一篇论文上的,论文上能跑到 60%,我卡在 25%不动了

    希望大佬们如果做过这方面的话给我指点一下,谢谢了。

    下面这是我跑的数据: Epoch: 72, Test Loss= 0.018, Test Accuracy= 0.256 Epoch: 73, Batch: 0, Loss= 1.779, Training Accuracy= 0.240 Epoch: 73, Batch: 50, Loss= 1.774, Training Accuracy= 0.320 Epoch: 73, Batch: 100, Loss= 1.803, Training Accuracy= 0.220 Epoch: 73, Batch: 150, Loss= 1.802, Training Accuracy= 0.260 Epoch: 73, Batch: 200, Loss= 1.882, Training Accuracy= 0.180 Epoch: 73, Batch: 250, Loss= 1.808, Training Accuracy= 0.220 Epoch: 73, Batch: 300, Loss= 1.932, Training Accuracy= 0.160 Epoch: 73, Batch: 350, Loss= 1.811, Training Accuracy= 0.300 Epoch: 73, Batch: 400, Loss= 1.801, Training Accuracy= 0.300 Epoch: 73, Batch: 450, Loss= 1.775, Training Accuracy= 0.280 Epoch: 73, Batch: 500, Loss= 1.754, Training Accuracy= 0.280 Epoch: 73, Batch: 550, Loss= 1.737, Training Accuracy= 0.280 Epoch: 73, Test Loss= 0.018, Test Accuracy= 0.256 Epoch: 74, Batch: 0, Loss= 1.779, Training Accuracy= 0.240 Epoch: 74, Batch: 50, Loss= 1.774, Training Accuracy= 0.320 Epoch: 74, Batch: 100, Loss= 1.803, Training Accuracy= 0.220 Epoch: 74, Batch: 150, Loss= 1.802, Training Accuracy= 0.260 Epoch: 74, Batch: 200, Loss= 1.882, Training Accuracy= 0.180 Epoch: 74, Batch: 250, Loss= 1.808, Training Accuracy= 0.220 Epoch: 74, Batch: 300, Loss= 1.932, Training Accuracy= 0.160 Epoch: 74, Batch: 350, Loss= 1.811, Training Accuracy= 0.300 Epoch: 74, Batch: 400, Loss= 1.801, Training Accuracy= 0.300 Epoch: 74, Batch: 450, Loss= 1.775, Training Accuracy= 0.280 Epoch: 74, Batch: 500, Loss= 1.754, Training Accuracy= 0.280 Epoch: 74, Batch: 550, Loss= 1.737, Training Accuracy= 0.280 Epoch: 74, Test Loss= 0.018, Test Accuracy= 0.256

    16 条回复    2018-04-19 09:41:59 +08:00
    larryli1995
        1
    larryli1995  
    OP
       2018-03-14 13:23:41 +08:00
    Epoch: 72, Test Loss= 0.018, Test Accuracy= 0.256
    Epoch: 73, Batch: 0, Loss= 1.779, Training Accuracy= 0.240
    Epoch: 73, Batch: 50, Loss= 1.774, Training Accuracy= 0.320
    Epoch: 73, Batch: 100, Loss= 1.803, Training Accuracy= 0.220
    Epoch: 73, Batch: 150, Loss= 1.802, Training Accuracy= 0.260
    Epoch: 73, Batch: 200, Loss= 1.882, Training Accuracy= 0.180
    Epoch: 73, Batch: 250, Loss= 1.808, Training Accuracy= 0.220
    Epoch: 73, Batch: 300, Loss= 1.932, Training Accuracy= 0.160
    Epoch: 73, Batch: 350, Loss= 1.811, Training Accuracy= 0.300
    Epoch: 73, Batch: 400, Loss= 1.801, Training Accuracy= 0.300
    Epoch: 73, Batch: 450, Loss= 1.775, Training Accuracy= 0.280
    Epoch: 73, Batch: 500, Loss= 1.754, Training Accuracy= 0.280
    Epoch: 73, Batch: 550, Loss= 1.737, Training Accuracy= 0.280
    Epoch: 73, Test Loss= 0.018, Test Accuracy= 0.256
    Epoch: 74, Batch: 0, Loss= 1.779, Training Accuracy= 0.240
    Epoch: 74, Batch: 50, Loss= 1.774, Training Accuracy= 0.320
    Epoch: 74, Batch: 100, Loss= 1.803, Training Accuracy= 0.220
    Epoch: 74, Batch: 150, Loss= 1.802, Training Accuracy= 0.260
    Epoch: 74, Batch: 200, Loss= 1.882, Training Accuracy= 0.180
    Epoch: 74, Batch: 250, Loss= 1.808, Training Accuracy= 0.220
    Epoch: 74, Batch: 300, Loss= 1.932, Training Accuracy= 0.160
    Epoch: 74, Batch: 350, Loss= 1.811, Training Accuracy= 0.300
    Epoch: 74, Batch: 400, Loss= 1.801, Training Accuracy= 0.300
    Epoch: 74, Batch: 450, Loss= 1.775, Training Accuracy= 0.280
    Epoch: 74, Batch: 500, Loss= 1.754, Training Accuracy= 0.280
    Epoch: 74, Batch: 550, Loss= 1.737, Training Accuracy= 0.280
    Epoch: 74, Test Loss= 0.018, Test Accuracy= 0.256
    winglight2016
        2
    winglight2016  
       2018-03-14 14:13:42 +08:00
    你这个卷积后面怎么又跟一个卷积?每个卷积后面都要加一个池化,一个 dropout 吧?
    Hzzone
        3
    Hzzone  
       2018-03-14 14:20:34 +08:00
    训练数据打乱了吗?
    enenaaa
        4
    enenaaa  
       2018-03-14 14:27:16 +08:00
    batch size 搞大点试试
    ttvlls
        5
    ttvlls  
       2018-03-14 14:33:56 +08:00 via Android
    @winglight2016 当然不是
    ioiogoo
        6
    ioiogoo  
       2018-03-14 14:35:29 +08:00
    能否把论文发出来看看?
    我感觉这个结构里面用的 dropout 太多了(纯讨论),dropout 是为了防止参数过多而导致过拟合,卷积层由于所有参数共享且参数较少,所以过拟合的问题不是很严重,加这么多的 dropout 会不会因为信息丢失太多而导致欠拟合或者训练速度减慢?

    看到这个帖子后搜到的一些关于 dropout 层是否应该用在卷积层的讨论:
    https://www.quora.com/Why-would-I-need-to-apply-a-dropout-layer-before-a-convolutional-layer
    https://stats.stackexchange.com/questions/240305/where-should-i-place-dropout-layers-in-a-neural-network
    https://www.zhihu.com/question/52426832
    takato
        7
    takato  
       2018-03-14 14:43:53 +08:00 via iPhone
    个人感觉的优化方向:
    1.batchnorm
    2.增加 residual connection
    3.减少中间层的 dropout
    glasslion
        8
    glasslion  
       2018-03-14 15:40:12 +08:00
    你这模型是 vgg? 原版 vgg 的卷积层是没有 dropout 的. 前面几位也提到了, 卷积层一般不需要加 dropout, 可以考虑加 batchnorm.

    也可以调调 learning rate 和 optimizer.
    glasslion
        9
    glasslion  
       2018-03-14 15:49:53 +08:00
    先画个 confusion matrix 看看每个每个分类的到底是错在了哪里?
    Suddoo
        10
    Suddoo  
       2018-03-14 16:49:53 +08:00
    我之前一般将 batch_size 设置成 2 的整数次幂,显存不够的话,batch_size 有限制的,训练集和验证集一般 4:1,楼主用的什么框架? 还有,数据预处理的时候是不是可以考虑标准化,减去均值,除以方差。

    学习率我一般用 0.001 初始化,然后不断调小。

    前面有人提到的,数据训练前要 shuffle 一下。
    larryli1995
        11
    larryli1995  
    OP
       2018-03-14 23:36:43 +08:00
    @winglight2016 我是看的一篇论文上这么做的,他就达到了 60 多的准确率,不知道怎么搞得。
    @Hzzone 这个数据本来就是乱的呀 你是说 batch 随机取吗?
    @enenaaa 我等下试试 谢谢啦。
    @ioiogoo 感谢感谢,我研究下,不会了再问您。
    @takato 谢谢 我等下试试 我觉得改成 INCEPTION 模型应该也不错。
    @glasslion 谢谢,我等下画个 confusion matrix 分析一下。
    @Suddoo 我用的是 TF 框架,这个 FER2013 本来不就是乱着的吗? shuffle 会有用吗?还有您说数据标准化,我已经标准化了,然后把最后 softmax 去掉了,不知道这样可以不可以,之前没标准化,最后加 SOFTMAX 准确率更低了
    inflationaaron
        12
    inflationaaron  
       2018-03-15 07:48:29 +08:00 via iPad
    Dropout 起的是 regulaization 的作用,你的 training acc 还这么低的时候可以先把所有的 dropout 关掉,等到调整完其他的结构、overfit 之后,再加入 dropout 并调整参数
    YRodT
        13
    YRodT  
       2018-03-15 08:54:28 +08:00 via Android
    可以尝试的方向:
    1.在所有卷积层加 batchnorm,去掉所有 dropout
    2.如果使用了 vgg,尝试用已有的 vgg 模型参数初始化你这个模型的前几层
    3.学习率使用 steps 方式,当 loss 反复时学习率减半
    4.你没有说优化方法,adam 和 momentum 适合的学习率不太一样
    larryli1995
        14
    larryli1995  
    OP
       2018-03-15 09:38:59 +08:00
    @inflationaaron 谢谢
    @YRodT 3Q
    larryli1995
        15
    larryli1995  
    OP
       2018-03-15 09:57:16 +08:00
    @YRodT 优化是 AdamOptimizer
    smit
        16
    smit  
       2018-04-19 09:41:59 +08:00
    你好,我想我跟你遇到一样的问题,甚至有点玄学。。我用另外一个网络,test loss 也卡在 0.018 ,跟你的结果一模一样,能否加扣扣交流?我的是 2640062655,期待你的回复
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     3980 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 23ms UTC 10:13 PVG 18:13 LAX 03:13 JFK 06:13
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86