您好,欢迎来到伴沃教育。
搜索
您的当前位置:首页浅谈pytorch中torch.max和F.softmax函数的维度解释

浅谈pytorch中torch.max和F.softmax函数的维度解释

来源:伴沃教育
浅谈pytorch中torch.max和F.softmax函数的维度解释

在利⽤torch.max函数和F.Ssoftmax函数时,对应该设置什么维度,总是有点懵,遂总结⼀下:⾸先看看⼆维tensor的函数的例⼦:

import torch

import torch.nn.functional as F

input = torch.randn(3,4)print(input)

tensor([[-0.5526, -0.0194, 2.1469, -0.2567], [-0.3337, -0.9229, 0.0376, -0.0801], [ 1.4721, 0.1181, -2.6214, 1.7721]])

b = F.softmax(input,dim=0) # 按列SoftMax,列和为1print(b)

tensor([[0.1018, 0.3918, 0.8851, 0.1021], [0.1268, 0.1587, 0.1074, 0.1218], [0.7714, 0.4495, 0.0075, 0.7762]])

c = F.softmax(input,dim=1) # 按⾏SoftMax,⾏和为1print(c)

tensor([[0.0529, 0.0901, 0.7860, 0.0710], [0.2329, 0.1292, 0.3377, 0.3002], [0.3810, 0.0984, 0.00, 0.5143]])

d = torch.max(input,dim=0) # 按列取max,print(d)

torch.return_types.max(

values=tensor([1.4721, 0.1181, 2.1469, 1.7721]),indices=tensor([2, 2, 0, 2]))

e = torch.max(input,dim=1) # 按⾏取max,print(e)

torch.return_types.max(

values=tensor([2.1469, 0.0376, 1.7721]),indices=tensor([2, 2, 3]))

下⾯看看三维tensor解释例⼦:

函数softmax输出的是所给矩阵的概率分布;

b输出的是在dim=0维上的概率分布,b[0][5][6]+b[1][5][6]+b[2][5][6]=1

a=torch.rand(3,16,20)b=F.softmax(a,dim=0)c=F.softmax(a,dim=1)d=F.softmax(a,dim=2)

In [1]: import torch as t

In [2]: import torch.nn.functional as FIn [4]: a=t.Tensor(3,4,5)

In [5]: b=F.softmax(a,dim=0)In [6]: c=F.softmax(a,dim=1)In [7]: d=F.softmax(a,dim=2)

In [8]: aOut[8]:

tensor([[[-0.1581, 0.0000, 0.0000, 0.0000, -0.0344],

[ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000], [-0.0344, 0.0000, -0.0344, 0.0000, -0.0344], [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000]],

[[-0.0344, 0.0000, -0.0344, 0.0000, -0.0344], [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000], [-0.0344, 0.0000, -0.0344, 0.0000, -0.0344], [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000]],

[[-0.0344, 0.0000, -0.0344, 0.0000, -0.0344], [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000], [-0.0344, 0.0000, -0.0344, 0.0000, -0.0344],

[ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000]]])

In [9]: bOut[9]:

tensor([[[0.30, 0.3333, 0.3410, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]],

[[0.3468, 0.3333, 0.3295, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]],

[[0.3468, 0.3333, 0.3295, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333], [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]]])

In [10]: b.sum()

Out[10]: tensor(20.0000)

In [11]: b[0][0][0]+b[1][0][0]+b[2][0][0]Out[11]: tensor(1.0000)

In [12]: c.sum()Out[12]: tensor(15.)

In [13]: cOut[13]:

tensor([[[0.2235, 0.23, 0.2521, 0.23, 0.2457],

[0.2618, 0.2457, 0.2521, 0.2457, 0.23],

[0.2529, 0.23, 0.2436, 0.23, 0.2457],

[0.2618, 0.2457, 0.2521, 0.2457, 0.23]],

[[0.2457, 0.23, 0.2457, 0.23, 0.2457],

[0.23, 0.2457, 0.23, 0.2457, 0.23],

[0.2457, 0.23, 0.2457, 0.23, 0.2457],

[0.23, 0.2457, 0.23, 0.2457, 0.23]],

[[0.2457, 0.23, 0.2457, 0.23, 0.2457],

[0.23, 0.2457, 0.23, 0.2457, 0.23],

[0.2457, 0.23, 0.2457, 0.23, 0.2457],

[0.23, 0.2457, 0.23, 0.2457, 0.23]]])

In [14]: n=t.rand(3,4)

In [15]: nOut[15]:

tensor([[0.2769, 0.3475, 0.14, 0.6845], [0.9251, 0.3976, 0.8690, 0.4510], [0.8249, 0.1157, 0.3075, 0.3799]])

In [16]: m=t.argmax(n,dim=0)

In [17]: m

Out[17]: tensor([1, 1, 0, 0])

In [18]: p=t.argmax(n,dim=1)

In [19]: p

Out[19]: tensor([2, 0, 0])

In [20]: d.sum()

Out[20]: tensor(12.0000)

In [22]: dOut[22]:

tensor([[[0.1771, 0.2075, 0.2075, 0.2075, 0.2005],

[0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

[0.2027, 0.1959, 0.2027, 0.1959, 0.2027]],

[[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

[0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

[0.2027, 0.1959, 0.2027, 0.1959, 0.2027]],

[[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

[0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

[0.2027, 0.1959, 0.2027, 0.1959, 0.2027]]])

In [23]: d[0][0].sum()Out[23]: tensor(1.)

补充知识:多分类问题torch.nn.Softmax的使⽤

为什么谈论这个问题呢?是因为我在⼯作的过程中遇到了语义分割预测输出特征图个数为16,也就是所谓的16分类问题。因为每个通道的像素的值的⼤⼩代表了像素属于该通道的类的⼤⼩,为了在⼀张图上⽤不同的颜⾊显⽰出来,我不得不学习了torch.nn.Softmax的使⽤。

⾸先看⼀个简答的例⼦,倘若输出为(3, 4, 4),也就是3张4x4的特征图。

import torch

img = torch.rand((3,4,4))print(img)

输出为:

tensor([[[0.0413, 0.8728, 0.26, 0.0693], [0.4072, 0.0302, 0.9248, 0.6676], [0.4699, 0.9197, 0.3333, 0.4809], [0.3877, 0.7673, 0.6132, 0.5203]], [[0.4940, 0.7996, 0.5513, 0.8016], [0.1157, 0.8323, 0.9944, 0.2127], [0.3055, 0.4343, 0.8123, 0.3184], [0.8246, 0.6731, 0.3229, 0.1730]], [[0.0661, 0.1905, 0.4490, 0.7484], [0.4013, 0.1468, 0.2145, 0.8838], [0.0083, 0.5029, 0.0141, 0.98], [0.8673, 0.2308, 0.8808, 0.0532]]])

我们可以看到共三张特征图,每张特征图上对应的值越⼤,说明属于该特征图对应类的概率越⼤。

import torch.nn as nn

sogtmax = nn.Softmax(dim=0)img = sogtmax(img)print(img)

输出为:

tensor([[[0.2780, 0.4107, 0.4251, 0.1979], [0.38, 0.2297, 0.3901, 0.3477],

[0.4035, 0.4396, 0.2993, 0.2967], [0.2402, 0.4008, 0.3273, 0.4285]], [[0.4371, 0.3817, 0.3022, 0.4117], [0.2726, 0.5122, 0.4182, 0.2206], [0.3423, 0.2706, 0.4832, 0.2522], [0.3718, 0.38, 0.2449, 0.3028]], [[0.2849, 0.2076, 0.2728, 0.3904], [0.3627, 0.2581, 0.1917, 0.4317], [0.23, 0.28, 0.2175, 0.4511], [0.3880, 0.2344, 0.4278, 0.2686]]])

可以看到,上⾯的代码对每张特征图对应位置的像素值进⾏Softmax函数处理, 图中标红位置加和=1,同理,标蓝位置加和=1。

我们看到Softmax函数会对原特征图每个像素的值在对应维度(这⾥dim=0,也就是第⼀维)上进⾏计算,将其处理到0~1之间,并且⼤⼩固定不变。

print(torch.max(img,0))输出为:

torch.return_types.max(

values=tensor([[0.4371, 0.4107, 0.4251, 0.4117], [0.38, 0.5122, 0.4182, 0.4317], [0.4035, 0.4396, 0.4832, 0.4511], [0.3880, 0.4008, 0.4278, 0.4285]]),indices=tensor([[1, 0, 0, 1], [0, 1, 1, 2], [0, 0, 1, 2], [2, 0, 2, 0]]))

可以看到这⾥3x4x4变成了1x4x4,⽽且对应位置上的值为像素对应每个通道上的最⼤值,并且indices是对应的分类。清楚理解了上⾯的流程,那么我们就容易处理了。看具体案例,这⾥输出output的⼤⼩为:16x416x416.

output = torch.tensor(output)

sm = nn.Softmax(dim=0)output = sm(output)

mask = torch.max(output,0).indices.numpy()

# 因为要转化为RGB彩⾊图,所以增加⼀维

rgb_img = np.zeros((output.shape[1], output.shape[2], 3))for i in range(len(mask)): for j in range(len(mask[0])): if mask[i][j] == 0:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 255 if mask[i][j] == 1:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 0 if mask[i][j] == 2:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 180 if mask[i][j] == 3:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 255 if mask[i][j] == 4:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 180 if mask[i][j] == 5:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 0 if mask[i][j] == 6:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 0

rgb_img[i][j][2] = 180 if mask[i][j] == 7:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 255 if mask[i][j] == 8:

rgb_img[i][j][0] = 255 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 0 if mask[i][j] == 9:

rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 0 if mask[i][j] == 10: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 255 if mask[i][j] == 11: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 180 if mask[i][j] == 12: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 255 if mask[i][j] == 13: rgb_img[i][j][0] = 180 rgb_img[i][j][1] = 255 rgb_img[i][j][2] = 180 if mask[i][j] == 14: rgb_img[i][j][0] = 0 rgb_img[i][j][1] = 180 rgb_img[i][j][2] = 255 if mask[i][j] == 15: rgb_img[i][j][0] = 0 rgb_img[i][j][1] = 0 rgb_img[i][j][2] = 0

cv2.imwrite('output.jpg', rgb_img)

最后保存得到的图为:

以上这篇浅谈pytorch中torch.max和F.softmax函数的维度解释就是⼩编分享给⼤家的全部内容了,希望能给⼤家⼀个参考,也希望⼤家多多⽀持。

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- bangwoyixia.com 版权所有 湘ICP备2023022004号-2

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务