且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Keras输入说明:input_shape,units,batch_size,dim等

更新时间:2023-12-01 23:34:04

单位:

神经元"或细胞"的数量,或其中内部任何层的数量.

The amount of "neurons", or "cells", or whatever the layer has inside it.

这是每个图层的属性,是的,它与输出形状有关(我们将在后面看到).在您的图片中,除了输入层(在概念上与其他层不同)之外,您具有:

It's a property of each layer, and yes, it's related to the output shape (as we will see later). In your picture, except for the input layer, which is conceptually different from other layers, you have:

  • 隐藏层1:4个单位(4个神经元)
  • 隐藏层2:4个单位
  • 最后一层:1个单位

形状是模型配置的结果.形状是元组,表示每个维度中数组或张量具有多少个元素.

Shapes are consequences of the model's configuration. Shapes are tuples representing how many elements an array or tensor has in each dimension.

Ex:形状(30,4,10)表示具有3个维度的数组或张量,其中第一个维度包含30个元素,第二个维度包含4个元素,第三个维度包含10个元素,总计30 * 4 * 10 = 1200个元素或数字.

Ex: a shape (30,4,10) means an array or tensor with 3 dimensions, containing 30 elements in the first dimension, 4 in the second and 10 in the third, totaling 30*4*10 = 1200 elements or numbers.

层之间流动的是张量.张量可以看作是具有形状的矩阵.

What flows between layers are tensors. Tensors can be seen as matrices, with shapes.

在Keras中,输入层本身不是层,而是张量.它是您发送到第一个隐藏层的起始张量.该张量必须与您的训练数据具有相同的形状.

In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.

示例:如果您有30张RGB大小为50x50像素的图像(3通道),则输入数据的形状为(30,50,50,3).然后,您的输入层张量必须具有此形状(请参见"keras中的形状"部分中的详细信息).

Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3). Then your input layer tensor, must have this shape (see details in the "shapes in keras" section).

每种类型的图层都需要输入一定数量的尺寸:

Each type of layer requires the input with a certain number of dimensions:

  • Dense层要求输入为(batch_size, input_size)
    • (batch_size, optional,...,optional, input_size)
    • Dense layers require inputs as (batch_size, input_size)
      • or (batch_size, optional,...,optional, input_size)
      • 如果使用channels_last:(batch_size, imageside1, imageside2, channels)
      • 如果使用channels_first:(batch_size, channels, imageside1, imageside2)
      • if using channels_last: (batch_size, imageside1, imageside2, channels)
      • if using channels_first: (batch_size, channels, imageside1, imageside2)

      现在,输入形状是您必须定义的唯一形状,因为您的模型不知道该形状.根据您的训练数据,只有您知道这一点.

      Now, the input shape is the only one you must define, because your model cannot know it. Only you know that, based on your training data.

      所有其他形状都是根据每层的单位和特殊性自动计算的.

      All the other shapes are calculated automatically based on the units and particularities of each layer.

      给定输入形状,所有其他形状都是图层计算的结果.

      Given the input shape, all other shapes are results of layers calculations.

      每一层的单位"将定义输出形状(由该层产生的张量的形状,并将其作为下一层的输入).

      The "units" of each layer will define the output shape (the shape of the tensor that is produced by the layer and that will be the input of the next layer).

      每种类型的图层都有特定的工作方式.密集层具有基于单位"的输出形状,卷积层具有基于过滤器"的输出形状.但是它总是基于某些图层属性. (有关每层输出的信息,请参见文档)

      Each type of layer works in a particular way. Dense layers have output shape based on "units", convolutional layers have output shape based on "filters". But it's always based on some layer property. (See the documentation for what each layer outputs)

      让我们展示一下密集"图层的情况,这是图形中显示的类型.

      Let's show what happens with "Dense" layers, which is the type shown in your graph.

      致密层的输出形状为(batch_size,units).因此,是的,单位(图层的属性)也定义了输出形状.

      A dense layer has an output shape of (batch_size,units). So, yes, units, the property of the layer, also defines the output shape.

      • 隐藏层1:4个单位,输出形状:(batch_size,4).
      • 隐藏层2:4个单位,输出形状:(batch_size,4).
      • 最后一层:1个单位,输出形状:(batch_size,1).
      • Hidden layer 1: 4 units, output shape: (batch_size,4).
      • Hidden layer 2: 4 units, output shape: (batch_size,4).
      • Last layer: 1 unit, output shape: (batch_size,1).

      将完全根据输入和输出形状自动计算重量.同样,每种类型的层都以某种方式起作用.但是权重将是一个能够通过一些数学运算将输入形状转换为输出形状的矩阵.

      Weights will be entirely automatically calculated based on the input and the output shapes. Again, each type of layer works in a certain way. But the weights will be a matrix capable of transforming the input shape into the output shape by some mathematical operation.

      在密集层中,权重将所有输入相乘.它是一个矩阵,每个输入只有一列,每个单位只有一行,但这对于基本工作通常并不重要.

      In a dense layer, weights multiply all inputs. It's a matrix with one column per input and one row per unit, but this is often not important for basic works.

      在图像中,如果每个箭头上都有一个乘法数字,则所有数字一起将构成权重矩阵.

      In the image, if each arrow had a multiplication number on it, all numbers together would form the weight matrix.

      之前,我举了一个示例,该示例包含30张图像,50x50像素和3个通道,输入形状为(30,50,50,3).

      Earlier, I gave an example of 30 images, 50x50 pixels and 3 channels, having an input shape of (30,50,50,3).

      由于输入形状是您唯一需要定义的形状,因此Keras会在第一层中要求它.

      Since the input shape is the only one you need to define, Keras will demand it in the first layer.

      但是在此定义中,Keras忽略了第一维,即批量大小.您的模型应该能够处理任何批量,因此您只能定义其他尺寸:

      But in this definition, Keras ignores the first dimension, which is the batch size. Your model should be able to deal with any batch size, so you define only the other dimensions:

      input_shape = (50,50,3)
          #regardless of how many images I have, each image has this shape        
      

      (可选)或在某些型号的模型需要时,可以通过batch_input_shape=(30,50,50,3)batch_shape=(30,50,50,3)传递包含批量大小的形状.这将您的培训可能性限制在这种独特的批次大小,因此仅在真正需要时才使用.

      Optionally, or when it's required by certain kinds of models, you can pass the shape containing the batch size via batch_input_shape=(30,50,50,3) or batch_shape=(30,50,50,3). This limits your training possibilities to this unique batch size, so it should be used only when really required.

      无论选择哪种方式,模型中的张量都将具有批处理尺寸.

      Either way you choose, tensors in the model will have the batch dimension.

      因此,即使您使用了input_shape=(50,50,3),当keras向您发送消息或打印模型摘要时,它也会显示(None,50,50,3).

      So, even if you used input_shape=(50,50,3), when keras sends you messages, or when you print the model summary, it will show (None,50,50,3).

      第一个维度是批量大小,它是None,因为它可以根据您提供的培训示例数量而变化. (如果您明确定义了批量大小,则将显示您定义的数字,而不是None)

      The first dimension is the batch size, it's None because it can vary depending on how many examples you give for training. (If you defined the batch size explicitly, then the number you defined will appear instead of None)

      此外,在高级作品中,当您直接在张量上实际操作时(例如,在Lambda层内部或在损失函数中),批次大小尺寸将存在.

      Also, in advanced works, when you actually operate directly on the tensors (inside Lambda layers or in the loss function, for instance), the batch size dimension will be there.

      • 因此,在定义输入形状时,您将忽略批量大小:input_shape=(50,50,3)
      • 直接在张量上执行操作时,形状将再次为(30,50,50,3)
      • 当keras向您发送消息时,形状将为(None,50,50,3)(30,50,50,3),具体取决于它发送给您的消息类型.
      • So, when defining the input shape, you ignore the batch size: input_shape=(50,50,3)
      • When doing operations directly on tensors, the shape will be again (30,50,50,3)
      • When keras sends you a message, the shape will be (None,50,50,3) or (30,50,50,3), depending on what type of message it sends you.

      最后,什么是dim?

      如果输入形状只有一个维度,则无需将其作为元组给出,而将input_dim作为标量数给出.

      If your input shape has only one dimension, you don't need to give it as a tuple, you give input_dim as a scalar number.

      因此,在您的模型中,您的输入层具有3个元素,您可以使用以下两个元素中的任何一个:

      So, in your model, where your input layer has 3 elements, you can use any of these two:

      • input_shape=(3,)-只有一维时,逗号是必需的
      • input_dim = 3
      • input_shape=(3,) -- The comma is necessary when you have only one dimension
      • input_dim = 3

      但是直接处理张量时,dim通常指的是张量有多少维.例如,形状为(25,10909)的张量具有2个维度.

      But when dealing directly with the tensors, often dim will refer to how many dimensions a tensor has. For instance a tensor with shape (25,10909) has 2 dimensions.

      Keras有两种实现方式,即Sequential模型或功能性API Model.我不喜欢使用顺序模型,以后您将不得不忘记它,因为您将需要带有分支的模型.

      Keras has two ways of doing it, Sequential models, or the functional API Model. I don't like using the sequential model, later you will have to forget it anyway because you will want models with branches.

      PS:这里我忽略了其他方面,例如激活功能.

      PS: here I ignored other aspects, such as activation functions.

      使用顺序模型:

      from keras.models import Sequential  
      from keras.layers import *  
      
      model = Sequential()    
      
      #start from the first hidden layer, since the input is not actually a layer   
      #but inform the shape of the input, with 3 elements.    
      model.add(Dense(units=4,input_shape=(3,))) #hidden layer 1 with input
      
      #further layers:    
      model.add(Dense(units=4)) #hidden layer 2
      model.add(Dense(units=1)) #output layer   
      

      使用功能性API模型:

      from keras.models import Model   
      from keras.layers import * 
      
      #Start defining the input tensor:
      inpTensor = Input((3,))   
      
      #create the layers and pass them the input tensor to get the output tensor:    
      hidden1Out = Dense(units=4)(inpTensor)    
      hidden2Out = Dense(units=4)(hidden1Out)    
      finalOut = Dense(units=1)(hidden2Out)   
      
      #define the model's start and end points    
      model = Model(inpTensor,finalOut)
      

      张量的形状

      请记住,在定义图层时忽略批量大小:

      Remember you ignore batch sizes when defining layers:

      • inpTensor:(None,3)
      • hidden1Out:(None,4)
      • hidden2Out:(None,4)
      • finalOut:(None,1)
      • inpTensor: (None,3)
      • hidden1Out: (None,4)
      • hidden2Out: (None,4)
      • finalOut: (None,1)