更新时间:2023-12-02 20:24:58
# Keras 预测img = image.load_img(img_path, target_size=(224, 224))# OpenCV 预测imgcv = cv2.imread(img_path)暗淡 = (224, 224)imgcv_resized = cv2.resize(imgcv,dim,interpolation=cv2.INTER_LINEAR)
如果你仔细看,你在案例中指定的插值cv2 是 cv2.INTER_LINEAR
(双线性插值);然而,默认情况下,image.load_img()
使用 INTER_NEAREST
插值方法.
img_to_array(img)
.这里的 dtype
参数是:无
默认为 None,在这种情况下全局设置tf.keras.backend.floatx() 被使用(除非你改变它,它默认到float32")
因此,在 img_to_array(img)
中,您有一个由 float32
值组成的图像,而 cv2.imread(img)
返回uint8
值的 numpy 数组.
image = image[:,:,::-1]
或 image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
;否则,您将颠倒 R 和 B 通道,从而导致不正确的比较.由于您在两种情况下应用的预处理是相同的,唯一的区别是我上面提到的;调整这些更改应确保可重复性.
我想做出一个观察:假设一个库(在本例中为cv2
)自动(并且可以说只加载整数)而不是浮点数,唯一正确的方法是将第一个预测数组 (Keras) 转换为 uint8
,因为将后者转换为 float32
,可能会丢失信息中的差异.例如,使用 cv2
加载到 uint8
,通过强制转换而不是 233
你得到 233.0
.然而,初始像素值可能是 233,3
但由于第一次转换而丢失了.
I want to use Keras Resnet50 model using OpenCV for reading and resizing the input image. I'm using the same preprocessing code from Keras (with OpenCV I need to convert to RGB since this is the format expected by preprocess_input()). I get slightly different predictions using OpenCV and Keras image loading. I don't understand why the predictions are not the same.
Here is my code:
import numpy as np
import json
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import cv2
model = ResNet50(weights='imagenet')
img_path = '/home/me/squirle.jpg'
# Keras prediction
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted Keras:', decode_predictions(preds, top=3)[0])
# OpenCV prediction
imgcv = cv2.imread(img_path)
dim = (224, 224)
imgcv_resized = cv2.resize(imgcv, dim, interpolation=cv2.INTER_LINEAR)
x = cv2.cvtColor(imgcv_resized , cv2.COLOR_BGR2RGB)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted OpenCV:', decode_predictions(preds, top=3)[0])
Predicted Keras: [('n02490219', 'marmoset', 0.28250763), ('n02356798', 'fox_squirrel', 0.25657368), ('n02494079', 'squirrel_monkey', 0.19992349)]
Predicted OpenCV: [('n02356798', 'fox_squirrel', 0.5161952), ('n02490219', 'marmoset', 0.21953616), ('n02494079', 'squirrel_monkey', 0.1160824)]
How can I use OpenCV imread()
and resize()
to get the same prediction as Keras image loading?
# Keras prediction
img = image.load_img(img_path, target_size=(224, 224))
# OpenCV prediction
imgcv = cv2.imread(img_path)
dim = (224, 224)
imgcv_resized = cv2.resize(imgcv, dim, interpolation=cv2.INTER_LINEAR)
If you look attentively, the interpolation you specify in the case
of cv2 is cv2.INTER_LINEAR
(bilinear interpolation); however, by default,
image.load_img()
uses an INTER_NEAREST
interpolation method.
img_to_array(img)
. The dtype
argument here is: None
Default to None, in which case the global setting tf.keras.backend.floatx() is used (unless you changed it, it defaults to "float32")
Therefore, in img_to_array(img)
you have an image that consists of float32
values, while the cv2.imread(img)
returns a numpy array of uint8
values.
image = image[:,:,::-1]
or image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
; otherwise you will have the R and B channels reversed resulting in an incorrect comparison.Since the preprocessing that you apply is the same in both cases, the only differences are the ones that I mentioned above; adapting those changes should ensure reproducibility.
There is one observation I would like to make: provided that one uses a library (cv2
in this case) which automatically (and arguably only loads ints) instead of floats, the only correct way is to cast the first prediction array (Keras) to uint8
because by casting the latter to float32
, the possible difference in information is lost. For example, with cv2
you load to uint8
, and by casting instead of 233
you get 233.0
. However, maybe the initial pixel value was 233,3
but this was lost due to the first conversion.