日本色88网站,88XV日韩,亚洲欧美综合

Python中的快速特征工程

2021-12-20 11:26

磐創AI

關注

“在任何一種數據豐富的環境中都很容易找到模式。關鍵在于確定模式是代表噪聲還是信號。”―奈特·西爾弗本文將介紹將圖像處理作為機器學習工作流程的一部分時要遵循的一些最佳實踐。庫import random

from PIL import Image

import cv2

import numpy as np

from matplotlib import pyplot as plt

import json

import albumentations as A

import torch

import torchvision．models as models

import torchvision．transforms as transforms

import torch．nn as nn

from tqdm import tqdm＿notebook

from torch．utils．data import DataLoader

from torchvision．datasets import CIFAR10

調整圖像大小／縮放圖像調整大小是該領域深度學習實踐者所做的最基本的改變。這樣做的主要原因是確保我們的深度學習系統收到的輸入是一致的。調整大小的另一個原因是減少模型中的參數數量。更小的維數意味著更小的神經網絡，從而節省了我們訓練模型所需的時間和計算能力。信息丟失怎么辦？從較大的圖像向下調整大小時，確實會丟失一些信息。但是，根據你的任務，你可以選擇愿意為訓練時間和計算資源犧牲多少信息。例如，對象檢測任務將要求你保持圖像的縱橫比，因為目標是檢測對象的準確位置。相反，圖像分類任務可能需要將所有圖像的大小調整為指定的大小（224x224是一個很好的經驗法則）。

img ＝ Image．open（＂goldendoodle－1234760＿960＿720．jpeg＂）

img＿resized ＝ Image．Image．resize（img， size＝（224， 224））

調整圖像大小后，圖像如下所示：

為什么要執行圖像縮放？與表格數據類似，用于分類任務的縮放圖像可以幫助我們的深度學習模型的學習率更好地收斂到最小值。縮放可確保特定維度不會主導其他維度。在StackExchange上找到了一個非常好的答案。一種特征縮放是標準化像素值的過程。我們通過從每個通道的像素值中減去每個通道的平均值，然后除以標準差。在為分類任務訓練模型時，這是一種常用的特征工程選擇。mean ＝ np．mean（img＿resized， axis＝（1，2）， keepdims＝True）

std ＝ np．std（img＿resized， axis＝（1，2）， keepdims＝True）

img＿std ＝（img＿resized － mean）／ std

注意：與調整大小一樣，在執行對象檢測和圖像生成任務時，可能不希望進行圖像縮放。上面的示例代碼演示了通過標準化縮放圖像的過程。還有其他形式的縮放，例如居中和標準化。擴充（分類）增強圖像背后的主要動機是由于計算機視覺任務的可觀數據需求。通常，由于多種原因，獲取足夠的圖像進行訓練是一項挑戰。圖像增強使我們能夠通過稍微修改原始樣本來創建新的訓練樣本。在本例中，我們將研究如何將普通的增強應用于分類任務。我們可以使用Albumentations庫來實現這一點：img＿cropped ＝ Image．fromarray（A．RandomCrop（width＝225， height＝225）（image＝np．array（img））［＇image＇］）
img＿gau＿blur ＝ Image．fromarray（A．GaussianBlur（p＝0．8）（image＝np．array（img＿resized））［＇image＇］）
img＿flip ＝ Image．fromarray（A．Flip（0．8）（image＝np．array（img＿resized））［＇image＇］）
高斯模糊、隨機裁剪、翻轉：

通過應用圖像增強，我們的深度學習模型可以更好地概括任務（避免過擬合），從而提高其對未知數據的預測能力。增強（目標檢測）Albumentations庫還可用于為其他任務（如對象檢測）創建增強。對象檢測要求我們在感興趣的對象周圍創建邊界框。當試圖用邊界框的坐標注釋圖像時，使用原始數據可能是一項挑戰。幸運的是，有許多公開和免費可用的數據集，我們可以用來創建用于對象檢測的增強管道。其中一個數據集就是國際象棋數據集。該數據集包含棋盤上的606張棋子圖像。除了這些圖像，還提供了一個JSON文件，其中包含與單個圖像中每個棋子的邊界框相關的所有信息。通過編寫一個簡單的函數，我們可以在應用擴展后可視化數據：

with open（＂＿annotations．coco．json＂） as f：

json＿file ＝ json．load（f）

x＿min， y＿min， w， h ＝ json＿file［＇annotations＇］［0］［＇bbox＇］

x＿min， x＿max， y＿min， y＿max ＝ int（x＿min）， int（x＿min ＋ w）， int（y＿min）， int（y＿min ＋ h）

def visualize＿bbox（img， bbox， class＿name， color＝（0， 255， 0）， thickness＝2）：

x＿min， y＿min， w， h ＝ bbox

x＿min， x＿max， y＿min， y＿max ＝ int（x＿min）， int（x＿min ＋ w）， int（y＿min）， int（y＿min ＋ h）

cv2．rectangle（img，（x＿min， y＿min），（x＿max， y＿max）， color＝color， thickness＝thickness）

（（text＿width， text＿height），＿）＝ cv2．getTextSize（class＿name， cv2．FONT＿HERSHEY＿SIMPLEX， 0．35， 1）

cv2．rectangle（img，（x＿min， y＿min － int（1．3 ＊ text＿height）），（x＿min ＋ text＿width， y＿min）， BOX＿COLOR，－1）

cv2．putText（
img，
text＝class＿name，
org＝（x＿min， y＿min － int（0．3 ＊ text＿height）），
fontFace＝cv2．FONT＿HERSHEY＿SIMPLEX，
fontScale＝0．35，
color＝（255， 255， 255），
lineType＝cv2．LINE＿AA，

）

return img

bbox＿img ＝ visualize＿bbox（np．array（img），

json＿file［＇annotations＇］［0］［＇bbox＇］，

class＿name＝json＿file［＇categories＇］［0］［＇name＇］）

Image．fromarray（bbox＿img）

現在，讓我們嘗試使用Albumentations創建一個增強管道。包含注釋信息的JSON文件具有以下鍵：dict＿keys（［‘info’， ‘licenses’， ‘categories’， ‘images’， ‘annotations’］）

圖像包含有關圖像文件的信息，而注釋包含有關圖像中每個對象的邊界框的信息。最后，類別包含映射到圖像中棋子類型的鍵。image＿list ＝ json＿file．get（＇images＇）

anno＿list ＝ json＿file．get（＇annotations＇）

cat＿list ＝ json＿file．get（＇categories＇）

image＿list：［｛＇id＇： 0，

＇license＇： 1，

＇file＿name＇：＇IMG＿0317＿JPG．rf．00207d2fe8c0a0f20715333d49d22b4f．jpg＇，

＇height＇： 416，

＇width＇： 416，

＇date＿captured＇：＇2021－02－23T17：32：58＋00：00＇｝，

｛＇id＇： 1，

＇license＇： 1，

＇file＿name＇：＇5a8433ec79c881f84ef19a07dc73665d＿jpg．rf．00544a8110f323e0d7721b3acf2a9e1e．jpg＇，

＇height＇： 416，

＇width＇： 416，

＇date＿captured＇：＇2021－02－23T17：32：58＋00：00＇｝，

｛＇id＇： 2，

＇license＇： 1，

＇file＿name＇：＇675619f2c8078824cfd182cec2eeba95＿jpg．rf．0130e3c26b1bf275bf240894ba73ed7c．jpg＇

＇height＇： 416，

＇width＇： 416，

＇date＿captured＇：＇2021－02－23T17：32：58＋00：00＇｝，

anno＿list：［｛＇id＇： 0，

＇image＿id＇： 0，

＇category＿id＇： 7，

＇bbox＇：［220， 14， 18， 46．023746508293286］，

＇area＇： 828．4274371492792，

＇segmentation＇：［］，

＇iscrowd＇： 0｝，

｛＇id＇： 1，

＇image＿id＇： 1，

＇category＿id＇： 8，

＇bbox＇：［187， 103， 22．686527154676014， 59．127992255841036］，

＇area＇： 1341．4088019136107，

＇segmentation＇：［］，

＇iscrowd＇： 0｝，

｛＇id＇： 2，

＇image＿id＇： 2，

＇category＿id＇： 10，

＇bbox＇：［203， 24， 24．26037020843023， 60．5］，

＇area＇： 1467．752397610029，

＇segmentation＇：［］，

＇iscrowd＇： 0｝，

cat＿list：［｛＇id＇： 0，＇name＇：＇pieces＇，＇supercategory＇：＇none＇｝，

｛＇id＇： 1，＇name＇：＇bishop＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 2，＇name＇：＇black－bishop＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 3，＇name＇：＇black－king＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 4，＇name＇：＇black－knight＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 5，＇name＇：＇black－pawn＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 6，＇name＇：＇black－queen＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 7，＇name＇：＇black－rook＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 8，＇name＇：＇white－bishop＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 9，＇name＇：＇white－king＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 10，＇name＇：＇white－knight＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 11，＇name＇：＇white－pawn＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 12，＇name＇：＇white－queen＇，＇supercategory＇：＇pieces＇｝，

｛＇id＇： 13，＇name＇：＇white－rook＇，＇supercategory＇：＇pieces＇｝］

我們必須改變這些列表的結構，以創建高效的管道：new＿anno＿dict ＝｛｝

new＿cat＿dict ＝｛｝

for item in cat＿list：

new＿cat＿dict［item［＇id＇］］＝ item［＇name＇］

for item in anno＿list：

img＿id ＝ item．get（＇image＿id＇）

if img＿id not in new＿anno＿dict：
temp＿list ＝［］
temp＿list．append（item）
new＿anno＿dict［img＿id］＝ temp＿list

else：
new＿anno＿dict．get（img＿id）．append（item）

現在，讓我們創建一個簡單的增強管道，水平翻轉圖像，并為邊界框添加一個參數：transform ＝ A．Compose（

［A．HorizontalFlip（p＝0．5）］，

bbox＿params＝A．BboxParams（format＝＇coco＇， label＿fields＝［＇category＿ids＇］），

）

最后，我們將創建一個類似于Pytorch dataset類的dataset。為此，我們需要定義一個實現方法＿＿len＿＿和＿＿getitem＿。class ImageDataset：

def ＿＿init＿＿（self， path， img＿list， anno＿dict， cat＿dict， albumentations＝None）：
self．path ＝ path
self．img＿list ＝ img＿list
self．anno＿dict ＝ anno＿dict
self．cat＿dict ＝ cat＿dict
self．albumentations ＝ albumentations

def ＿＿len＿＿（self）：
return len（self．img＿list）

def ＿＿getitem＿＿（self， idx）：
＃每個圖像可能有多個對象，因此有多個盒子
bboxes ＝［item［＇bbox＇］ for item in self．anno＿dict［int（idx）］］
cat＿ids ＝［item［＇category＿id＇］ for item in self．anno＿dict［int（idx）］］
categories ＝［self．cat＿dict［idx］ for idx in cat＿ids］
image ＝ self．img＿list［idx］
img ＝ Image．open（f＂｛self．path｝｛image．get（＇file＿name＇）｝＂）
img ＝ img．convert（＂RGB＂）
img ＝ np．array（img）
if self．albumentations is not None：

augmented ＝ self．albumentations（image＝img， bboxes＝bboxes， category＿ids＝cat＿ids）

img ＝ augmented［＂image＂］
return ｛

＂image＂： img，

＂category＿ids＂： augmented［＂category＿ids＂］，

＂category＂： categories
｝

＃ path是json＿file和images的路徑

dataset ＝ ImageDataset（path， image＿list， new＿anno＿dict， new＿cat＿dict， transform）

以下是在自定義數據集上迭代時的一些結果：

因此，我們現在可以輕松地將此自定義數據集傳遞給數據加載器以訓練我們的模型。特征提取你可能聽說過預訓練模型用于訓練圖像分類器和其他有監督的學習任務。但是，你知道嗎，你也可以使用預訓練的模型來提取圖像的特征？簡言之，特征提取是一種降維形式，其中大量像素被降維為更有效的表示。這主要適用于無監督機器學習任務。讓我們嘗試使用Pytorch預先訓練的模型從圖像中提取特征。為此，我們必須首先定義我們的特征提取器類：class ResnetFeatureExtractor（nn．Module）：

def ＿＿init＿＿（self， model）：
super（ResnetFeatureExtractor， self）．＿＿init＿＿（）
self．model ＝ nn．Sequential（＊model．children（））［：－1］

def forward（self， x）：
return self．model（x）

請注意，在第4行中，創建了一個新模型，將原始模型的所有層保存為最后一層。你會記得，神經網絡中的最后一層是用于預測輸出的密集層。然而，由于我們只對提取特征感興趣，所以我們不需要最后一層，因此它被排除在外。然后，我們將torchvision預訓練的resnet34模型傳遞給ResnetFeatureExtractor構造函數，從而利用該模型。讓我們使用著名的CIFAR10數據集（50000張圖像），并在其上循環以提取特征。CIFAR10數據集（源）