%%capture
!pip install kornia
!pip install kornia-rs
Geometric image and points transformations
torch
components to perform data augmention.
Kornia recently introduced a module called
kornia.augmentation
which among other functionalities, provides a set of operators to perform geometric data augmentation with the option to retrieve the applied transformation to the original image in order to perform transformations of additional data such keypoints, bound boxes, or others.Our geometric transformations API is compliant with
torchvision
including a few extras such as the flagreturn_transform
that returns to the user the applied transformation to the original.Additionally, our API inherits from
nn.Module
meaning that can be combined withnn.Sequential
and chain the different applied transformations, when this last one is used. Moreover, we can compute in batches of images using different devices such CPU/GPU (and TPU in future).Finally, all the operators are fully differentiable, a topic that we will cover in future tutorials so that users can make use of this feature.
\
In brief, in this tutorial we will learn how to:
- Use
kornia.augmentation.RandomAffine
to generate random views and retrieve the transformation. - Use
kornia.geometry.transform_points
to manipulate points between views. - Combine the above in a
nn.Module
with otherkornia.augmenation
components to generate a complete augmentation pipeline.
Installation
We first install Kornia v0.2.0 and Matplotlib for visualisation.
To play with data we will use some samples from HPatches dataset [1].
[1] HPatches: A benchmark and evaluation of handcrafted and learned local descriptors, Vassileios Balntas, Karel Lenc, Andrea Vedaldi and Krystian Mikolajczyk, CVPR 2017.
import io
import requests
def download_image(url: str, filename: str = "") -> str:
= url.split("/")[-1] if len(filename) == 0 else filename
filename # Download
= io.BytesIO(requests.get(url).content)
bytesio # Save file
with open(filename, "wb") as outfile:
outfile.write(bytesio.getbuffer())
return filename
"https://github.com/kornia/data/raw/main/homography/img1.ppm")
download_image("https://github.com/kornia/data/raw/main/v_dogman.ppm")
download_image("https://github.com/kornia/data/raw/main/v_maskedman.ppm")
download_image("https://github.com/kornia/data/raw/main/delorean_side.png") download_image(
'delorean.png'
Setup
We will import the needed libraries and create a small functionalities to make use of OpenCV I/O.
%matplotlib inline
import cv2
import kornia as K
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
Define a function for visualisation using Matplotlib.
def imshow(image: np.ndarray, height: int, width: int):
"""Utility function to plot images."""
=(height, width))
plt.figure(figsize
plt.imshow(image)"off")
plt.axis( plt.show()
Since Kornia don’t provide render functionalities, let’s use OpenCV cv2.circle
to draw points.
def draw_points(img_t: torch.Tensor, points: torch.Tensor) -> np.ndarray:
"""Utility function to draw a set of points in an image."""
# cast image to numpy (HxWxC)
= K.utils.tensor_to_image(img_t)
img: np.ndarray
# using cv2.circle() method
# draw a circle with blue line borders of thickness of 2 px
= img.copy()
img_out: np.ndarray
for pt in points:
= int(pt[0]), int(pt[1])
x, y = cv2.circle(img_out, (x, y), radius=10, color=(0, 0, 255), thickness=5)
img_out return np.clip(img_out, 0, 1)
Transform single image
In this section we show how to open a single image, generate 2d random points and plot them using OpenCV and Matplotlib.
Next, we will use kornia.augmentation.RandomAffine
to gerenate a random synthetic view of the given image and show how to retrieve the generated transformation to later be used to transform the points between images.
# load original image
= K.io.load_image("img1.ppm", K.io.ImageLoadType.RGB32)[None, ...] # BxCxHxW
img1: torch.Tensor
# generate N random points within the image
int = 10 # the number of points
N: = img1.shape
B, CH, H, W
= torch.rand(1, N, 2)
points1: torch.Tensor 0] *= W
points1[..., 1] *= H
points1[...,
# draw points and show
= draw_points(img1[0], points1[0])
img1_vis: np.ndarray
10, 10) imshow(img1_vis,
Now lets move to a bit more complex example and start to use the kornia.augmentation
API to transform an image and retrieve the applied transformation. We’ll show how to reuse this transformation to project the 2d points between images.
# declare an instance of our random affine generation eith `return_transform`
# set to True, so that we recieve a tuple with the transformed image and the
# transformation applied to the original image.
= K.augmentation.RandomAffine(degrees=[-45.0, 45.0], p=1.0)
transform: nn.Module
# tranform image and retrieve transformation
= transform(img1, transform=transform)
img2 = transform.get_transformation_matrix(img1)
trans
# transform the original points
= K.geometry.transform_points(trans, points1)
points2: torch.Tensor
= draw_points(img2, points2[0])
img2_vis: np.ndarray
15, 15) imshow(img2_vis,
Transform batch of images
In the introduction we explained about the capability of kornia.augmentation
to be integrated with other torch
components such as nn.Module
and nn.Sequential
.
We will create a small component to perform data augmentation on batched images reusing the same ideas showed before to transform images and points.
First, lets define a class that will generate samples of synthetic views with a small color augmentation using the kornia.augmentation.ColorJitter
and kornia.augmentation.RandomAffine
components.
NOTE: we set the forward pass to have no gradients with the decorator @torch.no_grad()
to make it more memory efficient.
from typing import Dict
class DataAugmentator(nn.Module):
def __init__(self) -> None:
super().__init__()
# declare kornia components as class members
self.k1 = K.augmentation.RandomAffine([-60, 60], p=1.0)
self.k2 = K.augmentation.ColorJitter(0.5, 0.5, p=1.0)
@torch.no_grad()
def forward(self, img1: torch.Tensor, pts1: torch.Tensor) -> Dict[str, torch.Tensor]:
assert len(img1.shape) == 4, img1.shape
# apply geometric transform the transform matrix
= self.k1(img1)
img2 = self.k1.get_transformation_matrix(img1)
trans
# apply color transform
= self.k2(img1), self.k2(img2)
img1, img2
# finally, lets use the transform to project the points
= K.geometry.transform_points(trans, pts1)
pts2: torch.Tensor
return dict(img1=img1, img2=img2, pts1=pts1, pts2=pts2)
Lets use the defined component and generate some syntethic data !
# load data and make a batch
= K.io.load_image("v_dogman.ppm", K.io.ImageLoadType.RGB32)[None, ...] # BxCxHxW
img1: torch.Tensor = K.io.load_image("v_maskedman.ppm", K.io.ImageLoadType.RGB32)[None, ...] # BxCxHxW
img2: torch.Tensor
# crop data to make it homogeneous
= K.augmentation.CenterCrop((512, 786))
crop
= crop(img1), crop(img2)
img1, img2
# visualize
= torch.cat([img1, img2], dim=-1)
img_vis 15, 15) imshow(K.tensor_to_image(img_vis),
# create an instance of the augmentation pipeline
# NOTE: remember that this is a nn.Module and could be
# placed inside any network, pytorch-lighting module, etc.
= DataAugmentator()
aug: nn.Module
for _ in range(5): # create some samples
# generate batch
= torch.cat([img1, img2], dim=0)
img_batch
# generate random points (or from a network)
int = 25
N: = img_batch.shape
B, CH, H, W
= torch.rand(B, N, 2)
points: torch.Tensor 0] *= W
points[..., 1] *= H
points[...,
# sample data
= aug(img_batch, points)
batch_data
# plot and show
# visualize both images
= []
img_vis_list
for i in range(2):
= draw_points(batch_data["img1"][i], batch_data["pts1"][i])
img1_vis: np.ndarray
img_vis_list.append(img1_vis)
= draw_points(batch_data["img2"][i], batch_data["pts2"][i])
img2_vis: np.ndarray
img_vis_list.append(img2_vis)
= np.concatenate(img_vis_list, axis=1)
img_vis
20, 20) imshow(img_vis,
BONUS: Backprop to the future
One of the main motivations during the desing for the kornia.augmentation
API was to give to the user the flexibility to retrieve the applied transformation in order to achieve one of the main purposes of Kornia - the reverse engineering.
In this case we will show how easy one can combine Kornia and PyTorch components to undo the transformations and go back to the original data.
“Wait a minute, Doc. Are you telling me you built a time machine…out of a PyTorch?” - Marty McFLy
# lets start the Delorean engine
= K.io.load_image("delorean_side.png", K.io.ImageLoadType.RGB32)[None, ...] # BxCxHxW
delorean: torch.Tensor
10, 10) imshow(K.utils.tensor_to_image(delorean),
“If my calculations are correct, when this baby hits 88 miles per hour, you’re gonna see some serious shit.” - Doc. Brown
# turn on the time machine panel (TMP)
= K.augmentation.RandomHorizontalFlip(p=1.0)
TMP
= TMP(delorean) # go !
delorean_past = TMP.get_transformation_matrix(delorean)
time_coords_past
10, 10) imshow(K.utils.tensor_to_image(delorean_past),
Let’s go back to the future !
“Marty! You’ve gotta come back with me!” - Doc. Brown
# lets go back to the past
= torch.inverse(time_coords_past)
time_coords_future: torch.Tensor
= delorean_past.shape[-2:]
H, W = K.geometry.warp_perspective(delorean_past, time_coords_future, (H, W))
delorean_future
10, 10) imshow(K.utils.tensor_to_image(delorean_future),