Introduction to Pytorch


Bikash Santra

Indian Statistical Institute, Kolkata


Author: Soumith Chintala

PyTorch

It’s a Python based scientific computing package targeted at two sets of audiences:

  • A replacement for NumPy to use the power of GPUs
  • a deep learning research platform that provides maximum flexibility and speed

Tensors

Tensors are similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.

In [1]:
%matplotlib inline
from __future__ import print_function
import torch
1. Construct a 5x3 matrix, uninitialized:
In [2]:
x = torch.empty(5, 3)
print(x, x.dtype)
tensor([[ 0.0000,  0.0000, -0.0000],
        [ 0.0000, -0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000],
        [ 0.0000, -0.0000,  0.0000],
        [-0.0000,  0.0000, -0.0000]]) torch.float32
2. Construct a randomly initialized matrix:
In [3]:
x = torch.rand(5, 3)
print(x, x.dtype)
tensor([[0.9331, 0.7786, 0.3861],
        [0.3608, 0.8381, 0.5882],
        [0.5534, 0.1672, 0.1186],
        [0.9113, 0.5418, 0.4060],
        [0.1186, 0.9712, 0.5674]]) torch.float32
3. Construct a matrix filled zeros and of dtype long:
In [4]:
x = torch.zeros(5, 3, dtype=torch.long)
print(x)
tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])
4. Construct a tensor directly from data:
In [5]:
x = torch.tensor([5.5, 3])
print(x)
tensor([5.5000, 3.0000])
5. Help on 'torch' functions:
In [6]:
help(torch.transpose)
Help on built-in function transpose:

transpose(...)
    transpose(input, dim0, dim1) -> Tensor
    
    Returns a tensor that is a transposed version of :attr:`input`.
    The given dimensions :attr:`dim0` and :attr:`dim1` are swapped.
    
    The resulting :attr:`out` tensor shares it's underlying storage with the
    :attr:`input` tensor, so changing the content of one would change the content
    of the other.
    
    Args:
        input (Tensor): the input tensor
        dim0 (int): the first dimension to be transposed
        dim1 (int): the second dimension to be transposed
    
    Example::
    
        >>> x = torch.randn(2, 3)
        >>> x
        tensor([[ 1.0028, -0.9893,  0.5809],
                [-0.1669,  0.7299,  0.4942]])
        >>> torch.transpose(x, 0, 1)
        tensor([[ 1.0028, -0.1669],
                [-0.9893,  0.7299],
                [ 0.5809,  0.4942]])

6. Transpose of a tensor:
In [7]:
x = torch.tensor([5.5, 3])
print(x, torch.transpose(x,0,0))
tensor([5.5000, 3.0000]) tensor([5.5000, 3.0000])
7. Creating tensor based on existing tensor:

These methods will reuse properties of the input tensor, e.g. dtype, unless new values are provided by user.

In [8]:
x = x.new_ones(5, 3, dtype=torch.double)      # new_* methods take in sizes
print(x)

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print(x)                                      # result has the same size
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[-0.4720, -0.5281,  0.0683],
        [-0.2435,  0.0228,  1.3377],
        [ 0.4934, -1.1588,  0.8927],
        [-1.7224, -0.5271,  0.1744],
        [ 0.7074,  2.3632, -0.6392]])
8. Getting size of a tensor:

torch.Size is in fact a tuple, so it supports all tuple operations.

In [9]:
y = x.size()
print(y[0])
5

Operations on Tensors

There are multiple syntaxes for operations. In the following example, we will take a look at various operations.

1. Addition: syntax 1
In [10]:
y = torch.rand(5, 3)
print(x + y)
tensor([[-0.0722, -0.0095,  0.3673],
        [ 0.2229,  0.0690,  1.6079],
        [ 0.6903, -0.3592,  1.2882],
        [-1.6149,  0.0336,  0.8815],
        [ 0.9481,  2.6690,  0.1140]])
2. Addition: syntax 2
In [11]:
print(torch.add(x, y))
tensor([[-0.0722, -0.0095,  0.3673],
        [ 0.2229,  0.0690,  1.6079],
        [ 0.6903, -0.3592,  1.2882],
        [-1.6149,  0.0336,  0.8815],
        [ 0.9481,  2.6690,  0.1140]])
3. Addition: providing an output tensor as argument
In [12]:
result = torch.empty(5, 3)
torch.add(x, y, out=result)
print(result)
tensor([[-0.0722, -0.0095,  0.3673],
        [ 0.2229,  0.0690,  1.6079],
        [ 0.6903, -0.3592,  1.2882],
        [-1.6149,  0.0336,  0.8815],
        [ 0.9481,  2.6690,  0.1140]])
4. Addition: providing an output tensor as argument
In [13]:
# adds x to y
y.add_(x)
print(y)
tensor([[-0.0722, -0.0095,  0.3673],
        [ 0.2229,  0.0690,  1.6079],
        [ 0.6903, -0.3592,  1.2882],
        [-1.6149,  0.0336,  0.8815],
        [ 0.9481,  2.6690,  0.1140]])

Note

Any operation that mutates a tensor in-place is post-fixed with an ``_``. For example: ``x.copy_(y)``, ``x.t_()``, will change ``x``.

NumPy-like Indexing

In [14]:
print(x[:, 1])
tensor([-0.5281,  0.0228, -1.1588, -0.5271,  2.3632])

Deep Copy of Tensors

In [15]:
a = torch.zeros((2,3))
print(a)
tensor([[0., 0., 0.],
        [0., 0., 0.]])
In [16]:
b = a.t().clone()
print(b)
tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])
In [17]:
b[0,0] = 5
print(b)
print(a)
tensor([[5., 0.],
        [0., 0.],
        [0., 0.]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])

Resizing / Reshaping Tensors torch.view:

In [18]:
x = torch.randn(4, 4, 4)
y = x.view(64)
z = x.view(-1, 2, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())
torch.Size([4, 4, 4]) torch.Size([64]) torch.Size([4, 2, 8])

If you have a one element tensor, use .item() to get the value as a Python number

In [19]:
x = torch.randn(5)
print(x, type(x))
print(x.data[0], type(x.data[1]))
tensor([-0.1033,  0.6318, -0.0639,  0.2737,  0.0403]) <class 'torch.Tensor'>
tensor(-0.1033) <class 'torch.Tensor'>

Read later: 100+ Tensor operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random numbers, etc., are described here <http://pytorch.org/docs/torch>.

NumPy Bridge

Converting a Torch Tensor to a NumPy array and vice versa is a breeze.

The Torch Tensor and NumPy array will share their underlying memory locations, and changing one will change the other.

In [20]:
a = torch.ones(5)
print(a, type(a))
tensor([1., 1., 1., 1., 1.]) <class 'torch.Tensor'>
1. Converting a Torch Tensor to a NumPy Array
In [21]:
b = a.numpy()
print(b, type(b))
[1. 1. 1. 1. 1.] <type 'numpy.ndarray'>

See how the numpy array changed in value.

In [22]:
a.add_(10)
print(a)
print(b)
tensor([11., 11., 11., 11., 11.])
[11. 11. 11. 11. 11.]
2. Converting NumPy Array to Torch Tensor

See how changing the np array changed the Torch Tensor automatically

In [23]:
import numpy as np
a = np.ones(5)
print(a)
[1. 1. 1. 1. 1.]
In [24]:
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(b)
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)

Note

All the Tensors on the CPU except a CharTensor support converting to NumPy and back.

CUDA Tensors

Tensors can be moved onto any device using the .to method.

In [25]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
print(torch.cuda.is_available(), torch.cuda.get_device_name(0))
True TITAN Xp
In [26]:
torch.cuda.empty_cache()

torch.cuda.set_device(0)

device = torch.device(0)

if torch.cuda.is_available():
    y = torch.ones_like(x).to(device)
    x = x.to(device)               # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
tensor([0.8967, 1.6318, 0.9361, 1.2737, 1.0403], device='cuda:0')
In [27]:
x = torch.rand(3,3)
y = torch.rand(3,3)

print(x)
print(y)
tensor([[0.2721, 0.4461, 0.3787],
        [0.8215, 0.2232, 0.7930],
        [0.1594, 0.0333, 0.2712]])
tensor([[0.2953, 0.6222, 0.1779],
        [0.7139, 0.0724, 0.6469],
        [0.8526, 0.4841, 0.4538]])
In [28]:
x = x.cuda()
print(x)
tensor([[0.2721, 0.4461, 0.3787],
        [0.8215, 0.2232, 0.7930],
        [0.1594, 0.0333, 0.2712]], device='cuda:0')
In [29]:
print(x+y)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-39cb3db33052> in <module>()
----> 1 print(x+y)

RuntimeError: TensorIterator expected type torch.cuda.FloatTensor but got torch.FloatTensor[3, 3]
In [30]:
y = y.cuda()
print(x+y)
tensor([[0.5673, 1.0683, 0.5566],
        [1.5354, 0.2957, 1.4399],
        [1.0119, 0.5174, 0.7250]], device='cuda:0')

From GPU to CPU

In [31]:
print(y.cpu())
tensor([[0.2953, 0.6222, 0.1779],
        [0.7139, 0.0724, 0.6469],
        [0.8526, 0.4841, 0.4538]])