# Pytorch frameworks, a few comparissons

Catalyst, Fastai, Ignite and Pytorch-Lightning are all amazing frameworks but which one should I use for project x? I have been asking myself the same question and it is not an easy answer.

There are several factors at play and framework selection also depends on your background. I will outline some basic statistics and library codestyle examples. I believe it is important to be able to delve into the source code of these libraries when you inevitably get stuck on problem/can’t work out how to implement something or want to debug your code.

## Catalyst

I started to use Catalyst last year…

# Pytorch tensor operations

This post covers some of the key operations used in pytorch

# argmax

Returns the indices of the maximum value of all elements in the input tensor.

This is the second value returned by torch.max()

`a = torch.randn(4, 3)a>>tensor([[ 2.0149,  1.0420, -1.3816],        [-1.0265, -0.5212, -0.7570],        [-0.5141,  0.5674,  0.1039],        [-0.1549, -0.3003, -0.1086]])torch.argmax(a)>>tensor(0)b = torch.randn(4)b>>tensor([0.6022, 1.1465, 0.3250, 1.0555])torch.argmax(b)>>tensor(1)`

# max

torch.max(input)

Returns the maximum value of all elements in the input tensor.

torch.max(input, dim, keepdim=False, out=None)

Returns a namedtuple (values, indices) where values is the maximum value of each row of the input…

# Numpy Axes

This post covers a quick overview of axes in numpy (NB both numpy and pytorch use same representation)

`import numpy as np`

# Axes

## 2D matrices

For both numpy and pytorch, axis 0 = row, 1 = column

Note how when we specify axis=0 for sum, we are collapsing along that (row) axis

`np.sum([[1, 0], [3, 5]], axis=0) #->[1+3, 0+5]>>array([4, 5])`

If we flatten on axis 1 we remove a column dimension — (ie we dont sum along columns :-) )

`np.sum([[1, 0], [3, 5]], axis=1) #->[1+0, 3+5]>>array([1, 8])`

# 3D arrays / tensors

Moving to 3D gets a bit more complicated, the trick is to use…

# SuperMicro IPMI when a board wont post

I bought a used Supermicro X9DRi-LN4F+ from ebay to use for datascience projects involving large datasets where having >128MB RAM and 20+ cpu cores would speed up data processing. The boards, E5–26xx v1/v2 series cpus and ECC DDR3 RAM can be sourced relatively cheaply relative to newer boards/cpu’s using DDR4 RAM.

After installing a single E5–2690 v1 cpu and 1 RAM DIMM I found the board exhibited two beeps plus a short blip and would not post (nothing displayed on screen).

I removed the board from the PC case I had installed it in and put it on a desk…

# Batch Norm and Transfer Learning — whats going on?

A commonly used technique in deep-learning is transfer-learning, whereby the learned weights of a model that was pre-trained on one dataset are used to ‘bootstrap’ training of early or all but the the final layer of a modified version of the model applied to a different dataset.

This technique allows faster training, whereby the model just learn the weights of the last fully connected layers, then applies a low learning rate finely tuned adjustment to the entire models weights.

This post explores the effect of bath-normalisation on transfer-learning based models.

Batch Norm:

Serge Ioffe and Christian Szegedy in 2015 proposed…

# Gumtree sale scammer

Gumtree is a UK and Australia website for used item sales, a bit like ebay but more personable. Recently I put an item up for sale, an RTX 2080 GPU (looking to upgrade to an RTX 2080ti). I received a response in less than an hour, and I have posted the messaging transactions in italics so you can see the scam play out.

Hi A, I’m interested in “RTX 2080”. Is this still available? If so, when and where can I pick it up? Cheers David

This is a default gumtree email message, a quick lookup of David on gumtree…

# Hitting a brick wall in a Kaggle Competition

Here I am reviewing my experience in the VSB Power Line Fault Detection Kaggle Competition.

The aim of the competition was to “detect partial discharge patterns in signals acquired from these power lines with a new meter designed at the ENET Centre at VSB — Technical University of Ostrava”.

A Partial Discharge ‘PD” is “A localized and quick electric discharge that only partially bridges the insulation material between the conductor’s materials or electrodes”. They are identified by localised higher than normal amplitude, high frequency spikes in the signal. …

# Building a Multi-GPU Deep Learning Machine on a budget

Here’s another story on building your own deep learning rig, containing the information I wish I had known a couple of years ago.

This story is aimed at building a single machine with 3 or 4 GPU’s. The big factors impacting my deep learning training capability has been number of available GPU’s and amount of available GPU VRAM. Having access to 3 or 4 GPU’s on a single machine can be really useful, but can be tricky to build.

The first consideration to make is what CPU/Motherboard combination to use. Each GPU should have a CPU/GPU bandwidth of x8 or…

# Watercooling a Deep Learning Machine

This is an article documenting the lessons I have learnt building two watercooled deep learning boxes. There are plenty of youtube videos and articles about watercooling gaming rigs, there is much less information about multi-GPU water-cooling setups.

If you research what type of GPU’s to use for multi-GPU deep learning rigs, most people nowadays will recommend blower type GPU’s such as the ASUS Turbo GeForce® RTX 2080 Ti (Note that for single GPU setups open-air / blower / AIO style are all totally fine).

This makes sense as when you have multiple GPU’s you want to redirect hot air away…

# Pandas for time series data — tricks and tips

There are some Pandas DataFrame manipulations that I keep looking up how to do. I am recording these here to save myself time. These may help you too.

## Time series data

Convert column to datetime with given format

`df[‘day_time’] = pd.to_datetime(df[‘day_time’], format=’%Y-%m-%d %H:%M:%S’)0 2012–10–12 00:00:001 2012–10–12 00:30:002 2012–10–12 01:00:003 2012–10–12 01:30:00`

Re-index a dataframe to interpolate missing values (eg every 30 mins below). You need to have a datetime index on the df before running this.

`full_idx = pd.date_range(start=df[‘day_time’].min(), end=df[‘day_time’].max(), freq=’30T’)df = ( df .groupby(‘LCLid’, as_index=False)  .apply(lambda group: group.reindex(full_idx, method=’nearest’))  .reset_index(level=0, drop=True)  .sort_index() )`

Find missing dates in…