ML Blog – Just another WordPress site

Image DataGenerator getting X and Y for each picture

With function bellow using ImageDataGenerator and after that flow_from_directory. It’s gonna solve shuffle problem which exist when we predict with model using test_generator as data, giving us Y which is not matching to his pair X import math import numpy as np number_of_examples = len(test_generator.filenames) number_of_generator_calls = math.ceil(number_of_examples / (1.0 * batch_size)) x_final = np.empty((0,… Continue reading Image DataGenerator getting X and Y for each picture

GPU activation

This installation is for ubuntu 20.04 First open a terminal and write nvidia-smi to check if gpu is on. If you see this, then do: sudo nvidia-smi -pm 1. This will change the off to on. Now set the required packages for the GPU: sudo apt install nvidia-cuda-toolkit export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64/ lspci -nn with this line… Continue reading GPU activation

Loading large image data

In machine learning we can struggle with problems like not enough memory. There is a good solution, for that, and you can load the data without running out of RAM memory. The problem starts, when you load the data in one step and you holding it in a variable. That can full your memory easy.… Continue reading Loading large image data

Best pre-trained model

NASNet je najbolji model? https://towardsdatascience.com/how-to-choose-the-best-keras-pre-trained-model-for-image-classification-b850ca4428d4

Liveness Dataset

SiW Database now is open to industrial institutes for research purposes

Optimal Split for Categorical Features

It is common to represent categorical features with one-hot encoding, but this approach is suboptimal for tree learners. Particularly for high-cardinality categorical features, a tree built on one-hot features tends to be unbalanced and needs to grow very deep to achieve good accuracy. Instead of one-hot encoding, the optimal solution is to split on a… Continue reading Optimal Split for Categorical Features

Stratification

If we go ahead and train our model on the sample data which has the wrong proportions it is likely that the model will be over-fitted to the training data and it is also likely that when we run the model against real-world or testing data that is in the right proportions it will underperform.… Continue reading Stratification

What is Vaex?

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. It can calculate statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid up to a billion (109) objects/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, a zero memory copy… Continue reading What is Vaex?

Hello world!

Welcome to WordPress. This is your first post. Edit or delete it, then start writing!