Overview Easy parallelization over multiple GPUs can be accomplished in Tensorflow 2 using the ‘MirroredStrategy’ approach, especially if one is using Keras through the Tensorflow integration. This can be used as a replacement for ‘multi_gpu_model’ in Keras. There are a few caveats (bugs) with using this on TF2.0 (see below).
An example illustrating its use is shown below where two of the GPU devices are selected.
import tensorflow as tf from tensorflow.
[Read More]
Disease modeling using Julia for COVID-19
SIR, SIS, SIRS models and the impact of social distancing
This Jupyter notebook contains disease propagation modeling using the SIR, SIS and SIRS models in Julia. The impact of social distancingis assessed using these various models. Also shown is the outbreak size as a function of the mean degree of connectivity (physical interaction) of the social network.
RVATECH/DataSummit 2020
Introduction to AutoML
The following slides are an overview of AutoML. This is an updated version of the slides presented at SuperComputing18. Additionally, this session covers an introduction to H2O for model selection and Comet.ml for hyperparameter optimization.
Introduction to AutoML H2O H2O is a tool that allows you to perform Automated Machine Learning. A Jupyter notebook with an introduction to H2O can be found in the GitHub repository. The binder path to the repository is located here
[Read More]
Hyperparameter Optimization with Comet.ml
Data science workflow management tool & collaboration hub
Reader level: Introductory Introduction to Comet.ml Comet.ml is an API-driven framework for workflow management in Machine learning and Data Science experiments. Comet’s hyperparameter optimization is roughly based on the Advisor hyperparameter black box optimization tool. It allows you to add API calls from your code to perform optimization on a selected set of hyperparameters using Comet’s cloud service. This requires that you install the comet python package ‘comet_ml’.
[Read More]
Tensorflow in Jupyter Notebook for Multi-GPU environments
Options/Best Practices
When running Jupyter notebooks on machines will multiple GPUs one might want to run individual notebooks on separate GPUs to take advantage of your available resources. Obviously, this is not the only type of parallelism available in TensorFlow, but not knowing how to do this can severely limit your ability to run multiple notebooks simultaneously since Tensorflow selects your physical device 0 for use. Now if you have two notebooks running and one happens to use up all the GPU memory on your physical device 0, then your second notebook will refuse to run complaining that it is out of memory!
[Read More]
Conditional Variational Autoencoders
With code in Keras
The following slides are an overview of Variational Autoencoders. A notebook that modifies this to implement a Conditional Variational Autoencoder can be found below.
A Jupyter notebook with the implementation can be found here.
Data Science with Neptune.ml
Data science workflow management tool & collaboration hub
Reader level: Introductory Table of Contents 1. Introduction
2. Overview of Neptune UI
3. How I used Neptune in my Keras ML project
4. What I have not covered
Introduction Neptune.ml is a workflow management and collaboration tool for Data Science and Machine Learning (DS/ML). I have had the pleasure of testing this platform out for my own work and I must admit that I am convinced that every Data Science team needs something like this.
[Read More]
Self-attention for Text Analytics
Visualization
Reader level: Intermediate The Self-attention mechanism as shown in the paper is what will be covered in this post. This paper titled ‘A Structured Self-attentive Sentence Embedding’ is one of the best papers, IMHO, to illustrate the workings of the self-attention mechanism for Natural Language Processing. The structure of Self-attention is shown in the image below, courtesy of the paper:
Suppose one has an LSTM of dim ‘u’ and takes as input batches of sentences of size ‘n’ words.
[Read More]
Using RQ for scheduling tasks
RQ and remote scheduling
Reader level: Introductory RQ can be used to set up queues for executing long-running tasks on local or remote machines. Some steps on how to install and get started with RQ are listed below.
Installation Create a virtual environment and we will have to install the following components:
Redis-server RQ RQ-scheduler Install Redis using the following
wget http://download.redis.io/redis-stable.tar.gz tar xvzf redis-stable.tar.gz cd redis-stable make Run ‘make test’ to make sure things are working properly, followed by ‘sudo make install’ to complete the installation.
[Read More]
Publishing Jupyter Notebooks using Gatsby and Netlify
A quick overview
Reader level: Introductory Build a Gatsby website using the following command. This will start a server running at port 8000, navigate using your browser. You can also access the GraphQL query page at localhost:8000/___graphql.
gatsby develop Once you are done developing, you can build this website so it can deployed to a server such as Netlify or Gitlab pages.
gatsby build Once you have the above you can go ahead and set up your Netlify account and link your current folder.
[Read More]