Researcher, educator and solver of computationally-intensive mathematical problems. I currently work as a Senior Data Science Developer Advocate at Databricks where I work on Data Science and Machine Learning. Prior to this, I worked as a Computational Scientist at Virginia Tech (2014 - 2020), where I had the privilege of working with some great minds and state-of-the-art science.

## SuperComputing18 Presentations

### Slides

The slides below were used for presentations at the SuperComputing 2018 conference in Dallas.

### Overview of PyTorch

The posts associated with these slides can be found here and here.

### Quick introduction to AutoML

The post associated with these slides can be found here. Note that this is still work in progress and will be updated periodically.

## AutoML

### An overview of Automated Machine Learning

Reader level: Intermediate Disclaimer: This post is work in progress and will be updated periodically. This is not meant to a comprehensive overview of the topic, but more of an introduction to AutoML, some tools and techniques. Overview Finding a model that works for a specific problem or a class of problems can be a time-consuming task. Usually, an engineer or a scientist determines what model class to use either based on his prior knowledge of the problem at hand or by evaluating several models and picking the best one. [Read More]

## Gaussian Process Regression (Draft)

### Uncertainty quantification

Reader level: Advanced Gaussian Distributions A Gaussian distribution exists over variables, i.e. the distribution explains how (relatively) frequently the values for those variables show up in observations. A Gaussian distribution for a n-dimensional vector variable is fully specified by a mean vector, μ, and covariance matrix Σ $$\mathrm{x} = (x_{1},....x_{n})^{T} \sim \mathcal{N}(\mu,\Sigma)$$ A univariate Gaussian distribution is given by $$p(x|\mu,\sigma^2) = \dfrac{1}{2\pi \sigma^2} e^{ \dfrac{ -(x - \mu)^2 }{2 \sigma^2} }$$ where μ is the mean and σ is the standard deviation for the Gaussian. [Read More]

## Word2Vec in Pytorch - Continuous Bag of Words and Skipgrams

### Pytorch implementation

Reader level: Intermediate Overview of Word Embeddings Word embeddings, in short, are numerical representations of text. They are represented as ‘n-dimensional’ vectors where the number of dimensions ‘n’ is determined on the corpus size and the expressiveness desired. The larger the size of your corpus, the larger you want ‘n’. A larger ‘n’ also allows you to capture more features in the embedding. However, a larger dimension involves a longer and more difficult optimization process so a sufficiently large ‘n’ is what you want to use, determining this size is often problem-specific. [Read More]