A little about me…
I am currently employed as a Senior Data Science Developer Advocate at Databricks where I work on Data Science and Machine Learning problems.
I was previously employed at Virginia Tech as a Computational Scientist from 2014 to 2020. My research interests lie in Numerical Methods, Computational Science and Machine Learning using High-Performance Computing with traditional and novel accelerator technologies. Since joining VT, I have also had a chance to hone my skills at Data Visualization. My background during my Doctoral work at the SimCenter: National Center for Computational Engineering, was in CFD and Electromagnetics using the Finite-Element Method.
I have a Doctorate in Computational Engineering from the University of Tennessee and a Masters in Electrical Engineering from the The Pennsylvania State University. A web version of my current resume can be found here. A PDF version can be generated by hitting ‘Print’ from the browser, the CSS is setup to format this appropriately.
My current research interests are:
- Natural Language Processing
- Weak supervision
- Automated hyperparameter optimization
- Dimensionality reduction
- Docker and other forms of virtualization
- Cloud infrastructure and serverless computing
Note: I am no longer updating my activities below on this page. Please visit this website for an up-to-date version of my research, teaching and publications.
Introduction to PyTorch for Natural Language Processing, CS 4984: Big Data Text summarization
This class introduced PyTorch for text analytics to the students. It also covered neural network-based word embedding generation using CBOW and Skipgrams, followed by their respective implementations in PyTorch. Additionally, students learned how to run this code on the GPUs. It finished by showing them how to do a rudimentary topic extraction using clustering of the generated word embeddings.
Introduction to Python, CS1064
This class was an introductory class for non-Computer Science majors to be introduced to the basics of programming concepts using Python.
Introduction to OpenACC, CMDA 3634: Comp Sci Foundations for CMDA
Lectured on the OpenACC framework for GPU computing in Prof. Tim Warburton’s class ‘Foundations for Computational Modelling’. This framework simplifies GPU computing by using directive-based acceleration, instead of the explicit parallelization required by CUDA.
Hands-on Workshop for the Industrial and Systems Engineering (ISE) Department
1. Introduction to Python for Scientific Computing
This introduced the basics of Python as well as numerical libraries such as NumPy and SciPy tailored to the ISE department. This class also introduced students to the JupyterHub framework for collaborative computing and visualization.
2. Introduction to Data Visualization with Plotly
This class introduced the Data Visualization framework for exploratory visualization and presenting data relevant to the ISE department.
Introduction to Scientific Computing using Python
This introduced the basics of Python as well as numerical libraries such as NumPy, Scipy, Debugging, Interfacing with C, plotting libraries Matplotlib.
Introduction to Data Visualization
Introduced the basics of Visualization techniques and also taught users how to use the visualization package Plotly for two-dimensional and three-dimensional information and scientific visualization. At the end of this session users should be able to understand and generate interactive bar charts, line charts, scatterplots, choropleths, proportional symbol maps, 3D topographic maps, network charts to name a few.
Introduction to CUDA
CUDA is a parallel programming paradigm used for GPUs which can provide massively parallel execution of high-performance codes. Almost all major supercomputers are now equipped with GPUs and they have played a pivotal role in the last decade for expediting many scientific discoveries. This course introduces participants to the GPU architecture and the programming paradigm CUDA.
Introduction to Scientific Visualization using ParaView
This course introduces participants to the high-performance visualization tool ParaView which is capable of using multi-core architecture (including GPUs) on supercomputers to visualize very large datasets. ParaView has been shown to scale to be able to interactively visualize scientific datasets of more than a billion points. Scientists rarely have access to personal machines that have the memory or compute capabilities to deal such massive data. Participants will learn how to load and remotely render data that lives on the VT supercomputers, thereby leveraging cluster computing to visualize data.
TensorFlow for Machine Learning
TensorFlow is an open source software library for numerical computation using data flow graphs. This is used for distributed and scalable computation of large problems using CPUs and GPUs. In this class, the users will learn the basics of machine learning and how to create computational graphs with the TensorFlow API. Participants will have access to a live version of TensorFlow installed on a remote server and will be able to create a linear regression and neural network model on the MNIST and the Iris datasets. At the end of this session, participants should be able to understand the parameters and algorithms for solving large-scale machine learning problems.
Dask for Out-of-Core Computing: Big Data solutions on your laptop
Dask is a flexible parallel computing library for analytic computing. It allows users to compute on data that won’t fit into a machine’s memory, thereby justifying the name ‘Out-of-core’ computing. While a lot of users have massive data that requires BigData tools, many more have data that is large but does not quite necessitate the need for such complex tools. Such users can utilize the Dask framework that integrates right into the Python ecosystem utilizing NumPy and Pandas for computation.
This tutorial covers the Python programming language including all the information needed to participate in the XSEDE15 Modeling Day event on Tuesday, July 27th, 2015. Topics covered are variables, input/output, control structures, math libraries, and plotting libraries.
This tutorial is a beginner level course on tackling data analytics using the Python pandas module. Python is a high-level object oriented language that has found wide acceptance in the scientific computing community. Ease of use and an abundance of software packages are some of the few reasons for this extensive adoption. Pandas is a high-level open-source library that provides data analysis tools for Python. We will also introduce necessary modules such as numpy for fast numeric computation and matplotlib/bokeh for plotting to supplement the data analysis process. Experience with a programming language such as Python, C, Java or R is recommended but not necessary.
This tutorial is an intermediate level course on tackling the problems facing data scientist using Python. Python is a high-level object oriented language that has found wide acceptance in the scientific computing/ data science community. Ease of use and an abundance of software packages are some of the few reasons for this extensive adoption. Pandas is a high-level open-source library that provides data analysis tools for Python. It provides an efficient and comprehensive platform for a large number of analytics problems. For generating sophisticated visualizations two packages: Seaborn and Plotly are introduced. While Seaborn is aimed at Statisticians, Plotly provides a rich, interactive visualization framework which is ideal for visualizing large data. Plotly also allows visualization-rich dashboards which can be shared online. To conclude, out-of-core computing with Dask/Blaze is introduced for those datasets that won’t quite fit into memory. The goal of dask is to “extend the size of convenient datasets from ‘fits in memory’ to ‘fits on disk’” effectively fitting between Pandas and PySpark in the Python ecosystem for analytics.
Introduction to Machine Learning with Scikit-learn and TensorFlow: Unsupervised Learning
This class introduces participants to the basics of Machine Learning, specifically unsupervised learning using the open-source TensorFlow framework and the popular machine-learning framework Scikit-learn. We will look at clustering and dimensionality reduction and implement both using the Scikit-learn and TensorFlow framework.
Introduction to Machine Learning with Scikit-learn and TensorFlow: Supervised Learning
This class introduces participants to the basics of Machine Learning, specifically supervised learning using the open-source TensorFlow framework and the popular machine-learning framework Scikit-learn. We will cover Decision Surfaces, Support Vector Machines, Linear Regression and Logistic Regression. Participants will be introduced to implementations of the above using both Scikit-Learn and TensorFlow. Users will also learn about the optimization algorithms popularly used in Machine Learning problems.
Introduction to Deep Learning with TensorFlow and Keras
This is the 3rd class in a 3 part series on Machine Learning and it introduces the concepts of Deep Learning using the open-source tools TensorFlow and Keras. Users are required to have taken Classes 1 and 2 on Machine Learning as a pre-requisite to this class. Users will learn how to model and train a Convolutional Neural Network for an image classification problem and a language modeling problem with Recurrent Neural Networks in TensorFlow. We will conclude by introducing Keras, a high-level framework for deep-learning problems and how the solution of the above problems can be simplified using the Keras API.
Srijith Rajamohan, Alana Romanella, Amit Ramesh. “Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation”. Aug 2019, https://arxiv.org/abs/1908.02282
Valerio Mascolino, Alireza Haghighat, Nicholas Polys, Nathan J. Roskoff, and Srijith Rajamohan. 2019. “A Collaborative Virtual Reality System (VRS) with X3D Visualization for RAPID”, The 24th International Conference on 3D Web Technology (Web3D ’19), ACM, New York, NY, USA, 1-8.
Srijith Rajamohan and Faiz Abidi, “Web-based Visualization and Querying of Food and Beverage Endorsements by Celebrities”, PEARC19, ACM, Chicago
Rajamohan, S., Romanella, A., Ramesh, A., “A Human-in-the-Loop Deep Learning Based Document Tagging for Stance Detection”, CHCI 2019: Algorithms that make you think, Blacksburg.
Rajamohan,S and Anderson, W.K. “A Modified Streamline Upwind/Petrov-Galerkin Stabilization Matrix for Time-Domain FEM”, ACES 2018 Denver
Rajamohan,S and Anderson, W.K. “Using an Approximate Streamline Upwind/Petrov-Galerkin Stabilization Matrix for the Solution of Maxwell’s Equations in Dispersive Materials”, ACES 2018 Denver
Faiz Abidi, Nicholas Polys, Srijith Rajamohan, Lance Arsenault and Ayat Mohammed. “Remote High Performance Visualization of Big Data for Immersive Science.”, 26th High Performance Computing Symposium 18, Baltimore
Zhou M, Kraak VI, Rajamohan S, Abidi F, Polys N. “Mapping the Celebrity Marketing of Branded Food and Beverage Products in the United States: Policy Implications and Research Needs”, 15th World Congress on Public Health, April 2017
Polys, N., Mohammed, A., Iyer, J., Radics, P., Abidi, F., Arsenault, L., & Rajamohan, S. (2016, March).“Immersive analytics: Crossing the gulfs with high-performance visualization.” In Immersive Analytics (IA), 2016 Workshop on (pp. 13-18). IEEE.
Rajamohan, S and Anderson, W.K. “HPC for Legacy EM Code, a Mixed Language Approach using CUDA.” Applied Computational Electromagnetic Society 2012, Volume: GPU for CEM (paper)
- XSEDE Campus Champion
- ACI-REF Campus Representative
- OpenACC Campus Representative
- XSEDE Conference Session Chair and Paper Reviewer
- NSF Proposal Reviewer