I am a PhD candidate in Biostatistics at Johns Hopkins Bloomberg School of public health and pre-doctoral fellow in the Genetic Epidemiology Research Branch of National Institutes of Mental Health. My research involves developing methods to model intensive longitudinal mixed datatype and highly multivariate spatial data. My work is driven by applications in wearables, smartphone diaries, mobile health and environmental sciences.
With my advisor Dr. Vadim Zipunnikov, I am developing methods using semi-parametric Gaussian Copula to model mixed datatype coming from big health surveys and smartphone diaries (ecological momentary assessment). I am a part of the data science core of mMarch consortium where we investigate investigate scientific questions at the intersection of physical activity, mental health and various biochemical processes. This research is done in collaboration with Dr. Kathleen Merikangas.
My research interests also span to functional data analysis with its application on wearables. Specifically, my paper titled Re-evaluating the effect of age on physical activity over the lifespan has made a significant impact in the field. The findings are not only interesting but have major public health implications. The work has been highlighted in press releases by Johns Hopkins Bloomberg School of public health and Johns Hopkins School of Medicine, respectively, and featured in major media outlets such as TIME (Teens Are Just As Sedentary As 60 Year Olds), Washington Post, Wall Street Journal, BBC Radio and WPYR among others. In this work, we described circadian patterns of physical activity in the nationally representative data and identified different times throughout the day when activity was highest and lowest: These patterns could inform programs aimed at increasing physical activity by targeting—age- and sex-specifically—times with subpar activity, such as the morning for children and adolescents.
I am also working with my co-advisor Dr. Abhirup Datta to develop methods to model highly multivariate spatial data (both large number of variables and large number of locations). This has direct applications in modeling the distribution of multiple pollutants across many locations.
I also like to combine my passion for soccer and statistics. I have been actively working on soccer analytics for past five years. In 2018, I was part of a finalist team at the inaugural US soccer hackathon , where we developed key metrics to judge defensive contributions in a game and also provided tools to create in-game substitution suggestions. Later on, I worked with Statsbomb event data from FIFA World Cup 2018. Our contribution was to - (1) identify riskiest turnover locations for different formations, (2) find associations between pattern of play and effective shots, and (3) build dynamic in-game team level summary measures that are predictive of performance. You can find the details of the project here: The good, the bad and the ugly of the beautiful game . Recently, we were chosen one among the six groups to present at the Opta Pro Forum, 2021 . We adopted a semi-supervised learning approach to creating predictive models that capture hidden patterns from within the data available, with the objective of drawing inferences for tracking data in an event-only dataset.
Outside work, I like to play soccer, indulge in culinary adventures and travel around the world.
Download my resumé.
PhD in Biostatistics, 2022 (Expected)
Johns Hopkins Bloomberg School of Public Health
Master of Statistics, 2017
Indian Statistical Institute
Bachelor of Statistics, 2015
Indian Statistical Institute