PACT Pharma is a clinical stage immuno-oncology company dedicated to engineering personalized TCR T-cell therapies. As a Bioinformatics data scientist, I work on multiple projects including but not limited to selecting neoantigens to target, building predictive models, analyzing clinical data, developing cloud scalable pipelines etc.
Antigen Map is a collaborative project of Adaptive Biotech and Microsoft Research where the goal is identifying disease-associated T-Cell receptor sequences and repertoire features to help build immune-based diagnostics. While I was an intern in the Antigen Map team, I developed a novel computational method (Ordered Clustering of Clones using weak Classifiers or OC3), for the identification of TCR-antigen pairs that bind, from data generated by Adaptive's MIRA assay. OC3 comprises feature generation, using maximum-likelihood for ordered clustering using those features, and cross-validation to select the optimal features. I built a Python package (oc3-py) that includes scalable implementation of the method and easy-to-use CLI commands. Using oc3-py, we identified tens of thousands of antigen-specific TCRs by analyzing data from multiple MIRA experiments.
(Joint work with Pavel Pevzner, Yana Safonova , Massimo Franceschetti, and Ramesh Rao)
Antibodies provide specific binding to an enormous range of antigens and represent a key component of the adaptive immune system. With immunosequencing, we can sample antibody repertoires and generate millions of reads that can provide insights into monitoring immune response to disease and vaccination. Most immunogenomics studies rely on the reference germline genes rather than the germline genes in a specific patient. This is deficient as the set of known germline V, D, and J genes is incomplete (particularly for non-Europeans humans and non-human species) and contains alleles that resulted from sequencing and annotation errors. We made a tool for de novo inference of germline D genes of a patient using their immunosequencing data.
We first fomulated this problem as a mathematical string reconstruction problem and proposed an algorithm to find the optimal solution in linear time using dynamic programming. Based on this, we developed the heuristic algorithm, Method for Inference of Immunoglobulin Genes -D (MINING-D), that additionally takes into account the complexities of the real data ignored by the string reconstruction problem. Using MINING-D, we inferred 25 novel D genes across 5 species on ~600 publicly available immunosequencing datasets and validated the inferred novel D genes by analyzing ~100 diverse WGS datasets. We also revealed D genes that are potentially associated with antigen-specific response and showed that heterozygous D genes can be used for V gene haplotyping.
(Joint work with Jain Lab (UCSD School of Medicine) and Shriram Nallamshetty)
In a joint work with Jain lab at UCSD School of Medicine, we examined the role of FAHFAs - a class of bioactive lipids that exert anti-diabetic effects in preclinical models - in human diseases, and investigated their regulation with various physiological conditions including fasting, feeding, or specific dietary interventions. We analyzed metabolomics and phenotype data from multiple large human cohorts with a total of ~20,000 human subjects. The results will be posted as a preprint soon.
(Joint work with Dewleen Baker, Abigail Angkaw, Massimo Franceschetti, and Ramesh Rao)
In a joint work with VA Healthcare system and UCSD School of Medicine, we studied the influence of war on the mental health of veterans. Post-traumatic stress disorder has been linked with negative outcomes like hostility, depression, physical aggression, and suicidal tendencies. These are found to occur more often among veterans with PTSD than among civilians or veterans without PTSD. We analyzed clinical mental health data from a sample of veterans returning from Iraq or Afghanistan to better understand the inter-connections between these.
Bhardwaj V., Franceschetti M, Rao R, Pevzner PA, Safonova Y (2020) Automated analysis of immunosequencing datasets reveals novel immunoglobulin D genes across diverse species. PLoS Comput Biol 16(4): e1007837. See publication.
(Novel algorithm development, exploratory data analysis, bioinformatics, software, data visualization)
Bhardwaj V., et. al. (2021). 820 Machine learning significantly improves neoantigen-HLA predictions utilizing> 26,000 data points from the PACTImmuneTM Database. Journal for ImmunoTherapy of Cancer, 9(Suppl 2), A858-A858. See poster abstract
(Machine learning, exploratory data analysis, bioinformatics)
Bhardwaj V., Pevzner, P. A., Rashtchian, C., & Safonova, Y. (2020). Trace reconstruction problems in computational biology. IEEE Transactions on Information Theory, 67(6), 3295-3314. See publication.
(Algorithms, theoretical computer science, immunogenomics)
Bhardwaj V., Angkaw AC, Franceschetti M, Rao R, Baker DG (2019). Direct and indirect relationships among posttraumatic stress disorder, depression, hostility, anger, and verbal and physical aggression in returning veterans. Aggressive Behavior. 2019;1–10. See publication.
(Statistical analysis)