Arkash Jain
Software Engineer
I graduated with a combined BA [Mathematics and Computer Science] and MS [Computer Science] from Boston University bringing with me experience from a Venture Capital Firm [Battery Ventures 2x sourcing intern, diligence intern - https://www.battery.com], BCH (through BU Spark] and a tech startup [Zerosync - https://www.zerosync.co]. My master's theses focused on Streaming & Event driven systems [more specifically, adaptive checkpointing in Flink] and Numerically Stable Momentum Based SGD Algorithms. My core passion lies System Design on which I write a monthly blog, [https://medium.com/@arkjain] highlighting various online algorithms for Stream Processing Engines, Sorting Algorithms, KV and Columnar Stores, Adaptive Techniques for Task Placement on the Cloud, ML etc.
Warren Alpert Building, Room 138
617.713.8684
Arkash.Jain [at] childrens.harvard.edu
Current Projects

Here at the Kirchhausen Lab, I've worked on the Incasem [https://github.com/kirchhausenlab/incasem] and CryoSamba [https://x.com/KirchhausenLab/status/1813651584427184490] projects so far, automating the entire workflows, introducing vanilla architectural decisions such as local config and runtime managers, custom loggers, end to end tests, benchmarks, scripts and lightweight APIs for ease-to-use GUIs.

I built one of the world's first completely unsupervised, annotation-free 3D native segmentation model by modifying DinoV2, an open source 2D model from Meta for natural images to be translated to highly convoluted biological data. Biologists collect data of viruses interacting with cells using novel microscopes however the data is noisy to the extent one cannot discern the foreground from the background. The intensity of the particles is close to that of the background and it fluctuates, making it hard to use simple thresholding techniques across time and even across the single volume. Understanding how the viruses interact with the cells helps biologists come up with techniques to prevent diseases like Alzheimer's and Parkison's. Until now, no such method existing to segment such data using supervised or existing ML techniques, especially cross-domain for multiple types of viruses and cells. Our model, once pretrained, allows biologists to segment any viral data with an extremely high accuracy, uncovering interesting biology. Throw on tracking of these objects, we have built a pathway to solve every viral borne disease on the planet.
:small_orange_diamond: Co-authored 3 Deep Learning Papers, 𝐛𝐞𝐢𝐧𝐠 𝐭𝐡𝐞 𝐟𝐢𝐫𝐬𝐭 𝐚𝐮𝐭𝐡𝐨𝐫 𝐨𝐧 𝐒𝐩𝐚𝐭𝐢𝐚𝐥𝐃𝐈𝐍𝐎 one of the world's first 3D Self-Supervised, Annotation Free Deep Learning Model. Relevance - allows a deeper understanding of how viruses interact with cells and track them, allowing biologists to study pathways of entry/exit.
:small_blue_diamond: Key Contributions: Feature pyramid based encoder, 3D Native transformations, patch-token augmented segmentation, optimization of networking stack to using streaming datasets, multi-node training/inference, hardware acceleration using diverse datatypes (bfloat16), model quantizations.
:small_blue_diamond: Papers: Dinov2, DesD Loss, ADOPT optimizer, FeatUP, LiFT, SINDER, NoPE
:small_blue_diamond:https://kirchhausen.hms.harvard.edu/publications

Alumni Graduation Year
2025
Alumni Present Position

Machine Learning Engineer at Benmore Technologies / Dakota Systems