Projects
I get curious about a lot of random stuff and try to make it a reality. Here is a (very non-exhaustive) list of curiosities I have followed:
-
Experimented with Group-Relative Policy Optimizaiton (GRPO) for various LLM fine-tuning tasks. Created a reward function to measure the “rhymingness” of 2 lines of verse, in order to fine-tune a poetry LLM with RL. Also created a custom reward function to incentivize use of more frequent words, in attempt to fine-tune LLM to speak only in the most common english words.
-
Script Generation
Built a custom corpus of video scripts, and fine-tuned an LLM using SFT to generate original video scripts.
-
Experimented with different deep learning methods for morpheme segmentation. Trained multiple model architectures for morpheme segmentation. Built a Python package utilizing the Tü Seg architecture, as well as an integration into the spaCy NLP library.
-
Trained RNN-based model to predict phonemes from english text, and built a web application around it to make a text-to-speech site.
-
Led team of first-year students to build this as their introduction to Machine Learning with MassAI. Reviewed research literature and tested document clustering methodologies to provide advice and direction to the team. Created an email categorizer that can cluster emails and create a human-readable label for each of the clusters it defines. The team earned 4th place in the MassAI project showcase.
-
Ship Name Generator
Made intelligent combiner of 2 names by taking all possible combinations of any number of syllables from the 2 names, then ranking their wellformedness with a bigram phonetic model.
-
Buffalo Buffalo Buffalo Buffalo Buffalo
Computationally searched for as many words as possible that can form a complete sentence with just themselves, ideally repeated more than twice.
-
Phoneme Spanning Set
Wrote efficient search algorithm to find a set of English words that spans the set of all phonemes commonly occuring in American English.