Aspiring data scientist
and machine learning engineer.
Current Data Science
and Applied Mathematics
student at UC Berkeley.
I am especially interested in and
currently pursuing
You can access my full resume here.
I am a third-year student at the University of California, Berkeley, originally from a small town in the Sacramento Valley called Wilton. I am pursuing a double major in Data Science and Applied Mathematics, with a concentration in Numerical Analysis. Currently, I work as a Data Intern at UC Berkeley's Fung Institute for Engineering Leadership, analyzing current student and alumni data to assess program impact. Outside my formal studies and work, I pursue data-focused extracurricular projects and mentor fellow students, driven by my passion for learning and sharing knowledge. Looking ahead, I am primarily interested in contributing to the data science, machine learning, and artificial intelligence domains, applying my mathematics background to enhance my work in these areas. Feel free to reach out if you'd like to discuss my experiences or the worlds of data science and math in general!
Data Science (BA) - Applied Math and Modeling concentration
Applied Mathematics (BA) - Numerical Analysis concentration
At the Fung Institute, I work to uncover insights from student and alumni data, driving improvements and highlighting the impact of initiatives such as the Fung Fellowship and Master of Engineering programs. Using tools like Salesforce and Google Workspace, I manage and refine databases to ensure accurate and efficient data tracking. My role also involves creating compelling data visualizations that effectively communicate findings, enabling collaboration with teaching teams and stakeholders to advance the institute's mission and share the value of its programs.
As a Data Analyst, I leveraged SQL, Python, and Tableau to query, clean, and visualize student demographic and academic data for strategic enrollment decisions. Collaborating with department heads and fellow analysts, I regularly translated complex data into actionable insights; for instance, in 2024, my analysis contributed to enrolling over 800 additional in-state students (a 1% increase from previous years). I also developd interactive, data-driven visualizations for EM's website, writing accessible HTML code to enhance user experience and highlight key trends for stakeholders.
At SUMaC, I evaluated admissions exams for Stanford’s most competitive high school math programs, assessing students’ mathematical creativity, logical reasoning, and proof-based problem solving skills, beyond rigid rubrics. My role involved providing comprehensive written evaluations that highlight each applicant’s strengths, weaknesses, and problem solving approach, directly shaping admission decisions for top candidates.
Under the supervision of Dr. Stefano Bertozzi, I am conducting an analysis of UC Berkeley's financial landscape for the Academic Senate's Committee on Academic Planning and Funding Allocation (CAPRA). I query and analyze campus-wide financial and enrollment data from UC Berkeley's central database, CalAnswers, utilizing Excel to visualize and interpret revenue streams and expenditure trends across undergraduate and graduate programs. My findings inform CAPRA's strategic planning and resource allocation decisions.
Through a collaboration with Daanmatch, I led a project that identified geographic areas with high NGO concentrations, guiding the strategic allocation of funding and support. I performed exploratory data analysis in Python (Pandas, Seaborn, Matplotlib), constructed data pipelines for cleaning/analysis, and standardized and mapped the addresses of over 10,000 NGOs across India, primarily using ReGex for address formatting. I presented the team's findings in a poster at the Data Science Discovery Program Symposium. View the project report here.
Built a logistic regression model to classify more than 8,000 emails as spam or not spam, achieving 85% test set accuracy through feature engineering and model tuning. Leveraged RegEx to parse email text, engineering features like custom word presence indicators and punctuation frequency. Evaluated performance with precision, recall, and ROC curves, iteratively refining the feature set and optimizing hyperparameters to minimize false positives and enhance overall reliability.
Developed a predictive model for housing prices using linear regression and a custom data pipeline on 500,000+ records. Performed extensive feature engineering—including outlier removal, log transformations, and one-hot encoding—while analyzing potential biases to ensure fair and accurate results. Validated the model’s performance on real-world data, demonstrating the effectiveness of exploratory analysis and iterative refinement in delivering actionable insights.
Built a 2D Java game (2,500+ lines of code) featuring randomly generated worlds driven by user-provided seed numbers.
Implemented multiplayer modes and interactive mechanics, incorporating mouse and keyboard inputs for character
control. Integrated PNG/GIF for graphics, OTF for fonts, and WAV for music, and stored player progress locally
via a text file to maintain persistent gameplay.
Downloads for Mac and select Windows versions can be found here.
Recreated core Git version control functionalities—including add, commit, branch, and merge— entirely from scratch in Java. Designed and implemented internal data structures to track file states, handle branching logic, and manage snapshot history, while ensuring error handling and workflow consistency. Focused extensively on system architecture, object-oriented design, and testing to deliver a fully functioning mini Git system from the ground up.
Implemented a multi-phase Python project emphasizing object-oriented programming, inheritance, and composition through replicating the video game "Plants vs. Zombies". Developed specialized classes of 'Ants' and 'Bees' that override and extend base classes, manage dynamic game states, and interact seamlessly with each other. Strengthened debugging and testing skills by writing local tests and iteratively refining the game mechanics.
Collaborated on a project to approximate π and e using both numerical and geometric methods in Python. Leveraged these approximations in a damped pendulum physics simulation to demonstrate real-world applications of fundamental constants, highlighting the practical intersection between theoretical math and applied science. View the project code, report, and presentation summary here.
As a board member, I advise the College of CDSS on student needs and emerging concerns by participating in regular pulse surveys and providing feedback on core initiatives. I contribute to discussions around advising processes, diversity and inclusion, and student organization support, ensuring that student perspectives shape college-wide decisions.
In this role, I mentor LGBTQ undergraduates majoring in math, holding biweekly group sessions and one-on-one meetings to offer tailored academic advice and share my personal experiences. My goal is to foster a supportive community within the MPS department, helping students plan their academic/professional careers and stay connected to valuable resources.
Coding Languages: Python, Java, SQL, R, HTML, CSS, JavaScript, RegEx, LaTeX
Libraries: Numpy, Pandas, Scikit-learn, Matplotlib/Seaborn
Data Analysis Tools: Tableau, Excel, Google Sheets, Power Bi, ATLAS.ti
Specializations: Data Wrangling, ETL, A/B Testing, Predictive Modeling
Languages: Spanish (California State Seal of Biliteracy)
Data Science (DS): Foundations of Data Science, Computational Structures in Data Science, Principles & Techniques of Data Science, Data Structures and Programming Methodology, Numerical Analysis for Data Science, Probability for Data Science, Introduction to Programming in R
Mathematics: Multivariable Calculus, Discrete Math, Linear Algebra, Abstract Linear Algebra, Abstract Algebra, Analysis, Complex Analysis