Winning TU Data Analytics Competition 2025

Notebook repository

About

TU Data Analytics competition is hosted by the College of Business and Economics. It runs every year during the spring semester where students are provided a data set to uncover patterns to answer real-world questions. We were given a full month to submit our findings and slides. Top 10 presentations are selected as finalist. Finalist are then asked to present to a panel of judges.

The dataset that was given to us was a list of universities in the U.S. with multiple features such as the number of students enrolled, or how much they spend on certain categories.

The challenge was for us to rank the list of universities in the dataset and suggest where Towson University should invest to increase its standing in our list.

I was recruited by a member of the Data Science club at my university. I was initially hesitant at joining since it didn’t sound interesting at the time but there was cash prize invovled, so I decided, why not.

Initial Mistakes

This competition was definitely eye opening in regards to how little I knew about the field of Data Science.

Writing Python wasn’t the hard part, but it was learning about the different techniques such as dimensionality reduction and scaling numerical features to a standard.

Not having that statistical background was definitely a set back.

At the start of the competition, I looked at the data and became overwhelmed. I’d sort the columns, do some comparison, but it didn’t lead to anything. I was hitting a wall with nothing to show for.

I turned to LLMs to get me on the right track but I was bombarded with machine learning algorithms that I’ve never heard of before.

Internally, I was ready to call it quits.

Breakthrough

As the due date to submit our findings was coming to a close, I told a friend, who was a phd student, about the issues I was having. He suggested that I read up on principal component analysis. It’s a dimensionality reduction technique that became the gateway to our breakthrough.

The biggest breakthrough, and what got us the win, was when my partner prompted an LLM to generate categories and weights based off Towson University’s Strategic Plan

It gave us the following weights:

weights = {
            "Student Success": 0.37,
            "Access": 0.09,
            "Equity": 0.09,
            "Academic Resources": 0.15,
            "Innovation & Research": 0.12,
            "Sustainability & Efficiency": 0.06,
            "Community Engagement": 0.17,
        }

We have to choose the features that belong in each category. We didn’t think about prompting an LLM again to bin features within those categories but that honestly could’ve saved us a lot of time.

Once we’ve binned the features into their respective categories, we used PCA to reduce the categories into a single component. We then multiplied each category’s value by the weights generated and summed them up as a value to determine rankings.

To answer the question of “provide a list of peer institutions that Towson University could learn from”, we used PCA again to generate a scatter plot. We used K-means to generate the clusters.

Practicing speech

I have to admit, I’m not a great speaker, especially on topics that I have very little knowledge about. My lack of confidence and the pursuit of being 100% correct has probably held me back on amazing opportunities.

On the contrary, my partner was a great speaker. He had to coach and correct my shortcomings when it comes to public speaking (I took a public speaking course a semester ago 😭). I eventually accepted that nobody is perfect and to just “send it”. Saying things with confidence started to feel like the times where I conquered my fear doing tricks on a skateboard. Chances are, you’ll probably land it if you don’t chicken out midway.

I finished writing up this post on Fri Dec 12 20:44:54 EST 2025 so I’m simply going off memory and cannot say more about how the presentation went. I just remember freaking out when we were announced as first place.