Improving Student Communication with Data
In this project, I worked with Shamen Kalmurzaev and Baitur Boobekov on a data analytics project called CCE Email Optimization Analysis at the American University of Central Asia (AUCA).

The main goal of this project was to understand how students and university members respond to emails sent by the Center for Civic Engagement (CCE). Since CCE organizes many important events and opportunities, it is important for their emails to be clear, attractive and easy to read.
To study this problem, we collected and analyzed survey responses about email reading behavior, subject line preferences, email body length, keywords, reminders, link placement and overall satisfaction. In total, we analyzed 46 survey responses and created a full data analysis pipeline using Python.
For the analysis, we used several techniques, including data cleaning, feature engineering, K-Modes clustering, SMOTE balancing, Logistic Regression, statistical testing and data visualization. The project helped us identify different groups of respondents and understand what makes people more likely to read CCE emails.
Some key results from the project include:
- 46 survey responses analyzed
- 32 engineered features created
- 3 respondent clusters identified
- 95.4% cross-validated model accuracy
- Visual dashboards created for better decision-making
The results showed that students usually prefer emails with clear subject lines, short titles, short body text, and useful keywords such as Opportunity, Event, Job and Internship. Respondents also valued practical features like registration buttons, contact information and relevant attachments.
This project gave us practical experience in applying machine learning and statistical analysis to a real communication problem. It also showed how survey data can be transformed into useful recommendations that help organizations improve how they communicate with their audience.
Overall, the project was a valuable learning experience in data analytics, machine learning and teamwork. It helped us better understand how data can support better decision-making in university communication.
You can find the full project code and report on GitHub:
https://lnkd.in/dVtVtthF