Analysis of Academic Preferences

Context

I have been co-lecturing in Ngee Ann Polytechnic for about 3 months now – developing two modules, a Research Studio and a Data Analytics, both for product innovation diploma students. The modules focus on an evidence-based approach, providing research skills to aid their initial stages of design thinking.

For their last assignment, we posed a collaboration project with NEA to the students. They are to study 8 different hawker centres and propose design solutions for them. To aid their research process, we suggested that the students take up different focuses and share their findings with one another.

As such, they were given opportunities to choose site locations and roles - intelligently and humorously crafted by us for them. All 39 students will be allocated 40 distinct roles as illustrated by the matrix above. Consequently, a survey was created for them to bid for their choices of responsibilities.

The survey allows students to express their top 4 choices of responsibilities (site location/role). And to ensure fairness, I assigned them on a first-come-first-serve order basis

Motivations / Research Questions

In contrary to popular belief, I didn’t intend to analyse the survey data from start. However, while I was manually assigning the roles, I had many “gut-feeling” observations about the student’s choices which reflected very interesting characteristics about the class at large.

This inspired me to pursue an in-depth study; it's perhaps good introspection of our influence on the students. Also, this might be a good chance to show our students that there is massive potential in survey data if it is crafted strategically to target certain trends or phenomena. The methods I used are rather advanced because they effectively compare 3 different scales of data: Location Preference, Role Preference and Degree of Preference (1st,2nd, 3rd, 4th choices).

Here are some research questions of the study:

1) Understanding patterns of students' academic interests

2) Based on choice of roles, find out who might work well with each other

3) Based on choice of location, have an intelligent guess where students might stay

1) Patterns of students' academic interests (sankey diagrams)

Sankey Diagrams were used to show the progression in the choices in which each student expressed.

Might be useful to note that 1 Alluvial Line represents 1 student.

1a) Choice by Location

Majority of the students have distinct favourites in site location

Most students make repeated choices of site locations at least once.
23% of our students (9/39) choose the same location for ALL choices.
All students who listed the Commonwealth Crescent Site as the 1st choice stayed with the site for all their remaining choices (see blue alluvial lines).

I am convinced that students would want to study a site location in close proximity to their house. Perhaps for convenience reasons, for familiarity reasons, or even to work with their friends in the case that they happen to choose the same site. Perhaps in future, we could create projects that inspire students to get out of their comfort zones and have more diversity in the sites of interest.

1b) Choice by Role

The analysis by role choice is particularly interesting. Apart from observing the distribution of academic interests, we also get to see the behavioural tendencies of students with a particular interest.

Influencer and Journalists roles have overwhelming popularity.

Influencers and journalists make up on average 50% of all choices listed.
The influencer role took up a whopping 43% (17/39) among 1st choices, the highest proportion of a role in any given choice category - 1st/2nd/3rd/4th.

The qualitative work scope of the influencer and journalist roles are new and unfamiliar – especially in this research module where we have been mostly teaching quantitative methods so far. The very fact that numerous students still chose this possibly shows how they might still be more comfortable with qualitative approaches in their work. Or perhaps the names are just too enticing.

There is a trend of persistence for those who chose Influencers and Journalists roles.

Of the 10 students who chose the same role through Choice 1st – 4th,
- 7 of which were those interested in influencers/journalists roles.

This raises a question: is this persistence due to the anticipated demand of the role or might this be due to a strong insistence of qualitative research designers on qualitative work. If the latter is true, it could be because the students may already have given in to their weaknesses/preferences or even have heavy objections against the quantitative methods we taught them (haha). This is a topic that I will have to seek feedback on.

The role of Spatial Analyst seemed like a good backup plan for many.

The spatial analyst makes up 28% (11/39) of the 2nd Choice. Which is rather high compared to its count at 1st, 3rd, 4th Choice, with 8, 4, 6 respectively.
It is also the most picked 2nd Choice.

I found this to be very comforting. Spatial analysis was something unorthodox that we taught which strays from the traditional product design syllabi - the students are required to pick it up from scratch. I was frankly worried about their interests in spatial analysis since our students did in fact, chose to study product design instead of architecture. It made sense that spatial analyst was not the most picked 1st Choice (even though it’s still quite significant), but I was glad that people found enough interest in our content to make it the top 2nd Choice.

Students interested in Spatial Analysis/Town Planning have the most diverse interests.

From tracing the blue/light green alluvial lines, students who choose spatial analysts on any of their 1st/2nd/3rd/4th choices tend to have a different subsequent role choice.

It's quite amusing that students who have a preference for spatial analysis also tend to be more willing to explore other research methods and scope of work. This could imply the interdisciplinary/ transdisciplinary interests of these students.

2) Who might work well with each other? (SNA by Role)

This Social Network Analysis (SNA) seeks to reveal hidden groups of students with shared academic interests- as such, community detection will be the main method explored here. To summarize the methodology: the survey results provides sufficient data for a bipartite graph by choice options, which is then translated into a one-mode of projection for analysis. Simply, the network will connect students (edges) to one other if they happened to pick the same choice.

SNA by Role

(Academic Interests)

The diagram highlights connections among students who share a similar pattern of choosing their roles. To put it simply, they are being clustered by academic interest. It is important to note that this also accounts for “degree of interests” since we have data differentiating 1st/2nd/3rd/4th choices which imply different extents of interests.

Students are assigned community predictions (represented by colours) based on the greedy optimization algorithm. They are likely grouped by common role choices sequences which are perceived to be unique traits that define their exclusive community. For instance, the blue community seems to be made up of students who only chose influencers as their first choice.

I fondly believe that there is much potential utility in such an application. Grouping students based on interest could be a more humanistic alternative than banding them traditionally based on pure academic prowesses. Schools could give students academic preference surveys and use the information collected to administer teaching resources to the respective groups. Education in consideration of a student's interests and diverse abilities will definitely be more effective.

There is also the possibility of encouraging students of different interest communities to work together, so as to gather better inspiration and to generate better ideas.

3) Where students might stay? (SNA by Location)

We will also be using SNA except with location as edges, rather than roles like (2).

This experiment will serve more as proof of concept rather than a question in real need of an answer.

SNA by Location

This experiment assumes the high likelihood that students have chosen site locations closest to their homes. Hence, the network could give an approximated geographical relationship between each student

SNA_loc_4choice

SNA_loc_3choice

SNA_loc_1choice

SNA_loc_4choice

1/4

Model - 1st Choice

If modelled exclusively by the 1st choice, you will see 8 different clusters of people who likely live close to the 8 different site locations.

Model - 1st Choice + 2nd Choice

However, if modelled additionally with the 2nd choice, the clusters start to connect. Why? It can be assumed that if one does not get the 1st choice (nearest site to their house) they will pick the 2nd choice (the next nearest site). Hence, you would see the “proximity” of each cluster of students to another cluster. In this diagram, students of the cyan group could likely live geographically between the green group and the red group.

Model - 1st Choice + 2nd Choice + 3rd Choice...

It might be expected that as the number of choices modelled increases, the more accurate the estimation will be. However, I found this to be unlikely since we should not forget that the original assumption where people who choose site locations closer to their homes might not hold by the 3rd or 4th choice.

Well, gotta admit this one didn't work the way I wanted it to...

Reflecting upon this, it reminded me of something important- the necessity of having a strong awareness of what the data actually represents and its underlying assumptions. More often than not, data analysts (including myself) get consumed by the goal of saving the world with our models that we detach ourselves from the real world representations. Especially when numerical models will only get increasingly complex in this day and age, I think it is too important to stop and think about their boundaries and limitation, and access if it is useful in giving us answers to what we want.