Ever wondered why it is so hard to get on a shuttle bus during lunch hours? As students, we constantly gripe about boarding the bus to get to class on time, and while NUS is trying its best to resolve this problem, it still persists till today.

ST

(Image from the Straits Times)

Wolf Pack in NUS Business Analytics Society

As part of NUS BAS’s new initiative, Wolf Pack, our group decided to pick up a familiar student life-related problem and investigate its root cause. To tackle the insufficient supply of shuttle buses during peak hours, we planned to model travel demand and submit a report to the Office of Student Affairs (OSA) to see if they can adjust the bus schedule to meet the required demand.

Modelling Data Based on Modules

Our group decided to model the demand based on the number of classes occuring at once and class size. We can then estimate demand based on the number of students travelling to the main campus from University Town or Kent Ridge MRT.

Therefore, our first step to take was to scrape module-related data from the IVLE or CORS. We found out later that the data was easily available from the NUSMods API. We then converted it from JSON to CSV format (Excel-readable) using an online converter. Due to the sheer amount of data available and our inexperience, we decided to limit our scope to Business modules taken in BIZ 2 and COM 2.

Data Cleaning Using Pandas

As expected from online-sourced data, some data cleaning is required, and in our case our data had included duplicate modules and modules with two different module codes (e.g. ACC1002/1701). An experienced senior in Wolf Pack recommended us to learn Pandas, a useful data structures and data analysis tool in Python.

BAS3

(An example of a duplicate. Class K9 for ACC2002 appears twice.)

To remove duplicates from our data set, we used the drop_duplicates function. However, inputting too few or too many parameters into the argument would result in duplicates still being present or non-duplicates being removed. After reading posts on Stack Overflow and some trial and error, we finally managed to get the code right.

BAS1

Then, we filtered the data based on the module venues. The relevant venues would be close to BIZ 2 and COM 2 bus stops.

BAS4

After exporting the cleaned dataset into Excel, the next step was to visualise the data.

Data Visualisation using Tableau

After learning Tableau from the BAS tutorial, we wanted to see the starting and ending times of all the modules across the week.

4.PNG

(Most of the Biz modules held in BIZ1 and BIZ2 ended around 11 am or 2 pm. This was especially so for Tuesdays and Thursdays)

Our initial plan was to scrape the class sizes for each module from IVLE. However, due to the class sizes being kept in different pages we were unable to do so. We tried doing it with CORS but because most of the foundation business modules were allocated, the bidding reports did not give reliable results.

BAS5.PNG

Next, we wanted to create a heat map, similar to the one that we learnt in the Tableau tutorial.

BAS6.PNG

(This heat map shows the profits for each state and is colour coded to show the difference between high and low profits, and negative profits)

Another Tableau guide gave us information about how we should start. We decided to do a scatterplot, using point coordinates to plot various locations.

A friend recommended the NUS ArcGIS Directory, which contained a folder containing data with lat long coordinates for each location in BIZ 2 and COM 2. This is exportable to JSON. Spatial data (shapefile or geoJSON) was required, so we used an online converter to do that. Following that, we realised the coordinates pointed us to the middle of the South Altantic Ocean, so it was necessary to transform the coordinates. There were online resources to help us do it, but we would need to input each pair of coordinates manually. Looking to Stack Overflow once again, we hoped a Python library could help us. The relevant articles we found were too technical and we did not have the experience or the time to understand them, so we decided to scrap that idea.

Nevertheless, we manually converted some coordinates so that we can input them into Tableau. Unfortunately, what we saw was a tiny dot in Singapore, as shown below.

5.PNG

Perhaps Singapore was too small compared to the United States? It was possible to build a custom map in Tableau, but due to the lack of time, we decided to wrap up our project.

Learning Outcomes

Despite not accomplishing the goals we originally set for this project (i.e. submitting a report to OSA, scraping data from IVLE and creating a heat map using Tableau), the learning curve was indeed quite steep. Nevertheless, we actually enjoyed the experience because for most of us, it was our first time embarking on an independent project where we had to learn skills tailored to our tasks.

We hope that you have enjoyed reading about our project and learnt something from it!

The members of the team include:

Nelson Lim Jun Kang, Year 1, BAC

Jonathan Low Ka Heen, Year 2, BBA

Moh Song En Joseph, Year 2, BBA

Quek Yew Hong Ryan, Year 2, BBA + Econs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s