Webinar on Using Mobile Phone and Big Data in Transport
The AITPM in conjunction with the NZ Modelling User Group (NZMUGS), plus leading consultants from Aimsun and Mott MacDonald held a webinar titled “Current and future strategies for utilising mobile phone/big data in transportation modelling”.
Over 400 registered, with some 300 from Australia, ?? hundred from New Zealand and also attendees from the UK.
It is worth pondering how much the concept of using big data in transport planning is going through the Gartner hype cycle.
The AITPM in conjunction with the NZ Modelling User Group (NZMUGS), plus leading consultants from Aimsun and Mott MacDonald held a webinar titled “Current and future strategies for utilising mobile phone/big data in transportation modelling”.
Over 400 registered, with some 300 from Australia, ?? hundred from New Zealand and also attendees from the UK.
It is worth pondering how much the concept of using big data in transport planning is going through the Gartner hype cycle.
In terms of the Gartner hype cycle, perhaps it is not a case of “Inflated expectations” but we really need to move up the “Slope of Enlightenment” in understanding just how much professional effort and understanding we need to apply in ensuring the accuracy of the data, comprehending its limitations and applying it in an effective manner.
In his opening remarks to the webinar, Tom van Vuren, said:
My personal position is that mobile phone data is a data source that we cannot ignore. It has got its challenges but the speakers today all have spent many years learning and improving their data sets and processes turning mobile phone traces into useful valuable data in Transport Planning here in Australia and New Zealand. We can piggyback on speeding up this kind of delivery at the lower cost whilst protecting credibility and analytical robustness.
First the Good News
In the first presentation Claire Cheriyan, from Transport for London in a joint presentation with Tim Beech from Jacobs, said that there are opportunities for using mobile network data in transport planning and modelling in both short- and long-term monitoring and evaluation, understanding how the city is changing over time or even just on shorter bursts such as major events.
The main focus of their talk was on the construction of matrices for transport modelling and the very in-depth insights into movement of people that are offered broken down by mode and by purpose.
There are opportunities to get a huge temporal range in the data to look at things that are important to large urban areas such as the night time economy and also understanding how people from outside of our city are influencing the flows of traffic say on the outer edges.
The main benefits come from working with a huge number of data points and also removing the human element of data collection.
What are we getting?
Mobile data gives us a lot of records but it is critically important to know just how the records are produced and consequently what they really represent.
Philippe Perret Technical Director Citi Logik began his presentation on what data is being generated. Edited comments from his presentation are as follows:
Data is based on monitoring technical messages existing within the 2G 3G 4G networks and it's basically helping to locate mobile device accurately within a cell.
The higher the G the smaller the coverage. Also the very important thing is that when the data is collected, it is collected for each individual device behind a firewall which is given the random identifier so that we can follow a device over a long period of time.
But also, what is very important is that setting up data can be categorised in two forms. Which is what we call on-call and off-call.
But also what's important to realise is that it's so so slightly different from Call Data Records (CD). It goes way beyond this data using new mobile data.
We have what we want to call the passive connections where the user of the device is actually not active with their phone so they are not on a call or browsing or anything. This kind of data set, the off-call, are not generating a lot of evidence. They would generate, at regular periods or as a device travels when it goes through groups of cells which are collected together to help mobile operators understanding where devices are that's the trouble.
The important element in terms of the passive connections, or off-call, is that this may not be very spatially granular because we don't go to the level of the exact cell but a group of cells only. However, this is an extremely large sample.
Then we've got a secondary set which is what we call the on call or active connections. This is actually when users are using their phones making calls or browsing on the Internet or even apps just pushing information. This generates a lot of evidence - hundreds of events a minute which are very granular because at that time mobile operators are really directing the information to the exact cell.
However, what we have here is a far more limited sample, because mobile device users are not active on their devices all the time and you will find that probably 95 per cent of the time your device is in an off-call mode generating only passive connections; thus on only 5 percent of the times it is generally generating active connections.
But what's very important here for us to be able to really understand movement, is how many events tend to be generated using this information. Typically, a device might generate around 250 to 300 events on a day in a useful form.
So, when we remove the events that are not really useful to us, we have about 250 events per day per device. What becomes very interesting is when we start looking at how events are recorded considering the device situation. What we find is when devices are travelling with their owners, we have more evidence being collected. This is critical from our perspective as to the purposes of the analysis we do based on our perception of the movement.
Philippe showed a slide of an example following a trip from South London to Gatwick Airport.
Looking at the cell size starts to show what the limitations are on mobile phones. This is that the data itself does not give us an accurate location in terms of what G.P.S. could give us. What we get is an area where a device tends to be and this starts to show the limitations with mobile phone data.
Philippe’s full presentation is available on the internet in which he covers issues such as:
• After the data is processed the mobile operators have created a series of algorithms to identify trips and dwell times.
• Looking at identifying road trips versus rail trips
• Cell size can vary quite greatly. They do not have exact boundaries.
• Cell size can vary with the weather.
• There is a critical element here in that we are only following devices. We are not following people.
• Expanding the records to represent a total sample means you have to be aware of the market share and market penetration.
In future editions of the newsletter we will be looking at some of the other comments that came out of the webinar. Issues to be covered include the need to validate data against other data collected through other means such as a house hold survey.
Links to the on-line material are as follows: