Cellular Communications Data
Cell phones regularly communicate with their networks through control channel messages. This cell-tower signaling can locate and track individual cell phones using trilateration and other inferences with signals sent between phones and towers. This was one of the first technologies harnessed to provide origin-destination (OD) big data on a large scale and remains widely used. Because this data is provided by cell phone service providers and they are particularly concerned for their customer privacy, they do not release disaggregate data, but only aggregated data products.
# Precision and Coverage
Cell-based OD matrices can be obtained for most time periods of potential interest, including average weekday, average weekend day, individual day of the week, and multi-hour periods within the day. The spatial resolution or precision of the data is limited. Locations are generally only known with a precision of more than one hundred meters and sometimes only within one to two kilometers in areas of limited tower coverage, but precision tends to be better in urban areas with better tower coverage. Cell-based data is typically purchased in observation periods of one month which can result in smaller net sample sizes, despite higher penetration if compared to multiple months of other datasets.
# Sample Penetration
Sample penetration can vary significantly depending on the market share of various cell phone service providers and there are some areas with no coverage at all. The authors have found these samples typically include approximately 6–10% of vehicles in a corridor. These samples may include observations from a significantly larger portion of the population, perhaps as much as 30% or more depending on service provider market shares. However, not all trips by a person are necessarily observed. Therefore, the portion of trips observed is less than the portion of the population included in the sample. These figures vary by region, and some regions may have even larger samples than this range while others (especially rural areas) may not achieve this level of penetration.
Cell-based data is typically pre-expanded by the provider based on proprietary estimates of service provider market share at imputed residence locations. This residence- or population-based expansion can be helpful in addressing demographic biases related to market shares. However, this type of expansion does not address systematic biases that can arise in the data when people travel to and from locations with poor coverage or when trip-length bias arises from the varying frequency of observations which has been documented in this type of data.