HubCab is an interactive visualization that invites you to explore the ways in which over 170 million taxi trips connect the City of New York in a given year. This interface provides a unique insight into the inner workings of the city from the previously invisible perspective of the taxi system with a never before seen granularity. HubCab allows to investigate exactly how and when taxis pick up or drop off individuals and to identify zones of condensed pickup and dropoff activities. It allows you to navigate to the places where your taxi trips start and end and to discover how many other people in your area follow the same travel patterns. What do these visualizations tell us about collective mobility? How many of these cabs might you have been able to share with the people around you? And how might entertaining these questions be the first step in building a more efficient and cheaper taxi service?
With an ever-increasing trove of real-time urban data streams, we are able to see precisely where, how, and at what times different parts of our cities become stitched together as hubs of mobility. By using these pervasive, interconnected, and “smart” technologies, we can begin to unravel the complexity of our travel patterns and identify how we can reduce the social and environmental costs embedded in our transportation systems. In HubCab we target taxicab services as a way to understand the linkages between our travel habits and the places we travel to and from most often.
Seven days of taxi pickups (yellow) and taxi drop offs (blue) in New York (Mon Oct 24 2011, 0°° AM – Sun Oct 31 2011, 12°° PM). Please note: the activity drops down to almost zero on Saturday due to the 2011 Halloween nor’easter storm.
Screenshot of HubCab, showing pickups and drop offs of all 170 million taxi trips over one year in New York City.
Screenshot of HubCab, showing all taxi pickups and drop offs at JFK airport daily between 3AM and 6AM.
Screenshot of HubCab, showing taxi flows and potential taxi sharing benefits between two locations in Manhattan.
Screenshot of HubCab, highlighting all taxi dropoff points in New York City of passengers who were picked up at Times Square daily between 12 PM and 3 PM.
The Science of Sharing
The Science of Sharing The HubCab tool expands and changes the perception of urban space using a largescale data set. Studying this data, we show in a scientific study  the vast potential of taxi shareability. Our analysis introduces the novel concept of “shareability networks” that allows for efficient modeling and optimization of the trip-sharing opportunities. This mathematical approach makes use of network densification effects and represents a substantial advance over the existing state-of-the-art solutions to social sharing problems. Significant improvements of such a shared system are expected to lead to less congestion in road traffic, less running costs and split fares, and to a less polluted, cleaner environment .
The sharing benefits displayed on the map refer to total fare savings to passengers, distance savings in travelled miles, and emission savings in kg of CO2 that come from potentially shared trips. Our research  shows that taxi sharing could reduce the number of trips by 40% with only minimal inconvenience to the passengers. Here we assume this 40% shareability rate, together with the following highly simplifying assumptions: A fare of 3.00$ + 2.50$/mi , using Rate Code 1 not accounting for low motion fares or special surcharges, and average CO2 emissions of 423g/mi . Traveled distance is simplified as linear distance.
The basis of the HubCab tool is a data set of all 170 million taxi trips of all 13,500 Medallion taxis in New York City in 2011. The data set contains GPS coordinates of all pickup and drop off points and corresponding times. Cartographic data of street shapes were obtained from OpenStreetMap. The streets were cut into over 200,000 street segments of 40m length each with a Python script and the help of the shapely Python library, and imported into a MongoDB. Pickup and drop off points were matched to the closest street segments. Street types unlikely to contain taxi drop offs or pickups, such as footpaths, trunks, service roads, etc. were not used in the matching process. Line widths of yellow and blue street segments on low zoom levels were styled on a logarithmic scale. The pickup and drop off points, represented as dots on the high zoom levels, were generated via an Arcpy script, being placed randomly within a box around a given street segment with the box width again following a logarithmic scale. GPX files of the dots were styled using Maperitive (), then merged and amended for different zoom levels. The dots and street line files were layered together with MapBox, which is the platform that streams all the map content.
Urban Richter Scale
In 2008, more than half of the world’s population could be found living in towns or cities; urbanization intensifying in the world’s mega-cities and extensifying in smaller ones. Here, I imagined the world as a Richter scale with lines being drawn from west to east, mapping the density of cities across lines of latitude (5 degree intervals). The lines connect the centroid of each city or town.
In the United States, the 3-1-1 telephone number allows city residents at the municipal level to phone in their various maintenance requests. From broken street lights to pot holes, citizens become the “eyes and ears of their neighborhoods” are able to file their complaints and document the problems that they would like to see resovled in their city. This maps shows all of the 3-1-1 calls from Boston, MA, from 2011, collected from the Citizen Connect Mobile Application.
Imagine swimming across Los Angeles as if pool-by-pool they form a river through the city; 43123 oases stitched together in a desert of hyper-urban reality. You float unabashed down your unmapped highway of water, but are confronted very quickly by the fact that you are not welcome in this realm of kidney and clover bowls, Olympic-sized parallelograms, and hot tubs. Threatened by an unforgiving obstacle course of disgruntled homeowners and an impending court order you continue from pool to pool, your reconciliation awaiting you in the next chlorinated ecosystem.
The LA Swimmer is based on the dataset of “The Big Atlas of LA Pools” and inspired by Frank Perry’s 1963 film, “The Swimmer”. We imagined that through the LA Swimmer, we might be able to make an allusion to the vision of a “river of pools”. However instead of a journey of self-discovery, we rather invite you to leap-frog from pool-to-pool and to see LA through the Google Car’s perfectly tall street view camera.
The “Big Atlas of LA Pools” is about the process of mapping and map-making in the contemporary age of big data, open data, crowdsourcing, and citizen science. The project attempts to highlight on one hand the emerging and powerful role of non-domain experts in the discovery of scientifically and socially relevant information, and on the other hand seeks to emphasize the darker, creepier, and more contentious issues surrounding data processing and exploration. As a “two-person army”, Benedikt Groß (DE) and Joseph K. Lee (US) located and traced the contours of over 43000 pools and other manmade water boundaries — features which computer vision could not adequately demarcate. Throughout their project, the two exploited the idea of “crowdsourcing” to process the aerial ortho-imagery of their study area in Los Angeles County and to validate their dataset using commercial online third-party services, namely clipping farms in India and Amazon Mechanical Turk. In addition, they mashed then together additional layers of contextual information that might suggest surprising, or intriguing, or sinister spatial relationships within LA’s social and physical landscapes.
The Atlas of LA Pools is a processed-based exploration of data and an experiment in data processing methods, analysis, and resourcing. While the final visualizations and text are intended to stand alone as visual and scientific contributions, they are limited in describing the journey through which the authors arrived at these maps.
The authors started with 39 georeferenced orthoimages of their study area in Los Angeles county. Two sets of images — one true color composite (red, green and blue wavelengths) and one false color composite (near infrared, red, and green wavelengths) — were sent to clipping factory farms to delineate the pool and manmade water boundaries as vectors. Clipping factory farms are services in India normally used to prepare product placement photos for onlineshops , they “clip” the product out and place it on a white background. The authors performed a quality check of the initial dataset, deleting falsely drawn pool and water boundaries and marking additional features from the source aerial footage which had been overlooked. The images were then resubmitted to the clipping factory to be reprocessed and were again reviewed by the authors. After passing this rigorous inspection, the geopositions (latitude and longitude) of each pool were viewed against higher resolution (~1 ft) Bing maps as provided by Microsoft. The authors then submitted these high-res images to Amazon Mechanical Turk and asked the workers to visually check whether a feature might be a pool (blue water) or not (blue roof); this was done twice to ensure that the geolocated features were accurately classified. At last, the authors manually resolved any of the conflicts in the dataset. The result: “The Big Atlas of LA Pools”
NAIP Orthoimagery The National Agriculture Imagery Program (NAIP) acquires aerial imagery during the agricultural growing seasons throughout the continental U.S. This data was used to find and trace the pools. The dataset is also used for the aerial imagery overview pages.
Bing Maps Bing Maps by Microsoft were used to show higher resolution images of the pool locations. This data was used to help validate the pools on through Amazon Mechanical Turk. Furthermore the images are also used to catalog the individual pools throughout these books.
Google Geocoder Google provides a service for geo-coding and reverse geocoding addresses (by translating latitude/longitude to addresses). This service was used to give each pool an address.
Eight Maps Eight Maps is a map mash-up of the people in the State of California who were actively opposed to passing same-sex marriage and made monetary contributions towards passing the proposition which would prevent it. Proposition 8 changed the California State constitution to prohibit same-sex marriage.
Megan’s Law Sex Offender List Megan’s Law is the informal term for the United States’ Sex Offender Act of 1994. This allows law enforcement agencies to make public the addresses and crimes of the registered sex offenders living in the United States.
Los Angeles Crime Dataset The Los Angeles Sheriff’s Department provides an extensive crime dataset which includes information regarding the geoposition, date, and type of crimes that have occurred in Los Angeles County. We used a yearly data file for 2011.
Land Use and Parcel Boundaries The Los Angeles County Office of the Assessor provides data regarding the land lot parcel boundaries and their specific land uses, property value, year built, and much more.
LA Neighborhood Definition The LA neighborhood boundaries were taken from the Los Angeles Times neighborhood definition dataset from June 2010.
"The Big Atlas of LA Pools" is a presentation of the data derived from satellite and aerial imagery, sourced from various governmental and non-governmental organizations, and processed by the authors. The information expressed in this collection is neither conclusive nor tested for statistical significance. The "swimming pool" dataset is enriched through spatial autocorre- lation, or in other terms by incorporating Tobler’s 1st Law of Geography: "Everything is related to everything else, but near things are more related than distant things." The data is therefore provided merely to suggest certain relationships and curiosities, not to define them.
74 Books, ca. 6000 pages
Covers show all shapes of the swimming pools overlaid
Aerial Imagery of the study area of the specific book, here all of LA
Process and workflow
LA’s neighborhoods with pool count
Each page shows all pool shapes of the particular neighborhood overlaid = neighborhood pool shape fingerprint
The map shows all sex offenders in LA and whether they have pools or not
Neighborhood maps and pool locations
Aerial Imagery of the study area of the specific book, here Downey
Pool catalog page, all pools are orderer according to their post address
Pool features e.g. area, dimensions, water evaporation, parcel price, crime …
All posters are based on the dataset of “The Big Atlas of LA Pools”.
Satellite Overview RGB (from the Big Atlas of LA Pools)
Satellite overview of the analyzed area 35 km by 49 km, from Beverly Hills to Long Beach (1716 km²).
Lambda Print, 100 cm x 145 cm
LA Sorted (from the Big Atlas of LA Pools)
Each pixel of the source satellite imagery classified into four categories (top to bottom): Sea, Developed Areas (streets, houses etc.), Pools and Vegetation + Soil. Within the categories each pixel is sorted to its greyscale value. A pixel represents the size of 2m x 2m, the image shows 428.955.900 pixels (17450 x 24582).
Lambda Print, 100 cm x 145 cm
Pool Shapes (from the Big Atlas of LA Pools)
The poster shows each pool’s shape overlaid one onto the other. The orientation (compass directions) of the pools has been normalized according to the longest axis and the centre points. The average pool size of all pools in Los Angeles is 4.99 m × 10.13 m.
Lambda Print, 100 cm x 145 cm
43123 (from the Big Atlas of LA Pools)
The map shows all 43123 pools of Los Angeles. Each pool has been hand-traced and quality checked via commercial online crowdsourcing services.
Lambda Print, 100 cm x 145 cm
Welcome to my temporary webpage! New site coming soon. image: Landsat ETM+ False Color Composite of Singapore, the “little red dot”.