Joey Lee explores geographies mediated through data and computation.
Designer & Researcher.
Based in Brooklyn, NY
Working as an interaction designer, creative technologist, and educator.
New York University | Interactive Telecommunications Program (ITP)
School of Visual Arts | Design for Social Innovation
Generative design & geography, machine learning, data collection systems, climate change, urban climates, open source & access, visualization, algorithmic ethics, critical mapping, & education.
The Big Atlas of LA Pools
A seventy-three volume atlas cataloging the pools of Los Angeles and emerging issues around data privacy.
The “Big Atlas of LA Pools” is about the process of mapping and map-making in the contemporary age of big data, open data, crowdsourcing, and citizen science. The project attempts to highlight on one hand the emerging and powerful role of non-domain experts in the discovery of scientifically and socially relevant information and on the other hand seeks to emphasize the darker, creepier, and more contentious issues surrounding data processing and exploration.
As a “two-person army”, Benedikt Groß (DE) and Joey Lee (US) located and traced the contours of over 43000 pools and other manmade water boundaries — features which computer vision could not adequately demarcate. Throughout their project, the two exploited the idea of “crowdsourcing” to process the aerial ortho-imagery of their study area in Los Angeles County and to validate their dataset using commercial online third-party services, namely clipping farms in India and Amazon Mechanical Turk. In addition, they mashed then together additional layers of contextual information that might suggest surprising, or intriguing, or sinister spatial relationships within LA’s social and physical landscapes.
The Atlas of LA Pools is a processed-based exploration of data and an experiment in data processing methods, analysis, and resourcing. While the final visualizations and text are intended to stand alone as visual and scientific contributions, they are limited in describing the journey through which the authors arrived at these maps.
The authors started with 39 georeferenced orthoimages of their study area in Los Angeles county. Two sets of images — one true color composite (red, green and blue wavelengths) and one false color composite (near infrared, red, and green wavelengths) — were sent to clipping factory farms to delineate the pool and manmade water boundaries as vectors. Clipping factory farms are services in India normally used to prepare product placement photos for onlineshops , they “clip” the product out and place it on a white background. The authors performed a quality check of the initial dataset, deleting falsely drawn pool and water boundaries and marking additional features from the source aerial footage which had been overlooked. The images were then resubmitted to the clipping factory to be reprocessed and were again reviewed by the authors. After passing this rigorous inspection, the geopositions (latitude and longitude) of each pool were viewed against higher resolution (~1 ft) Bing maps as provided by Microsoft. The authors then submitted these high-res images to Amazon Mechanical Turk and asked the workers to visually check whether a feature might be a pool (blue water) or not (blue roof); this was done twice to ensure that the geolocated features were accurately classified. At last, the authors manually resolved any of the conflicts in the dataset. The result: “The Big Atlas of LA Pools”
“The Big Atlas of LA Pools” is a presentation of the data derived from satellite and aerial imagery, sourced from various governmental and non-governmental organizations, and processed by the authors. The information expressed in this collection is neither conclusive nor tested for statistical significance. The “swimming pool” dataset is enriched through spatial autocorrelation, or in other terms by incorporating Tobler’s 1st Law of Geography: “Everything is related to everything else, but near things are more related than distant things.” The data is therefore provided merely to suggest certain relationships and curiosities, not to define them.
- NAIP Orthoimagery – The National Agriculture Imagery Program (NAIP) acquires aerial imagery during the agricultural growing seasons throughout the continental U.S. This data was used to find and trace the pools. The dataset is also used for the aerial imagery overview pages.
- Bing Maps – Bing Maps by Microsoft were used to show higher resolution images of the pool locations. This data was used to help validate the pools on through Amazon Mechanical Turk. Furthermore the images are also used to catalog the individual pools throughout these books.
- Google Geocoder– Google provides a service for geo-coding and reverse geocoding addresses (by translating latitude/longitude to addresses). This service was used to give each pool an address.
- Eight Maps – Eight Maps is a map mash-up of the people in the State of California who were actively opposed to passing same-sex marriage and made monetary contributions towards passing the proposition which would prevent it. Proposition 8 changed the California State constitution to prohibit same-sex marriage.
- Megan’s Law Sex Offender List – Megan’s Law is the informal term for the United States’ Sex Offender Act of 1994. This allows law enforcement agencies to make public the addresses and crimes of the registered sex offenders living in the United States.
- Los Angeles Crime Dataset– The Los Angeles Sheriff’s Department provides an extensive crime dataset which includes information regarding the geoposition, date, and type of crimes that have occurred in Los Angeles County. We used a yearly data file for 2011.
- Land Use and Parcel Boundaries – The Los Angeles County Office of the Assessor provides data regarding the land lot parcel boundaries and their specific land uses, property value, year built, and much more. LA Neighborhood Definition – The LA neighborhood boundaries were taken from the Los Angeles Times neighborhood definition dataset from June 2010.