NYC’s Urban Textscape.

Matt Daniels writes about an amazing use of Street View:

What if you could search every visible word on New York City’s streets?

First, we’d need to transcribe every business sign, bumper sticker, ad, flyer—anything with text. All these transcribed words are a wealth of information: non-English text could indicate a cultural enclave, like this one in Sunset Park, Brooklyn. We can pinpoint the phrases that comprise NYC vernacular. From everyday common words… …to phrases that uniquely blanket the city.

This is possible because media artist Yufeng Zhao fed millions of publicly-available panoramas from Google Street View into a computer program that transcribes text within the images (anyone can access these Street View images; you don’t even need a Google account!). The result is a search engine of much of what’s written in NYC’s streets. It’s limited to what a Google Street View car can capture, so it excludes text in areas such as alleyways and parks, or any writing too small to be read by a moving vehicle.

The scale of the data is immense: over 8 million Google Street View images (from the past 18 years) and 138 million identified snippets of text.

There are sections on Broadway (Matches for “Broadway” identify street signs, of course. The resulting map is oddly satisfying, illuminating each of the five boroughs’ Broadway), Luxury, Beware (There’s a simple answer to why “beware” has such a clear geographic footprint, oddly completely absent from Manhattan), Gold (The map of “gold” depicts Manhattan’s Diamond District, as well as streets lined with “we buy gold” jewelry and pawn shops), and many more, and there’s a list of the most common words in the dataset (“Stop” is the #1 word in the dataset, appearing 1,304,417 times, followed by “One way”), with explanations for many of them (NYC is home to largest Muslim community in the United States, so it’s no surprise to see halal—food prepared according to Islamic dietary law—ranked at #304). It’s impressive and loads of fun. Thanks, Y!

Comments

  1. J.W. Brewer says

    Although meatwise “halal” is outranked by “Sabrett.”

  2. As far as I can tell, the search engine only works with Latin letters. I couldn’t find “כשר” for example.

    Ed. I now see that this is acknowledged explicitly. I suppose because of some limitations in the OCR software.

Speak Your Mind

*