Peer-to-peer marketplaces and the sharing economy are reshaping many markets. For example, the nonstandardized, diverse selections on Airbnb produce both superior value and uncertainty for customers. Visual information, such as background images that appear in search results, can help resolve such uncertainty. The visual information that such images provide consists of both technical quality and content features. Using a convolutional neural networks approach to process the images, this study reviews Airbnb listings in two cities and derives a descriptive model of image technical features, content, and other property attributes (e.g., price, textual information, characteristics) to predict demand at the property level. To correct for endogeneity, a novel instrumental variable approach matches Airbnb listings across cities and establishes instruments according to distance from city centers. The results show that consumers value interior views in the main image; the effect of such views on demand is greater than that of image technical features and complementary to textual information. The size of these effects is consistent with literature and substantial; for example, a one standard deviation increase in the probability of showing the living room translates into a 13% increase in the booking rate, equivalent to about two more nights of bookings.
Note: Research papers posted on SSRN, including any findings, may differ from the final version chosen for publication in academic journals.