This is a small design project aiming to analyze the language used in Zillow.com listings in two different Atlanta neighborhoods. In our comparative analysis of the Zillow listing data, we chose two Atlanta neighborhoods: Old Fourth Ward and Inman Park. Old Fourth Ward is no stranger to gentrification. It is a neighborhood on the eastside of Atlanta. It stretches from Piedmont Avenue and Downtown Atlanta on the west to the BeltLine and the Poncey-Highland and Inman Park areas on the east. Boulevard, a main road running through the Old Fourth Ward, has become a notorious hub for drug activity, prostitution, homelessness and other crimes. Boulevard is also known for having the highest concentration of Section 8 housing in the Southeast.
Inman park is also a neighborhood on the east side of Atlanta. It is one of the highly desirable intown neighborhoods in Atlanta. Inman Park is the gathering place of diverse culture, art galleries, and savory cuisine. The neighborhood has developed a strong sense of community over the years, welcoming new businesses and residents.
We thought it would be interesting to compare the two distinctly different, but historically rich neighborhoods. First of all, we built a customized scraper in Python that extracts information directly from Zillow.com’s listings. We avoided using their APIs as the content returned did not include textual descriptions. We collected information about the price, square footage, number of beds/bath, property descriptions and lastly the type of property (“For Rent”, “For Sale”, “For Sale by Owner”, “Make Me Move” and “Sold”).
On a higher level, we want to analyze what type of things people care about when writing the listings in the two neighborhoods. In particular, we want to look for the top words people use to advertise the properties in the listings. We hypothesize that Inman Park real estate listings contain words that imply higher-end and more luxurious features, whereas Old Fourth Ward listings imply impoverishment and gentrification. Therefore, we used AYLIEN Text Analysis API to further analyze the listing descriptions we collected. The AYLIEN package for Natural Language Processing, Information Retrieval and Machine Learning extracts insights from textual content. We performed “Entity Extraction” and “Concept Extraction” on these textual descriptions. Entities are critical words that appear in the descriptions, determined using the TFIDF algorithm. Concepts however, are themes determined using a LDA topic modeling algorithm.
We extracted 449 listings for Inman Park and 801 listings for O4W from Zillow.com. “Entity Extraction” method in AYLIEN extracts entities such as people, places, organizations, and products. These information provide important information and identifiers for the textual data. “Concept Extraction” method disambiguates named entities on a more accurate level through which we can find out the abstract or generic idea of the particular entities. Being able to extract entities and concepts from these listings allows us to delve deeper into understanding the text and perform cross-comparison analysis on the two neighborhoods.
We plotted 3 comparison charts in total for each neighborhood. The first pair plots “top concept”. Since Aylien generated more top entities than concepts, we split the entities list into two categories: neighborhood-related words (Entities Chart 1) and home-feature-related words (Entities Chart 2). Entities Chart 1 plots the top words that suggest the surrounding neighborhoods, such as “Ponce City Market” and “Midtown”. Entities Chart 2 contain words that describe home features, such as “granite”, “HVAC”, “countertops”.
During the analysis, some interesting findings emerged. For example, in O4W, listing mentioned terms such as “Fannie Mae” and “HVAC”, where as in Inman Park people mentioned “Coffee” and “Surround Sound”. From the close readings of the two neighborhood listings, we found that listings in O2W focus more on the “convenience” of the location, basic amenities, and approximation to attractions such as PCM and midtown. Inman Park listings advertise for the luxurious renovation style, top-of-the-line amenities and a posh in-town lifestyle.
Some quotes from O4W listings worth pointed out are:
“This is a Fannie Mae property.”
“Solid bones but full rehab needed – roof, HVAC, electric, plumbing, kitchen, baths, etc.”
“Newer HVAC and Water Heater!”
Some quotes from Inman Park listings worth pointed out are:
“Take out one non structural wall and turn the two kitchens into the kitchen of your dreams. You’ve admired these Victorian icons from afar. “
“…Award winning schools. “
“All new bedroom level with new marble baths. Top of the line fixtures.”