Sample Presentation

This is a presentation my team made back in 2012 for Emory’s Dobbs University Center ticket sale and service quality improvement suggestion. We conducted several interviews with DUC’s employees, customers, and student workers to come up with a set of solutions to improve the quality of the service and efficiency of the operations.

Slide01

Slide02

Slide03

Slide04

Slide05

Slide06

Slide07

Slide08

Slide09

Slide10

Slide11

Slide12

Slide13

Slide14

Slide15

Slide16

Slide17

Slide18

Slide19

Slide20

Slide21

Slide22

Slide23

Slide24

Slide25

Slide26

Slide27

Slide28

Slide29

Slide30

Slide31

Using Natural Language Processing Analysis on Zillow.com Listings

This is a small design project aiming to analyze the language used in Zillow.com listings in two different Atlanta neighborhoods. In our comparative analysis of the Zillow listing data, we chose two Atlanta neighborhoods: Old Fourth Ward and Inman Park. Old Fourth Ward is no stranger to gentrification. It is a neighborhood on the eastside of Atlanta. It stretches from Piedmont Avenue and Downtown Atlanta on the west to the BeltLine and the Poncey-Highland and Inman Park areas on the east. Boulevard, a main road running through the Old Fourth Ward, has become a notorious hub for drug activity, prostitution, homelessness and other crimes. Boulevard is also known for having the highest concentration of Section 8 housing in the Southeast.

Inman park is also a neighborhood on the east side of Atlanta. It is one of the highly desirable intown neighborhoods in Atlanta. Inman Park is the gathering place of diverse culture, art galleries, and savory cuisine. The neighborhood has developed a strong sense of community over the years, welcoming new businesses and residents.

We thought it would be interesting to compare the two distinctly different, but historically rich neighborhoods. First of all, we built a customized scraper in Python that extracts information directly from Zillow.com’s listings. We avoided using their APIs as the content returned did not include textual descriptions. We collected information about the price, square footage, number of beds/bath, property descriptions and lastly the type of property (“For Rent”, “For Sale”, “For Sale by Owner”, “Make Me Move” and “Sold”).

On a higher level, we want to analyze what type of things people care about when writing the listings in the two neighborhoods. In particular, we want to look for the top words people use to advertise the properties in the listings. We hypothesize that Inman Park real estate listings contain words that imply higher-end and more luxurious features, whereas Old Fourth Ward listings imply impoverishment and gentrification. Therefore, we used AYLIEN Text Analysis API to further analyze the listing descriptions we collected. The AYLIEN package for Natural Language Processing, Information Retrieval and Machine Learning extracts insights from textual content. We performed “Entity Extraction” and “Concept Extraction” on these textual descriptions. Entities are critical words that appear in the descriptions, determined using the TFIDF algorithm. Concepts however, are themes determined using a LDA topic modeling algorithm.

We extracted 449 listings for Inman Park and 801 listings for O4W from Zillow.com. “Entity Extraction” method in AYLIEN extracts entities such as people, places, organizations, and products. These information provide important information and identifiers for the textual data. “Concept Extraction” method disambiguates named entities on a more accurate level through which we can find out the abstract or generic idea of the particular entities. Being able to extract entities and concepts from these listings allows us to delve deeper into understanding the text and perform cross-comparison analysis on the two neighborhoods.

We plotted 3 comparison charts in total for each neighborhood. The first pair plots “top concept”. Since Aylien generated more top entities than concepts, we split the entities list into two categories: neighborhood-related words (Entities Chart 1) and home-feature-related words (Entities Chart 2). Entities Chart 1 plots the top words that suggest the surrounding neighborhoods, such as “Ponce City Market” and “Midtown”. Entities Chart 2 contain words that describe home features, such as “granite”, “HVAC”, “countertops”.

During the analysis, some interesting findings emerged. For example, in O4W, listing mentioned terms such as “Fannie Mae” and “HVAC”, where as in Inman Park people mentioned “Coffee” and “Surround Sound”. From the close readings of the two neighborhood listings, we found that listings in O2W focus more on the “convenience” of the location, basic amenities, and approximation to attractions such as PCM and midtown. Inman Park listings advertise for the luxurious renovation style, top-of-the-line amenities and a posh in-town lifestyle.

Some quotes from O4W listings worth pointed out are:

“This is a Fannie Mae property.”
“Solid bones but full rehab needed – roof, HVAC, electric, plumbing, kitchen, baths, etc.”
“Newer HVAC and Water Heater!”

Some quotes from Inman Park listings worth pointed out are:

“Take out one non structural wall and turn the two kitchens into the kitchen of your dreams. You’ve admired these Victorian icons from afar. “
“…Award winning schools. “
“All new bedroom level with new marble baths. Top of the line fixtures.”

House Feature Entities - Inman

House Feature Entities - OFW

Neighborhood Entities - Inman

Neighborhood Entities - OFW

Top Concepts - Inman

Top Concepts - OFW

Japanese Kamishibai “Paper” Drama”

I made this paper drama video was for my Japanese class when I studied at Emory.Kamishibai is a form of storytelling that originated in Japan to convey stories with moral lessons to a mostly illiterate audience.Kamishibai visuals resemble the frames from a movie.

I wrote and narrated the script for the Kamishibai, and also created all the graphics. The ancient mythology is about moon worshiping during the traditional Chinese Mid-Autumn Festival celebration by the Chinese people, featuring a lunar deity, Chang’e, the Moon Goddess of Immortality. The script is as follow:

昔々、てんのかみさまの子供は太陽でした。とお太陽(たいよう)が一度(いちど)天にありました。気温はとてもあついで、ぜんぜん雨が降らないで、大地(だいち)と海はひあがりました。男の人は一人いて、名前はゲイでした。ゲイは弓(yumi)と神の矢(ya)を持って一気に九つの太陽を射落(いとす)としました.
ゲイはゆめになりましたがら、人々はゲイを敬(けい)いました。そのあと彼 は(かれは)嫦娥(じょうが)とけっこんしました。嫦娥はきれいて、しんせつて、あたまがいいおんなの人でした。
ある日、ゲイは狩の途中でゲイは一人(ひとり)の年老(としおい)ひとに会いました。この人はかみさまでした。ゲイに不老長寿 (ふろうちょうじゅ)の薬をあたえりました。この薬をのんで、天にかみさまになることができるのです。でも、ゲイは妻(つま)にわかれないとおもいましたから、家に帰て、不老長寿の薬を嫦娥にあたえりました。
でも、いじわるなひといて、なまえは蓬蒙(ほうもう)でした。蓬蒙(ほうもう)は不老長寿(ふろうちょうじゅ)の薬を奪い(うばい)、自分で飲んで、かみになるとおもいました。
その歳(とし)の8月15日(はちがつじゅうごにち)、ゲイはでかけていました。そのよるのまえにほうもうは、嫦娥(じょうが)の部屋(へや)に不老長寿(ふろうちょうじゅ)の薬を渡すよう嫦娥(じょうが)に迫ったのです。嫦娥はやむにやまれず薬を全部飲んでいました。すると嫦娥(じょうが)の体は突然(とつぜん)軽(けい)くなり、窓の外を飛ぶん(まどのそとおおとぶ)で、一直線(いちょくせん)に天高く舞(ま)い上(じょ)がったのです。ゲイを思う気持ち(もち)は強く(つよく)、地上(ちじょう)から一番近い(ちかい)月(つき)に降り立ちました。
ゲイは家にかえで、嫦娥(じょうが)は見ませんでした。そして、事の次第を知ったでした。急いで外に出て月を見て(いそいでそとにでてつきおみて)、月(つき)はいつもよりも丸く(まるく)、いつも明るいて見ました。
毎としの8月15日(はちがつじゅうごにち)の夜、月(つき)はいつも丸く、明るいていました。そしてゲイはこの日も果物(くだもの)をたくさん置(お)いたテーブルをおくて、妻(つま)を思ったのです。それがら、まいとしの8月15日、かぞくといっしょにつきおみって、おかしとくだものおたべます。それは中秋節(ちゅうしゅうぶし)のお祭り(おまつり)のでんせつ伝説(でんせつ)です。

Amazon Review: Visualization Design on Document Collections (Micro Design Project)

This micro design project is to provide you with further experience in analyzing and understanding mutlivariate data sets. The particular focus of this project is a data set that is rich with textual data. It is a document collection that consists of a set of product reviews of a Samsung TV from amazon.com.

Snapshot of the document collection

Snapshot of the document collection


The goal is to design a visualization of the data set that you feel would help a person learn about the television and understand the issues that identified earlier. This design addresses concepts including visualizing aspects of the data set that would be difficult to extract from simple search queries.

amazon review mockup-01

amazon review mockup-02

amazon review mockup-03

When the user first approaches the visualization, he first sees two types of visualizations of product overview on the top of the page. The average star rating of the product is represented as a progress bar shaped as a star. Below the progress bar, he user can see the individual ratings for the 5-star rating system in a bar chart, with each square/pixel of the bar representing one review. When the user hovers over the square, he/she can see the details of the review on a pop-up speech bubble. To the right of the bar chart are several donut charts that show the overall rating of the product’s different features, and the proximity of the keywords to other keywords in the reviews.

For example, when a lot of people that give low ratings because of bad delivery, the system acknowledges any keywords related to the concept “delivery”, such as “UPS”, “deliver”, “package”, etc. The system extracts these keywords, and look at the sentiments of words in the context that suggest service quality, like “bad”, “disappointed”, and “fast”. All these relationships will be factored into an overall rating score for the “delivery” feature and represented on the donut chart. The 5 shades of the donut chart represnet the 5-star rating; the bigger the slice, the more reviews there are for that star.

Below the overview, the user can hover over individual tabs and view top reviews for each star. The “thumb-up” and “thumb-down” scores are users’ ratings of the reviews, depending on if they find the reviews helpful or not. Below the review section is a keyword cloud generated from all the reviews under that star. When the user clicks on a keyword — “Samsung” for example, the review section will narrow down and only display reviews that contain “Samsung”. In the section below is a line chart that shows the distribution of the ratings throughout the entire review period. For instance, the user can examine which month people give out most 5 stars, and fine out why. Maybe there was a big price drop or an Amazon promotion when users gave out more 5-star reviews, as suggested by the keyword cloud and the frequency line chart. The line chart will reajdust according to which star rating tab the user clicks on at the top. The line chart and review sections also readjust every time the user clicks on a specific keyword on the keyword cloud. For example, a user wants to see all the 5-star reviews (click on the “5 stars” tab) that contain keyword “price (click on “price from keyword cloud), he then finds that this word is most mentioned in November. He can further narrow down the reviews to show just November by clicking on the “November” node in the line chart. Last but not the least, the user can compare similar products by choosing one of the following: keyword match, overall rating, similar features, and review dates.

The Clubs That Connect The World Cup

world cup figure 1

Every four years, the best 32 soccer national teams come together to compete for one ultimate championship in the FIFA World Cup. Although the World Cup is the epitome of the competition, team work, and patriotism in world of sports, the players themselves are no strangers to each other. For most time of the year, they have formed camaraderie through professional leagues all over the world. The series of 4 informative, interactive visualizations titled “The Clubs that Connect the World Cup” shows the connections between players, national teams, and professional leagues (clubs) in the 2014 Brazil FIFA World Cup.

The first chord graph is an overview of all 32 teams and their affiliation with the clubs that have players on at least 2 national teams. The circles on the outer rim represent the 32 national teams. The circles in the middle of the graph, color coded according to geographic regions, stand for different clubs. The size of each circle represents the amount of “connections” that club has with the 32 teams. The closer the circle is to the center of the graph, the more connections the club shares with the world. When hovered over, the club circle singles out its first-degree links while the rest of the graph fades to the background. For example, when hovered over, the graph shows that Manchester United has players serving 9 national teams. Similarly, one can hover over a national team’s circle and show the team’s affiliation with clubs and other countries. The graph also contains small, grey dots congregating around the national team circles. These dots represent individual players. The graph even replaces the dots with profile pictures for some of the star players. If the user hovers over Balotelli’s picture, for instance, he/she can see Balotelli’s affiliation with the clubs that have continuing connections with other players, clubs, and national teams.


world cup figure 2

The second graph shows the connections between Brazil and Argentina on a granular level. The hovering interaction functions in the same way as the overall graph. On a personal note, it would have been more informative if this graph shows the connections between Germany and Argentina instead since it was these two teams that were in the final.

world cup figure 3

world cup figure 4

The third graph displays the connections among European leagues, World Cup players, and the 32 national teams. The circles are still pretty dense compared to the first chord graph given the fact that the European leagues are predominant in the world. The fourth chord graph displays the connections among players, countries, and clubs in non-European leagues. Not surprisingly, this graph is much less dense than its European counterpart.

world cup figure 5

Last but not the least, at the very bottom of the website, the user can search for a particular team using the filters provided. The legend here is consistent with the first chord graph, and the visualization will change accordingly depending on the user’s inputs.

Overall, the visualization effectively portrays the non-linear, multi-dimensional connections of 3 different variables: nationalities, individual players, and professional leagues in an all-inclusive, consistent fashion. The presentation resonates with the shape of a soccer ball. However, given the amount of data underneath the visualization, the real-time interaction is quite glitchy. Users sometimes need to wait for 2-3 seconds before seeing changes on the graph. The hovering interaction does not function well. Some interactions are not straightforward: I was not sure if I should click or hover over a circle to bring forth the information because there is a lag between my track-pad gestures and the visualization. It is also not transparent that the grey dots represent individual players and the colored circles represent clubs, especially for soccer novices. For example, only the last names of soccer players are displayed, which can easily be mistaken for club names. There are no distinctions in font color and font size for the different names. Thus, the legend should have properly labeled every single element on the chord graph.

Spotify: The Marriage of Music Sharing and Social Media Platforms

Spotify is a commercial music sharing platform that provides digital-rights management-restricted contents. Spotify encourages music lovers around the world to explore their ubiquitous passion by accessing an encyclopedic library of music contents.

Figure 1: Spotify Premium desktop interface

Figure 1: Spotify Premium desktop interface

Figure 2: Spotify iOS interface

Figure 2: Spotify iOS interface


Like many other systems with computer-generated recommendation algorithms, Spotify uses a robust collaborative filtering method based on users’ behaviors to produce suggestions on playlists, artists, genres, and songs as if the computer is “aware” of the users. The user’s account is also synchronized and can be accessed on different devices. As a social platform, Spotify has a Twitter-like “Follow” tab that allows the user to subscribe to influential and knowledgeable users to discover new music. For example, the user can follow Taylor Swift and her playlists to discover the artist’s personal taste. Spotify brings artists and fans closer together through music as a form of shared consciousness.

Figure 3: “Follow” function in Spotify

Figure 3: “Follow” function in Spotify

In addition to the traditional music-recommendation procedural affordance, Spotify has a “Browse” tab that distinguishes itself from other competitors. The “Browse” tab allows the user to choose from a host of pre-defined playlists depending on the user’s context and mood, which highlights Spotify’s role as a social accompaniment in everyday life. Spotify also partners with Facebook to maximize this social companionship.

Figure 4: Facebook displays what songs the user has listened to recently on the newsfeed

Figure 4: Facebook displays what songs the user has listened to recently on the newsfeed



Figure 5: “Browse” function in Spotify with different genres of music

Figure 5: “Browse” function in Spotify with different genres of music

Wacom Tablet: Interactive Pen-and-Touchpad Design that Allows Direct Manipulation in Traditional Way

Wacom’s pressure-sensitive pen and tablet is designed for users to input their creativity seamlessly into the digital world by mimicking the use of conventional tools.

Compared to traditional click-and-point navigation, Wacom’s pen-and-touchpad design allows users to digitize their work with an old-fashioned hand. The pen serves as a transparent and efficient form of pointing device and threshold object of the system. Through the pen, information in the atomic world is translated into pixel information in the digital world. The user can adjust lighting or tonal value by controlling the opacity of an effect based on how hard he/she presses the pen into the tablet. The tablet has a binding dimension that corresponds the screen’s working area.

wacom figure 1

Wacom offers different models with functionalities to support performances ranging from simple tasks like doodling, to complex tasks such as film editing. Wacom tablets elicit encyclopedic expectations by providing large repertoire of effects. For Photoshop’s “Pen Tool” alone, users can choose various combinations of effects such as the types of artist brush, brush stroke, spread, and size. Wacom also extends the use of fingers by allowing users to interact with the touchpad to point, drag, scroll, flip, zoom, and rotate files in the computer.

wacom figure 2

Immigration Explorer: Interactive Map

Immigration Explorer is an interactive map that shows where various foreign-born immigrants have settled across the United States between the 1880s to the 2000s.

immigration explorer figure 1

The tool contains information with multiple levels of data retrieval methods, granularity, and configuration. On the linear level, the tool incorporates a unisequential timeline to represent the chronological and spatial changes of immigrants’ settlement pattern. A dropdown filter allows the user to select a specific ethnic group, and bubbles of different sizes represent the population census data proportionally.

immigration explorer figure 2
immigration explorer figure 3

Users can click on individual bubbles to retrieve more textual information about the population of an ethnic group and the total population of a county. If nothing is chosen, the map displays data of all countries of origin in different colors. Darker shades indicate a larger ethnical presence. Moreover, the map allows the viewers to zoom in on a particular region of the U.S. However, there are some missing data points, such as the “Korean” category from 1890s to 1970s, causing design inconsistency. Neither does the segmentation further break down the Middle East region in the same level of granularity as other regions.

immigration explorer figure 4
immigration explorer figure 5

“The Depth of the Problem” Infographic: The Computer as a Spatial Medium

“The Depth of the Problem” infographic depicting flight MH370’s pinger detected 15,000 feet below sea level efficiently exploits spatial affordance of infographics by combining visual design concepts and key informational elements. The graph mainly portrays the depth of the debris and its relationship to viewers’ existing perception of depth. However, the message of the infographic can be maximized to better explain the difficulties of the salvage by incorporating other key information of the incident, such as underwater lighting and water pressure.

MH370TheDepthoftheProblem_5346c53156127

The infographic refrains from having too much decorative elements and draws viewers’ attention straight to the statistical information in a unified presentation that contains the depth of water, heights of different architecture, sea levels where different sea creatures reside, historical disasters of similar kind, and so on. It dramatizes the contrast between the perception of depth familiar to people’s knowledge and the unfathomable ocean beyond imagination. The graph also combines quantitative and qualitative elements by using the silhouette of the architecture as histograms bars and other visual elements as marks of depth.

The graph successfully translates the most difficult obstacle in search of the wreckage–“depth”, visually and spatially. However, the author can also add visual elements to represents other obstacles in the salvage without making the graph too overwhelming. For instance, the author can realistically represent the pitch darkness of underwater world by gradually making the background blacker and blacker starting at the level where light in water starts to diminish. The author can also represents the drastic increase in water pressure in a more vivid way than plainly labeling the magnitudes of water pressure.