| Keyword Articles |
|
Electric Symbols: Internet Words and Culture
AbstractThe mous Sapir-Whorf Hypothesis posits a linguistic determinism arguing language plays a central role in creation of a worldview. In the sense that language is a product of words, one can say that a culture's worldview is affected and influenced by the words of its particular language. Words both create and communicate worldviews. The greatest potential in history for the observation and analysis of words exists on the Internet. Indeed, the Internet can be considered history's greatest observatory and laboratory of words.ContentsWatching Words
Watching WordsThere is a relationship between Internet words and leading events, people and products of cultures. More specifically, there is a relationship between Internet words in specific cultures and leading events and things in those cultures. While our main interest throughout this report is the relationship between leading American Internet words and leading events, people and products of American culture, the techniques and theories we discuss have cross-cultural applications. Whatever culture one examines, the relationship of Internet words to culture has the potential to offer startling new insights into that culture. However, these insights will come only when the Internet is viewed as a tool for observation rather than economic production. Not long ago, the Internet began as an interactive forum where people from all over the world could meet and speak to each other despite the barriers of land, oceans and culture. Then came major corporations that tried to erect a central stage to make the Internet a platform to preach their own messages. Despite the early interactive history of the Internet, its superstructure was increasingly dominated by major corporations and their business paradigms of one-way communication based around broadcasting models. In effect, the Internet was used as an economic tool to promote business rather than as an informational resource to understand culture. Economic goals dominated cultural insight. The dot.com crash caused the disappearance of many Internet companies and forced a re-evaluation of business models. Yet the crash had little effect on the Internet's interactive social community and its untapped potential for understanding culture. Today, the Internet offers a vast unexplored territory of social and cultural insight ready to be mapped and mined by a new generation of Internet explorers. Members of this new generation will be observers of information more than producers of it. Their efforts will help reduce information rather than produce information. As Web inventor Tim Berners-Lee notes in Weaving The Web, "The Web is more a social creation than a technical one." Back To The Future of LinguisticsIn Language and the Internet, one of the world's leading linguists David Crystal observes that "as the Internet comes increasingly to be viewed from a social perspective, the role of language becomes central." Crystal notes that "what is immediately obvious when engaging in any Internet function is its linguistic character. If the Internet is a revolution, therefore, it is likely to be a linguistic revolution." Much of the Internet linguistics revolution might be viewed in the larger context of general linguistic history. In a sense, it involves a journey back to the nineteenth century in order to journey forward into the future. During the nineteenth century, the fragmentary approach to reality prevented scholars from getting beyond the immediate facts in matters of language. During this time, language was seen as mechanical and atomistic, a conception of language which was reflected in the historical studies of comparative philologists. The diachronic study of language, or study of the structure of language over a period of time, prevailed over the synchronic study of language, or study of language at a moment in time. During the early years of the twentieth century, the great Swiss linguist Ferdinand de Saussure changed this nineteenth century conception of language. As a result of his influential work, particularly his Course in General Linguistics, the atomistic and diachronic methods gave way to the development of a synchronistic perspective of language and a change from the past history of language to the present structure of language. In a sense, the Internet allows the return to the study of the pre-Saussure atomistic elements of language in words. At the same time, it also applies the synchronic method Saussure developed to study overall language structure to words. The Internet makes it possible to return to the earlier atomistic parts of language with the powerful new synchronic ability to rank and study these word atoms of language as never before. Both the diachronic and synchronic methods of linguistic analysis live on in studying words on the Internet. The diachronic method continues in studying not the history of languages but rather the histories and cycles of words. And, the synchronic method continues onward in studying word ascendance (what we term the "rise" index) and word ranking depth. Whatever the case, it is becoming increasingly obvious that the ability to index, search for and rank words on the Internet is giving new meaning to those word atoms of language we have taken for granted for so long. In a postmodern world of increasing symbols, the electric symbols of words are becoming our greatest symbols.
Search FunctionFor members of the new generation of Internet observers, there are three major benefits of being associated with the search function on the Internet. The first involves the great quantity of traffic passing through Internet search functions. The second involves the ability for the quantification of this huge amount of traffic. The third involves the ability for segmentation of searches from various communities. QuantityFirst, the quantity benefit involves the strategic position of search engines at the leading "doors" to cyberspace. Search engines are the virtual equivalents of real world airports with vast traffic passing through them on each day on its way to particular destinations. As Google CEO Eric Schmidt reminds in the August 2001 American Spectator, "Search is the number-one thing people do on the Web today." The number two thing is e-mail. Most of the top Internet search engines generate more than ten million searches each day. The leading search engine Google generates 150 million searches a day. Even the smallest of these virtual "airports" represent far more traffic than even the greatest real world airports. While the search engine traffic is immense, one needs to add the caveat that search still involves only Internet users and not the entire "real world" population. In this sense, search engines are subject to the current overall demographics of Internet users. As Internet use becomes more widespread, these particular demographics will diminish as they "bleed" into the general population. QuantificationSecond, there is the quantification benefit of search engines and their association with one of the most trackable actions of Internet users. Words keyed into search engines are easily ranked for specific periods of time. Histories of word and phrase ascendance and descendance is part of this quantification process. As we suggest, the dynamics of word movement might tell Internet observers as much or even more than the rankings of words. With search engine companies, the greatest quantity of traffic on the Internet is matched with the greatest ability for quantification. SegmentationCertainly Internet search functions have value as barometers of worldwide Internet activity. However, much greater value resides in the ability to segment searches into various real world communities such as nations and corporations. The ability for basic segmentation into nations (by identifying origination points for searches) is already possible on most search engines. Further segmentation into various communities will increase the application and practicality of search functions to more real world (rather than virtual world) scenarios. Segmentation by nations offers a powerful new tool for analysis in a number of disciplines. While the techniques might prove to have high transferability between nations, our interest currently is mostly in segmentation of searches by those originating in America.
Search EnginesSearch functions on the Internet are embodied in various search engines. While the search function in general draws large quantities of quantifiable traffic, some search engines draw far more traffic than others. These mega-search engines like Google offer the best positions for observation of Internet activity. Mega-search engines draw more quantity through them mainly because of the quality of their results and their commitment to the traffic passing through them rather than to the destination Web sites of the traffic. These differences can be seen by the segmentation of the search engine types into portals and directories, advertising sponsored search (pay for performance) and pure search engines. Portals & DirectoriesPortals and directories such as Yahoo! rely on human editors to scour the Web and appropriately categorize pages and their associated links. Portal editors are much like librarians. One problem for portals is that directories take tremendous effort to maintain. Finding new links, updating old ones, and maintaining the database technology add to a portals administrative burden and operating costs. AdvertisingThe Leading Pay-For-Performance Search Network Overture (http://www.overture.com) enables Web sites to enhance their revenues and user functionality by offering Overture's search results to their users. More than 95 percent of Overture's paid introductions are generated through its thousands of affiliate partners. Overture's search results reach 85 percent of active, U.S. Internet users. Overture's search results are distributed to thousands of sites across the Internet, including Yahoo!, MSN, Netscape, AltaVista, Lycos, HotBot and many others. Advertisers pay Overture the amount of their bid only when a consumer clicks on their listing, providing them with one of the most cost effective ways to drive targeted customer leads to their sites. Pure SearchSearch engines pay special attention to metadata in the pages that they spider through and add to their index databases. If the new librarians of the Internet run portals, mathematicians run search engines. More precisely, they are run by the formulas of mathematicians. In the simplest case, this metadata might take the form of content in <Meta> tags. Many search engines return results on how often keywords appear in a Web site. More advanced search engines, like Google, rely more on subtle information. For example, Google evaluates not only the occurrence of key words on a page, but also the number of outside links to the page itself, as a measure of importance or popularity. Google developed an advanced search technology that involves a series of simultaneous calculations typically occurring in under half a second - without any human intervention. At the heart of this technology is PageRankTM technology and hypertext-matching analysis developed by Larry Page and Sergey Brin. Google's search architecture also is scalable, which enables it to continue to index the Internet as it expands. PageRank technology performs an objective measurement of the importance of Web pages and is calculated by solving an equation of 500 million variables and more than two billion terms. Google does not count links; instead PageRank uses the vast link structure of the Web as an organizational tool. In essence, Google interprets a link from Page A to Page B as a "vote" by Page A for Page B. Google assesses a page's importance by the votes it receives. Google also analyzes the pages that cast the votes. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages important. Important, high-quality pages receive a higher PageRank and are ordered or ranked higher in the results. Google's technology uses the collective intelligence of the Web to determine a page's importance. Google does not use editors or its own employees to judge a page's importance. Finally, unlike conventional search engines, Google is hypertext-based. It analyzes all the content on each Web page and factors in fonts, subdivisions, and the precise positions of all terms on the page. Google also analyzes the content of neighboring Web pages. All of this data enables Google to return results that are more relevant to user queries. As a result, millions of users worldwide look to Google as the fastest, easiest way to find exactly the information they're looking for on the Web the first time. Search + AdvertisingA new service from Google is AdWords SelectTM. Basically, it combines Google concepts with those of Overture, http://www.overture.com/ (formerly GoTo.Com).An advertiser first chooses how they want to segment their market. (Currently, only large segmentations based on nations and languages are possible. However, someday perhaps we will be able to choose local areas such as states, counties, cities or metro areas.) Secondly, an advertiser chooses the key words specific to their business. This creates an ad group of words. When a Google user searches for these key words, an ad in a box appears to the right on the Google search. Each ad group has a maximum cost-per-click (CPC) the advertiser is willing to pay. This and the click through rate (CTR) determine where the ad will be shown. If click through is high, advertiser pays less to stay in a top position. If the same, advertiser pays only pennies more to stay in top position. If CTR is less, then the advertiser pays more for top position. All of this is monitored by Google's AdWords Discounter that automatically adjusts rates. For example, the words "furnished office space boston" yield a list of sponsored links on the right which go directly to sites of the paid advertisers. The box positioned at the top of the right column list is based on the above formula. Unlike Overture, though, Google does not place these results in their hypertext links on the left. Rather, they put them into their own column on the right. A Google user can therefore easily see what links are sponsored and which ones result from the Google internal math formulas. Pure search resides side by side with paid for search. The Semantic WebYet as sophisticated as leading search technology has become, the ultimate quality of search on the Web is a two-way street. Search engines can get more and more sophisticated at locating information. But Web pages need to also get more sophisticated at helping search engines locate information. Currently, the methods for providing information lag far behind the methods for searching for information. This is because there is little standardization of providing information on the Web. As Uche Obbuji writes in the June 2002 of New Architect, "The Semantic Web is a vision of a next-generation network that lets content publishers provide notations designed to express a crude "meaning" of the page, instead of merely dumping arbitrary text onto a page. Autonomous agent software scan then use this information to organize and filter data to meet the user's needs." Semantic Web proponents are looking to XML and RDF to meet these new challenges.
Analysis"And the big complaint we get isn't about speed - we're the fastest there is. What people want is even more information, from sources that are not currently searched. Research reports, news files, historical archives, university projects - they're out there, but not yet in publicly searchable sources. Getting that information indexed is a Herculean, but incredibly important job." As one might suspect, a number of search engines post their word search rankings. Some attempt analysis of the relationship between their top words and events in culture. For the most part, though, their analysis focuses on the top ranked words and analysis of their meaning involves a good deal of speculation. Some are niche products of large branded search engines like The Lycos 50. The Lycos 50 (http://50.lycos.com/) attempts interpretations from the Lycos search engine receiving about 12 million searches each day. It has strong filtering requirements eliminating words such as those associated with corporations, pornography and computer technology. There are also a number of independent word ranking consulting services that combine rankings from a number of search engines. One of the oldest and best is Beyond Engineering's Word Spot (http://www.wordspot.com/) that customizes searches for words clustered around client products. For example, word clusters around the search word "MP3" offers a client insight into top words associated with products relating to MP3. The online report from Word Spot is at http://www.wordspot.com/samplecustom.html. Google ZeitgeistTMThe search engine Google has focused on the quantity of traffic and information rather than the quantification and analysis of it. As Google CEO Eric Schmidt notes, the task of indexing information is a "Herculean" task by itself. It's had incredible success in this effort so far. Among search engines, Google has become the largest in just a few years, processing over 150 million searches a day from its two billion Web-page database. While Google has focused on search services rather than interpretation of results, it has created a weekly and monthly "Google Zeitgeist" of the top ten ascending and descending words in Google rankings. It has also created an archive of this information that goes back to June 2001 at http://www.google.com/press/zeitgeist/archive.html. Weekly ZeitgeistFor example, consider the Google Zeitgeist top gainers and losers for the week ending 13 May 2002. It is arrived at by comparison of search queries that have either risen or dropped by a significant percentage between the week ending 6 May with the week ending 13 May. Some of the top ascenders were #3 "Mother Day" for Mother's Day on 12 May, #4 "Star Wars" for the new Star Wars film which was opening and #6 "Dia de la Madre" the Spanish for Mother's Day. Some of the top decliners from the previous week was #1 "Cinco de Mayo" the recently past Mexican Holiday, #2 "Kentucky Derby" which was run on 5 May, #3 "Le Pen" the conservative French politician, #7 "Spiderman" which had opened the previous week and #9 "Kirsten Dunst" the female star of the film Spiderman. All top ten ascenders and descenders are represented in the chart below.
Table 1: Weekly Google Zeitgeist - 13 May 2002
Notice the close connection between the weekly Zeitgeist and leading news of popular culture. Monthly ZeitgeistThe weekly Google Zeitgeist focuses on the fast "revolving door" of popular culture and popular culture's increasingly short attention span. On the other hand, the monthly Google Zeitgeist casts a wider net picking up larger trends or those things that gain attention for a month rather than a week. For example, some of the top ascenders for April 2002 reflect events and people with longer staying power than weekly indexes. These were #3 "Linda Lovelace" the infamous pornography film star who died in April, #4 "Spiderman" the movie with the largest opening box office in history, #6 "Le Pen" the conservative French politician, #8 "Robert Blake" the film star accused of murdering his wife and #10 "Lisa Lopes" the teen singing star who died in an accident. Some of the top descenders were events which had passed such #1 "Oscars," and #2 "NCAA," and #7 the film "Ice Age." The full list of ascenders and descenders for April 2002 is reproduced below.
Table 2: Monthly Google Zeitgeist - April 2002
While the monthly Google Zeitgeist is more indicative of broader trends than the weekly Zeitgeist, in months with major news or events the monthly Zeitgeist might represent even broader cultural trends. For example, April 2002 had no big news events while September 2001 had one huge news event. The top ascenders and descenders for September 2001 are reproduced below. The ascendance of the September 2001 words on Google are fairly self-explanatory. It is interesting, though, that the word "Nostradamus" has top ascendance ranking possibly indicating an event of such proportions to place it in the category of prophetic. Top descending words are also fairly self-explanatory. Interestingly, the idea of travel is one of the largest decliners as Americans decided to stay home after the terrible airline tragedy of September 11th.
Table 3: Monthly Google Zeitgeist - September 2001
Apart from top ascending and descending words on the monthly Google Zeitgeist, Google is also experimenting with other listings.
Table 4: Top Product Searches - April 2002
For example, for the month of April 2002, they post the top brands searched for on Google. These had the order listed in above Table 4. Since advertising and publicity is tied to products, much of the positions in the above indicate advertising or publicity campaigns driving the positioning. Annual ZeitgeistsThe Google Zeitgeist (as well as other search engines ranking words) is relatively new. As the years pass, though, there will develop Internet rankings of top yearly words in particular cultures. Exploring relationships between these top yearly words and leading cultural events offers much research potential. Long before the era of Internet word rankings, a number of scholars and organizations have been tracking leading words in culture. For example, each issue of American Speech, published by the well-respected American Dialect Society features leading new words in its "Among The New Words" (ATNW) section. The "ATNW" section of American Speech has been a feature of the journal for most of the twentieth century. A fascinating book project by leading linguists David Barnhart and Allan Metcalf is America in So Many Words. It is an attempt to identify the leading words which have entered the American vocabulary on a yearly basis starting in the year 1555 and ending in 1998.
Table 5: American Words added to Dictionaries in the 1990s
For example, leading words during the beginning years of American culture were "canoe" (1555), "skunk" (1588), "Indian" (1602) and "turkey" (1607). At the other end of American history, leading words for the 1990s were the revealing list of the above words. Unlike the ability of the great search engines like Google to rank word popularity, the authors have simply relied on new invented words that became so prominent they were added to the American dictionaries. Theoretically, if historical content analysis of American print media was possible on a massive scale, a particular type of ranking of leading words by year might be possible. Of course if Internet search engines like Google had been around, we would have immediate rankings of words on a yearly basis. The exciting thing is that we have already developed a database of the most popular words in America over a few years period. An extremely important database to rate American word rankings continues to grow. Google alone adds 150 million word rankings each day to this database. Century ZeitgeistsSince 1879, when James Murray began the work of compiling the Oxford English Dictionary (OED), readers all over the world have been collecting examples of new words, idioms and meanings for the Oxford database. Once sufficient evidence for a new word has been collected, Oxford's lexicographers prepare a new entry that is then reviewed by expert consultants before it is added to the OED database. The OED database has reached one million words. During the last century, some 90,000 new words were added to the Oxford English Dictionary and its supplements. This represents a 25 percent increase in the total vocabulary over the previous thousand years. This is not surprising when one realizes the number of native speakers of English in the world nearly tripled in the twentieth century, from 140 million to 400 million. In addition, a further 100 million people added English as a second language. From these 90,000 words added to the OED in the twentieth century, British lexicographer John Ayto has selected the most salient new words coined in each decade - some 5,000 words in all - for his book 20th Century Words. As Ayto notes: "Words are a mirror of their times. By looking at the areas in which the vocabulary of a language is expanding fastest in a given period, we can form a fairly accurate impression of the chief preoccupations of society at that time and the points at which the boundaries of human endeavour are being advanced." Grouped by decade, the words are a mirror of their times and the events, preoccupations, inventions and discoveries of each decade. In a February 2000 interview on Lingua Franca, an Australian radio program, Ayto observes that words help us define new decades and help us make a new start. "We throw off the fashions of the previous decade and perhaps make a conscious effort to move into the new one." One example is how the names of popular dances have defined decades. "And so the names for all the dances, for example, that were popular in the 1920s, were very swiftly superseded and made to seem very old hat in the 1930s, and the same thing happened in the '40s and '50s and so on. You can really encapsulate a decade almost by the names of the dances that were popular at a particular time." In a general sense, Ayto observes in the radio interview that the twentieth century could be characterized as the century of abbreviation. "I suppose you could try and draw all sorts of morals about our hurried lifestyle from that. There has been the acronym, for example, like AIDS and NATO where you take the initial letter of a string of words, put them all together and make another word out of it. That was virtually unknown before the twentieth century." In the coming years, it will be interesting to see how new areas such as Internet word analysis develops in relationship to traditional areas such as lexicography, semantics and linguistics.
SegmentationGoogle is a worldwide search engine and its overall, non-segmented results reflect a mixture of interests from its worldwide users. However, Google demographics show that this mixture of interests is heavily weighted to the English language and western nations. Currently, Google has the ability to segment searches into five languages and most of the major nations of the world. In order for Google to be an effective tool for cultural analysis, searches originating in specific countries need to be segmented from the overall worldwide searches on Google. Apart from key segmentation into various nations for cultural analysis, searches might also be segmented into various forms of real world (rather than virtual) communities. Search segmentation most likely will be based on physical origination points. In this manner, interests of smaller search communities (within the larger overall search community) are defined and ranked. Examples of some of these word search communities are corporations, cities, universities and zip codes. However, it is conceivable that search segmentation might also be done by non-locality identifiers such as industry SIC codes. The ability to segment search data into various communities has two important consequences. One, it obviously demonstrates the interests of community members. Second, and less apparent, it allows for cross community comparisons of word rankings. For example, in the illustration below, assume four different segmented word search communities 1, 2, 3 and 4. These might represent four different corporations, universities or towns which are able to segment internal generated searches from external generated ones via fire-walled Intranet technology. For comparison purposes, it is important they are similar communities. For example, all should be corporations or all should be towns. Better comparisons are obtainable if they are similar corporations and towns. Assume the letters A, B and C represent word rankings levels (top, middle and lower) in these communities. Community 1 and 2 are in word alignment with each other. Community 1 and 2 are in alignment with top ranked words in community 4 but out of alignment with middle and lower ranked words. Community 3 and 4 are out of alignment with top ranked words but in alignment with lower ranked words.
Figure 1: Word Rankings In Four Communities
Of course this is a simplified example. Yet it represents the basic principles of cross community word comparisons. What might this information tell us? In later sections of this report, we discuss issues of word rankings such as depth, cycles and clusters (with their content and context words). One of our main arguments is that top ranked words (we term "content" words) are more reflective of external events in the community while middle and particularly lower ranked words (termed "context" words) are more expressive of collective psychology and moods, emotions and attitudes of the internal world. Applying this to the above example, word communities 1, 2 and 4 have content word alignment (matching "A"s) yet lack context word alignment ("C"s are not aligned with "B"s). However, communities 3 and 4, while lacking content word alignment have context word alignment at their lowest level of words. This low level word alignment might ultimately be more important than the surface alignment of top ranked content words. (For purposes of better understanding, the reader is encouraged to return to this section after reading the upcoming sections on depth, cycles and clusters. However, we feel it is important that segmentation be discussed first for these sections to have more practical application and meaning).Below are some examples of prime segmentation candidates that might have word searches separated by search engines from outside traffic for a focused understanding of the dynamics of their own communities. These are nations, corporations, towns, counties, states, universities, zip codes and industries. NationsSegmentation of search via nations is perhaps the first essential action for cultural analysis. But it also offers powerful new tools for international marketers as well as for governments. Searches originating in different nations might be compared to locate commonality or differentiation of interests between nations. A word ranked high in one nation might be ranked low in another nation. A word with a high "Rise Index" in one nation might have a low "Rise Index" in another nation. On the other hand, there may be much commonality between some words. President Bush (as well as other political and military leaders) has observed that modern wars will increasingly be fought on many fronts using a number of means outside of conventional military force. As modern wars increasingly become "battles of symbols" it becomes more and more important to gauge the top symbols in other nations and formulate a strategy in some type of alignment with these symbols. The top ranked words of one's enemies and allies become important in this emerging global scenario. By segmenting word searches originating in specific nations, search engines are able to create national lists of top ranked words.
Table 6: Fictional Top Search Words of Nations
In the above illustration, assume that the word "Man" and "Woman" are subject to a close cross-cultural interpretation in search words on the Internet and that searches are conducted at close to the same moment in search time. Lower rank of the word shows greater importance. Higher ranking shows less importance. Most interesting, are the relative weights of the rankings between various nations. For instance, at the particular moment in time represented by the above chart, America gives less importance to the word "Man" than all the other nations in the study. However, it gives greater importance to the word "Woman" than the other nations. Might variations such as these in key symbol words suggest the dominance of a feminine context or zeitgeist over a masculine one? Might this simply be a coincidence? Might there not be enough data to make any meaningful conclusion? For a minute, assume there is enough information to draw some broad general conclusions. Assume that an American culture with a high "Man" word index and low "Woman" index indicates a general American feminine cultural zeitgeist in contrast to a general masculine cultural zeitgeist in Afghanistan. Much can be gained for communication purposes if a culture understands it is in a feminine cycle and that it is communicating with another culture that is in its masculine cycle. Beyond speculation of top word rankings for various nations, the Google Zeitgeist is already posting top word rankings from various nations in its monthly Google Zeitgeist listings. The month of April 2002 is below.
Table 7: April 2002 Word Rankings - Four Nations
In the above, note the perfect alignment between America and France on "Loft Story" as well as a close alignment between America and Britain on Spiderman. CorporationsOne of the more interesting and valuable word search communities is in current application with the Google Search ApplianceTM. This appliance is licensed from Google and provides the ability to segment Google Internet search technology behind firewalls and inside various Intranet communities such as corporate Web sites. Besides offering greater communication within corporations, the various shifts in page hits and word rankings potentially offer an incredible new tool for corporate management in determining key interests of employees. Towns, Counties And StatesWhile Google can create segmented searches behind firewalls and corporate Intranets, it currently does not have the ability to segment inquiries originating in localized areas (smaller than nations) such as towns, counties or states. Connected communities or cyber-cities presently utilize Internet technology to increase local "social capital" by attempting to maximize economic, educational, political and social value. Word searches originating in particular cities or local communities could offer a valuable method of gauging local interests. UniversitiesIn the early years of the Internet, many universities were locked behind firewalls. However, today almost all universities have removed firewalls allowing for searching outside their specific communities. Some universities, though, still maintain various forms of firewalls. To the extent that these firewalls capture intranet activity, they offer valuable insight to school administrators regarding leading issues with faculty and students. Zip CodesSearches segmented by zip codes have the potential of relating to perhaps the leading marketing segmentation tool in the Claritas and PRIZM database. Basically, Claritas has segmented America into a number of distinct psychographic and demographic groups with direct relationships to zip codes. For example, if a marketer provides a zip code to Claritas, they can provide an excellent model of the consumers living in this zip code. If zip codes could be isolated for search purposes, leading word rank could be cross-indexed with Claritas data for a powerful new understanding of markets. IndustriesAs we mentioned earlier, it is conceivable search segmentation might also be undertaken by non-locality identifiers such as industry SIC codes. For example, searches originating in different industries might be isolated for industry analysis as well as for cross industry comparison. Also possible might be further segmentation of niches within industries. Industry specific words might be defined and word rankings obtained from this community of words. The rise or decline of certain words (discussed in the following Ascendance chapter) associated with industry products might greatly help in forecasting industry or product trends.
Depth (Synchronic Words)While the Google Zeitgeist is interesting, for the most part the top ten Google words (weekly or monthly) simply reflect leading things and people coming and going from the attention of popular culture. The top ten words offer few surprises and little insight into the hidden forces behind popular culture. Far greater cultural insight exists in the larger database of words ranked outside the Google top ten words. It is at the lower ranking levels that words move away from reflecting external cultural events to expressing internal attitudes.
Figure 2: Deep Zeitgeist
Just as the surface of a lake reflects the clouds over it, so too do top ranked words reflect general interest in external current events. In the above illustration, top ranked words (red dot) are closer to popular culture and the external world. In this sense, they are more reflective of the external world. On the other hand, lower ranked words (orange dot) are closer to collective psychology and the internal world and are more expressive of the internal world. The arrows in the illustration show the path origination of forces creating the word rankings. Top ranked words are reflections of leading events in culture and the arrow moves down from the external world to the words. Lower ranked words are expressions of attitudes and moods in individuals and the arrow moves up from this internal world to the words. The real "zeitgeist" or "spirit of the times" behind the quick flash in the pan events of popular culture exists within the patterns obtained from these larger and deeper ranking of words.
AscendanceIn addition to exploring a deeper Google Zeitgeist, the history of the ranked words may also offer new insights into culture. Here, the speed of a word's rise might be given an index and rating. Words that rise the fastest (and are not related to obvious cultural events) may indicate the collective psychology of a culture, or, what Carl Jung termed the collective unconscious. The same word searched for over a shorter period of time may be more indicative of collective psychology than the same word searched for over a longer period of time. Words might be given a "Rise Index" to rate them apart from their "Rank Index."
Table 8: Rise Index
In the above chart, although the word "Baby" is rated higher than "Clouds" it may not be as important as "Clouds" since it took 1,000 minutes (16.6 hours) to rise to #200 while "Clouds" took only 10 minutes (1/6 hour) to rise to #800. Dividing total searches by the time period a "Rise Index" is obtained. This index measures the speed of ascent. Note in the above example, the "Rise Index" for "Clouds" (at 10,000 per minute) is 20 times greater than the rise index for "Baby" (at 500 per minute). Most interesting within the "Rise Index" are words searched for simultaneously demonstrating a type of synchronicity of search. The higher the Rise Index, one might posit the higher degree of synchronicity involved. For example, a word ranked #1,500 but based on 50,000 simultaneous searches and with no observable external stimulus offers a valuable research potential into the concept of word search synchronicity on the Internet and its relationship with collective psychology.
Cycles (Diachronic Words)Cycles are composed of beginnings, endings and a sequence of stages between the two. Research into cycles has shown that cycles move between symbolic opposites inherent in the symbolism of beginnings and endings. While many cycles have been observed in nature, there is growing evidence there are also cycles in culture. Some of the key researchers in the area of cultural cycles are Arthur Schlesinger Sr. and Arthur Schlesinger Jr., Frank Klingberg, William Mayer, Harold Lasswell, Lloyd deMause, Pitirim Sorokin, William Strauss and Neil Howe. One of the most famous cycle theories of cultural cycles is the Elliott Wave Theory proposed by Ralph Nelson Elliott. It relates the economy to cyclic social moods. Robert Prechter in The Wave Principle of Human Social Behavior extends the Elliott Wave theory to wider areas of sociology and psychology to create a theory of socionomics. Within Prechter's theory there is the suggestion words go through cycles similar to Elliott waves. Within the area of cultural word cycles, one needs to distinguish between external word cycles and internal word cycles. The first involves word relationships to annual cyclic events. The second more closely resembles Elliott Wave cycles and collective cultural moods. External CyclesCertain words and classes of words demonstrate repeating cycles related to external annual cultural events such as major holidays, sporting events or governmental deadlines. For example, the word "Internal Revenue Service" or "IRS" evidences a cyclic rise before the April 15th U.S. tax deadline and a decline after it. Or, as another example, words associated with the words "Super Bowl" show an annual rise in January and decline in February. Internal CyclesFar more interesting, though, are word cycles relating to internal factors rather than external events. According to cycle theory, the general types of words clustered around the beginning of these internal cycles should be different (opposite in fact) than words clustered around the end of these cycles. They should also demonstrate a sequential clustering in stages between the beginning and endings of these cycles.
Figure 3: External and Internal Word Cycles
For instance, one would expect to see a dominant cluster of feminine related words at the beginning of an internal word cycle and a dominant cluster of masculine related words at the end of an internal word cycle. Whether the clusters of feminine or masculine words have a relationship to popular culture (feminine film and story genres, political leaders, colors, etc.) is an important research question by itself. In the above illustration, external word cycles (Yellow) demonstrate repeating cycles related to external annual cultural events such as major holidays, sporting events or governmental deadlines. Internal word cycles (Blue) demonstrate repeating cycles related to biological rhythms and internal psychology.
Clusters"The meaning of an episode was not inside like a kernel but outside in the unseen, enveloping the tale which could only bring it out as a glow brings out a haze." Besides showing a depth archaeology of ranking levels, Internet words also demonstrate a clustering phenomenon centered around key words searched for. This clustering phenomenon can be analogized to the orbit of planets around stars, of moons around planets. The force of certain words to draw other words to them in orbits is similar in many ways to gravitational forces of the universe. Words may have a ranking on the Internet but words also possess their own internal gravity pulling other words into orbit around them. All words are symbols and all symbols refer to other words or symbols. Much of this sounds esoteric until one considers that many key words on the Internet relate to products. People search the Internet largely to find these leading products of popular culture. But more importantly, they unconsciously search the brand contexts and narrative entertainment genres containing these products. We suggest culture first creates a narrative context before placing products in it. Just as the outer planets of the solar system are symbols for the context of collective psychology, so too might words in the outer orbits around products present collective factors associated with these core products. BrandsAll physical products and non-physical services are wrapped in the images, emotions and words of brands. The core product can be viewed as a few particular words with a number of related words orbiting around it in increasing concentric orbits. Those brand words most directly related to the product have the smallest (closest) orbits while those indirectly related to the product the largest (most distant) orbits. Words clustering around products can be isolated and given various levels of importance. For example, a word directly related to a product might be given more weight than one in a larger orbit farther away from the core product. As a result, lower word rankings of direct product attributes might mean as much or more than higher word rankings of indirect product attributes in more distanced orbits.
Figure 4: Products & Word Orbits
In the illustration above, a product in the green circle is defined by words relating to its specific features and benefits. Words describing the actual product (like color, size and shape) are clustered within the small yellow circle. Outside the yellow circle various words orbit around in growing concentric circles. In effect, these words create the overall "brand" surrounding the product. The orbits of the Inner Brand Words are closer to the product and more objective in defining it. Those Outer Brand Words are less objective and more subjective in defining the product. One might visualize the green product as placed within an advertisement or a commercial. The words of the outer circles are words relating to the context of the advertisement, the setting or the background environment which best presents the product. For example, assume the product is a rugged off-road vehicle. Key words describing it would center on masculine words associated with freedom and power. Words in the outer circles would most likely center on a non-urban context of the product such as the desert suggestive of the setting for westerns. As Philip Kotler notes in Marketing Management (8th Edition), there are five levels of a product that expand out in circles. At the center is the Core Benefit, then the Generic Product, the Expected Product, the Augmented Product and finally the Potential Product. Kotler notes that much competition takes place at the Augmented Product level rather than the Core Benefit level. The Augmented Benefit level is really the brand level. As Harvard Business School's Theodore Levitt notes in The Marketing Mode: "The new competition is not between what companies produce in their factories, but between what they add to their factory output in the form of packaging, services, advertising, customer advice, financing, delivery arrangements, warehousing, and other things that people value." One might add, that this has increasing application to products searched for on the Internet through search engines. As the power and interest in brands increases, consumer attention focuses more away from our green circle (containing what Kotler calls the product's Core Benefits) and more on Levitt's outer bundle of brand elements in the blue circle. The attention is away from key descriptive words of the product itself and more towards words relating to the context of the product. In effect, for the marketer selling rugged off road vehicles, word searches involving contextual places like "desert" might indicate a more subliminal and predictive interest trend in the product than straight searches based on words like "rugged vehicle." In this way, the outer words of product brands are similar to the deeper layers of word rankings. Content And Context WordsWe suggest outer brand words and inner brand words might be termed context and content words. In many ways, these words represent symbols on opposite ends of a spectrum or continuum. In a general sense, one observes that content words are more objective and descriptive of the external world while context words are more subjective aspects of the internal world. Context is a pervasive, ubiquitous medium surrounding messages within it. It is similar to the water surrounding a fish and difficult for our visually oriented Western culture to sense. As Marshall McLuhan once commented, "While we're not sure who discovered water, we're pretty sure it wasn't a fish."
Table 9: Content & Context Words As Opposite Symbols
The oppositions can be shown in a few ways. First, they can be seen (below) in a cluster manner as context and content words orbiting around products.
Figure 5: Content & Context Word Clusters
In the illustration above, the Product is represented by the small green circle (P). Content words are objective describing such product features as shape, size and color. These words have a close physical relationship to products (P). In effect, they form the clothing of products. Context words are subjective providing such containers for products such as space, place and time. Both content words and context words surround the actual product (P). They can also be seen in a depth manner as top ranked words or lower ranked words. For example, in the illustration below, the yellow box represents top ranked content words close to leading products in culture while the blue level box represents lower ranked words closer to the context of leading products in culture.
Figure 6: Content & Context Word Depth (Rankings)
However one views these words, it is important to see the dichotomies of content and context words and their attachment to products of culture. Narratives
|