The current online version contains 8995 cues and 2.85 million responses. This work would not have been possible without the kind collaboration of thousands of volunteers who participated in the study. One way of thanking them is by making these data available not only to the scientific community but to anyone who is interested in word associations, language and memory. We hope this website will contribute to this goal.
Most of the functionality described here is self-explanatory and easy to understand by actually trying. Simply submit a word on the appropriate pages and see what happens.We are open to any requests or comments, so don't hesitate to get in touch (see info page). This website is a continuous work in progress, so please check back for updates. For this and other reasons it is best to contact us in advance if you want to use any of these results for scientific purposes. We hope to submit a new manuscript about these data by the end of 2010 at which point most of the new data will be made publicly available.
In a word association network, some words are more important than others. This different centrality can be expressed by differences in the number of different associates it has (outdegree) or the number of times it is mentioned as an associate response (indegree). Apart from outdegree and indegree more complicated measures have been formulated to measure how important a node is in a network. Some of these measures like PageRank and HITS are currently used to measure how central certain pages are on the worldwide web.Below you can lookup the centrality values for words and take a look at the twenty most important nodes in the network according to these measures.
If no match is found, consider entering a different wordform.
You can use the mouse to drag the network by holding the left button and dragging inside the network. If the mouse cursor focusses on a node, additional information about the incoming and outgoing edges will be displayed. Pan by mouse 'click-and-hold' on the background visualization. Sink nodes (words never presented as cues) are presented as squares. Only associations with an association frequency larger than 7 are presented.
The average Dutch university student knows about 45.000 words. This lexicon is can also be characterized by the relationships between these words. One way of doing this is by using the metaphor of a semantic associative network. Our aim is to collect data for word associations to study the property of human semantic knowledge. We use a snow balling approach to gradually expand the set of cues. We will gather data from 100 persons for each cue. When all this data is collected, we select the most frequent association responses that are not yet in our set, and present them as cue in a second round.
This project is part of our research into semantic networks and concept representation at the Research Group on Concepts and Categories in Leuven, Belgium in collaboration with Prof. Gert Storms. There's also a Dutch text that gives a general overview of the word association project. Originally, this project started in 2003 with the collection of associations for a set of 400 concrete concepts (see Ruts et al., 2004 for a description). Gradually this set was expanded over the years to a stimulus set of about 1.400 words. The results at this point were reported in De Deyne and Storms (2008). We hope to make the latest version of the dataset available to other scientists in the beginning of 2010.
Participants. At the end of September 2010, a total of 58.335 persons participated in the study.
Some participant statistics are shown below. These graphs show that most participants were young, which is not surprising since many of them were university students. In addition, the majority of the volunteers was female and came from the Flemish speaking part of Belgium.
Evolving Networks. The core assumption of distributional approaches to lexico-semantics considers the meaning of a word to be shaped by the way it is used in language. This would imply that the meaning of words and the way we use them changes during the course of our life. Word associations can be used to unravel how this meaning changes over time. This can be done by comparing the association association responses in young and older participants.
Small World Networks. Word associations also offer us a method to investigate how our memory works in general. When word associations are represented in networks we notice that not any two words are related by coincedence. Instead, this network shows a specific topology similar to other growing networks. These networks are referred to as small world networks because any two nodes in the network are separated by a small number of steps. Just like social networks where two arbitrary persons are only a few handshakes away, two random words in the association network are on average separated by 4 associations. Such a network structure might be an explanation why some words are retrieved quite easily, while others require more effort.
Missing Links When people generate associations, there are occasions where a certain responses are generated by almost all participants. When presented the word rose, most people will respond red. It is however likely that the semantic network around rose contains other information as well. The goal of predicting missing link is to find which weak links are lacking from the network and investigate the role of these weak links.
Cross-Cultural Comparisons. Finally, we are also interested in studying wordassociations in different cultures. These studies allow us to compare what words are central in other languages, where two languages converge or differ. One of the topics of interest is comparing how the meaning of certain responses differ for Dutch speakers of Flanders with the corresponding responses for Dutch speakers from the Netherlands.
The word associations are currently used as part of the Dutch Womima association website. It allows you to explore personal association networks in a unique way and offers many different types of visualizations.
The results reported in De Deyne and Storms (2008a) are available for download. The comma-separated text files contain wordassociation data for 1424 stimuli words and were collected between 2003 and 2006. For each cue, a total of at least 83 participants generated three different associations.