Text Analysis Overview
Excerpt from text analysis of Gormenghast
-
An observer's perception of an environment contributes to their sense of a place.
-
Cultural and biological factors influence these perceptions
- First Nations people may avoid a place they consider sacred based on a shared oral and/or written history.
-
Species possessing sentience, language and culture generate complex narratives about places that can vary between members of the same cultural group.
Significance
-
What motivated our inquiry?
- Desire to extract subjective descriptions of place from text (e.g., adjectives, place names) and parse them to identify patterns and trends (e.g., common associations between nouns and descriptive words).
-
Who could this research benefit?
- Our intended audience is designers, particularly those working with nonhumans.
- Benefits extend to collaborators, e.g. ecologists, biologists, botanists, etc.
- Greater knowledge of nonhuman perspectives can improve design outcomes.
- The timeline can support exploration of the novel and guide deep reading.
- Tools can support creative efforts by designers and artists.
- Our intended audience is designers, particularly those working with nonhumans.
-
How does it relate to the exhibition themes?
- Differences in perceptual abilities between species influences their conception of place, which informs culture, social dynamics, and ways of understanding their environment.
- Concepts from animal justice studies conted that nonhuman species enact behaviours that satisfy ecocentric definitions of culture [REF]. Animals develop instinctive (fight or flight) responses to places based on positive and negative experiences and share this knowledge with others [REF].
- For example, crows communicate rituals ([Example]) to eachother using rudimentary language, hereditary exchange, and mirroring [REF].
- Plant species "prefer" places with resources that support flourishing.
- One can argue that nonhuman species can construct place attachments depending on the sophistication of their perceptual abilities.
- Concepts from animal justice studies conted that nonhuman species enact behaviours that satisfy ecocentric definitions of culture [REF]. Animals develop instinctive (fight or flight) responses to places based on positive and negative experiences and share this knowledge with others [REF].
- Differences in perceptual abilities between species influences their conception of place, which informs culture, social dynamics, and ways of understanding their environment.
-
Integrate the paragraph below.
-
Provide tools and methods for "quantifying" the use of language related to place in Gormenghast with a focus on the relationships between:
- Phenomena such as Colour, light, texture and atmosphere.
- Discrete geographical locations and their component parts, e.g. buildings and rooms, forests and trees, etc.
-
Explore and discuss issues relevant to place, design and communication that emerge from attempts to create these tools, particularly those that emphasize the differences in subjectivity between species.
-
Describe the difficulties inherent in attempting to "quantify" subjective phenomena such as atmosphere, mood and events from unstructured text.
- Ambiguous/loosely-defined cultural phenomena such as place are amalgamations of different sources not limited to the impression created directly by language, e.g. collective/cultural memory such as awareness of other works, history, symbolic associations play a large part.
Research Gap
- Challenges and limitations of quantifying subjective text
- Statistical analysis of subjective descriptions is difficult
- Current methods for quantifying interpretations of places (e.g., journal entries, online reviews, social media posts) capture few descriptive place terms.
- These methods produce binary or hierarchical interpretations of a place, e.g. what adjectives visitors use to describe a Google Maps location.[REF]
- Large datasets exist to support analysis, but these are intended for commercial applications (e.g., market research, improving search engine results).
- Quantifying a 'place' is difficult because places are subjective and complex.
- Cultural, physical, biological, evolutionary, social, and economic dynamics contribute to an individual's sense of place. It is difficult to delineate and extract these factors from texts alone.
Opportunity
- Literature describes the subjective qualities (atmosphere, geography, histories, culture, emotions, etc.) of fictional places.
- Use methods from other disciplines to analyse representative texts.
- Digital humanities and natural language processing
- Digital humanities scholars use natural language processing and statistical analysis to quantify literary texts. These approaches identify trends in word usage (how many times a word or word pair is used) in a single text (e.g., Moby Dick by Herman Melville) or large corpora (e.g., the works of Jane Austen or all Gothic novels published during the 19th century).
- Sentiment analysis uses lexicons to approximate the sentiment of words. The process determines the 'sentiment' of a text (e.g., a sentence or paragraph) by assigning each word to a lexicon category.
- The NRC Emotion Lexicon is an open-source lexicon for sentiment analysis that can score words based on sentiments (positive or negative) and eight emotions (anger, anticipation, disgust, fear, etc...).c
- Geographical information (imagination) systems
- Automating semi-supervised methods for extracting place data establishes a basic standard for
- Digital humanities and natural language processing
Limitations
- Our only source of information is the English language, which biases Enlightenment-era worldviews founded on rationalist principles [REF]. Anthropocentric language is blind to the richness of nonhuman life.
- As society becomes less enmeshed with nonhuman nature, language has lost many place-based terms. [REF, Robert MacFarlane]
- Reference material about nonhuman species (textbooks, academic papers, natural history) adopt an ecocentric position eliminates some anthropocentric bias, as does fiction focused on places.
- The influence of errors decreases as corpus size increases: trends observed in the works of Jane Austen are more reliable than one paragraph from Pride and Prejudice.
- An author's experiences, biases and style influence results.
- Our knowledge of statistical methods limits the replicability of our findings. We acknowledge that errors will exist, but hope others will find our conceptual approach interesting. We aim to demonstrate that applying these methods to novel scenarios can produce interesting results that others can extend.
Research Question
- Text analysis tools can automate the extraction of descriptive words about places in literary text(s)
Methods
-
Select a text containing descriptive language about places.
- This research grant uses Gormenghast by Mervyn Peake to explore synergies between design, literature and place.
-
Use text analysis tools to extract data about place from this text
- Sentiment analysis summarises moods and emotions in text(s).
- Measuring collocations between words highlights the descriptive words associated with place nouns.
- Frequency searches measures the importance of words in text(s).
- Reviewing words in context helps verify assumptions derived from automated methods.
- Arranging words by occurence reveals trends in language usage over 'time'.
-
Explore the limitations of this data and suggest improvements.
- A working schema relates this data about place language.
- Exploring the biases implict in language reveals the limitations
-
Suggest applications for this data.
- This data can inform representations of places.
-
The full text of the novel (unstructured text).
-
Images, graphs and tables that interpret the novel (item 1) using information from item 2.
Results
- See Experiments for the outcomes of our attempts to apply the above logic to several phenomena in the novel.
Footnotes
Saif M. Mohammad and Peter D. Turney, “Crowdsourcing a Word-Emotion Association Lexicon,” Computer Intelligence 29, no. 3 (2013): 436–65, https://doi.org/10/f45xmb.˄
Michael E. Martin and Nadine Schuurman, “Area-Based Topic Modeling and Visualization of Social Media for Qualitative GIS,” Annals of the American Association of Geographers 107, no. 5 (2017): 1028–39, https://doi.org/10/gg6twg.˄
Luke Bergmann and Nick Lally, “For Geographical Imagination Systems,” Annals of the American Association of Geographers, 2020, 1–10, https://doi.org/10/gg6tb6.˄
Mervyn Peake, Anthony Burgess, and Quentin Crisp, The Gormenghast Novels (Woodstock: Overlook Press, 1995).˄
Imogen Lesser Woods, “Literary Language as a Tool for Design: An Architectural Study of the Spaces of Mervyn Peake’s The Gormenghast Trilogy and ‘Boy in Darkness’” (PhD, Kent, University of Kent, 2018).˄
Backlinks