Textual Analysis tools beyond the technical pecularities - 10 views

started by Ridha Ben Rejeb on 10 Feb 14

#1 Ridha Ben Rejeb on 10 Feb 14

Following last week's class topic Text and Discourse Analysis, I thought to invigorate the discussion around this particular topic of interest , given my academic background in applied linguistics and discourse studies blended with DH. According to Rockwell, text analysis tools did not respond well to the expectations and needs of the research community, partly because these tools are not user-friendly " The tools have emerged from the private sector and from the open source community; they just haven't been designed for us and need to be adapted to fit into our research practices". (Rockwell 9) and partly because " These industry tools provide access to licensed digital archives" .Thus , "They can only be deployed on more sophisticated (and expensive systems) by people with a certain level of technical proficiency. Thus only well funded projects can deploy them and they therefore tend to be used to publish scholarly corpora by well- funded projects." (Rockwell 9).

In my opinion textual analysis projects require collaboration between a number of like minded research institutions to work on developing textual materials corpora and data banks that allow future unanticipated query and questions. Textual Analysis projects need reanimation by passing innovative legislations and regulations that should replace the existing conventional research legislations and ethics. One example I can think of is my research project on exploring formulaic language ( set phrases) relevant to business meetings. Though the infrastructure is available, I ran into obstacles prior to using textual materials for discourse analysis partly because there is no corpus available of textual materials of business meeting minutes , though many establishments publish their minutes online as e-minutes, once they are approached seeking their open access texts for textual analysis to identify and classify a set of phrases that are frequent and salient in that particular genre of communication , one can sense the reluctance in approval or collaboration in disseminating the textual materials despite the fact that the university Research Ethics Board explicitly states that publicly available information are exempted from research ethical clearance. The issue stems from the fact that many textual materials contain names or data associated with certain names. Now, these names were published on publicly accessible sites why can't they still remain for textual analysis of the digital material. Names can reflect during textual analysis for future generations the ratio of female vs male members at a certain establishment , gender and ethnicity distribution and representation , let alone popular male and female names at a given point in time. I guess my focal point is that there are far more emerging issues beyond the technical side or infrastructure that may impede collaboration and the creation of textual analysis corpora and data bases.
Another issue that emerge from textual and discourse analysis is the lack of consideration of certain tools to context related words especially in an era of English as a Lingua Franca ( also known as World Englishes) for many contexts and situations. An illustration is the claim of Huang " As I looked closely at the words in the 'Other' category [off list words that are highly specific], I found some of them were Chinese names and terms that can be readily understood by people living in Taiwan,.....Since the chosen article reported Taiwanese news...If we add those context-related words to a 'most frequent 1000 words used in Taiwan list', my finding will probably be closer to *Nation's". This clearly motivates collaboration on multiple levels to include the wider international community including creating corpus of textual materials specific to certain contexts that would reflect accurate textual and discourse analysis findings. This is echoed in Rockwell " the aim is to support not only the researchers and existing projects,[...] but also to provide a portal to appropriately configure tools for researchers and to significantly improve the research infrastructure [...] in the humanities and other disciplines that make heavy use of textual evidence and [...]computer assisted text analysis in the interpretation of texts. We hope to trigger a re-examination of the presuppositions, the types of questions, and the interpretative theories that form our practices."(14)

Can you think of other issues specific to your discipline associated with digital tools beyond the technological peculiarities or challenges?

*Paul Nation (1997)Vocabulary Lists of most frequent English 1K and 2K.

References :

Hsing-fei, Huang. "The Exposure to English Vocabulary for a University-level
Learner in Taiwan".McGill University.EDSL-617.May 18, 2004.Online
Investigation.

Nation, Paul. "Vocabulary size, text coverage, and word lists." Vocabulary:
Description, Acquisition and Pedagogy." Ed. Norbert Schmitt and Michael
McCarthy. Cambridge: Cambridge University Press,1997. 6-19.Print.

Rockwell, Geoffrey. "What is Text Analysis, Really?," Literary and Linguistic
Computing 18.2 (2003): 209-220.Print.

<div class="cArrow"> </div><div class="cContentInner">Following last week's class topic Text and Discourse Analysis, I thought to invigorate the discussion around this particular topic of interest , given my academic background in applied linguistics and discourse studies blended with DH. According to Rockwell, text analysis tools did not respond well to the expectations and needs of the research community, partly because these tools are not user-friendly " The tools have emerged from the private sector and from the open source community; they just haven't been designed for us and need to be adapted to fit into our research practices". (Rockwell 9) and partly because " These industry tools provide access to licensed digital archives" .Thus , "They can only be deployed on more sophisticated (and expensive systems) by people with a certain level of technical proficiency. Thus only well funded projects can deploy them and they therefore tend to be used to publish scholarly corpora by well- funded projects." (Rockwell 9). In my opinion textual analysis projects require collaboration between a number of like minded research institutions to work on developing textual materials corpora and data banks that allow future unanticipated query and questions. Textual Analysis projects need reanimation by passing innovative legislations and regulations that should replace the existing conventional research legislations and ethics. One example I can think of is my research project on exploring formulaic language ( set phrases) relevant to business meetings. Though the infrastructure is available, I ran into obstacles prior to using textual materials for discourse analysis partly because there is no corpus available of textual materials of business meeting minutes , though many establishments publish their minutes online as e-minutes, once they are approached seeking their open access texts for textual analysis to identify and classify a set of phrases that are frequent and salient in that particular genre of communication , one can sense the reluctance in approval or collaboration in disseminating the textual materials despite the fact that the university Research Ethics Board explicitly states that publicly available information are exempted from research ethical clearance. The issue stems from the fact that many textual materials contain names or data associated with certain names. Now, these names were published on publicly accessible sites why can't they still remain for textual analysis of the digital material. Names can reflect during textual analysis for future generations the ratio of female vs male members at a certain establishment , gender and ethnicity distribution and representation , let alone popular male and female names at a given point in time. I guess my focal point is that there are far more emerging issues beyond the technical side or infrastructure that may impede collaboration and the creation of textual analysis corpora and data bases. Another issue that emerge from textual and discourse analysis is the lack of consideration of certain tools to context related words especially in an era of English as a Lingua Franca ( also known as World Englishes) for many contexts and situations. An illustration is the claim of Huang " As I looked closely at the words in the 'Other' category [off list words that are highly specific], I found some of them were Chinese names and terms that can be readily understood by people living in Taiwan,.....Since the chosen article reported Taiwanese news...If we add those context-related words to a 'most frequent 1000 words used in Taiwan list', my finding will probably be closer to *Nation's". This clearly motivates collaboration on multiple levels to include the wider international community including creating corpus of textual materials specific to certain contexts that would reflect accurate textual and discourse analysis findings. This is echoed in Rockwell " the aim is to support not only the researchers and existing projects,[...] but also to provide a portal to appropriately configure tools for researchers and to significantly improve the research infrastructure [...] in the humanities and other disciplines that make heavy use of textual evidence and [...]computer assisted text analysis in the interpretation of texts. We hope to trigger a re-examination of the presuppositions, the types of questions, and the interpretative theories that form our practices."(14) Can you think of other issues specific to your discipline associated with digital tools beyond the technological peculiarities or challenges? *Paul Nation (1997)Vocabulary Lists of most frequent English 1K and 2K. References : Hsing-fei, Huang. "The Exposure to English Vocabulary for a University-level Learner in Taiwan".McGill University.EDSL-617.May 18, 2004.Online Investigation. Nation, Paul. "Vocabulary size, text coverage, and word lists." Vocabulary: Description, Acquisition and Pedagogy." Ed. Norbert Schmitt and Michael McCarthy. Cambridge: Cambridge University Press,1997. 6-19.Print. Rockwell, Geoffrey. "What is Text Analysis, Really?," Literary and Linguistic Computing 18.2 (2003): 209-220.Print.</div>

...

Cancel

To Top

Start a New Topic » « Back to the DIGH5000-14W group

Start a New Topic