Cate Cross
Cate Cross
Abstract
The purpose of the paper was to analyse the structure of a small number of abstracts
that have appeared in the CABI database over a number of years, during which time the
authorship of the abstracts changed from CABI editorial staff to journal article authors
themselves. This paper reports a study of the semantic organisation and thematic structure of
12 abstracts from the field of protozoology in an effort to discover whether these abstracts
followed generally agreed abstracting guidelines. The method adopted was a move analysis
of the text of the abstracts. This move analysis revealed a five-move pattern: Move 1 situates
the research within the scientific community; Move 2 introduces the research by either
describing the main features of the research or presenting its purpose; Move 3 describes the
methodology; Move 4 states the results; and Move 5 draws conclusions or suggests practical
applications. Thematic analysis shows that scientific abstract authors thematise their subject
by referring to the discourse domain or the ‘real’ world. Not all of the abstracts succeeded in
following the guideline advice. However, there was general consistency regarding semantic
organisation and thematic structure. The research limitations were the small number of
abstracts examined, from just one subject domain. The practical implications are the need for
abstracting services to be clearer and more prescriptive regarding how they want abstracts to
be structured as the lack of formal training in abstract writing increases the risk of subjectivity
and verbosity and reduces clarity in scientific abstracts. Another implication of the research
are that abstracting and indexing services must ensure that they maintain abstract quality if
they introduce policies of accepting author abstracts. This is important as there is probably
little formal training in abstract writing for science students at present. Recommendations for
further research are made.
He has been a member of JISC and/or some of its committees since 1992. He is currently a
member of the JISC Scholarly Publishing Working Group and of the HEFCE/UUK Working
Group on Intellectual Property Rights.
1
Charles is an Honorary Fellow of the Chartered Institute of Library and Information
Professionals. He is a member of the Legal Advisory Board of the European Commission.
He was the Specialist Advisor to the House of Lords’ Inquiry into the Information
Superhighway. He is a regular contributor to conferences and to the professional and
scholarly literature, and is on the editorial board of a number of professional and learned
journals, and of Annual Review of Information Science and Technology.
Introduction
In the last thirty years, the publication of research articles and monographs by
academics and practitioners has increased dramatically and this growth has been well
documented in both the respective academic fields and within the Information and Library
Management (ILM) literature. This growth has resulted in to what some have termed
‘information overload’ or ‘documentary inundation’ (Lancaster 2003, p.104; Pinto 1994,
p.111). The increase in research literature combined with growing interdisciplinarity has
strengthened the need for efficient information retrieval systems (Tibbo 1993, p.4). This is
particularly true in science, where scholars and practitioners perceive periodicals to be the
most valuable source for their continuing education and for sharing new knowledge. Yet due
to the burgeoning growth in the output of material, readers are unable to survey all the
literature that is relevant to their field of expertise (Salager-Meyer 1990, p.366; Maizell 1971,
p.2). Consequently, the demand for accurate and thorough condensed document
representations, which present similar content in a consistent manner, but which,
simultaneously are able to differentiate between individual items from a number of related
ones, grows (Tibbo 1993, p.7). To satisfy this need, abstracts have become a standard
gateway into the research literature for the scientific community (Hartley 1996, pp.349-356).
Scientists rely on the abstract as a concise and accurate representation of the contents
of a document (Salager-Meyer 1990, p.366; Rowley 1988, p.10). There are a number of
reasons for the importance of abstracts. They (1) save reading time, as the reader is able to
gauge whether the full-text document is likely to be of sufficient interest to warrant reading in
its entirety; (2) help overcome the language barrier – written in the parent language, they
allow the reader access to the central themes of an article written in a foreign language; (3)
can provide some language preparation for the text by using key words and ideas that are used
in the full-text document; (4) the well written abstract can serve as a key to understanding
2
fully the argument of the original article (Swales 1990, p.179); and (5) serves the function of
a current awareness tool. Finally, as a post-reading phase, the abstract can act as a reminder
to the contents of the article and can help to consolidate ideas and opinions regarding the
research (Salager-Meyer 1990, p.367).
However, despite the obvious need for well-formed and consistent abstracts to aid
searching and selection and to overcome documentary inundation, abstracts suffer from
certain flaws that can have deleterious effects on their usefulness. For example, Tibbo (1993)
highlighted the inability of abstracts in certain disciplines to effectively mirror the structure of
the original text and to fully clarify key concepts. Furthermore, she stated that poorly
constructed abstracts could have a direct impact on the precision and recall of a search.
Meanwhile, the very essence of the abstract means that writing and reading such documents is
difficult, as they should be short, yet still contain the main arguments of the original text.
This necessarily means that the content of the abstract will be lexically and propositionally
dense (Hartley 1994, p.332; Kaplan 1994, p.413). Thus, abstracts should be subject to a fairly
strict measure of quality control in order to maximise text comprehension. Pinto and
Lancaster (1999, p.238) explicitly state that the abstract must be coherent syntactically and
semantically. However, currently there are no generally accepted abstracts standards, nor are
there any criteria against which the abstract can be assessed.
There have been several studies undertaken in the past that approach the subject of
abstracts and abstracting. The American National Standards Institute (ANSI) has published
material on abstracts in an attempt to characterise the essential elements and style of such
material (American National Standards Institute, 1979). A number of monographs have also
been written to help construct a successful abstract (e.g., Borko and Bernier 1975; Rowley
1988; and Lancaster 2003). In recent years, several authors have attempted to place the
abstract in the context of linguistics, cognitive psychology and philosophy. They suggest that
abstracting should develop within a scientific, operational context that draws on structural and
textual linguistics, formal and fuzzy logic and cognitive psychology (Pinto 1994, pp.111-133;
Pinto 1995, pp.225-234; Liddy 1990, pp.39-52; Endres-Niggemeyer 1995, pp.631-674;
Endres-Niggemeyer 1998).
Whilst studies on the readability of abstracts are not uncommon (see, e.g.,
Blakeborough and Oppenheim, 1980; King, 1976; Fox and Hartley, 2003; Armstrong and
Wheatley, 1998; Wheatley and Armstrong, 1997; Hartley and Sydes, 1995; Snizek, Oehler
and Mullins, 1991), only a small number of studies that posit abstracts within the context of
textual, discourse and genre analysis have appeared. These studies view abstracts as a
3
particular text-type, one that lends itself particularly well to genre analysis. Genre analysis is
seen as valuable because it is clarificatory, and it ‘provides a communication system for the
use of writers and writing, and readers and critics in reading and interpreting’ (Swales 1990,
pp.42-45). Furthermore, there is growing evidence that scientific and technical
communication depends on both linguistic competence and knowledge of appropriate
structure of genres and the forms of their linguistic representation (Busch-Lauer 1995, p.769).
To date, there have only been very few studies that approach the subject of abstracts and
genre analysis – both Swales (1990) and Bhatia (1993) briefly mention abstracts, but only
within the larger context of research writing in general.
1) To define the typology and functions of abstracts to fully understand their purpose, scope
and use;
2) To clarify the nature of scientific language, discourse and knowledge and its effect on
subject-specific abstracts;
3) To establish the structure of science abstracts through the definition of ‘moves’;
4) To discover how the thematic structure of the abstracts reflects the subject discourse;
5) To determine how closely abstracts follow stipulated guidelines.
What is an abstract?
4
Dictionary defines ‘representation’ as a set of means by which one thing stands for another.
The term “condensed document” furthers the notion of surrogacy. The idea of condensing a
document, or extracting the most pertinent details, is not particular to the different types of
condensed document representation. It is one example of a cognitive process that is
employed in all circumstances that require comprehension, memory and reasoning (Anderson
2000; Sternberg 2003, Endres-Niggemeyer 1998, pp.45-94).
Van Dijk’s work on macrostructures, although specifically directed towards the
cognitive sciences, is particularly useful in understanding the processes, production and uses
of condensed documents. Van Dijk (1980, pp.4-6) suggests that language users implicitly and
explicitly differentiate between local and global structures of discourse: although users
employ both notions of detail and particulars to speak of the parts of a discourse, to refer to
the general discourse, language users employ terms such as gist, topic or theme. These
general terms point to what is considered relevant, central or crucial to what is being
communicated; he observes that these notions represent the meaning or content of the
discourse and not the style of expression. The former he labels macrostructures; the latter
microstructures. Recently, Endres-Niggemeyer (1998, pp.57-59) has studied the theory of
macrostructures specifically in relation to summarising and condensed document
representation and has consequently underlined how the aboutness of a document can be
realised through the application of the macrostructure concept. Furthermore, she stresses that
the macrostructure of a document plays an important role in the selection of meaning units for
producing effective condensed documents.
5
micro-information and thereby efficiently storing complex information for future retrieval and
problem solving tasks.
The ability to present clearly, concisely, and unambiguously the main points of an
original document determine the usability and effectiveness of an abstract. Certain factors and
characteristics could all be considered instrumental in whether an abstract fulfilled its role as
facilitator in selecting or not selecting material, acting as a substitute of an original document
or saving time for the user during the research process.
Brevity. The asset of using abstracts in information retrieval is that they are
considerably shorter than the original document. All languages are full of redundancy, the
majority of which can be removed during the abstracting of the parent text (Borko 1975, pp.9-
10). However, the lack of redundancy can impede reading comprehension. Pinto and
Lancaster (1999, p.243) observe that although the length of the abstract is one of the few
things that abstracting guidelines can proscribe, in terms of recommended number of words,
6
brevity ‘should always be secondary to other considerations such as exhaustivity and
accuracy’.
Exhaustivity. It is the extent to which the abstract succeeds in including all of the
information that is relevant to the intended audience. By covering all the essential points of
the original document in the abstract, the user will have more access points to the contents of
the parent text, thereby enabling a more informed selection decision.
Accuracy. For an abstract to actually act as a substitute for the original document, it
is essential that the information summarised is completely accurate and represents the parent
document faithfully. A study by Pitkin (1999, pp.1110-1111) found that a considerable
percentage of abstracts failed to give correct data and in some instances failed to contain the
correct data at all. Such inaccuracy destroys the ability of the abstract to save the user time
and to provide an accurate representation of the original text to help selection and retrieval.
Density. Density is related to the brevity and exhaustivity of the abstract – given that
all the relevant information is included, the shorter the length of the abstract, the greater the
density of information found therein (Pinto 1999, p.244). Both Kaplan et al. (1994, pp.413-
414) and King (1976, pp.119-120) suggest that propositionally and lexically dense texts are
generally more difficult to read and can therefore reduce the quality of the abstract.
Content. An informative or indicative abstract has two main parts: the reference,
which points the users to the original document; and the body, which contains the condensed
information from the original document. A study by Borko and Chapman (cited in Borko
1975, pp.36-53) found that the body of a well-formed research article abstract must contain, at
the very least, information on purpose, method, conclusions, and specialised content. In her
example of abstract structure, Liddy (1991, pp.55-81) proposes that abstracts of scientific
research articles should closely follow the text structure of the parent document Anecdotal
evidence suggests that whilst the vast majority of abstracts provide accurate bibliographic
citations, a minority do not.
Context
7
The final factor that can influence the format of the abstract is the context in which it
is written. Factors such as the intended audience and the function of the abstract determine
how the abstract is structured. Generally, the information services that provide abstracts can
be divided into those that are discipline-oriented services and those that are mission-oriented
services. Discipline-oriented services aim to provide comprehensive coverage of a given field
of knowledge by capturing the literature at the time of its primary publication and adequately
indexing and abstracting it. The mission-oriented services direct their publications to an
identified user group that has a specific area of interest usually defined in terms of a task
rather than a traditional discipline (Borko 1975, p.4).
The information contained in an abstract can also be slanted in two other ways in an
effort to meet the needs of the intended audience and how they use the abstract. The abstract
can be purpose-oriented, or findings oriented. In the former, information on the primary
objectives, scope and methodology are presented first; in the latter, emphasis is given to the
results and discussion elements and placed in a prominent position.
Scientific discourse
8
and grammar, but also as a system of resources for generating meaning. As Lemke (1990,
p.ix) notes, language gives us semantics, i.e., ‘the study of meaning as it is expressed through
language’. Brown and Yule (1983, p.26) propose an analysis of discourse, or the analysis of
language in use, in an effort to see how language is employed as an instrument of
communication by speakers/writers and how these users express meanings and achieve
specific communicative intentions. As they stress, language ‘cannot be restricted to the
description of linguistic forms independent of the purpose or functions which those forms are
designed to serve in human affairs’.
Discourse communities
Recently, there have been a number of studies that seek to demonstrate that the
scientific style of writing is much more than a mere objective channel through which to
communicate scientific ‘facts’ and incontrovertible ‘truths’. Bhatia (1993, p.13) notes that the
nature and construction of a specific register and genre are characterised by the
9
communicative purpose that it is intended to fulfil; that the language used in scientific writing
reflects the position it holds in a particular context.
Other features of scientific expression that identify it as such are: (1) a strong
temporal and causal perception. This style of narrative reflects scientific discourse’s
preference for reporting a story as a chain of events that can easily be replicated (Junker 1999,
p.253) and (2) lexical density, i.e., how many words are packed into the clause. In the
formalised, written style of scientific expression, the number often rises to six to eight lexical
words per clause; this reveals the planned nature of scientific communication and highlights
its tendency to alienate readers who are not part of the discourse community (Halliday 1993,
p.76). These features that are imposed on scientific communication are often subtle features
of scientific style but they enable the clear identification of scientific writing and whether an
example of writing belongs to that specific discourse community.
Methodology
There have been a number of studies that have succeeded in defining the types and
functions of abstracts in information retrieval; studies that describe the characteristics of the
abstract (e.g., length, clarity, content and style) and that offer prescriptive advice on how to
write a well-formed and effective abstract. The intention of this study, however, was to offer
a systematic linguistic analysis of how scientific abstracts fulfil their function as condensed
document representations. This study also attempted to gauge whether abstracts follow the
advice stipulated in abstracting guidelines. The language employed in scientific discourse
varies between different branches of science. We focussed on scientific abstracts from the
field of protozoology, which exemplifies the experimental and empirical discourse of science
but does not rely too heavily on its own subject-specific language (unlike, say, the language
of mathematics, chemistry or physics).
10
The first step in selecting the appropriate kind and size of corpus for linguistic
analysis is to define the genre/sub-genre that one is working with, so that it is distinguishable
from other genres that are either similar or closely related to it (Bhatia 1993, pp.22-24). The
abstracts selected for analysis in this study were all from Protozoological abstracts – an
online bibliographic and abstracts database of published research on parasitic protozoa
provided by CABI Publishing (CABI). Until 2001, CABI provided abstracts written by
professional abstractors; however, in the last few years, CABI has introduced a policy of
accepting only author-produced abstracts with minimal or no editing.
Genre analysis was chosen for analysing the selected abstracts, as it provides an
‘insightful and thick description of academic and professional texts’ and it is a powerful tool
for determining form-function correlations (Bhatia 1993, p.11). A genre is a class of
communicative events, the members of which share some set of communicative purposes
(Swales 1990, p.58). The communicative function of the genre is what shapes it in terms of
structure, style, content and intended audience. By focussing on the abstract as a type of
genre, it was hoped that the formal structure, communicative purpose and forms of linguistic
realisation of the abstract would be revealed.
Levels of analysis
Bhatia (1993, pp.22-24) suggests the following steps to perform a successful genre
analysis:
1) Placing the given genre-text in a situational context.
2) Surveying existing literature. Including literature on 1) tools, methods or theories of
linguistic/discourse/genre analysis; 2) practitioner advice, guide books etc. relevant to
speech community; and 3) discussions of the social structure, history, beliefs and
goals of the academic community which uses the genre in question.
3) Refining the situational/contextual analysis. One needs to 1) define the
speaker/writer of the text, the audience, their relationships and goals; 2) identify the
11
network of surrounding texts and linguistic traditions that form the background to the
particular genre; and 4) identify the topic/subject which the text is trying to represent,
change or use and the relationship of that text to reality.
4) Selecting the corpus.
5) Studying the institutional context. This information can be found in guidebooks,
manuals and practitioner advice and discussions. It is important if the data is
collected from a particular organisation, which often imposes its own constraints for
genre construction.
6) Levels of linguistic analysis. This constitutes the actual analysis of the abstracts in
this study and therefore needs further elucidation.
.
Move analysis
Thematic structure
i. Participant domain.
12
Discourse participant – direct reference to the writer(s) which offers highest visibility to
the writer.
Participant viewpoint – reference to the writer through focus on research activities and/or
outcomes
Interactive participant – direct reference to other researchers by name.
ii. Discourse domain.
Discourse event/process – reference to the processes of reporting one’s research.
Macro-discourse entity – reference to units of discourse.
Micro-discourse entity – reference to discourse internal entities.
Interactive discourse entity – references to units of discourse other than the writer’s own
discourse entity.
Empty discourse theme – themes beginning with it as dummy subject.
iii. Hypothesised/Objectivised domain.
Hypothesised viewpoint – comments and judgments about research matters.
Objectivised viewpoint – reference to evaluative judgments (involving adjectival or
adverbial modifications of the nominal forms).
Hypothesised entity – models and/or research devices that are hypothesised to
measure/produce something.
Empty hypothesised and objectivised theme – empty themes introducing evaluation
through formulaic expressions.
iv. Real world domain.
Mental processes – implicates intellectual entities/processes as part of the ‘real’ world
research domain.
Real world entity – material entities/objects.
Real world event process – actions/processes as the target of research.
Empty real world theme – empty themes introducing ‘real’ world entities/actions.
This study was an attempt to define the structural and linguistic elements of
abstracts to establish their cognitive structure (vis-à-vis ‘moves’) and to observe the particular
use of language in this sub-genre. Our results will only be applicable to this specific sub-
genre, although the methods could be used in similar further studies. Only a small number of
abstracts were studied; therefore, the results of this study are exploratory. Furthermore, the
results were the outcome of qualitative field research, or naturalistic enquiry, and will have
been obtained inductively by systematic analysis of the data; because of this, Endres-
13
Niggemeyer (1998, pp.114-121) recommends that any regularities are best described by
context-dependent rules rather than by general laws.
Results
Move analysis
A move analysis of the 12 abstracts in the corpus revealed that the structure of such
material was encapsulated in five moves:
Move 2 – Purpose
In all instances of Move 2 in the corpus, it announced the research article’s content by
describing the key features of the research, or by presenting the main purpose. Eight of the
twelve abstracts contained this move. There were five instances which described the main
features of the research. Again, there was evidence of move embedding in this move, as Move
2 appeared with Move 3 (methodology) in two instances. In both instances, Move 3 was
introduced into this move only partially and did not carry substantial information. There were
also a number of ways in which the abstract authors indicated the main purpose of the
research. Two out of the three instances of this sub-move occurred as a discrete move. The
purposive nature of this sub-move was conveyed via the verb phrase: “the study was
conveyed in order to test…” or through the nominal phrase: “the aim of this study…”
Move 3 – Methodology
This move occurred in all twelve of the abstracts included in the corpus, and therefore
can be considered an obligatory element of the abstract. In all cases, this move indicated the
subjects, apparatus, procedures and variables of the research. There was a high incidence of
Move 3 merging with both Move 2 and Move 4, either partially or completely. Move
embedding between Move 2 and 3 occurred twice in the corpus, move embedding between
14
Move 4 and 3 occurred six times, whereas Move 3 appeared separately four times. In 50% of
the cases where Move 3 appeared independently of any other move, it was signalled by the
data, procedures or materials being placed in subject position. In one example, rather than
thematising the methods etc., the author indicated the onset of Move 3 by overtly drawing the
reader’s attention to the methodology by outlining the steps taken in the experimental process:
“The initial approach, which involved…” In the majority of occurrences of Move 3, the
author chose to use the past tense with the passive voice. However, in one instance, the
author broke with convention and used the active voice, making the researchers the subject of
the sentence.
15
“Passive immunization with MAb/SRIF therefore increased resistance to
E.vermiformis infection in susceptible C57BL/6 mice but not in resistant
BALB/c mice, suggesting that SRIF modulates the gut immune function in
parasitic infection.”
Sub-move 2 – Recommendations
Sub-move 2 appeared in the corpus as an attempt to suggest practical implications for
the research. There were four instances of this sub-move in the twelve abstracts under
analysis. In all four cases, it immediately followed Sub-move 1; furthermore, it occurred
within the same sentence boundary. All instances were in the present simple tense and
employed the active voice. However, in 50% of Sub-move 2 occurrences, the author used
either a modal verb or lexical items that suggested hedging to signal the onset of this move.
Thematic structure
16
Empty discourse theme – themes beginning with it as dummy subject; e.g., it is
concluded…. Three instances in the corpus.
Discussion
Move analysis
Following on from the work of Swales (1990, p.42), the identification of moves, or
semantic units, in each of the abstracts in the corpus was an attempt to uncover whether the
abstract authors succeeded in writing a particular type of goal directed communicative event
(the science abstract) and how the abstracts reflected the appropriate schematic structure of
this genre. Together with Swales (1990, p.43) and Pinto (1994, pp.116-117), the CABI
17
abstracting guidelines suggest a four move argumentational structure, i.e., purpose,
methodology, results and conclusions, which reflects the structure of scientific research
papers. However, a move analysis of the abstracts revealed a five-move pattern that included
a semantic unit about relation to other research. This reflects the results of Liddy (1991,
pp.55-81) and Santos (1996, pp.481-499) on the textual organisation of abstracts. Salager-
Meyer (1990, p.370) writes that appropriate ‘move selection’ is one of a number of
characteristics that define a well-formed abstract and writes that these moves ‘are
fundamental and obligatory in the process of scientific enquiry and patterns of thought’.
Furthermore, she states that the ‘semantic organisation of moves’ should be coherent and
logical, that is, that the progression of ideas should be presented in a logical order.
The results of the move analysis in this study showed that 33% of the corpus included
Move 1, suggesting that it is not considered an obligatory move in the genre. Pinto (1994,
p.117) suggests that scientific discourse contains much implicit knowledge, ‘the already-
known ‘old’ information, accumulated throughout the centuries by humanity thanks to
documentary tradition’, and that, as an effect, the literature of scientific research chooses to
imply this ‘old’ knowledge and only state that knowledge which is ‘new’. It is possible that
when Move 1 is included, it is because the information is still relatively new and it is felt that
such information will be useful to the reader; alternatively, in two of the abstracts, ‘old’
information was given in order to situate the current research within a scientific paradigm and
to increase the perceived relevance of the research. There was a 50% incidence of move
embedding in Move 1; both instances occurred with Move 4 (results). Although Salager-
Meyer’s definition of a well-structured abstract prohibits hybrid moves, in these two
instances, they served to increase the relevance of the current research as they challenged
‘old’ information. The choice of the present tense in this move reflects that the information it
contains is beyond doubt and already ‘out there’ in scientific discourse. As Bhatia (1993, pp.
6-7) writes, choice of tense is not solely dependent on syntactic and semantic considerations,
but also involves rhetorical judgments.
Further, 66% of the corpus, which is a surprising low figure, contained Move 2-
Purpose. This element is considered an essential component of the experimental-empirical
scientific research article. It would be expected that this move would be included in every
abstract (Endres-Niggermeyer 1998, p.107). Move 2 appeared logically placed in the text, i.e.,
it either opened the abstract or immediately followed Move 1. However, there were a number
of anomalies. In one instance, Move 1 followed Move 3 (methodology) and Move 4 (results):
this resulted in conceptual scattering, which had an adverse effect on reading comprehension;
and secondly, in the same abstract, the author(s) employed the present tense and active voice,
18
placing the researchers as the syntactic subject: this is highly unusual in scientific discourse,
which usually seeks to remove the scientist from the scene and prefers to thematise the
methods, apparatus or results (Lemke 1990, p.130). A final point concerning this move, when
the object of the study was not thematised, the author showed a strong preference for ‘this’
(e.g., ‘in this report…’, ‘This study…’), suggesting that the author wanted to incorporate the
abstract into the body of the paper. However, it is revealing that this type of formulaic
introduction is more prevalent in the abstracts that were produced by CABI after 2001, that is,
when CABI adopted their policy of only accepting author-produced abstracts.
Move 3 - Methodology was present in all twelve abstracts, reflecting its importance
in the discourse of scientific research. In all occurrences, it constituted a minimum of 25% of
the text space. However, there was a high incidence of move embedding. It was observed
that these hybrid moves vary in a number of ways. When Move 3 does occur as a separate
move, the apparatus, data and procedures are presented as the syntactic subject. This might
be explained by the author’s desire to describe the methodology as objectively as possible,
thereby following the prescribed advice in writing science (Lemke 1990, p.130). Similarly,
apart from one instance, the authors chose to use the past tense and passive voice to express
the methodology; this not only serves to remove the human element from the experiment but
also acts as signifier for the onset of this move. As Santos (1996, p.492) notes, in Move 3
‘tense-voice correlation necessarily implies signalling’.
Again, the results suggest that Move 4 – Summarising the results is an obligatory
move in the abstract genre as it is present in all twelve of the abstracts under analysis. All but
one of the instances of Move 4 were in the past tense. A substantial number of Move 4
sentences were written in the passive voice, which is likely the cause of the thematisation of
the results in this section (Santos 1996, p.493). However, wherever the author used the active
voice, (s)he succeeded in achieving an objective, ‘scientific’ tone. In effect, the author
achieved a sense of the results carrying the information, whilst the researcher remained
detached, thereby following the norm for communicating science. A final point to be made
concerning Move 4 concerns the high use of evaluative terms in this section. This may be for
two reasons: firstly, the author is attempting to position their research as relevant and
important and is competing for the time of a busy readership; and secondly, as the abstracts in
the corpus are either indicative or informative-indicative abstracts, the author uses
comparative lexical items as a way of implying measurements and quantitative data in a
format that does not usually prescribe including such data.
19
Move 5 - Discussing the research, which was realised by a concluding statement or as
a concluding statement together with a statement of recommendation, was found in eight of
the twelve abstracts. This suggests that the eight abstracts that included this move were
informative-indicative abstracts whereas the remaining four abstracts were indicative
abstracts.
Thematic structure
Kaplan et al. (1994, p.406) suggest that the choices that the abstract authors make
regarding thematisation affect the persuasive quality of the abstract. Theme, as defined by
Rashidi, is the clause level constituent that the author uses as a starting point of the message.
Rashidi (1992, p.192) observes that this constituent moves ‘the decoder [reader] towards the
core of the communication’. Thus, its importance as a persuasive tool is evident.
Furthermore, Brown and Yule (1983, p.99) propose that an analysis of theme is clarificatory
as it serves as a way to realise the structure of a text and what meaning the author wishes to
impart to that piece of information. They note that the ‘thematic organisation appears to be
exploited by speakers/writers to provide a structural framework for their discourse’.
Moreover, it appears that the choice of grammatical subject as a marker of theme reveals how
the writer seeks to position him/herself in their discourse community and against other
discourse communities.
The results of this study suggest that the authors of the abstracts in the corpus prefer
to position themselves and their research in two main ways: (1) there were 28 instances of
presenting the theme in the discourse domain, that is, focussing either on the processes of
reporting one’s research, referring to units of discourse, to discourse internal entities or to
units of discourse other than the writer’s own discourse entity. This suggests that the writer
presents his/her research as part of the discourse community and that this thematisation results
in the research being accepted into the wider discourse community; and (2) there were 45
instances of the writer thematising the grammatical subject as part of the ‘real’ world domain,
that is, that the research in question constitutes an answer to a relevant ‘real’ world research
problem. A third, but less usual way in which the writers thematise their subject is to present
the theme in the participant domain (three instances). It is possible that this method is less
prevalent as it includes direct reference to either the researcher or to other named researchers
in the discourse community. Such writing is unusual in scientific communication, which
seeks to impose an objective, de-humanised tone to its research literature (Lemke 1990,
p.130).
20
Conclusions and recommendations
This study was an attempt to establish the structure of protozoological research article
abstracts through the definition of moves; to discover how the thematic structure of the
abstracts reflected the subject discourse; and last, to determine how closely abstracts follow
stipulated guidelines.
The abstracts in the corpus followed a five-move pattern: relation to other research,
purpose, methodology, results and discussion of the research. However, these five moves
were dependent on the type of abstract involved and the communicative function that it
serves. These findings correspond to the conclusions of a similar study on applied linguistics
abstracts undertaken by Santos (1996, p.496). For an abstract to be effective, Endres-
Niggemeyer (1998, pp. 57-59) suggests that it must mirror the macrostructure of the parent
document. However, in the corpus, only Moves 3 and 4 appear all the time, with the other
moves only being used to suit the communicative needs of the author. This provides evidence
that the move selection in the corpus falls short of an acceptable standard for abstracts. In
addition, there was some inconsistency in the corpus regarding semantic organisation of
moves, with a minority of abstracts showing conceptual scatter and consequently impeding
reading comprehension. Similarly, Salager-Meyer (1990, p.380) noted that conceptual
scattering and illogical ordering of moves revealed an inability to structure the semantic units
of the text in a way that enabled the reader to easily understand the meaning of the text. This
study also revealed considerable use of move embedding. Again, this substantiated the results
in Santos’ study. This approach allows the writer to organise information in a way that
marries the need to impart information succinctly and cohesively in a necessarily condensed
document.
Secondly, the thematic structure of the abstracts showed that the authors generally
succeed in reflecting the scientific discourse. The results demonstrated that the abstracts
thematise the grammatical subject by referring to the discourse domain and the ‘real’ world
domain, thereby positing their research within the objectivised, detached world of modern
scientific communication.
Finally, in answer to the question regarding how closely abstracts follow stipulated
guidelines, we conclude that, in general, the abstracts succeed in this task. However, there
were a few discrepancies. There were some formulaic expressions, introducing moves that
were evidence of lexical redundancy. Further, a small number of abstracts showed authorial
21
presence in the text via personal pronouns, which demonstrated subjectivity. Last, some of the
abstracts revealed conceptual scatter that impeded clarity of meaning.
Many of these inconsistencies were more evident in the abstracts published by CABI
after 2001, that is, abstracts that were written by the authors. Generally, the quality of
abstracts written after 2001 has not greatly diminished from those written by CABI’s own
professional abstractors before this date. However, it is notable that there was a higher
incidence of the briefer and less informative-rich indicative abstract in the last three years.
Furthermore, the author-produced abstracts demonstrate a tendency towards the first person
pronoun and a greater propensity for thematising the grammatical subject of the sentence in
the participant domain. We suggest that the lack of formal training in abstract writing
increases the risk of subjectivity and verbosity and reduces clarity in scientific abstracts.
This study raises a number of issues that have relevance for both abstracting and
indexing services and abstract authors. Firstly, it is imperative that the abstract fulfils its
function as a type of condensed document representation and successfully represents the main
arguments of the parent document logically, coherently and briefly, so that the reader can
assess relevance and gain access quickly. Abstracting and indexing services must ensure that
they maintain abstract quality if they seek to reduce production costs by introducing policies
of accepting author abstracts only.
Secondly, abstracting is a complex and difficult task that requires sound knowledge
of the principles of summarising and defining macrostructures. As abstracting and indexing
services move towards accepting author produced abstracts, it is important that students are
taught some of the processes of constructing a well-formed and effective abstract, so that they
can successfully present, communicate and persuade others of the importance of their
research. We suspect that there is little formal training in abstract writing for science students
at present.
Further research in this subject is needed; one way would be by analysing a larger
corpus. Another way would be an examination of the specific differences and similarities
between abstracts and the original documents that they represent; a linguistic analysis of how
meaning is realised in the abstract and how this reflects the discourse community of which it
is a part; and an exploration into the processes involved in summarising information,
specifically in regard to abstracts. Such a study would be able to assess, in close detail, how a
well-formed abstract is constructed and what constitutes such a document. It may also
provide the basis for future research on developing computer-generated abstracts as the link
between applied linguistics and artificial intelligence grows.
22
References
American National Standards Institute (ANS1), (1979), American National Standard for
writing abstracts (ANSI Z39.14-1979).
Anderson, J.R. (2000), Cognitive psychology and its implications, 5th ed., Worth Publishers,
New York.
Armstrong, C.J. and A. Wheatley (1998), Writing abstracts for online databases, Program,
Vol. 32 No. 4, pp. 359-371.
Bhatia, V.K. (1993), Analysing genre: language use in professional settings, Longman,
London.
Blakeborough, L. and C. Oppenheim (1980), The readability and information content of new
law abstracts and old law abridgements, J. Chartered Institute of Patent Agents, Vol. 10, pp.
86-92.
Borko, H. and Bernier, C.L. (1975), Abstracting concepts and methods, Academic Press,
London.
Brown, G. and Yule, G. (1983), Discourse analysis, Cambridge University Press, Cambridge.
Cremmins, E.T. (1996), The art of abstracting, 2nd ed., Information Resources Press,
Arlington.
23
Fox, C. and J. Hartley (2003), Abstracts, introductions and discussions: how far do they differ
in style?, Scientometrics, Vol. 57, No. 7, pp. 389 – 398.
Halliday, M.A.K. and Martin, J.R. (1993), Writing science: literacy and discursive power,
Falmer Press, London.
Hartley, J. (1994), “Three ways to improve the clarity of journal abstracts”, British Journal of
Educational Psychology, Vol. 64 No. 1, pp. 331-343.
Hartley, J. and M. Sydes (1995), Structured abstracts in the social sciences: presentation,
readability, recall, BLR&DD Report 6211.
Hartley, J., Sydes, M. and Burton, A. (1996), “Obtaining information accurately and quickly:
are structured abstracts more efficient?”, Journal of Information Science, Vol. 22 No. 5,
pp. 349-356.
Junker, K. (1999), “Law and science serving one master…narrative”, in Scanlon, E., Hill, R.
and Junker, K. (Eds.), Communicating science: professional contexts: reader, Open
University Press, Buckingham, p.253.
Kaplan, R.B. et al. (1994), “On abstract writing”, Text, Vol. 14 No. 3, pp. 401-426.
King, R. (1976), “A comparison of the readability of abstracts with their source documents”,
Journal of the American Society of Information Science, Vol. 27 No. 2, pp. 118-121.
Lancaster, F.W. (2003), Indexing and abstracting in theory and practice, 3rd ed., Facet,
London.
Lemke, J.L. (1990), Talking Science: language, learning and values, Ablex Publishing
Corporation, New Jersey.
24
Maizell, R.E., Smith, J.F. and Singer, T.E.R. (1971), Abstracting scientific and technical
literature: an introductory guide and text for scientists, abstractors and management,
Wiley-Interscience, London.
Montgomery, S. (1999), “Scientific discourse and its history: reflections and prospects”, in
Scanlon, E., Hill, R. and Junker, K. (Eds.), Communicating science: professional
contexts: reader, Open University Press, Buckingham, p.32.
Pinto, M. (1994), “Interdisciplinary approaches to the concept and practice of written text
documentary content analysis (WTDCA)”, Journal of Documentation, Vol. 50 No. 2,
pp. 111-133.
Pinto, M. and Lancaster, F.W. (1999), “Abstracts and abstracting in knowledge discovery”,
Library Trends, Vol. 48 No. 1, pp. 234-48.
Pitkin, R.M., Branagan, M.A. and Burmeister, L.F. (1999), “Accuracy of data in abstracts of
published research articles”, Journal of the American Medical Association (JAMA),
Vol. 281 No. 12, pp. 1110-1111.
Rashidi, L.S. (1992), “Toward an understanding of the notion of Theme: an example from
Dari”, in Davies, M. and Ravelli, L. (Eds.), Advances in systemic linguistics: recent
theory and practice, Pinter, New York, p.192.
Rowley, J.E. (1988), Abstracting and indexing, 2nd ed., Bingley, London.
Salager-Meyer, F. (1990), “Discoursal flaws in medical English abstracts: a genre analysis per
research- and text- type”, Text, Vol. 10 No. 4, pp. 365-384.
Santos, M.B.d. (1996), “The textual organisation of research paper abstracts in applied
linguistics”, Text, Vol. 16 No. 4, pp. 481-499.
Snizek, W.E., K. Oehler and N.C. Mullins (1991), “Textual and non-textual characteristics of
scientific papers”, Scientometrics, Vol. 20, Part 1, pp. 23-35.
25
Sternberg, R.J. (2003), Cognitive psychology, 3rd ed., Wadsworth, London.
Swales, J.M. (1990), Genre analysis: English in academic and research settings, Cambridge
University Press, Cambridge.
Tibbo, H. (1993), Abstracting, information retrieval and the humanities: providing access to
historical literature, American Library Association, London.
Wheatley, A. and C.J. Armstrong (1997), “Metadata, recall and abstracts”, Aslib Proceedings,
Vol. 49, Part 8, pp. 206-213.
26