Oleksii
V. Samoilenko, Cand. Sc. (Eng.), Assoc. Prof.
National
Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic
Institute"
Technical Problems of Academic
Plagiarism Identifying
in Higher Engineering Education
No one doubts the fact that
academic plagiarism is a shameful phenomenon, with which it is necessary to
fight irreconcilably.
Academic plagiarism is less
common in the field of technical sciences and higher engineering education.
Primarily, this is due to these reasons:
high labour intensity
of the work;
high visibility of
received academic results.
The issue of academic
plagiarism detection in large arrays of information occurs more and more
frequently. Therefore, it is quite logical to use computer technology for this.
Now there is a lot of software (including free) to search for borrowings on the
Internet and on the local computer.
However, work in the field
of technical science and engineering education has features that significantly
reduce the possibilities of automated processing or even make it impossible:
large amount of
graphic information that determines the essence of the work;
presence of
mathematical and chemical formulas, graphs, etc. in the text;
limited lexical
possibilities due to the standardization of terminology;
presence of formally
identical fragments in different places.
For example, the bachelor's
degree project contains (on average) 8 blueprints of the A1 format and an
explanatory note with a volume of about 70 pages. The blueprints are the main.
However, the software tests
conducted by the author have shown that anti-plagiarism programs work very
poorly with graphic information. If the search for coincidence in full-colour
images of the program is poorly handled, then when processing monochrome images
there are problems.
For example, the program
does not distinguish between a monochrome image of a text and a monochrome
sketch. The program does not distinguish significant and insignificant
(framework and main labels) elements in the processing of blueprints.
Also, the difficulty is
caused by the abundance of vector graphic formats in which blueprints can be
presented.
Mathematical and chemical
formulas are very poorly recognized by anti-plagiarism software. Especially
because the same mathematical formula can be written in several ways (for
example, permutation of terms). Therefore, anti-plagiarism software should
recognize not only the appearance, but also the essence of mathematical and
chemical formulas. And this has not yet been observed.
The language used in the
field of technical science and engineering education is much formalized. For
example, the search engine on the first attempt finds 23 Ukrainian state
standards (which are mandatory for use) that regulate the terminology in the
engineering industry. And this is not a complete list.
Also several explanatory
notes of diploma projects can contain the same elements. This can be, for
example, a description of the basic equipment (the same for several students
who have a common object for modernization) or typical calculations (which are
also often standardized).
The citation problem has a
satisfactory solution mainly in relation to modern publications. At the same
time, the publications of the countries of the former USSR are not yet
sufficiently represented in free access.
Also, the citation has such
difficulties:
incompleteness of the
bibliographic description of the source of information or even printed or
grammatical errors in it;
location of the
source list information in a separate file;
variety of ways to
specify the information sources and their location in the text of the
investigated document.
One of the solutions to this
problem may be, obviously, the establishment in the future of a single
identifier for scientific publications and technical documents.
In the short term there are
two possible solutions:
development of
special high-performance anti-plagiarism software;
adaptation of the
tested work to the analytical capabilities of the existing anti-plagiarism
software.
Obviously, the second method
is the most preferable.
Author suggests such ways of
adaptation of the information under study:
translation of
graphic information from vector to raster form;
presentation of
textual information in the simplest possible form.
The optimal use is the mark-up
language HTML with attached CSS style sheets. And
mathematical formulas can be written in LaTeX.
Advantages of the HTML/CSS format:
complete document
formatting;
easy to learn;
large number of free
editors, including the WYSIWYG class;
openness and the lack
of encryption.
the information is
placed in the appropriate containers (tags).
Requirements for editing a
document in the HTML/CSS format:
strict unification of
links to information sources;
placing of quotations
and other loans in special tags;
minimal formatting.
However, anti-plagiarism
software should be upgrated to use the HTML/CSS format.
The described problems can
lead to false positive results. Therefore, the supervisory authorities should
not apply decisions solely on the basis of the results of the work of the
anti-plagiarism software. Each case must be considered individually. Any doubts
should be interpreted in favour of authorship.