Book Form
Book Form
Abstract—The purpose for this paper is to explain the becoming very popular now days in the digital libraries
transformation and migration process from Libsys to across the globe. According to a survey, satisfaction
Koha- an open source library management software. Open ratings on Koha ’s performance on some aspects found
source is a development methodology, which offers “good” and value for money. The use of OSS has
practical accessibility to a product’s source. Koha being an tremendously lower down the initial cost of setting up
open source software is cost effective i.e. freely available the libraries and improves flexibility in delivery of
and is customizable according to one’s requirements as services to a greater extent. This is the reason for what
compared to libsys. Free/open source software Koha is an the number of researchers and librarians are interested
economical alternative to reliance upon commercially and continuously working on the implementation of OSS
supplied software libsys. So to migrate from libsys to Koha, in digital libraries.[2]
the source data is being transformed into the target
format. The paper discusses various steps for 2. WHAT IS KOHA?
accomplishment of task and the benefits of exploiting
Koha over Libsys.
KOHA is the world’s first free and open source library
Keywords- open source , library management, management software that is being implemented in
linux, marcedit , mysql, transformation, migration, digital libraries. By open source software we are meant
Z39.50 protocol, marc21, libsys, Koha to say that the source code of software is freely available
and it can be modified, customized or redistributed
1. INTRODUCTION according to the person’s requirement. As with the
enhancement in technology, the need pops up for
Data migration is an emerging field nowadays because compliant replacement of existing library system and
with the advancement in technology, the need grows to provides the user the ability to receive free software,
exploit the newer technologies instead of the older ones. customize and redistribute for the benefits of whole
The newer systems contain advanced features compared community. Also the library system should be advanced
to already existing systems. Hence migration from an to meet the present scenario needs. So, in the year 1999,
existing system to a new one is the need of the hour. Katipo Communications proposed a new system,
Data Migration is a process of transferring data from one KOHA(the Maori word for “gift” or “donation”) which
system to another and it is divided into two processes: was the first’s open source Integrated Library
(a) extracting data from an existing system into an Automation Package (ILAP) using open-source tools to
extracted file and (b) loading data from extracted file be released under the general public licence (GPL) and
into the new application. The new application usually installed at Horowhenua Library Trust (HLT) in New
requires data in a different format, hence transformation Zealand, in the year 2000.
of data is required for successful migration. The data
transformation is the process of transforming data from
one format to another and is a mandatory step in data
2.1 Technical Features:
migration as the architecture of target system may be
different from source system[1].In this paper, we are The current version is Koha-3.22 .It runs on
discussing the transformation and migration process different platforms, including Linux, MacOSx,
from LIBSYS to KOHA . LIBSYS is a proprietary software FreeBSD, Solaris, and Windows.[3]
product aiming most convenient and pleasing library Developed on the Linux OS, Koha is written in
experience through its value added features.[12] KOHA on Perl, uses the Apache web server, and has better
the other hand is an open source library management support for multi-RDBMS like MySQL,
software. The use of OSS i.e. open source software is PostgreSQL.[3]
The Online Public Access Catalog(OPAC) User Management: Koha manages users by
interface is in CSS with XHTML. It supports all providing integration with systems like
major library standards such as MARC record Lightweight Directory Access Protocol (LDAP) ,
import/export (MARC 21), Z39.50 and Radius, Central Authentication Service (CAS) to
SRU/Wfeature. allow single sign-on
Records are stored internally in an SGML-like
format and can be retrieved in MARCXML, 2.3 Koha Modules:
Dublin Core, OAI-DC, and Endnote; and the
OPAC can be used by citation tools such as Koha includes various modules to provide tremendous
Zotero[3]. support to its users to enhance its functionalities. It
includes:
2.2 Key Features:
ACQUISITION: Koha’s acquisition module holds
Full-featured ILS : Koha is a true enterprise- suggestions, budgets, invoices, funds, currencies.
class ILS with comprehensive functionality ADMINISTRATION: It is an exclusive module of
including basic and advanced features for Koha that enable users to change global system
customization of software according to a preferences and other parameters in various
person’s requirement. Koha will work for aspects to provide better customizability.
consortia of all sizes, multi-branch, and single- CIRCULATION: Koha includes a fully featured
branch libraries. circulation module with circulation rules that
Multilingual and translatable: Koha has a are customizable to meet needs of user. It
large number of languages with enhancement includes checking in and out of books. It also
and translation in various available languages. grants offline circulation feature.
Full text searching: Koha supports powerful CATALOGING: Koha provides cataloguing
searching, and an enhanced catalogue display features to its users that enable them to search
that can fetch data from Amazon , Google ,etc. It migrated data both for books and serials, amend
uses zebra search engine i.e. Z39.50 server and already existing records ,add a new record in
client to enhance search ability, data any framework (default or created by user) and
interchange and import data from Library of fetch from external sources if required.
congress.
Web-based Interfaces: KOHA’s OPAC are all
based on worldwide technologies – XHTML, CSS, SEARCH
javascript etc. making it a platform independent CATALOG
FOLLOWED:
MYSQL
Database MARCEDIT
TOOL
4. WHY KOHA?
S.NO. CHARACTERISTICS LIBSYS KOHA
1. Nature of Commercial Open source i.e. FREE of cost
developing
organization
2. Ownership Libsys Katipo communications
3. License Commercial Under GPL General Public License
4. Price In Lacs Freely available and free support
5. Customization Libsys charge users to source code is freely available for innovation to
provide customized solutions provide new features at users end. New versions
[12] are added freely.
6. Training manual No system manual is YES, manual includes everything for user
provided to users except user convenience[4]
manual to get AMC[4]
7. Database Software can be used either MYSQL dual database design (Text based and
with with SQL Server, RDBMS). Scalable enough to meet the transaction
ORACLE or MYSQL as a load of library. [4]
backend RDBMS with ODBC
compatibility[4]
8. Support Costly on the basis of Online support and discussion forums free of cost.
AMC(annual maintenance No human ware for this purpose. Open and constant
contract) usually 10 to 20% dialogue with developers.[4]
of total costs[4]
9. Vendor Lock –in Restrictions – can ask for No restrictions , no set term contracts on changing
support only from particular support
vendor
10. Addition of new Charge extra cost to upgrade Very frequently new versions are coming and added
features/new to new version or add new for free[4]
version features [4]
11. Web Server Only Apache and IIS Apache, IIS and others[4]
5. DATA TRANSFORMATION Also the data contains various blank lines, unwanted
content after every record, one record may be separated
The transformation of data is a necessary step in data in different lines and one record may be repeated twice.
migration as the target format may have a different So it is required to remove all flaws like duplicacy and
system architecture which is differentiable from the consolidate the different data into the desired format.
previous one. It includes data collection, combination, The following snapshot will provide you a clearer
filtration, reformat and so on. It is necessary to find an version of the source data:
efficient and effective method for the same so as to
improve quality of data. One of the solutions we have
undergone for transformation of data is as follows:
File 1:
we may have multiple fields in same The following figure explains the procedure:
column. So we can use MS-EXCEL
functionalities as well as code in VBA for Received text file
processing our data.
b.
Remove unwanted stuff,
At first we will use “Text to Columns in blank lines; bring record in
c.
Row all errors are removed. Then select
first column and by fixed width option in
text to columns functionality, separate the
accession number and title in different
fields. It is required to remove all the blank
lines so we created another macro for this Step 4: Repeats the steps until iRow =
task: LastRow
Algorithm for removing blanks 4.1 If data in cell of iRowth row and 2nd
Step 1, Step 2 ,Step 3 and step 5 are same column contain double quotes as a
as above algorithm symbol of repetition, then
Step 4: Repeats the steps until iRow = Data in cell of (iRow-1)th row and
LastRow 2nd column come in place of iRowth
4.1 If data in cells of iRowth row and 1st row and 2nd column
2nd ,3rd ,4th ,5th columns are blank 4.2 iRow iRow + 1
then Delete that iRow
4.2 If data in cells of iRowth row and 1st f.
column is not blank but iRowth row We have some data sorted now but a
and 2nd, 3rd, 4th, 5th columns column with multiple fields separated by
are blanks ,then Delete that iRow delimiter is not yet sorted. Here in the data,
4.3 iRow iRow + 1 we have 'year' separated by comma(,) ;
'publisher' by (:) ; and place by (--). So it is
d. reqAuired to create macro for separating
All blank lines are now removed. Some titles them.
are divided in multiple lines so it is required Algorithm for using delimiter to separate
to bring them into a single line. For this we using macro
have created another macro.
Algorithm for merging multi row records Step 1: Start
Step 2: Declare variables iRow, LastRow ,
Step 1, Step 2 , Step 3 and step 5 are same pos, str, le.
as above algorithm Step 3: Initialize variables
Step 4: Repeats the steps until iRow = iRow = 1
LastRow LastRow =
4.1 If 1st column corresponding to ActiveSheet.UsedRange.Rows.Count
(iRow + 1)th row is blank, then str = data in cell of iRowth row and
Data in iRowth row and 2nd column 2nd column
and (iRow+1)th row and 2nd column le = length of str
gets merged into iRowth row and 2nd pos = 1st position of comma from
column and so on for 3rd column right to left in str
4.2 If 1st column corresponding to Step 4: Repeats the steps until iRow =
(iRow + 1)th row is blank , then LastRow
Delete that iRow 4.1 If pos = 0 , then
4.3 iRow iRow + 1 Data in cell of iRowth row and 3rd
column is blank
Now we have all the titles in one line. Data Data in cell of iRowth row and 4th
also contains same records like if one title is column is string str
repeated again in the next row then instead Else
of writing the title again, ” is written in the Data in cell of iRowth row and 3rd
next row to signify that the title repeats column is right part after comma
itself. So for solving this, we created another Data in cell of iRowth row and 4th
column is left part before comma
macro: 4.2 iRow iRow + 1
Algorithm for same records Step 5: Stop
6. DATA MAPPING
The fields in final excel sheet obtained are mapped with
MARC tags. Before moving ahead, let me explain about
WHAT is MARC and WHY it is required?
Fig-10: Mapping with marc tags Fig -12: Select Marc Tools
7. DATA MIGRATION
Data migration is the process of transferring data from
one system to another. It is an important step and is a
G
Go to KOHA Home Tools Stage
Marc Records for Import
B
Browse and upload .mrc file created
Mysql > select * from items; Special thanks and appreciation goes to Sanjay Burde,
{This table holds all information of items migrated Senior Principal Scientist, Charu Verma, Principal
to Koha } Scientist and Salim Ansari, Senior Technical officer for
their tremendous support.
Referential Integrity is maintained in the way:
We use various tables in Koha database which are 11. REFERENCES
connected to each other via primary key- foreign key [1] Cheong Youn and Cyril S. Ku Bell
hence fulfilling referential integrity. The following figure Communications Research, “Data Migration”,
will show referential integrity among 3 tables: Biblio , Piscataway, NJ 08855-1379,p.1255,1992.
Biblioitems and Items [10]. [2] Dr. Sanjay Kataria, Mohit Sharma and Anshul
Pachouri, “Integrating Open Source Knowledge
Management Tools into Library Management
for Automation: A case study of Jaypee Institute
of Information Technology University”, Noida,
India, p.317, 2010.
[3] K.T. Anuradha, R. Sivakaminathan and P. Arun
Kumar, “Open-source tools for enhancing full-
text searching of OPACs-Use of Koha,
Greenstone and
Fedora”,Bangalore,India,p.233,2011.
[4] Shivpal Singh Kushwah, J. N. Gautam and Ritu
Singh, “Library Automation and Open Source
Solutions Major Shifts & Practices: A
Comparative Case Study of Library Automation
Systems in India”, India, p. 148, 2008.
[5] Zahiruddin Khurshid, “From MARC to MARC 21
and beyond: some reflections on MARC and the
Fig -17: Referential Integrity Arabic language”,Dhahran, Saudi Arabia,p.370,
2002.
At data entry level, problem of misspellings, [6] Dhrubajit Das, “MARC 21 : The Standard
redundancy and contradictory values are resolved in Exchange Format for the 21st Century”,
data transformation process itself (Refer Fig. 3 and Fig.7) Ahmedabad, India, p.154, 2004.
[7] Branko Milosavljevic, Danijela Boberic´ and
Hence the correctness and effectiveness of Dusˇan Surla, “Retrieval of bibliographic records
transformation and migration process has been using Apache Lucene”, Novi
validated and thereby data quality is ensured in Koha . Sad,Serbia,p.526,2009.
[8] Ikhlas Fuad Zamzami, Hanan Abdullah A. Fatani
and Nuha Abdullah H. Zammarah, “Data
9. CONCLUSION Migration Challenges: The Impact of Data
Quality”, Kuala Lumpur,Malaysia,p.1.
With the advent of new technology and growth of [9] http://www.loc.gov/marc/bibliographic/
information technology, it becomes necessary to migrate [10] http://schema.koha-community.org
the data from their legacy system to a new one. The [11] http://manual.koha-community.org/
migration cannot be overlooked as a simple step. It is a [12] http://www.libsys.co.in/
complex process that holds various phases which makes [13] https://support.office.com/
12. BIOGRAPHIES
Principal Scientist & Principal
Investigator, CSIR Knowledge Senior Project Fellow, CSIR
Gateway Project at CSIR-National Knowledge Gateway Project at
Institute of Science CSIR-National Institute of science
Communication and Information communication & Information
Resources, New Delhi Resources, New Delhi
E-mail: E-mail:
[email protected] [email protected]