Well established industrial IT solutions often do not suit the needs of researchers.
Industrial technologies for data management and processing, perfectly tailored to serve businesses or banks, often appear weak and clumsy in scientific applications.
Thus, in genomics, it is typical that a single analytical job requires retrieving and storing back hundreds of millions of objects like short DNA sequences.
With such volumes, overhead of transaction processing inherent for industrial SQL DBMSs becomes unbearable.
SciDM Group provides an alternative - SciDM Database Management System, a zero-maintenance object-relational NoSQL database engine optimized for large aggregate transactions.
On typical bioinformatics tasks like sequence assembly and clustering, SciDM Data Manager operates thousands of times faster than conventional database engines.
It also provides a rich and flexible data model, native mapping to an array of programming languages, and very efficient over-the-network access.
For full details, please see SciDM Whitepaper[PDF].
Scientific Data Manager (SciDM) is a DBMS designed for applications that require:
- to operate quickly with heavy data sets over a network -
- on standard inexpensive hardware, free of maintenance -
- through light and easy-to-use programming interface -
- using multiple platforms and various programming languages.
SciDM provides:
-
Unmatched performance:
retrieving
1,000,000 data objects in 0.9 seconds
updating
1,000,000 data objects in 2.1 seconds
creating
1,000,000 data objects in 37 seconds
deleting
1,000,000 data objects in 21.5 seconds
processing
1,000,000 select queries in 5.2 seconds
handling
10,000 client sessions simultaneously
- over
a network,
running on a standard hardware,
with 540Gb of data pre-loaded.
- Strong reliability: All features are tested in real-world
applications that intensively operate with huge volumes of data.
These applications include genomic data management, literature
databases, document management and bug tracking.
- Truly unlimited storage – limited by hardware only.
SciDM-managed database of 12.5 Terabytes exists. The storage is
theoretically limited to 248 objects; each object can
contain up to 2 billions of data attributes; each data attribute can
contain up to 263 bytes.
- Rich data management capabilities at unprecedented speed. It
uses an object-relational model for data representation and provides
methods for storing, retrieving, and deleting data objects, for
sequential and indexed access and for dynamic data structure
discovery. Along with traditional indexing by entire attribute
contents, SciDM provides the ‘context’-style indices by the
individual words in stored texts.
- Structured framework for data processing formalized
in terms of a particular application field. This makes SciDM highly
suitable for research and prototyping. This also simplifies the
development by removing the traditional data translation layer
between application and DBMS, and by bringing the structured data
directly into the processing modules.
- Interoperability over various platforms, operating systems
and programming languages.
- Security model with object–level protection. The access
rights are controlled individually for every object.
- Among other features of SciDM there are: flexible object
structure, allowing dynamic addition of new attributes to
existing objects; integrated set management, allowing both
persistence and server-side operations for arbitrary objects sets; data
integrity support through locking and object subordination.
We are presently starting a company, SciDM Co., for commercialization the SciDM database engine.
Please see scidm.com for details.
Footnotes:
|