People Innovation Excellence


Data Management
Data management has a very important role in a research. As collected during study, researchers need to maintain data confidentiality, organize it to be easily found and analyzed, and also safe from tampering or loss. To develop a custom database system designed to the highest standards in quality, reliability, consistency, and security such as:

  • Design of relational databases and data entry/querying applications
  • Design of data collection and case report forms
  • Variable coding and embedded quality control
  • Data entry and integrity reporting
  • Programming of customized SQL queries and reports

Bioinformatics RIG has developed research data management for Indonesian Center for Agricultural Biotechnology & Genetic Resources called Germplasm Genetic Variation Agriculture Database that was configured and populated with genome-wide data of rice, including 1536 and 384 SNP sets.

Software Engineering

A study about management activities, technical methods, tools, and process to develop tailored software and web applications to meet a research needs. Software engineering divided into sub-disciplines, including software requirement, software design, software construction, software testing, software deployment, and software maintenance.

The purpose of software engineering is to ensure that project finished on time, on schedule, and on budget. Waterfall and Agile model are the most popular software development methods. Both has common activities such as system analysis, design, testing, and maintenance. The difference is that agile model is iterative and incremental software development methodologies, while in waterfall each activities should be performed individually.

Bioinformatics RIG had succeeded in implementing both approaches. Agile model was applied in developing a web-based application of KKP3N (Kerjasama Kemitraan Penelitian dan Pengembangan Pertanian Nasional), a program of Indonesian Agency for Agricultural Research and Development. While waterfall model was implemented in development of Germplasm Genetic Variation Agriculture Database, a web-based data management project of Indonesian Center for Agricultural Biotechnology & Genetic Resources.

Cloud Computing

Cloud computing enables solving big data problems and sharing computer resources (storage, network, servers, applications, and services). Cloud computing allows analyzing data faster with no local hardware and software investment – only need to pay for resources that are used. Cloud computing gives you the tools needed to prepare and to cope with unpredictable traffic.

Bioinformatics RIG has been selected to receive a machine learning grant and education coursework grant from a cloud computing services provider enterprise, Amazon Web Services (AWS). The services of AWS had been taught to graduate program students in research methodology course. And at the end of semester, each student had presented their research assignments using AWS services.

Genome-Wide Association Study (GWAS)

Is a study that examines markers across the complete set of DNA or genome to find the association between single-nucleotide polymorphisms (SNPs) and trait. For example in rice, desirable traits such as high yield, higher nutritional content, and length of maturity are cultivated and preserved while undesirable traits such as genetic susceptibility to disease are removed.

In collaboration study with Indonesian Center for Agricultural Biotechnology & Genetic Resources, Bioinformatics RIG is performing genome-wide association studies for various traits in cattle and rice.

Genome-Wide Association Study (GWAS) is also useful to develop better strategies to detect, treat and prevent complex diseases such as cancer. Smokescreen is one from many health solution products that used human GWAS and has two major components – a targeted genotyping array and a centralized software application. The array design includes multi-population coverage of 1,015 genomic regions associated with addiction, as well as biological pathways related to the metabolism of nicotine and the brain’s reward system.

High-Performance Computing (HPC)
High-performance computing (HPC) is the use of clustered computers and parallel processing techniques for solving complex computational problems. HPC technology focuses on developing parallel processing algorithms and systems by incorporating both administration and parallel computational techniques.

High-performance computing is typically used for solving advanced problems and performing research activities through computer modeling, simulation and analysis. HPC systems have the ability to deliver sustained performance through the concurrent use of computing resources.

Currently, Bioinformatics RIG is customizing its own HPC using HP ProLiant ML350p Gen8 Server, powered with 24x Intel® Xeon® CPU E5-2620 0 @ 2.00GHz, 8GB memory, GPU NVIDIA Tesla M2075, and RAID 5 configuration storage.

Biological Database

Biological databases are libraries of life sciences information, collected from scientific experiments, published literatures, high-throughput experiment technologies, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene’s function, structure, localization (both cellular and chromosomal), clinical effects of mutations as well as similarities of biological sequences and structures.

Bioinformatics RIG developed a database with batch data input features, including genetic data from DNA genotyping or sequencing machines, and phenotypic/trait data that are measured in the greenhouse and from the field. The SNP map table describes the SNPs contained on the array, such as where (chromosome and position), the polymorphic nucleotides (e.g., adenine or cytosine), and attributes of the array design. The sample map table contains data on the DNA samples. The final report has the genotypes for all the samples as well as information on the quality of the genotype calling. Trait data is stored in the phenotype table and linked to the sample map by a one to many relationship.

Genome Databases

These databases collect organism genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species’ genomes, or a single model organism genome.

Bioinformatics RIG has developed rice and bovine genome database in collaboration with Indonesian Center for Agricultural Biotechnology & Genetic Resources. Multiple versions of arrays are linked by creating a SNP index between versions. A web application is ideal for parties that want to share data on a species but has separate genotyping or sequencing facilities. Once data is in the system, it can be combined in a pooled analysis or summarized by study and combined in large and statistically powerful meta-analyses.

Web-Services in Bioinformatics

Web-Services in Bioinformatics is a standards-based, language-agnostic software entity, that accepts specially formatted requests from other software entities on remote machines via vendor and transport neutral communication protocols, producing application specific responses in biological scientific domain. Web-Services in Bioinformatics are software components that communicate using pervasive, standards-based web technologies including HTTP and XML-based messaging in biological scientific domain. Since they are based on open standards such as HTTP and XML-based protocols including SOAP and WSDL, web services are hardware, programming language, and operating system independent. Web-Services in Bioinformatics are fundamental building blocks for Service Oriented Architectures and Grid Computing that are applied in biological scientific domain.

Bioinformatics RIG is currently working on web-services for Agricultural Biotechnology & Genetic Resources. The web-services will enable one-way data transfer from restricted access server in Bogor to public access server in Pasar Minggu, Jakarta.

Applied Biostatistics
Biostatistics is the branch of applied statistics that applies statistical methods to medical and biological practices and researches. A combination of mathematics and careful reasoning. Entails formulating research questions and designing processes for exploring and testing theories. Of course, these areas of statistics overlap somewhat. For example, in some instances, given a certain biostatistical application, standard methods do not apply and must be modified. In this circumstance, biostatisticians are involved in selecting, modifying and implementing methods, such as:

  • Prediction and risk model building
  • Data and model simulations
  • Analysis of variance and covariance
  • Survival analysis
  • Validity and reliability testing
  • Bayesian hierarchical modeling
  • Bayesian model selection
  • Interpretation and presentation of statistical results

Bioinformatics RIG has applied biostatistics methods in research collaboration with Indonesian Center for Agricultural Biotechnology & Genetic Resources and will also use it in further research collaborations, including colorectal cancer, breast cancer, lung cancer and Smokescreen research.

High-Throughput Image Analysis

High-throughput research uses computational technologies to accelerate or fully automate the processing, quantification and analysis of large amounts of high-information-content biomedical imagery.

Dharmais Hospital National Cancer Center and Bioinformatics RIG has been collaborating in high-throughput image analysis research using various instruments, such as PET scanner, MRI, CT Scan, and RIS PACS.


Biometrics (or biometric authentication) refers to the identification of humans by their characteristics or traits including fingerprint, iris pattern, retina recognition, facial feature, signature, and speech. Biometrics is used in computer science as a form of identification and access control. It is also used to identify individuals in groups that are under surveillance. Performance of biometrics are measured by the accuracy, speed, and robustness of technology used.

In collaboration with Fujitsu, Bioinformatics RIG had successfully organized the event Fujitsu Innovation Challenge. This event harnests the creativity and programming skills of students by utilizing biometrics palm vein recognition technology (PalmSecure) from Fujitsu.

Sequence Analysis

The DNA sequences of thousands of organisms have been decoded and stored in databases. A comparison of genes within a species or between different species can show similarities between protein functions, or relations between species. With the growing amount of data, it long ago became impractical to analyze DNA sequences manually. A variant of this sequence alignment is used in the sequencing process itself.

The Smokescreen study shows that fine-mapping of top smoking behavior and cessation treatment response loci are 15q25.1 (nicotinic receptors), CYP2A6/B6 (nicotine and bupropion metabolizers). Selection of all known SNPs and indels (MAF >0 in any population) from 1000 Genomes, HapMap and Exome Sequencing Project (m=11,188). Average of 1 marker per 45-75 bp; 664 kb in total length.

Modeling Biological Systems

Systems biology may be defined as the emerging discipline that asks how physiology and phenotype emerge from molecular interactions. Mathematical models are being used in support of this, continuing a long tradition inherited from genetics, physiology, biochemistry, evolutionary biology, and ecology. Models, however, mean different thing to physicists, mathematicians, engineers and computer scientists, not to mention biologists of varying persuasions. These different perspectives need to be unravelled and their advantages distilled if model building is to fulfill its potential as an explanatory tool for studying biological systems.

Bioinformatics RIG and its research partner BioRealm are implementing Flexible Statistical Modeling Framework in Smokescreen study as follows:

  • Pathway models to guide statistical analyses.
  • Identifies associations missed by marginal scans (e.g., gene-gene interactions).
  • Uses Bayesian approaches to model many variables together, incorporate existing biological information, and account for model uncertainty.
  • Multiple models used in prediction and updated as more data becomes available.
  • The form of the model is flexible, supporting various study designs and mechanistic or hierarchical pathway models.
  • Ability to search over complex interaction models or vast model spaces.
  • Integration of Pharmacogenomics Knowledgebase (PharmGKB) pathways into the modeling framework.

Last updated :
Leave Your Footprint

    Periksa Browser Anda

    Check Your Browser

    Situs ini tidak lagi mendukung penggunaan browser dengan teknologi tertinggal.

    Apabila Anda melihat pesan ini, berarti Anda masih menggunakan browser Internet Explorer seri 8 / 7 / 6 / ...

    Sebagai informasi, browser yang anda gunakan ini tidaklah aman dan tidak dapat menampilkan teknologi CSS terakhir yang dapat membuat sebuah situs tampil lebih baik. Bahkan Microsoft sebagai pembuatnya, telah merekomendasikan agar menggunakan browser yang lebih modern.

    Untuk tampilan yang lebih baik, gunakan salah satu browser berikut. Download dan Install, seluruhnya gratis untuk digunakan.

    We're Moving Forward.

    This Site Is No Longer Supporting Out-of Date Browser.

    If you are viewing this message, it means that you are currently using Internet Explorer 8 / 7 / 6 / below to access this site. FYI, it is unsafe and unable to render the latest CSS improvements. Even Microsoft, its creator, wants you to install more modern browser.

    Best viewed with one of these browser instead. It is totally free.

    1. Google Chrome
    2. Mozilla Firefox
    3. Opera
    4. Internet Explorer 9