Topics/Areas

Cloud Computing
Cloud computing enables solving big data problems and sharing computer resources (storage, network, servers, applications, and services). Cloud computing allows analyzing data faster with no local hardware and software investment – only need to pay for resources that are used. Cloud computing gives you the tools needed to prepare and to cope with unpredictable traffic.

 

Bioinformatics RIG has been selected to receive a machine learning grant and education coursework grant from a cloud computing services provider enterprise, Amazon Web Services (AWS). The services of AWS had been taught to graduate program students in research methodology course. And at the end of semester, each student had presented their research assignments using AWS services.

Genome-Wide Association Study (GWAS)
GWAS is a study that examines markers across the complete set of DNA or genome to find the association between single-nucleotide polymorphisms (SNPs) and trait. For example in rice, desirable traits such as high yield, higher nutritional content, and length of maturity are cultivated and preserved while undesirable traits such as genetic susceptibility to disease are removed.

Genome-Wide Association Study (GWAS) is also useful to develop better strategies to detect, treat and prevent complex diseases such as cancer. Smokescreen is one from many health solution products that used human GWAS and has two major components – a targeted genotyping array and a centralized software application. The array design includes multi-population coverage of 1,015 genomic regions associated with addiction, as well as biological pathways related to the metabolism of nicotine and the brain’s reward system.

Database in Bioinformatics
Bioinformatics databases are libraries of life sciences information, collected from scientific experiments, published literatures, high-throughput experiment technologies, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene’s function, structure, localization (both cellular and chromosomal), clinical effects of mutations as well as similarities of biological sequences and structures.
Genome Databases
These databases collect organism genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species’ genomes, or a single model organism genome.

 

Bioinformatics RIG has developed rice and bovine genome database in collaboration with Indonesian Center for Agricultural Biotechnology & Genetic Resources. Multiple versions of arrays are linked by creating a SNP index between versions. A web application is ideal for parties that want to share data on a species but has separate genotyping or sequencing facilities. Once data is in the system, it can be combined in a pooled analysis or summarized by study and combined in large and statistically powerful meta-analyses.

Web-Services in Bioinformatics
Web-Services in Bioinformatics is a standards-based, language-agnostic software entity, that accepts specially formatted requests from other software entities on remote machines via vendor and transport neutral communication protocols, producing application specific responses in biological scientific domain. Web-Services in Bioinformatics are software components that communicate using pervasive, standards-based web technologies including HTTP and XML-based messaging in biological scientific domain. Since they are based on open standards such as HTTP and XML-based protocols including SOAP and WSDL, web services are hardware, programming language, and operating system independent. Web-Services in Bioinformatics are fundamental building blocks for Service Oriented Architectures and Grid Computing that are applied in biological scientific domain.
Data Analysis
Biostatistics is the branch of applied statistics that applies statistical methods to medical and biological practices and researches. A combination of mathematics and careful reasoning. Entails formulating research questions and designing processes for exploring and testing theories. Of course, these areas of statistics overlap somewhat. For example, in some instances, given a certain biostatistical application, standard methods do not apply and must be modified. In this circumstance, biostatisticians are involved in selecting, modifying and implementing methods, such as:

 

  • Prediction and risk model building
  • Data and model simulations
  • Analysis of variance and covariance
  • Survival analysis
  • Validity and reliability testing
  • Bayesian hierarchical modeling
  • Bayesian model selection
  • Interpretation and presentation of statistical results
Biometrics
Biometrics (or biometric authentication) refers to the identification of humans by their characteristics or traits including fingerprint, iris pattern, retina recognition, facial feature, signature, and speech. Biometrics is used in computer science as a form of identification and access control. It is also used to identify individuals in groups that are under surveillance. Performance of biometrics are measured by the accuracy, speed, and robustness of technology used.
Sequence Analysis
The DNA sequences of thousands of organisms have been decoded and stored in databases. A comparison of genes within a species or between different species can show similarities between protein functions, or relations between species. With the growing amount of data, it long ago became impractical to analyze DNA sequences manually. A variant of this sequence alignment is used in the sequencing process itself.
Medical Image Analysis
High-throughput research uses computational technologies to accelerate or fully automate the processing, quantification and analysis of large amounts of high-information-content biomedical imagery.
Data Modeling 
Systems biology may be defined as the emerging discipline that asks how physiology and phenotype emerge from molecular interactions. Mathematical models are being used in support of this, continuing a long tradition inherited from genetics, physiology, biochemistry, evolutionary biology, and ecology. Models, however, mean different thing to physicists, mathematicians, engineers and computer scientists, not to mention biologists of varying persuasions. These different perspectives need to be unravelled and their advantages distilled if model building is to fulfill its potential as an explanatory tool for studying biological systems. This topic area includes:

 

  • Pathway models to guide statistical analyses.
  • Identifies associations missed by marginal scans (e.g., gene-gene interactions).
  • Uses Bayesian approaches to model many variables together, incorporate existing biological information, and account for model uncertainty.
  • Multiple models used in prediction and updated as more data becomes available.
  • The form of the model is flexible, supporting various study designs and mechanistic or hierarchical pathway models.
  • Ability to search over complex interaction models or vast model spaces.
  • Integration of Pharmacogenomics Knowledgebase (PharmGKB) pathways into the modeling framework.