生物信息学实习一

时间:2024.4.1

实习 一 : 序列查询

实验目的:

1. 了解三大生物信息中心的资源;

2. 学会用Entre系统查找目标序列;

3. 学会用SRS系统查找目标序列

实验内容:

(一)三大生物信息中心浏览

NCBI、EBI、DDBJ

(二) Entrez的使用

Limits and Advanced Search

(三) SRS的使用

作业:

1. Introduce the following NCBI databases in your own words:MMDB,CDD,dbGap, PMC.,OMIM, UniGene, PubChem, RefSeq.

1).MMDB: Structure (Molecular Modeling Database ) is Three dimensional structure which provide a wealth of information on the biological function and the evolutionary history of macromolecules. 2).CDD: Conserved Domain Database (CDD) is a protein annotation resource that consists of a collection of well-annotated multiple sequence alignment models for ancient domains and full-length proteins.

3).dbGap: The database of Genotypes and Phenotypes(dbGaP) is developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype.

4).PMC: PubMed Central (PMC) is a free archive of biomedical and life

sciences journal literature to keep with NLM’s legislative mandate to collect and preserve the biomedical literature.

5).OMIM: Online Mendelian Inheritance in Man(OMIM) is a comprehensive, authoritative, and timely compendium of human genes and genetic phenotypes.

6).UniGene: UniGene is a database that computationally identifies transcripts from the same locus; analyzes expression by tissue, age, and health status; and reports related proteins(protEST) and clone resources.

7).Pubchem:PubChem is a chemical module database that provides information on the biological activities of small molecules which includes substance

information, compound structures, and BioActivity data in three primary databases, Pcsubstance, Pccompound, and PCBioAssay, respectively

8).RefSeq: The Reference Sequence (RefSeq)is a collection which aims to

provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.

2. Make a list of the molecular biology related books on the NCBI bookshelf,

specifying the book title, authors and publishing press. How about bioinformatics related books?

Molecular biology books:

1).Molecular Biology of the Cell. 4th edition Alberts B, Johnson A, Lewis J, et al.. New York: Garland Science; 2002.

2) Lodish H, Berk A, Zipursky SL, et al. Molecular Cell Biology. 4th edition. New York: W. H. Freeman; 2000.

Bioinformatics related books:

1) Madame Curie Bioscience Database [Internet]. Austin (TX): Landes

Bioscience; 2000.

2) National Research Council (US) Committee on Frontiers at the Interface of Computing and Biology; Wooley JC, Lin HS, editors. Catalyzing Inquiry at the Interface of Computing and Biology. Washington (DC): National Academies Press (US); 2005.

3. Introduce the following EBI databases in your own words:chEBI, ENA, UniProt, Array Express, Ensemble, PDBe

1) chEBI:Chemical Entities of Biological Interest (ChEBI) is a freely

available dictionary of molecular entities focused on ‘small’ chemical compounds.

2) ENA: The European Nucleotide Archive (ENA) is a database which

captures and presents information relating to experimental workflows that are based around nucleotide sequencing.

3)UniProt: UniProt is a databases to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information

4)Array Express: The ArrayExpress Archive is a database of functional

genomics experiments including gene expression where you can query and download data collected to MIAME and MINSEQE standards.

5)Ensemble: The Ensembl project produces genome databases for

vertebrates and other eukaryotic species, and makes this information freely available online.

6)PDBe: PDBe is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures.

4. Do a search for the 16S ribosomal RNA gene from Aeromonas hydrophila strain AE7.

a. Give the search details that you used to find this sequence.

Enter the Entrez →Type the key word” 16S ribosomal RNA gene eromonas hydrophila strain AE7”in the search box →The result only turns out in Nucleotide database →Click it and scan the result

b. What is the accession number?

The accession number is DQ855289.

c. How many base pairs are in this sequence?

The sequence has 992bp base pairs.

d.When was the entry last modified?

The entry last modified at 21-AUG-2006

e. Is there another organism that produces the same protein? If so, name the organism and show your evidence. Yes, find 16S ribosomal RNA in Protein Database, there are 8132 results. For example, Lactobacillus salivarius CECT 5713 16S ribosomal

RNA dimethyladenosine transferase) (16S ribosomal RNA dimethylase)

5. Search for the nucleotide sequence with accession number NM_013161.

a. What organism is this sequence from?

The sequence comes from Rattus norvegicus (Norway rat)

b. What is the accession number of the protein linked to this sequence?

The accession number is NP_037293

c. What is the function of this protein?

The Pancreatic lipase hydrolyzes dietary fat molecules in the human

digestive system, converting triglyceride substrates found in ingested oils to monoglycerides and free fatty acids.

d. Find a reference by Hjorth, et al, related to this protein. What is the PubMed ID for this article?

,ID is 8490016

e. In your own words, briefly describe what the researchers reported in the article.

A structural domain (the lid) found in pancreatic lipases is absent in the guinea pig (phospho)lipase. The amino acid sequence of guinea pig

(phospho)lipase is highly homologous to that of other known pancreatic lipases, with the exception of a deletion in the so-called lid domain that regulates access to the active centers of other lipases. We propose that this deletion is directly responsible for the anomalous behavior of this enzyme. Thus GPL challenges the classical distinction between lipases, esterases, and

phospholipases.

6. Search for orthologous nucleotide and protein sequences in more than 5 organisms, save all the sequences in fasta and genbank format for next practice.

a. Give the titles of those sequences.(first line of the fasta format).

b. What organisms are those orthologous sequences from?

c. What is the sequence length of each sequence?

d. When was each entry last modified?

Sequence1- gene:

Title: >gi|360039204|ref|NM_001008215.2| Homo sapiens cytochrome C oxidase assembly factor 5 (COA5), mRNA

Length: 1767 bp Organism: Homo sapiens Modified PRI 09-JUN-2012 Sequence1- protein:

Title >gi|56118949|ref|NP_001008216.1| cytochrome c oxidase assembly factor 5 [Homo sapiens]

Length: 74 aa Organism: Homo sapiens Modified: PRI 09-JUN-2012 Sequence2- gene:

Title: >gi|303324588|ref|NM_001195024.1| Bos taurus cytochrome c oxidase assembly factor 5 (COA5), mRNA

Organism: Bos taurus Length: 834 bp Modified MAM29-APR-2012 Sequence2- protein:

Title: >gi|303324589|ref|NP_001181953.1| cytochrome c oxidase assembly factor 5

[Bos taurus]

Length: 74 aa Modified: MAM 29-APR-2012 Sequence3- gene:

Title: >gi|390474111|ref|XM_002757406.2| PREDICTED: Callithrix jacchus cytochrome C oxidase assembly factor 5 (COA5), mRNA

Organism: Callithrix jacchus Length: 786 bp Modified PRI

08-JUN-2012 Sequence3- protein:

Title: Protein: >gi|296223030|ref|XP_002757452.1| PREDICTED: cytochrome c oxidase assembly factor 5 [Callithrix jacchus]

Length 74 aa Modified: PRI 08-JUN-2012 Sequence4- gene:

Title: >gi|284520949|ref|NM_001171783.1| Xenopus laevis cytochrome C oxidase assembly factor 5 (coa5), mRNA

Length: 730 bp Organism: Xenopus laevis Modified: VRT 26-MAY-2012 Sequence4- protein:

Title: Protein: >gi|284520950|ref|NP_001165254.1| cytochrome c oxidase assembly factor 5 [Xenopus laevis]

Length: 75 aa Modified: ? VRT 26-MAY-2012

Sequence5- gene:

Title: ?>gi|238859532|ref|NM_001161497.1| Danio rerio cytochrome C oxidase assembly factor 5 (coa5), mRNA

mRNA linear

Length: 504 bp Organism: Modified? VRT 26-MAY-2012

Sequence5- protein:

Title: >gi|238859533|ref|NP_001154969.1| cytochrome c oxidase assembly factor 5

[Danio rerio]

Length: 75 aa ?Modified: VRT 26-MAY-2012

更多相关推荐:
生物信息学小结

1.什么是(基因)生物信息学?目前一般意义的生物信息学是基因层次的它是一个包含着基因组信息的获取、处理、存储、分配、分析和解释的所有方面学科领域。生物信息学是把基因组DNA序列信息分析作为源头,破译隐藏在DNA…

我眼中的生物信息学

我眼中的生物信息学学院:外国语学院年级:10级班级:商务一班姓名:学号:一、生物信息学的概念从广义上来说,生物信息学从事对基因组研究相关生物信息的获取、加工、储存、分配、分析和解释。包括了两层含义,一是对海量数…

高级生物统计学学习心得

高级生物统计学课程学习总结摘要经过一学期对生物统计学的学习我对生物统计学有了进一步的理解本文主要讲述了本学期学习生物统计之后我对生物统计学的收获和体会关键词生物统计学收获体会学习了黄老师讲授的高级生物统计学这门...

生物信息学学习之感

学完生物信息学这门课程我感触最深的是关于生物信息学的应用方面尤其与人类健康有关的内容在学习的过程的同时我找了许多有关医药生物信息学资料学习了相关的知识并进行了整理分析以下是我从药物生物信息学方面的认识与理解首先...

生物信息学综述

生物信息学综述院系生命科学学院专业生态学姓名荆佩欣学号220xx0919xx0生物信息学综述摘要生物信息学是综合运用生物学数学物理学信息科学以及计算机科学等学科的理论方法而形成的交叉学科生物信息学已成为整个生命...

生物信息学论文

生物信息学课程论文20xx学年下学期论文题目班级08生工3班学号0809030308姓名周永强摘要生物信息学已成为整个生命科学发展的重要组成部分成为生命科学研究的前沿本文对生物信息学的产生背景及其研究现状等方面...

生物信息学期末考试答案

一名词Bioinformatics生物信息学是一门综合运用生物学数学物理学信息科学以及计算机科学等诸多学科的理论方法以互联网为媒介数据库为载体利用数学和计算机科学对生物学数据进行储存检索和处理分析并进一步挖掘和...

生物信息学简答题

1简答生物信息学产生的历史必然性以及生物信息学的研究内容答历史必然性一方面近50年计算机科学和信息科学已经成为发展最为迅速的学科领域计算机应用的普及以及各类型数据库在各行各业中的广泛应用给各个科学的发展带来了新...

第六章知识点总结郑美辰1130170178生物信息学

第六章知识点总结1103班郑美辰1130170178本章的主要讲述内容是蛋白质结构预测老师分别从引言蛋白质二级结构预测蛋白质三维结构预测蛋白质空间结构比较这几个方面讲述经过这一章的学习我对物质结构与功能相适应这...

国内外生物信息学发展状况

国内外生物信息学发展状况1国外生物信息发展状况国外非常重视生物信息学的发展各种专业研究机构和公司如雨后春笋般涌现出来生物科技公司和制药工业内部的生物信息学部门的数量也与日俱增美国早在19xx年在国会的支持下就成...

初中生物新课标学习心得

初中生物新课标学习心得双江县第一完全中学初二生物李荣芳最近我们初中生物组织对新课标重新学习通过学习体会颇深新课标提出的基本理念是生物新课程面向全体学生学生是学习的主体生物学习要以重要概念为重心重视学生的个性化发...

生物新课标学习心得体会

生物新课标学习心得体会教师建设初中王宗波通过对生物新课程标准和义务教育课程标准教师学习指导书的阅读学习让我知道了新教材的设计理念是面向全体学生提高学生的科学素养倡导探究性学习这就使我明确了本学科教学真谛懂得怎样...

生物信息学学习心得(21篇)