1. 基因
基因是指能够产生有功能蛋白质或是RNA产物的DNA片段,其中的核苷酸排列顺序决定了基因的功能。
2. 染色体
染色体是细胞核内由长链DNA和核蛋白组成的结构,是基因的主要载体,在显微镜下呈丝状或棒状。正常人体细胞中有22对常染色体和1对性染色体。
3. 基因测序
通过测序设备对DNA分之的碱基排列顺序进行的测定。4. 液态活检
液态活检(liquidbiopsy)是一种利用高通量测序仪来检测漂浮在血液中小DNA碎片。这些碎片可能来自死亡的细胞,只要血液中存在“外来”DNA,如来自胎盘、肿瘤或者移植器官等的DNA碎片,都能利用液态活检进行检测。液态活检通过一个简单的血液检测可对遗传侵入进行定位、研究和监控,目前该检测的应用领域迅速增长。
肿瘤基因的“液态活检”,即是在人体的循环血液中检测ctDNA(死亡肿瘤细胞上脱落释放的小片段肿瘤基因)的技术。它通过体外无创抽血即可对全身的肿瘤信息进行检测,适合于癌症的早期诊断和精准医疗,“液态活检”就是癌症“精准医疗”的瞄准镜和雷达。
液体活检的检测对象主要有CTC、ctDNA、循环microRNA和Exsome(外泌体),其中,ctDNA可应用于肿瘤的早期诊断、动态监测肿瘤的发生发展及疗效、耐药检测、复发风险评估等,在肿瘤诊治中前景广阔,受到越来越多的关注。
目前具备极微量核酸基因突变检测能力的技术主要有ARMS技术(包括Super-ARMS)、第二代测序(NGS)和数字PCR(ddPCR,包括BEAMing技术),这些技术也是近期液体活检专家共识中推荐的检测技术。
5.预后
预后(Prognosis)是指通过临床观察和分析来预测疾病的可能病程和结局。它既包括判断疾病的特定后果(如康复,某种症状、体征和并发症等其它异常的出现或消失及死亡)。也包括提供时间线索,如预测某段时间内发生某种结局的可能性。由于预后是一种可能性,主要指病人群体而不是个人。
5.1 自然预后
自然预后是在未经治疗的情况下,对某种疾病发展过程及其后果的预测。
5.2.治疗预后
治疗预后是在医学干预条件下,对某种疾病发展过程及其最终后果的预测。
6.血脑屏障
脑血管障壁(blood–brainbarrier(BBB)),也称为血脑屏障或血脑障壁,指在血管和脑之间有一种选择性地阻止某些物质由血进入脑的“屏障”。
血脑屏障是指脑毛细血管壁与神经胶质细胞形成的血浆与脑细胞之间的屏障和由脉络丛形成的血浆和脑脊液之间的屏障,这些屏障能够阻止某些物质(多半是有害的)由血液进入脑组织。血液中多种溶质从脑毛细血管进入脑组织,有难有易;有些很快通过,有些较慢,有些则完全不能通过,这种有选择性的通透现象使人们设想可能有限制溶质透过的某种结构存在,这种结构可使脑组织少受甚至不受循环血液中有害物质的损害,从而保持脑组织内环境的基本稳定,对维持中枢神经系统正常生理状态具有重要的生物学意义。
7.粒细胞
粒细胞是一类细胞质中包含颗粒体的白细胞,又因其细胞核形态多样而称多形核白细胞,(PMN或PML)。术语多形核白细胞通常特指最常见的中性粒细胞。
白细胞根据形态差异可分为颗粒和无颗粒两大类,无颗粒的白细胞有淋巴细胞和单核细胞两种。颗粒白细胞(粒细胞)中含有特殊染色颗粒,用瑞氏染料染色可分辨出三种颗粒白细胞即中性粒细胞、嗜酸性粒细胞和嗜碱性粒细胞。绝大部分的粒细胞是中性粒细胞。
人体内白细胞总数和种类白细胞的百分比是相对稳定的。正常人每立方毫米的血液时白细胞为止5000~10000个。各种白细胞的百分比为:中性粒细胞50~70%;嗜酸性粒细胞1~4%;嗜碱性粒细胞0~1%;淋巴细胞20~40%;单核细胞为1~7%。机体发生炎症或其他疾病都可引起白细胞总数及各种白细胞的百分比发生变化,因此检查白细胞总数及白细胞分类计数成为辅助诊断的一种重要方法。
8. 临床治疗常用指标
8.1 5年生存率
五年生存率系指某种肿瘤经过各种综合治疗后,生存五年以上的比例。用五年生存率表达有其一定的科学性。某种肿瘤经过治疗后,有一部分可能出现转移和复发,其中的一部分人可能因肿瘤进入晚期而去世。转移和复发大多生在根治术后三年之内,约占80%,少部分发生在根治术后五年之内,约占10%。所以,各种肿瘤根治术后五年内不复发,再次复发的机会就很少了,故常用五年生存率表示各种癌症的疗效。术后五年之内,一定要巩固治疗,定期检查,防止复发,即使有转移和复发也能及早治疗。
8.2 完全缓解(CR: Complete response)
是指所有的瘤块以及肿瘤的临床表现完全消失且持续至少1个月。
8.3 部分缓解(PR:Partial Response)
是指可测量的肿瘤垂直两直径的和较基线缩小50%并持续至少1个月。
8.4 进展(Progression)
是肿瘤垂直两直径的和较最低值增加25%,或出现新的肿瘤或可评价的疾病有明显的进展。
8.5 无病生存期(Disease-free survival, DFS)
是指从随机化开始至疾病复发或患者因任何原因导致死亡的时间。该指标也常作为抗肿瘤药物III期临床试验的主要终点。
8.6 总生存期(Overall survival, OS)
是指从随机化开始至因任何原因引起死亡的时间。该指标常常被认为是肿瘤临床试验中最佳的疗效终点。如果在生存期上有小幅度的提高,可以认为是有意义的临床受益证据。作为一个终点,生存期应每天进行评价,可通过在住院就诊时,通过与患者直接接触或者通过电话与患者交谈,这些相对比较容易记录。确认死亡的日期通常几乎没有困难,并且死亡的时间有其独立的因果关系。当记录至死亡之前的失访患者,通常截止到最后一次有记录的、与患者接触的时间。
8.7 疾病进展时间(Time to Progression,TTP)
是指从随机化开始至出现疾病进展或死亡的时间。
8.8 治疗失败的时间(Time to treatment failure, TTF)
指从随机化开始至出现疾病进展、死亡、由于不良事件退出、患者拒绝继续进行研究或者使用了新治疗的时间。TTF的本质是一具有综合特性的指标,所以,可造成为了达到毒性的降低,而潜在影响了预期疗效的产生。鉴于这种情况,TTF不常用作肿瘤III期临床试验的主要终点。
8.9 无进展生存期(progression-free survival, PFS)
由随机至第一次发生疾病进展或任何原因死亡的时间。PFS与TTP不同之处在于PFS可包括有患者死亡时间,因而与OS有更好的相关性。但当多数的死亡事件与肿瘤无关时,TTP则是一个可被接受的终点指标。
9. 顺反式作用元件
9.1 顺式作用元件(cis-actingelement)
或称顺式元件子,是存在于基因旁侧序列中能影响基因表达的序列。顺式作用元件包括启动子、增强子、沉默子等,它们的作用是参与基因表达的调控。顺式作用元件本身不编码蛋白质,其作用是提供一个结合位点,反式作用因子通过结合在该位点上来改变结合处的特性,进而调控受此顺式作用元件影响的基因。调控方式包括对基因转录可变剪切的调控、转录起始位点的调控以及转录效率的调控。
9.2 反式作用因子(trans-actingfactor)
指通过直接结合或间接作用于DNA、RNA等核酸分子,对基因表达发挥不同调节作用(激活或抑制)的各类蛋白质,其本身对基因表达没有调控作用,只是阻断来自上、下游的调控效应。反式作用因子主要指能结合在基因序列上的特异性蛋白质──转录因子,然而随着表观遗传学的发展,研究发现除了蛋白,某些DNA,RNA片断也具有类似的调控功能,因此现在把它们算作反式作用因子。
10. 癌症靶向治疗(targetedcancertherapies)
在分子水平上,针对已经明确的致癌位点(“分子靶标”),通过药物或者其它手段,干扰阻断肿瘤的发生,生长和扩散。
11. Germline Mutation 生殖细胞突变
Germline mutations, also called hereditary mutations, are passed on from parents to offspring.
12. Somatic mutation 体细胞突变
Somatic or acquired mutations are non-heritable mutations that can arise spontaneously in somatic cells due to mistakes in DNA replication, or from exposure to mutagens like UV radiation or certain chemicals, and the changes resulting from these mutations can lead to cellular transformation.
Somatic mutations can be identified by examining the genetic material in a questionable cell and comparing it to a cell from elsewhere in the body; the DNA in the two cells will be different, despite the fact that it is not supposed to be.
A change in the genetic structure that is not inherited from a parent, and also not passed to offspring, is called a somatic cell genetic mutation or acquired mutation.
Disease-causing mutations can also occur during the mitotic cell divisions that generate the embryo after fertilization and zygote formation. These mutations lead to individuals who are mosaic, with only a subset of their cells harboring the mutation. These mutations are de novo in the sense that they are not detectable in the parents of the affected individuals but are more specifically termed somatic mutations.
13.线粒体DNA
线粒体DNA(mitochondrialDNA,mtDNA)指一些位于线粒体内的DNA,与一般位于细胞核内的DNA有不同的演化起源,可能是源自早期细菌。现今人类体内的每个细胞中,大约有1000到10000个线粒体,而每一个线粒体内,则大约有2到10组mtDNA,每个mtDNA共包含16,569个碱基对,其中有37个基因,可用来制造13种蛋白质、22种tRNA与两种rRNA。其中的内含子较细胞核基因少,且有些不含内含子,如tRNA基因。
人类的mtDNA也可用来进行个体辨识。
正常状况下,线粒体只会遗传自母亲,以哺乳类而言,一般在受精之后,卵子细胞就会将精子中的线粒体摧毁。
14. de novo mutation
De novo mutation, an alteration in a gene that is present for the first time in one family member as a result of a mutation in a germ cell (egg or sperm) of one of the parents or in the fertilized egg itself.
de novo mutations are typically present in the sperm or egg of one parent and yet are not detectable in blood taken from the parents; once transmitted to the embryo, they are present in all tissues of the offspring.
- Somatic Mutation, Genomic Variation, and Neurological Disease. Science 341, (2013); DOI:10.1126/science.1237758
15. Recurrent mutation
Recent analysis of mutations across cancer types (pan-cancer analysis) has revealed that relatively few genes are recurrently mutated in a high proportion of samples above expectation from a random distribution without clonal selection.
A recurrent de novo mutation would be a de novo mutation that occurs repeatedly.
e.g. Haemophilia A is caused by mutation of the FVIII gene. Two different inversion mutations explain half of all severe cases of haemophilia A. The inversion mutations occurs repeatedly, i.e. de novo. They are caused by recombination between near identical copies of a DNA segment found within and outside of the gene, and in opposite orientation relative to one another (PMID:8275087; PMID:11756167).
Recurrent de novo point mutations can occur, for example, at CpG sites (PMID:25401298)
16. Driver mutations
Somatic mutations that have a role in creating, controlling and/or directing some aspect of the cancer phenotype.
17. Paired-end reads
Sequencing reads from each end of the same DNA molecule. Knowing the sequence of both reads and the length of the DNA molecule improves mapping to a reference sequence, de novo assembly and detecting structural variations.
18. Redundant sequence coverage
The total number of bases sequenced divided by the total number of bases in the haploid genome.
19. Mitochondrial Inheritance线粒体遗传
线粒体DNA(mitochondrial DNA,mtDNA)指一些位于线粒体内的DNA,与一般位于细胞核内的DNA有不同的演化起源,可能是源自早期细菌。现今人类体内的每个细胞中,大约有1000到10000个线粒体,而每一个线粒体内,则大约有2到10组mtDNA,每个mtDNA共包含16,569个碱基对,其中有37个基因,可用来制造13种蛋白质、22种tRNA与两种rRNA。其中的内含子较细胞核基因少,且有些不含内含子,如tRNA基因。人类的mtDNA也可用来进行个体辨识。正常状况下,线粒体只会遗传自母亲,以哺乳类而言,一般在受精之后,卵子细胞就会将精子中的线粒体摧毁。
Mitochondria are cytoplasmic organelles important in cellular respiration
Have their own DNA
Carry 37 genes
Transmitted from mother to ALL of her offspring
No recombination
Males and females equally affected
High mutation rate
20. Pseudoautosomal region (PAR)
The pseudoautosomal regions (PAR1 and PAR2) are short regions of homology between the mammalian X and Y chromosomes.
The PAR behave like an autosome and recombine during meiosis. Thus genes in this region are inherited in an autosomal rather than a strictly sex-linked fashion.
PAR1 comprises 2.6 Mb of the short-arm tips of both X and Y chromosomes in humans and other great apes and is required for pairing of the X and Y chromosomes during male meiosis.
PAR2 is located at the tips of the long arms and is a much shorter region, spanning only 320 kb.
Genes in [PAR1](http://www.genenames.org/cgi-bin/genefamilies/set/715\ and [PAR2](http://www.genenames.org/cgi-bin/genefamilies/set/716\
The locations of the PARs within GRCh38 are:
PAR1: chrY:10,000-2,781,479 and chrX:10,000-2,781,479
PAR2: chrY:56,887,902-57,217,415 and chrX:155,701,382-156,030,895
PAR3: chrY:3,571,959-5,881,959 and chrX:89,145,000-92,745,001
21. PLINK - founder & heterozygous haploid genotypes
--filter-founders Include only founders, excludes all samples with at least one known parental ID from the current analysis (note that it is not necessary for that parent to be in the current dataset)
--filter-nonfounders Include only nonfounders
--make-founders Set non-founders without two parents to founders. By default, if parental IDs are provided for a sample, they are not treated as a founder even if neither parent is in the dataset. With no modifiers, --make-founders clears both parental IDs whenever at least one parent is not in the dataset, and the affected samples are now considered founders.
--nonfounders Include all individuals in MAF/HWE calculations, Only founders are normally considered by these filters
--hwe-all HW filtering based on all founder individuals for binary trait (instead of just unaffecteds)
plink.nof List of SNPs with no observed founders
plink.hh List of heterozygous haploid genotypes (SNPs/individuals). This is usually caused by male heterozygous calls in the X chromosome pseudo-autosomal region. It can also be caused by incorrect sex information and/or an incorrect chromosome set.
nonfounder, at least one parental ID is known.
founder, both parental ID are not known.
22. What is allelic imbalance? 等位基因失衡
A difference in the expression between two alleles. Humans are diploid organisms, which means we have 2 copies of each gene. Normally, these two copies are expressed at the same level. This means that the mRNA transcript from the mother and the transcript from the father will have roughly the same number of copies.
Sometimes, however, this is not the case. When the ratio of the expression levels is not 1 to 1, we call it “allelic imbalance”. There are a variety of reasons why the expression may vary between the alleles. “Gene imprinting,” when environmental factors silence either the maternal or paternal allele, is one case.
If one allele is silenced completely, then there will be an extreme case of allelic imbalance. Other scenarios may increase or decrease expression of one particular allele only slightly, resulting in imbalance to a lesser degree.
Cis-acting mutations may alter regulation for just one allele through a change to promoter/enhancer regions (transcription factor binding sites), or even through 3′ UTR mutations that affect mRNA stability or microRNA binding.
23. 基因印记
基因组印记是是近年来发现的一种不遵从孟德尔定律的依靠单亲传递某些遗传学性状的现象,指来自父方和母方的等位基因在通过精子和卵子传递给子代时发生了修饰,使带有亲代印记的等位基因具有不同的表达特性。
基因印记是一种表观遗传调控机制,在二倍体哺乳动物的发育过程中,基因印记可以调控来自亲代的等位基因差异表达。
研究表明大多数印记基因中存在长非编码 RNA( 长度 > 200nt 的非编码 RNA, lncRNA) 的转录,长非编码 RNA 主要通过顺式的转录干扰作用来实现基因印记。
同时基因印记及其相关的长非编码 RNA 异常表达与许多先天疾病相关,迄今已发现数十种人类遗传疾病与基因印记有关,而 lncRNA 引起的基因印记在疾病的发生和治疗中起着重要作用。
印记基因一般是成簇分布的,包含 3 ~ 12 个基因,形成一个或几个印记控制区( imprinting control regions,ICRs) , 也即甲基化差异区( differentially methylated regions, DMRs) ,跨度达 20 ~ 3700kb。
长非编码 RNA( lncRNA) 是长度大于 200 nt 的非编码 RNA。
24. Targeted amplicon sequencing (TAS) vs. Targeted sequencing
扩增子测序 Targeted amplicon sequencing (TAS): 就是通过设计感兴趣的基因组区域的引物,通过PCR扩增,将目标区域DNA富集后进行高通量测序的研究策略。
目标区域捕获测序(Target Seq)就是通过定制感兴趣的基因组区域的探针,与基因组DNA进行杂交,将目标区域DNA富集后进行高通量测序的研究策略。
Amplicon resequencing is a kind of targeted resequencing.
With amplicon sequencing, you do PCR with two primers flanking the region you care about. Presumably, you'd try to multiplex a bunch of them together per sample. Then you make a sequencing library of all those amplicons
The other common way to sequence a particular subset of a genome is to use a capture probe that binds to your sequence of interest, then you sequence that library. Large sets of probes designed to capture exonic sequence are commonly used.
25. Allele
One of two copies of a gene that are located at the same position on different arms of a chromosome.