Welcome back this is the second module in our unit essentially on an introduction to molecular genetics, at least enough molecular genetics so that we can really talk about what behavioral geneticist are doing in this area. Today I'm going to talk about what a gene is. Last time we talked about DNA and that DNA is the code for life, in that it codes for the proteins that make up our body. And proteins are chains of amino acids. And the way DNA does this or the way this takes place in our bodies is through what's called the central dogma of molecular biology. DNA, it's a multi-stage, or multi-step process. DNA is first transcribed into RNA, and then RNA serves as a template for the synthesis of a protein. And that second step is called translation. So let's look again at this process a little bit more closely. Here's a representation of a gene. And this is probably when you read in the literature, and a researcher's giving a graphic of a gene, it's probably going to look a little bit like this. What's usually represented when they're giving you a gene or a sequence of a gene in a publication. Will be the sense strand of the gene, of the DNA. Remember, it's double stranded. One is called the sense strand, the other is called the anti-sense strand. The sense strand of the DNA will look like the transcribed RNA. So usually the schematic representation of a gene. Is in terms of that sense strand of DNA. I want to highlight three things about this representation of the gene. The first is there's an orientation. There's a five prime in and the three prime in. Last time I mentioned this to you be, This the, the designation of five prime versus three prime has to do with the carbon molecules in the DNA molecule or the carbon atoms, I'm sorry. in the DNA molecule. But for our purposes what's really important is that the way the gene is being read, that is the way it's being transcribed. Is from this end to that end. So the five prime end is called the upstream end of the gene. The three prime end is the downstream end. And you'll often hear those terms when geneticists talk, upstream versus downstream. So that's the first thing I want to highlight here. The second thing I want to highlight is on the upstream end that five prime end there are regulatory regions. Last time I said I would say something, we won't ,we won't get very in depth into this but I would say something about well there are two strands of DNA. How does the transcription machinery know which tran which strand to transcribe? The way it knows is that there's, first of all, a promoter region, a sequence of DNA that has certain properties. It might be just several hundred bases or it might be as many as a thousand bases. Upstream on the five prime n of the start of the gene, the promoter tells the transcription machinery where to begin to transcribe. So that's a very important region, although it's not within the transcribe region of the gene itself, it's very important. To the transcription process itself. It's a regulatory region. That's the second thing I want to highlight here. We'll talk about promoter regions as we go through the course. The third thing, and maybe this is obvious when I first put up the, the slide here, is that ,the a gene, and this is a representation of a single gene here now. A gene is not a continuous sequence of translated DNA bases. Rather, genes contain both exons and introns. So our genes aren't just continuous bases of DNA that are translated into proteins. What exons are, are the expressed. Regions of the DNA of the gene. The coding regions of the, of the gene, that region of the DNA that actually will be translated into protein, is in the exon. The typical human gene would have more than one exon. This gene here has four exons. In between the exons are what are called intervening. Sequences or introns. The intervening sequences are not translated into protein. So the third thing I want to highlight here is that genes, the basic gene structure is comprised of exons and introns. The coding regions of the genes the part of the gene that actually gets translated into protein. Will be in the exons not the introns. So here's a very famous gene called the beta hemoglobin gene and I'm going to use this as an example. As we go through some of the molecular genetic principles here. It's the gene that's involved in sickle cell anemia. So like most human genes, it's comprised of both exons and introns. In this case. There are three exons and two introns. Now, under the central dogma of molecular biology, we first, the, the DNA is transcribed into RNA. It turns out that both the exons and the introns are transcribed. So transcription begins here and ends there. So, in terms of the beta hemoglobin gene which codes for an amino acid 146 amino acids sequence 146 amino acids long, their, the whole. Sixteen hundred and six bases here in this sequence will be transcribed into RNA. That RNA then is spliced, the intronic regions are spliced out. So here the introns are designated by blue and you can see that in what is called the mature RNA. The introns have been spliced out. And we're left now just with the exonic material. So in this case, in the mature RNA, there'll be 143 bases from this first exon, 222 bases here, 261 bases here for 626 bases. In the mature RNA. The last step then is to translate the RNA into proteins. Now it turns out that in the, in the mature RNA, there are some regions that actually don't get translated. And I'm going to look at those in a, in a second here. I'll introduce the terms. Because we probably will see them in, when we talk about schizophrenia. So, this last step then what we're going to do is translate this RNA molecule into a sequence of amino acids, and I'll use again the beta hemoglobin. Gene to illustrate this. Here are three exons. The first exon is actually comprised of two parts. One part that is in the mature RNA but doesn't end up being translated. In, in the case of this particular gene, it, it's comprised of 53 bases. It's called the five prime, untranslated region. UTR for untranslated region. It's in the mature RNA, but again it's not going to code for a protein. And the, the last exon of a human gene will also have two parts. A part that actually will get translated into protein and then, something that's comparable to here. What's called three prime en, translated region; these can have re, regulatory function, so they're important. But they won't get translated into protein, in this case for the beta hemoglobin gene to produce 146 amino acids, there are 90 bases here, 222 bases here, 126 bases there. So if you add those up. Four hundred and thirty-eight bases. And if you divide that by three, because remember the basic coding unit her is a codon, three nucleotide bases, and a base again is a G, C, T, or A, 438 divided by 3 will give us a polypeptide chain, a sequence of amino acids. Of 146 amino acids. That'll give us the beta hemoglobin protein. Next time then we'll begin to scale this up and talk about the whole human genome. [BLANK_AUDIO]