Anatomy of the Pfizer/BioNTech COVID19 mRNA Vaccine

Kevin Folta
7 min readDec 24, 2020


The WHO Published the Details of the Injected Sequence - Here’s What it Means

“You have no idea what’s in that injection, other than the microchips,” the meme proclaims.

But thanks to great transparency we know exactly what’s in the new Pfizer / BioNTech mRNA vaccine directed against SARS-CoV2. First, no microchips! The following information comes from a release by the World Health Organization. To the average person it probably looks frightening, an injection of jargon. That’s why I’m here to sort it out for you.

What is mRNA?

RNA is a chemical compound very similar to DNA, the master blueprint of the cell. Messenger RNA (or mRNA) is a temporary copy of that blueprint that carries the information to the place in the cell where the master code is stored, to a different part of the cell, the cytoplasm. In this compartment it literally decoded and used to assemble a specific protein that plays a role in metabolism or structure in the cell. Most important, while DNA is stable and secure (as a master blueprint should be) RNA is inherently unstable and goes away shortly after its information is delivered.

The mRNA vaccine is a special molecule made in the lab that instructs your cells to make one tiny part of the virus (an antigen) that will trick your body into thinking there’s an infection when there really is not. It has various features that aid in production that that protein antigen.

Anatomy of the Vaccine mRNA Molecule

The vaccine contains 30 micrograms of mRNA. A gram is about the weight of a dollar bill. So if you fold the dollar in half, then in half again, then 13 more times, that piece is the same weight as the mRNA injected. That’s a small amount, equivalent to 30 seconds in the last 11.5 days. It contains the information that your cells decipher into the in “spike” protein from SARS-CoV2, the infectious agent that causes COVID19 symptoms. By instructing your cells to be factories creating this little signature of the virus, your body mounts a defense against it when the virus actually comes along. Neat trick!

The “Cap”

From left to right, the RNA has multiple parts. The “cap” is a modification that the cell does on most RNA molecules. It has several functions, including protecting the end from degradation and also connecting with the ribosome, the cellular machinery that takes the information in RNA and creates a protein based on the information within. The image above shows the detail of the cap structure. You can see that the left half looks like the right half, connected by three phosphtates (the P’s between them). This is how just about all of your RNA is capped, with a backwards “G”. The information in RNA starts with the stuff on the right side.

The ”5'-UTR”
The next features is called the 5'-UTR or 5'-untranslated region. The “ 5' ” part is just a chemical notation that tells us it is on the front part of this RNA molecule. “Untranslated Region” means that this part of the RNA does not contain information about the protein to be made, but it has information that will make it happen efficiently.

The description in their text says, “human alpha globin RNA with an optimized Kozak sequence.” The 5'-UTR of the human alpha globin gene imparts stability to the alpha globin RNA (Russell and Liebhaber, 1996), so by adding it to the sequence here, it makes the spike protein RNA more stable. That is important, as the cell has many mechanisms that attack and degrade RNA. Like the aforementioned cap, the alpha globin 5'-UTR helps keep this molecule intact so its information can be translated into a protein that will induce an immune response. Sometimes the 5'-UTR contains information that makes the information conversion to a protein more efficient.

Modified Kozak Sequence

The DNA molecule is a chain of four molecules we familiarly call A,G,C, and T, letters representing the actual chemical constituents. RNA is slightly different. It reflects the information in DNA only replacing “T” with a “U” (instead of thymidine there is uracil) a slightly different chemical found in RNA.

Thymidine nucleotides in DNA are represented by uracil in RNA. Note the slight differences in the upper right ring and the lower ring. The one on the left is missing an “OH” (it is just an H, hence “deoxy”) and it also has that extra line, which implies a methyl group.

The “Kozak Sequence” is defined as the A,C,G, and U that are present right around where the information in the RNA will be translated into protein. It is kind of a “start here” signal, originally identified by brilliant experiments by Dr. Marylin Kozak, starting in the 1980’s. The letters in the Kozak Sequence greatly affect the efficiency of translating the information in RNA into a protein, in this case the Spike protein of the virus.

What Does the “Psi” Mean in the Sequence?

Here is part of the mRNA sequence:

Not G, A, C and U. What’s up wit that?

The normal U(for uracil) in the RNA sequence is replaced by its cousin, pseudouracil. It actually is replaced by a slightly different form, called methyl pseudouracil. Pseudouracil occurs naturally and is a feature of other types of RNA molecules, namely something called transfer RNA or tRNA.

Replacing uracil with methylpseudouracil helps the molecule evade recognition by cellular receptors that would flag the molecule as foreign, and start a response against it that could trigger negative physiological events (for geeks: dendritic cells produce fewer cytokines). This discovery was a key advance in the creation of mRNA-based therapeutics and vaccines (Kariko et al., 2005).

The replacement does not change the information in the RNA message, it only changes its visibility to cellular surveillance, helping to ensure that the vaccine will do its job without collateral effects.

“Sig” The Signal Peptide

Once synthesized the peptide has to leave the cell or at least collect on the membrane to be recognized by the immune system. The next feature on the mRNA is referred to as “sig” in the World Health Organization release, and it is the first part of the mRNA vaccine that is translated into a protein. It is not exactly clear where the information for this sequence came from, but it is noted in public databases as the signal sequence from the SARS-CoV2 antigen. The sequence directs the synthesized protein to secretion and is typically cleaved off, revealing the final protein that induces the immune response.

S Prot_mut

This section of the mRNA vaccine is the most important. It creates the physical structure, part of the “spike” protein, that will induce the immune response.

On the diagram it is labeled as “S Prot_mut” which is shorthand for mutant spike protein. The spike proteins are the little projections that decorate the coronavirus. These are the features that physically connect to discrete receptors on your cell’s surface, shaking hands with it, and telling the cell “Bring me aboard” as the first step in infection.

Why is it a mutant? The spike protein does not exist in a constant shape, it changes slightly, and that could make it challenging for antibodies to detect. Scientists figured out how to lock the spike protein in its prefusion state, the form it takes before it connects to the cell. They lock it into that conformation by changing a few amino acids, the protein’s building blocks (for geeks only: changing a few to proline) (Hseih, et al., 2020).

The mutant form is more stable and always present in the prefusion state, the one that is present on the virus, so that the body’s immune system will be primed to recognize it.


Like the 5'-UTR the 3'-UTR (UTR= untranslated region) is just extra RNA that can confer stability to the mRNA as a whole. There are two elements in the Pfizer/BioNTech mRNA SARS-CoV2 vaccine, derived from the amino-terminal enhancer of split (AES) and the mitochondrial 12S RNA genes. Both were described as conferring stability to the mRNA, as well as enhancing the production of the encoded protein, and are common features in mRNA based therapeutics and vaccines (von Niessen et al. 2019). These features help the information in the mRNA stay around the cell a bit longer and create more of the important protein that activates the immune response against SARS-CoV2 coronavirus if /when it actually arrives.

poly (A) Tail

The poly A part of the diagram simply is a modification that helps keep the mRNA from being digested by cellular enzymes. Just about all mRNAs in the cell are decorated with this terminal extension, the “tail” of the mRNA. The tail helps resist the enzymes that constantly assault the mRNA, trying to degrade it. It is important in the vaccine because it provides stability to the molecule.

In Conclusion

While the concepts of the mRNA vaccine may seem new, these products have been in development for about 30 years. Development has been punctuated by several breakthroughs that hastened drug design. The most important take-home message is that this is not magic, not anything wacky and experimental — the vaccines are the outcome of three decades of hard work, creative experimentation, and a substantial investment.

I’m glad to answer your questions @kevinfolta on Twitter.

Cited References

Hsieh, Ching-Lin, et al. “Structure-based Design of Prefusion-stabilized SARS-CoV-2 Spikes.” bioRxiv (2020).

Karikó, Katalin, et al. “Suppression of RNA recognition by Toll-like receptors: the impact of nucleoside modification and the evolutionary origin of RNA.” Immunity 23.2 (2005): 165–175.

Russell, J. Eric, and Stephen A. Liebhaber. “The stability of human beta-globin mRNA is dependent on structural determinants positioned within its 3'untranslated region.” (1996): 5314–5323.

von Niessen, Alexandra G. Orlandini, et al. “Improving mRNA-based therapeutic gene delivery by expression-augmenting 3′ UTRs identified by cellular library screening.” Molecular Therapy 27.4 (2019): 824–836.



Kevin Folta

Professor, podcast host, fruit tree grower, keynote speaker, good trouble.