Abstract: Helicobacter pylori (H. pylori) bacteria residing in human stomachs can cause gastrointestinal diseases and cancer. Discovering their effective sequences’ biomarkers will help to estimate the disease risks. The CagA protein existing in some strains is one virulence factor. In this work, 272 H. pylori strain sequences were pulled from NCBI. Some types and patterns of CagA EPIYA motifs, including amino acid variations were only found in our study comparison with previous clinical data from literature. Two phylogenetic trees were built showing similar two main clades, one using CagA proteins translated by cagA genes and another using their concatenated seven-housekeeping genes. Our studied CagA protein set of EPIYA-ABD strains still report the same distribution of two deletion sites before the first EPIYA motif region in significance test. This aligns with the previous research, where their two-deletion was significantly discovered in EPIYA-ABD sequences isolated from gastric cancer patients. Moreover, the best alignment results, between seven allele sequences in each sequence type from pubMLST and seven housekeeping genes of the EPIYA-ABD strains, enable us to identify either EPIYA-ABD strain or strain groups. To conclude, several sequence analyses as in this work may further improve protocols in assessing the H. pylori gastric cancer risk.
No Comments.