Open access edition 31_2: 18th April 2025, refreshed Oct 2025 release notes
- Where?
- A web-based version of this application is currently available at GRCh38 Reads Generator
- just click the link; no user name or password needed:
- How? Check the Help page for advice on browser settings and application features.
- What? SyRGen creates reads with known variants that you can put through an informatics pipeline to ensure the variants are being correctly identified.
- Why? We need standards and this system creates test data that may anticipate data recoverable from rare tissue types.
- When? – refresh this page in case of maintenance updates
- Tell us what you think: Email syrgenreads@gmail.com
Contents of the data in this edition:
Genomic, mRNA and cDNA sequence data are available for the following reference genes, amongst others:
AK2 – Mutations in AK2 are know to cause reticular dysgenesis, a severe neonatal condition. Six different Haplotype Variants are presented here including SNV, a splice-junction variant; known and imagined variants are included.
NB: The MANE_Select definition for AK2 is absent from EnsEMBL’s GRCh37 feature-data, but it has been edited into the GRCh37 data for SyRGen. To emphasise this, the hyperlink text to Ensembl for this Haplotype includes “GRCh38” to differentiate it from links to GRCh37 for the other transcripts ie:


Table of AK2 variants included in SyRGen:
| Haplotype Variant Source Name | HGVS expression (CDS-sequence definition) | Alternative definitions or comment |
| AK2 EB0001 | NM_001625.4(AK2): c.453del(p.Tyr152fs) implemented as NM_001625.4(AK2): c.452_453CC>C(p.Tyr152fs) | Single-base deletion clinvar:18254 dbSNP:rs1553151177 |
| AK2 EB0002 | NM_001625.4(AK2): c.336_338del(p.Asp113del) | Three-base deletion clinvar:840748 |
| AK2 EB0003 | NM_001625.4(AK2): c.331-1G>A | Splicing-site deletion clinvar:18253 dbSNP:rs1192619329 |
| AK2 EB0004 | NM_001625.4(AK2): c.698_699del(p.Lys233fs) | Deprecated clinvar:1034623 |
| AK2 EB0005 | NM_001625.4(AK2): c.350_402del(p.Lys117Thrfs*32) | Theoretical; 53-base-long deletion |
| AK2 EB0006 | NM_001625.4(AK2): c.406_425+3dup dup=ATCCGAAGAATCACAGGAAGGTA | Theoretical; 23-base duplication crossing a splice-boundary. |
ATM – A long gene with multiple alternate mRNA transcripts and a few example variants.
BRCA1 – A demonstration, in the GRCh38 set, that the Sequence Reads Simulator accurately excludes variant features within introns when selecting an mRNA Template, compared to a genomic Template. The output data files for BRCA1 are listed in detail, as examples, with some annotation.
CIITA – A gene associated with bare lymphocyte syndrome (BLS). Seven Haplotype variants are listed; known and imagined variants are included.
Table of CIITA variants included in SyRGen:
| Haplotype Variant Source Name | HGVS expression (CDS-sequence definition) | Alternative definitions or comment |
| CIITA EB0101 | NM_000246.4(CIITA): c.36C>A (p.Tyr12Ter) | SNV clinvar:1076860 dbSNP:rs367628451 |
| CIITA EB0102 | NM_000246.4(CIITA): c.359-2A>G | clinvar:1068098 |
| CIITA EB0103 | NM_000246.4(CIITA): c.2890_2969+1del81;NP_000237.2: p.(Leu964Profs*6) | HGMDPro:P_000237.2 dbSNP:rs1555507411 |
| CIITA EB0104 | NM_000246.4: c.3229_3233+7delATGGAGTGAGTG | HGMDPro:unknown |
| CIITA EB0105 | NM_000246.4(CIITA) :c.1820_1848dup dup=ACAGCCACAGCCCTACTTTGTGCCGGGCA | Theoretical duplication. 1820_1848 defines the sub-sequence to duplicate, but it needs two of these in a delins, so another way to define this in “Create a Haplotype Variant” is to use a del Ins: c.1819C>CACAGCCACAGCCCTACTTTGTGCCGGGCA NB: the CIGARs are different |
| CIITA EB0106 | NM_000246.4(CIITA): c.975_976insCTTTTGGAATA | Theoretical insert between these two positions. In “Create a Haplotype Variant”, use a delins. C.975_976AG → ACTTTTGGAATAG |
| CIITA EB0107 | NM_000246.4(CIITA): c.975_976insCTTTTGGAATA | As EB0106; but EB0107 uses an insert definition that is not currently definable in “Create a Haplotype Variant” |
EGFR – In the GRCh38 set: includes exon-19 deletions not detected in available tests, as identified in “Molecular characteristics and clinical outcomes of EGFR exon 19 indel subtypes to EGFR TKIs in NSCLC patients” by Su et al Oncotargetv.8(67); 2017 Dec 19
When curated by Replicon Genetics, some of these types were not defined in public domain databases. The help section on EGFR gives more information
KRAS – In the GRCh38 set: includes a demonstration of the KRAS G12C variant in KRAS_som1. This is dbSNP:rs121913530 where a C->A on + strand is G->T on the reverse, coding, strand. Using HGVS nomenclature (KRAS):c.34G>T (p.Gly12Cys)
A clustered set of variants is shown in KRAS_hap2, for comparison with the KRAS_minus equivalent
KRAS_minus – In the GRCh38 set: a demonstration of the same data as KRAS, but on the reverse (-) coding strand. KRAS_hap2 shows the subtle difference between the CIGARs created when reading on the opposite strand, and that the application successfully calculates the same global position for the same variant on the opposite strand.
NCF1 – NCF1 variants are involved in autoimmune disease. Five Haplotype variants are listed; known and imagined variants are included.
Table of NCF1 variants included in SyRGen:
| Haplotype Variant Source Name | HGVS expression (CDS-sequence definition) | Alternative definitions or comment |
| NCF1 EB0201 | NM_000265.6(NCF1): c.502del(p.Glu168fs) | G is deleted dbSNP:rs1563003964 HGMD-PUBLIC:CD931024 |
| NCF1 EB0202 | NM_000265.6(NCF1): c.73_74GT[1] (p.Tyr26fs) seems to be a synonym for c.73_74delGT The [1] seems to mean “one GT” where variant is GTGT > GT | GT deletion Clinvar:2249 dbSNP:rs4029402 |
| NCF1 EB0203 | NM_000265.6(NCF1): c.574G>A(p.Gly192Ser) | Clinvar:2255 dbSNP:rs119103273 HGMD-PUBLIC:CS014967 HGMD-PUBLIC:CM104214 |
| NCF1 EB0204 | NM_000265.6(NCF1): c.331_339delTGTCCCCAC | Theoretical deletion |
| NCF1 EB0205 | NM_000265.6(NCF1): c.765_800+2del38 | Theoretical long deletion into intron |
With thanks to Dr Eleanor Baker at North West Genomic Laboratory Hub for defining the AK2, CIITA and NCF1 variants.
© Copyright Replicon Genetics & Cary O’Donnell 2021-2025 & available under the terms of the GNU Affero General Public License version 3 (AGPL-3.0 license)