Welcome to VarSem

Next-Gen Variant Standardization with AI-Driven Precision

DNA 3D Structure
šŸ¤–
Auto Extraction

Automatically extract complex variant descriptions from medical text using LLM.

🧬
HGVS Standard

Full support for cDNA, Protein, and Genomic HGVS formats (c., p., g.) ensuring clinical compliance.

šŸ“Š
Multi-Format

Seamlessly integrate with V-NLP-Loc formats for downstream bioinformatics pipeline compatibility.

Overview

VarSem is a professional engine designed to bridge the gap between natural language genetic reporting and structured bioinformatics data. By utilizing advanced reasoning models, it ensures every variant is normalized to international standards.

Supported Variant Types

  • SNV: Single Nucleotide Variants
  • Deletions: Removal events
  • Insertions: Addition events
  • Inversions: Reversal of a DNA segment
  • Delins: Deletion-insertion events
  • Tandem Repeats: Expansion or contraction of repeat units
  • Duplications: Sequence duplications
  • Others: Other supported types

Example Input

In the patient, APOE a novel variant at codon 130 (C→A, introducing a premature stop codon) was identified, together with a c.461G>T change.

Standardized Variant Format Examples

HGVS Format Example
{
  "genesymbol": "APOE",
  "cDNA_HGVS_standardized_form": "c.461G>T",
  "type": "HGVS"
}
V-NLP-Loc Format Example
{
  "genesymbol": "APOE",
  "V-NLP-Loc_standardized_form": "APOE|codon|130|UNK|snv_C>A",
  "type": "V-NLP-Loc"
}

Example Output

Gene Variant Description Transcript Chr Start(hg38) End(hg38) Ref Alt Protein Strand
APOE APOE|codon|130|UNK|snv_C>A NM_001302688.2 19 44908607 44908607 C A Ala130Glu +
APOE APOE|codon|130|UNK|snv_C>A NM_000041.4 19 44908686 44908686 C A Cys130Ter +
APOE APOE|codon|130|UNK|snv_C>A NM_001302689.2 19 44908686 44908686 C A Cys130Ter +
APOE APOE|codon|130|UNK|snv_C>A NM_001302690.2 19 44908686 44908686 C A Cys130Ter +
APOE APOE|codon|130|UNK|snv_C>A NM_001302691.2 19 44908686 44908686 C A Cys130Ter +
APOE c.461G>T NM_000041.4 19 44908757 44908757 G T Arg154Leu +
APOE c.461G>T NM_001302689.2 19 44908757 44908757 G T Arg154Leu +
APOE c.461G>T NM_001302690.2 19 44908757 44908757 G T Arg154Leu +
APOE c.461G>T NM_001302691.2 19 44908757 44908757 G T Arg154Leu +

External References