Skip to content

Add accession to sequence type mapping #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jsstevenson opened this issue Apr 7, 2025 · 0 comments
Open

Add accession to sequence type mapping #77

jsstevenson opened this issue Apr 7, 2025 · 0 comments

Comments

@jsstevenson
Copy link
Contributor

A while back, this method got added to VRS-Python to help with HGVS translation:

def extract_sequence_type(alias: str) -> str | None:
    """Provide a convenient way to extract the sequence type from an accession by matching its prefix to a known set of prefixes.

    Args:
    alias (str): The accession string.

    Returns:
    str or None: The sequence type associated with the accession string, or None if no matching prefix is found.

    """
    prefix_dict = {
        "refseq:NM_": "c",
        "refseq:NC_012920": "m",
        "refseq:NG_": "g",
        "refseq:NC_00": "g",
        "refseq:NW_": "g",
        "refseq:NT_": "g",
        "refseq:NR_": "n",
        "refseq:NP_": "p",
        "refseq:XM_": "c",
        "refseq:XR_": "n",
        "refseq:XP_": "p",
        "GRCh": "g",
    }

    for prefix, seq_type in prefix_dict.items():
        if alias.startswith(prefix):
            return seq_type
    return None

I don't really know the context or whether something here already fulfills this need, but it struck me as a bioutils-esque task and I figured I'd throw out the idea of moving it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant