Accessing comprehensive and relevant documentation is crucial for developers. This project proposes a novel approach leveraging a RAG-enhanced LLaMa-like Transformer Neural Network (RAG-LLTN) to assist developers in accessing and understanding programming language and framework documentation.
The RAG-LLTN model integrates Retrieval-Augmented Generation (RAG) capabilities with the architecture of LLaMa-like Transformer Neural Networks (LLTN), enhancing its ability to retrieve and generate contextually relevant and accurate documentation assistance. RAG allows the model to retrieve relevant passages from a knowledge base, while LLTN enables efficient processing and generation of natural language text. The integration with a Streamlit front-end offers a practical solution for developers.
The core of DocAssist is based on a LLaMa-like transformer architecture enhanced with Retrieval-Augmented Generation (RAG).
-
Pre-normalization Using RMSNorm: Employs RMSNorm for normalizing the input of each transformer sub-layer, optimizing computational cost compared to standard Layer Normalization while maintaining similar performance.
-
SwiGLU Activation Function: Uses the SwiGLU activation function (inspired by PaLM and extending Swish) in the feed-forward network, offering flexibility by combining smoothness and piecewise linearity.
-
Rotary Embeddings (RoPE): Encodes absolute positional information using a rotation matrix, naturally incorporating explicit relative position dependency in self-attention. Offers scalability and decaying inter-token dependency with distance.