CaptionThis is a Python command-line deep learning model that generates captions describing images provided as inputs.
CaptionThis is a deep learning project aimed at generating descriptive captions for images using Python. The system is accessible through a command-line interface and leverages a large training dataset for improving caption quality.
The CaptionThis team consists of 6 Cal Poly students. The team members are listed below:
- Deep Learning and AI: Utilized TensorFlow and PyTorch for building and training AI models, showcasing expertise in machine learning and neural networks.
- Image Processing and Computer Vision: Employed image processing techniques and computer vision libraries to handle and analyze image data.
- Natural Language Processing (NLP): Applied NLP techniques in conjunction with the BLIP model for generating image captions, demonstrating your ability to work with language models and textual data.
- Python Programming: Developed the tool using Python, indicating strong programming skills in a language widely used in AI and data science.
- Use of GPU Acceleration with CUDA: Utilized CUDA for GPU acceleration, which is essential for efficiently training deep learning models.
- Data Collection and Preprocessing: Implemented web scraping and data preprocessing, crucial for gathering and preparing datasets for training AI models.
- Concurrency and Multithreading: Used a multithreaded approach for efficient web scraping, showcasing your ability to write efficient and scalable code.
- Command Line Interface (CLI) Development: Developed a user-friendly command-line interface for the tool, enhancing its accessibility and ease of use.
- Software Engineering Best Practices: Applied principles of software development, including version control (evident from the use of GitHub), code optimization, and modular design.
Here is all you need to know to setup this repo on your local machine to start developing!
- Clone this repository
git clone https://github.com/Jkozmo10/CaptionThis.git
- TrainingSets Contains the scripts to scrape images from Google's Conceputal Captions Data Sets
Here are all of the steps you should follow whenever contributing to this repo!
- Before you start making changes, always make sure you're on the main branch,
then
git pull
to make sure your code is up to date - Create a branch with the name relating to the change you will make
git checkout -b <name-of-branch>
- Make changes to the code
When interacting with Git/GitHub, feel free to use the command line,
VSCode extension, or Github desktop. These steps assume you have already made
a branch using git checkout -b <branch-name>
and you have made all neccessary
code changes for the provided task.
- View diffs of each file you changed using the VSCode Github extension or GitHub Desktop
git add .
(to stage all files) orgit add <file-name>
(to stage specific file)git commit -m " <description>"
orgit commit
to get a message promptgit push -u origin <name-of-branch>
- Go to the Pull Requests tab on this repo
- Find your PR, and provide a description of your change, steps to test it, and any other notes
- Link your PR to the corresponding Issue
- Request a reviewer to check your code
- Once approved, your code is ready to be merged in 🎉