This project implements an intelligent property matching system that leverages Large Language Models (LLMs) and embedding-based similarity search to connect home buyers with their ideal properties. Built as part of the Udacity Generative AI Nanodegree, the system demonstrates practical applications of modern AI techniques in the real estate domain.
- 🏠 Synthetic property listing generation using LLMs
- 🔍 Semantic search using embedding-based similarity
- 🎯 Personalized property recommendations
- 💡 AI-powered listing refinement based on user preferences
- 🖥️ Interactive web interface for preference collection
The system consists of two main components:
- Generates realistic property listings using OpenAI's GPT models
- Computes embeddings for efficient similarity search
- Stores listings and embeddings in ChromaDB for quick retrieval
- Handles data preprocessing and augmentation
- Collects user preferences through an interactive form
- Performs semantic search using embedding similarity
- Fine-tunes property descriptions based on user preferences
- Presents personalized property recommendations
- Implements a user-friendly interface using IPython widgets
# Clone the repository
git clone https://github.com/GretaGalliani/HomeMatch.git
cd cd HomeMatch
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
export OPENAI_API_KEY="your-api-key"
export OPENAI_API_BASE="your-api-base-url"
Run notebook SyntheticDataGenerator.ipynb
Run notebook HomeMatch.ipynb
- Python 3.12+: Core programming language
- OpenAI API: For LLM-based text generation and embeddings
- LangChain: Framework for LLM application development
- ChromaDB: Vector database for similarity search
- IPython Widgets: Interactive user interface components
- Pandas: Data manipulation and analysis
- NumPy: Numerical computing and array operations
- Pydantic: Data validation and settings management
This project was developed as part of the Udacity Generative AI Nanodegree program and serves as a demonstration of applying modern AI techniques to real-world problems.