Welcome to the Multimodal Search Engine project! This project utilizes Jina-CLIP v2 and LanceDB to enable robust search functionality across both text and image data in 89 languages.
- Multimodal Search: Search across both image and text inputs.
- Multilingual Support: Supports 89 languages for text queries and captions.
- Efficient Retrieval: Powered by LanceDB, ensuring low latency and high throughput.
- Matryoshka Representations: Enables hierarchical embedding structures for fine-grained similarity.
- Input: Accepts either a text query or an image as input.
- Encoding: Uses Jina-CLIP v2 to convert text and images into a shared embedding space.
- Storage: Stores these embeddings in LanceDB for efficient retrieval.
- Search: Matches queries to the most relevant embeddings in the database and returns results.