Namespace VisioForge.Core.AI.Clip

ClipEmbeddingEngine: A CLIP dual-tower embedding engine. It owns two ONNX sessions — a vision tower that turns an image into an embedding and a text tower that turns text into an embedding in the same space — so an image and a natural-language query can be compared by cosine similarity. Both towers include the CLIP projection head, so their outputs share the embedding dimension exposed by VisioForge.Core.AI.Clip.ClipEmbeddingEngine.Dimension. All outputs are L2-normalized.