Skip to content

Python:搜索与过滤

基础搜索

python
query = [0.05] * 384  # 实际为 query 文本的 embedding

hits = client.search(
    collection_name="kb_zh",
    query_vector=query,
    limit=5,
    with_payload=True,
)
for h in hits:
    print(f"id={h.id} score={h.score:.4f} {h.payload}")

带过滤的搜索

python
from qdrant_client.models import Filter, FieldCondition, MatchValue

hits = client.search(
    collection_name="kb_zh",
    query_vector=query,
    limit=5,
    query_filter=Filter(
        must=[FieldCondition(key="doc", match=MatchValue(value="a"))]
    ),
)

创建 Payload 索引(首次过滤前建议执行)

python
from qdrant_client.models import PayloadSchemaType

client.create_payload_index(
    collection_name="kb_zh",
    field_name="doc",
    field_schema=PayloadSchemaType.KEYWORD,
)

整数 字段用 INTEGER浮点FLOAT,具体类型见官方说明。

批量搜索(多查询)

部分版本支持 search_batch 或循环调用;高 QPS 时注意连接池与 Qdrant 负载。

完整小例子:文本 + sentence-transformers

python
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

model = SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2")
dim = model.get_sentence_embedding_dimension()

client = QdrantClient(url="http://localhost:6333")
col = "demo_st"

if not client.collection_exists(col):
    client.create_collection(
        collection_name=col,
        vectors_config=VectorParams(size=dim, distance=Distance.COSINE),
    )

texts = ["苹果是一种水果", "向量数据库用于语义搜索", "今天天气不错"]
emb = model.encode(texts).tolist()
client.upsert(
    collection_name=col,
    points=[
        PointStruct(id=i, vector=emb[i], payload={"text": texts[i]})
        for i in range(len(texts))
    ],
)

q = model.encode("语义检索").tolist()
for h in client.search(collection_name=col, query_vector=q, limit=2):
    print(h.payload["text"], h.score)

首次运行会下载模型;模型名与维度以 Hugging Face 页面为准。

小结

Python 侧流程:encode → upsert → search;过滤前为 payload 建索引。