向量数据库部署:手把手教你把数据库跑起来

读完这篇文章你能学到什么

看完各种数据库介绍,不知道怎么把它们跑起来?或者部署了但不知道咋优化?这篇文章就教你用Docker和Kubernetes把主流向量数据库部署起来,配上监控,还告诉你常见的坑怎么避。

前言:部署其实没那么可怕

很多人一看部署文档就头皮发麻,什么Docker、Kubernetes、配置文件…一堆东西。但其实只要你有一台电脑,按着我的步骤来,十分钟就能把向量数据库跑起来。

我来带你一步步来。

一、Docker部署(最简单的方式)

1.1 先装Docker

如果你还没装Docker,先去 https://docker.com 下载安装。装好之后打开Docker Desktop,确保它正在运行。

1.2 Milvus单机部署(适合测试)

# 创建一个目录存放数据
mkdir -p milvus-demo && cd milvus-demo
 
# 下载docker-compose配置文件
curl -sL https://github.com/milvus-io/milvus/releases/download/v2.4.0/milvus-standalone-docker-compose.yml \
  -o docker-compose.yml
 
# 启动服务
docker-compose up -d
 
# 查看状态
docker-compose ps
 
# 如果看到三个服务都是"Up",说明成功了
# milvus-etcd, milvus-minio, milvus-standalone

启动成功后,Milvus的接口是19530端口。验证一下:

# 用pymilvus连接测试
pip install pymilvus
 
python << 'EOF'
from pymilvus import connections
 
# 连接
connections.connect(
    alias="default",
    host="localhost",
    port="19530"
)
 
# 验证
print("连接成功!Milvus跑起来了")
 
connections.disconnect("default")
EOF

1.3 Qdrant部署(更简单)

Qdrant的部署比Milvus还简单,不需要额外的etcd和minio:

# 创建数据目录
mkdir -p qdrant-demo && cd qdrant-demo
 
# 一行命令搞定
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/storage:/qdrant/storage \
  qdrant/qdrant:v1.7.0
 
# 验证
curl http://localhost:6333/health
# 返回 {"status":"ok"} 就是成功了

1.4 Weaviate部署

# 创建目录
mkdir -p weaviate-demo && cd weaviate-demo
 
# 创建docker-compose.yml
cat > docker-compose.yml << 'EOF'
version: '3.8'
services:
  weaviate:
    image: semitechnologies/weaviate:1.24.0
    ports:
      - "8080:8080"
    environment:
      QUANTUM_BLACKHOLE_ENABLED: "true"
      PERSISTENCE_DATA_PATH: "/var/lib/weaviate"
      ENABLE_MODULES: "text2vec-openai"
      OPENAI_APIKEY: "${OPENAI_API_KEY}"
    volumes:
      - weaviate_data:/var/lib/weaviate
volumes:
  weaviate_data:
EOF
 
# 启动
docker-compose up -d
 
# 验证
curl http://localhost:8080/v1/.well-known/healthy

1.5 停止和清理

# 停止服务
docker-compose down
 
# 停止并删除数据(慎用!)
docker-compose down -v
 
# 删除容器
docker rm -f qdrant milvus-etcd milvus-minio milvus-standalone

二、Kubernetes生产部署

2.1 什么时候需要K8s

如果你的应用需要:

  • 高可用(不能宕机)
  • 自动扩缩容
  • 多节点集群
  • 生产环境

那就需要上Kubernetes了。如果只是个人项目或小规模测试,Docker单机部署就够了。

2.2 Milvus集群部署

# 添加Helm仓库
helm repo add milvus https://milvus-io.github.io/milvus-helm/
helm repo update
 
# 创建namespace
kubectl create namespace milvus-system
 
# 部署
helm install milvus milvus/milvus \
  -n milvus-system \
  --set cluster.enabled=true \
  --set etcd.replicaCount=3 \
  --set minio.replicas=4
 
# 查看状态
kubectl get pods -n milvus-system
 
# 等待所有pod都Running起来
# 这可能需要几分钟

2.3 Qdrant集群部署

# qdrant-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: qdrant
  namespace: vector-system
spec:
  serviceName: qdrant-headless
  replicas: 3  # 3个副本保证高可用
  selector:
    matchLabels:
      app: qdrant
  template:
    metadata:
      labels:
        app: qdrant
    spec:
      containers:
        - name: qdrant
          image: qdrant/qdrant:v1.7.0
          ports:
            - containerPort: 6333
              name: http
            - containerPort: 6334
              name: grpc
          resources:
            requests:
              cpu: "2"
              memory: 8Gi
            limits:
              cpu: "4"
              memory: 16Gi
          volumeMounts:
            - name: qdrant-storage
              mountPath: /qdrant/storage
          readinessProbe:
            httpGet:
              path: /readyz
              port: 6333
            initialDelaySeconds: 5
            periodSeconds: 5
  volumeClaimTemplates:
    - metadata:
        name: qdrant-storage
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 100Gi  # 根据需要调整
# 应用配置
kubectl apply -f qdrant-statefulset.yaml
 
# 创建Service
kubectl expose StatefulSet qdrant \
  --name=qdrant-service \
  --port=6333 \
  --target-port=6333 \
  --type=LoadBalancer

三、Python客户端配置

3.1 Milvus客户端

from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType
 
class MilvusDB:
    def __init__(self, host="localhost", port="19530"):
        connections.connect(
            alias="default",
            host=host,
            port=port,
            pool_size=10
        )
    
    def create_collection(self, name, dim=768):
        """创建Collection"""
        fields = [
            FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
            FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=dim),
            FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=65535),
        ]
        
        schema = CollectionSchema(fields=fields, description=name)
        collection = Collection(name=name, schema=schema)
        
        # 创建索引(关键!)
        collection.create_index(
            field_name="embedding",
            index_params={
                "index_type": "HNSW",
                "metric_type": "COSINE",
                "params": {"M": 16, "efConstruction": 200}
            }
        )
        
        collection.load()
        return collection
    
    def insert(self, name, vectors, texts):
        """插入数据"""
        collection = Collection(name)
        collection.insert([vectors, texts])
        collection.flush()
    
    def search(self, name, query_vector, top_k=10):
        """搜索"""
        collection = Collection(name)
        
        results = collection.search(
            data=[query_vector],
            anns_field="embedding",
            param={"metric_type": "COSINE", "params": {"ef": 128}},
            limit=top_k,
            output_fields=["text"]
        )
        
        return results[0]
 
# 使用
db = MilvusDB()
db.create_collection("my_kb", dim=768)
 
# 插入数据
import numpy as np
vectors = np.random.rand(100, 768).tolist()
texts = [f"文档{i}内容" for i in range(100)]
db.insert("my_kb", vectors, texts)
 
# 搜索
query = np.random.rand(768).tolist()
results = db.search("my_kb", query)
print(f"找到{len(results)}个结果")

3.2 Qdrant客户端

from qdrant_client import QdrantClient, models
import numpy as np
 
class QdrantDB:
    def __init__(self, host="localhost", port=6333):
        self.client = QdrantClient(host=host, port=port)
    
    def create_collection(self, name, dim=768):
        """创建Collection"""
        self.client.create_collection(
            collection_name=name,
            vectors_config=models.VectorParams(
                size=dim,
                distance=models.Distance.COSINE
            )
        )
    
    def upsert(self, name, vectors, texts, ids=None):
        """插入数据"""
        from qdrant_client.models import PointStruct
        
        if ids is None:
            ids = [f"doc_{i}" for i in range(len(vectors))]
        
        points = [
            PointStruct(
                id=ids[i],
                vector=vectors[i],
                payload={"text": texts[i]}
            )
            for i in range(len(vectors))
        ]
        
        self.client.upsert(collection_name=name, points=points)
    
    def search(self, name, query_vector, top_k=10, filter_text=None):
        """搜索"""
        from qdrant_client.models import Filter, FieldCondition, MatchValue
        
        search_params = models.SearchParams(hnsw_ef=128)
        
        # 构建过滤条件
        query_filter = None
        if filter_text:
            query_filter = Filter(
                must=[
                    FieldCondition(
                        key="text",
                        match=MatchValue(value=filter_text)
                    )
                ]
            )
        
        results = self.client.search(
            collection_name=name,
            query_vector=query_vector,
            query_filter=query_filter,
            search_params=search_params,
            limit=top_k
        )
        
        return results
 
# 使用
db = QdrantDB()
db.create_collection("my_kb", dim=768)
 
# 插入
vectors = np.random.rand(100, 768).tolist()
texts = [f"文档{i}内容" for i in range(100)]
db.upsert("my_kb", vectors, texts)
 
# 搜索
query = np.random.rand(768).tolist()
results = db.search("my_kb", query)
print(f"找到{len(results)}个结果")

3.3 Chroma客户端(最简单)

import chromadb
 
class ChromaDB:
    def __init__(self, persist_dir="./chroma_data"):
        self.client = chromadb.PersistentClient(path=persist_dir)
    
    def create_collection(self, name):
        """创建Collection"""
        return self.client.create_collection(name)
    
    def add(self, name, documents, embeddings, metadatas=None, ids=None):
        """添加数据"""
        collection = self.client.get_collection(name)
        
        if ids is None:
            ids = [f"doc_{i}" for i in range(len(documents))]
        
        collection.add(
            documents=documents,
            embeddings=embeddings,
            metadatas=metadatas,
            ids=ids
        )
    
    def query(self, name, query_embeddings, n_results=10, where=None):
        """查询"""
        collection = self.client.get_collection(name)
        
        return collection.query(
            query_embeddings=query_embeddings,
            n_results=n_results,
            where=where
        )
 
# 使用
db = ChromaDB("./my_kb")
collection = db.create_collection("articles")
 
# 添加数据
documents = ["深度学习入门", "机器学习基础", "Python教程"]
embeddings = [[0.1]*128, [0.2]*128, [0.3]*128]  # 简化示例
db.add("articles", documents, embeddings)
 
# 查询
results = db.query("articles", [[0.15]*128, n_results=2)
print(results)

四、性能优化

4.1 索引参数调优

HNSW是向量检索最常用的索引,它有几个关键参数:

# Milvus HNSW参数
collection.create_index(
    field_name="embedding",
    index_params={
        "index_type": "HNSW",
        "metric_type": "COSINE",  # 或 L2, IP
        "params": {
            "M": 16,           # 连接数,越大越精确但越慢
            "efConstruction": 200  # 构建参数,越大越精确但越慢
        }
    }
)
 
# 搜索时调整ef
results = collection.search(
    data=[query_vector],
    anns_field="embedding",
    param={"metric_type": "COSINE", "params": {"ef": 256}},  # 调大提高精度
    limit=10
)

参数选择建议

  • M:4-64,默认16
  • efConstruction:100-400,默认200
  • 内存紧张:M=4-8
  • 追求精度:M=32-64, efConstruction=400

4.2 批量操作优化

# 批量插入(比单条插入快10-100倍)
def batch_upsert(client, collection_name, vectors, texts, batch_size=1000):
    from qdrant_client.models import PointStruct
    
    for i in range(0, len(vectors), batch_size):
        batch_vectors = vectors[i:i+batch_size]
        batch_texts = texts[i:i+batch_size]
        
        points = [
            PointStruct(
                id=f"doc_{i+j}",
                vector=batch_vectors[j],
                payload={"text": batch_texts[j]}
            )
            for j in range(len(batch_vectors))
        ]
        
        client.upsert(collection_name=collection_name, points=points)
        
        if (i + batch_size) % 10000 == 0:
            print(f"已插入 {i + batch_size} 条数据")

4.3 内存优化

向量数据库是内存大户,配置要留足内存:

向量规模推荐内存
100万向量(768维)8-16GB
1000万向量64-128GB
1亿向量256GB+

如果内存不够,可以考虑:

  • 启用向量压缩
  • 减小向量维度
  • 分片存储

五、监控与运维

5.1 基本健康检查

# Milvus
curl http://localhost:9091/health
 
# Qdrant
curl http://localhost:6333/health
 
# Weaviate
curl http://localhost:8080/v1/.well-known/healthy

5.2 Python健康检查

import time
 
def check_milvus_health(host="localhost", port="19530", timeout=30):
    """检查Milvus健康状态"""
    from pymilvus import connections
    
    start = time.time()
    while time.time() - start < timeout:
        try:
            connections.connect(
                alias="default",
                host=host,
                port=port
            )
            connections.disconnect("default")
            return True
        except Exception as e:
            time.sleep(1)
    
    return False
 
def check_qdrant_health(host="localhost", port=6333):
    """检查Qdrant健康状态"""
    import requests
    try:
        resp = requests.get(f"http://{host}:{port}/health", timeout=5)
        return resp.status_code == 200
    except:
        return False

5.3 常用运维命令

# 查看Docker容器日志
docker logs -f milvus-standalone
docker logs -f qdrant
 
# 进入容器内部
docker exec -it milvus-standalone bash
 
# 查看资源使用
docker stats
 
# 重启服务
docker-compose restart
 
# 查看磁盘使用
df -h
du -sh /var/lib/docker/volumes/*

六、常见问题

6.1 端口冲突

如果启动时报端口被占用:

# 查看端口占用
lsof -i :6333
 
# 或者换端口
docker run -d --name qdrant -p 6335:6333 qdrant/qdrant

6.2 内存不足

Docker Desktop设置里增加内存分配:

  • Mac: Docker Desktop → Preferences → Resources → Memory
  • Windows: Docker Desktop → Settings → Resources → Memory

至少给8GB。

6.3 数据持久化

数据存在Docker volume里,重启容器不会丢失:

# 查看volume位置
docker volume inspect milvus-etcd_vol
 
# 备份volume
docker run --rm -v milvus-etcd_vol:/data -v $(pwd):/backup alpine tar czf /backup/etcd_backup.tar.gz -C /data .

6.4 远程连接

# 如果Docker在远程服务器上
# 修改配置允许外部访问
 
# Milvus: 编辑docker-compose.yml
# milvus:
#   command: ["milvus", "run", "standalone"]
#   # 添加
#   extra_hosts:
#     - "host.docker.internal:host-gateway"
 
# Qdrant: 默认监听0.0.0.0,外部可以直接访问
# 确保防火墙开放6333端口

七、总结

部署向量数据库其实没那么复杂:

  1. 测试环境:Docker Compose一键搞定
  2. 生产环境:Kubernetes保证高可用
  3. 性能优化:调好HNSW参数,留足内存
  4. 监控运维:定期检查健康状态

关键是理解原理,按需选择配置,不要过度工程。

相关主题


更新记录

  • 2026-04-24:改写完成,语言风格优化
  • 增加详细部署步骤和常见问题解答