向量数据库部署:手把手教你把数据库跑起来
读完这篇文章你能学到什么
看完各种数据库介绍,不知道怎么把它们跑起来?或者部署了但不知道咋优化?这篇文章就教你用Docker和Kubernetes把主流向量数据库部署起来,配上监控,还告诉你常见的坑怎么避。
前言:部署其实没那么可怕
很多人一看部署文档就头皮发麻,什么Docker、Kubernetes、配置文件…一堆东西。但其实只要你有一台电脑,按着我的步骤来,十分钟就能把向量数据库跑起来。
我来带你一步步来。
一、Docker部署(最简单的方式)
1.1 先装Docker
如果你还没装Docker,先去 https://docker.com 下载安装。装好之后打开Docker Desktop,确保它正在运行。
1.2 Milvus单机部署(适合测试)
# 创建一个目录存放数据
mkdir -p milvus-demo && cd milvus-demo
# 下载docker-compose配置文件
curl -sL https://github.com/milvus-io/milvus/releases/download/v2.4.0/milvus-standalone-docker-compose.yml \
-o docker-compose.yml
# 启动服务
docker-compose up -d
# 查看状态
docker-compose ps
# 如果看到三个服务都是"Up",说明成功了
# milvus-etcd, milvus-minio, milvus-standalone启动成功后,Milvus的接口是19530端口。验证一下:
# 用pymilvus连接测试
pip install pymilvus
python << 'EOF'
from pymilvus import connections
# 连接
connections.connect(
alias="default",
host="localhost",
port="19530"
)
# 验证
print("连接成功!Milvus跑起来了")
connections.disconnect("default")
EOF1.3 Qdrant部署(更简单)
Qdrant的部署比Milvus还简单,不需要额外的etcd和minio:
# 创建数据目录
mkdir -p qdrant-demo && cd qdrant-demo
# 一行命令搞定
docker run -d \
--name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v $(pwd)/storage:/qdrant/storage \
qdrant/qdrant:v1.7.0
# 验证
curl http://localhost:6333/health
# 返回 {"status":"ok"} 就是成功了1.4 Weaviate部署
# 创建目录
mkdir -p weaviate-demo && cd weaviate-demo
# 创建docker-compose.yml
cat > docker-compose.yml << 'EOF'
version: '3.8'
services:
weaviate:
image: semitechnologies/weaviate:1.24.0
ports:
- "8080:8080"
environment:
QUANTUM_BLACKHOLE_ENABLED: "true"
PERSISTENCE_DATA_PATH: "/var/lib/weaviate"
ENABLE_MODULES: "text2vec-openai"
OPENAI_APIKEY: "${OPENAI_API_KEY}"
volumes:
- weaviate_data:/var/lib/weaviate
volumes:
weaviate_data:
EOF
# 启动
docker-compose up -d
# 验证
curl http://localhost:8080/v1/.well-known/healthy1.5 停止和清理
# 停止服务
docker-compose down
# 停止并删除数据(慎用!)
docker-compose down -v
# 删除容器
docker rm -f qdrant milvus-etcd milvus-minio milvus-standalone二、Kubernetes生产部署
2.1 什么时候需要K8s
如果你的应用需要:
- 高可用(不能宕机)
- 自动扩缩容
- 多节点集群
- 生产环境
那就需要上Kubernetes了。如果只是个人项目或小规模测试,Docker单机部署就够了。
2.2 Milvus集群部署
# 添加Helm仓库
helm repo add milvus https://milvus-io.github.io/milvus-helm/
helm repo update
# 创建namespace
kubectl create namespace milvus-system
# 部署
helm install milvus milvus/milvus \
-n milvus-system \
--set cluster.enabled=true \
--set etcd.replicaCount=3 \
--set minio.replicas=4
# 查看状态
kubectl get pods -n milvus-system
# 等待所有pod都Running起来
# 这可能需要几分钟2.3 Qdrant集群部署
# qdrant-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: qdrant
namespace: vector-system
spec:
serviceName: qdrant-headless
replicas: 3 # 3个副本保证高可用
selector:
matchLabels:
app: qdrant
template:
metadata:
labels:
app: qdrant
spec:
containers:
- name: qdrant
image: qdrant/qdrant:v1.7.0
ports:
- containerPort: 6333
name: http
- containerPort: 6334
name: grpc
resources:
requests:
cpu: "2"
memory: 8Gi
limits:
cpu: "4"
memory: 16Gi
volumeMounts:
- name: qdrant-storage
mountPath: /qdrant/storage
readinessProbe:
httpGet:
path: /readyz
port: 6333
initialDelaySeconds: 5
periodSeconds: 5
volumeClaimTemplates:
- metadata:
name: qdrant-storage
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi # 根据需要调整# 应用配置
kubectl apply -f qdrant-statefulset.yaml
# 创建Service
kubectl expose StatefulSet qdrant \
--name=qdrant-service \
--port=6333 \
--target-port=6333 \
--type=LoadBalancer三、Python客户端配置
3.1 Milvus客户端
from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType
class MilvusDB:
def __init__(self, host="localhost", port="19530"):
connections.connect(
alias="default",
host=host,
port=port,
pool_size=10
)
def create_collection(self, name, dim=768):
"""创建Collection"""
fields = [
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=dim),
FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=65535),
]
schema = CollectionSchema(fields=fields, description=name)
collection = Collection(name=name, schema=schema)
# 创建索引(关键!)
collection.create_index(
field_name="embedding",
index_params={
"index_type": "HNSW",
"metric_type": "COSINE",
"params": {"M": 16, "efConstruction": 200}
}
)
collection.load()
return collection
def insert(self, name, vectors, texts):
"""插入数据"""
collection = Collection(name)
collection.insert([vectors, texts])
collection.flush()
def search(self, name, query_vector, top_k=10):
"""搜索"""
collection = Collection(name)
results = collection.search(
data=[query_vector],
anns_field="embedding",
param={"metric_type": "COSINE", "params": {"ef": 128}},
limit=top_k,
output_fields=["text"]
)
return results[0]
# 使用
db = MilvusDB()
db.create_collection("my_kb", dim=768)
# 插入数据
import numpy as np
vectors = np.random.rand(100, 768).tolist()
texts = [f"文档{i}内容" for i in range(100)]
db.insert("my_kb", vectors, texts)
# 搜索
query = np.random.rand(768).tolist()
results = db.search("my_kb", query)
print(f"找到{len(results)}个结果")3.2 Qdrant客户端
from qdrant_client import QdrantClient, models
import numpy as np
class QdrantDB:
def __init__(self, host="localhost", port=6333):
self.client = QdrantClient(host=host, port=port)
def create_collection(self, name, dim=768):
"""创建Collection"""
self.client.create_collection(
collection_name=name,
vectors_config=models.VectorParams(
size=dim,
distance=models.Distance.COSINE
)
)
def upsert(self, name, vectors, texts, ids=None):
"""插入数据"""
from qdrant_client.models import PointStruct
if ids is None:
ids = [f"doc_{i}" for i in range(len(vectors))]
points = [
PointStruct(
id=ids[i],
vector=vectors[i],
payload={"text": texts[i]}
)
for i in range(len(vectors))
]
self.client.upsert(collection_name=name, points=points)
def search(self, name, query_vector, top_k=10, filter_text=None):
"""搜索"""
from qdrant_client.models import Filter, FieldCondition, MatchValue
search_params = models.SearchParams(hnsw_ef=128)
# 构建过滤条件
query_filter = None
if filter_text:
query_filter = Filter(
must=[
FieldCondition(
key="text",
match=MatchValue(value=filter_text)
)
]
)
results = self.client.search(
collection_name=name,
query_vector=query_vector,
query_filter=query_filter,
search_params=search_params,
limit=top_k
)
return results
# 使用
db = QdrantDB()
db.create_collection("my_kb", dim=768)
# 插入
vectors = np.random.rand(100, 768).tolist()
texts = [f"文档{i}内容" for i in range(100)]
db.upsert("my_kb", vectors, texts)
# 搜索
query = np.random.rand(768).tolist()
results = db.search("my_kb", query)
print(f"找到{len(results)}个结果")3.3 Chroma客户端(最简单)
import chromadb
class ChromaDB:
def __init__(self, persist_dir="./chroma_data"):
self.client = chromadb.PersistentClient(path=persist_dir)
def create_collection(self, name):
"""创建Collection"""
return self.client.create_collection(name)
def add(self, name, documents, embeddings, metadatas=None, ids=None):
"""添加数据"""
collection = self.client.get_collection(name)
if ids is None:
ids = [f"doc_{i}" for i in range(len(documents))]
collection.add(
documents=documents,
embeddings=embeddings,
metadatas=metadatas,
ids=ids
)
def query(self, name, query_embeddings, n_results=10, where=None):
"""查询"""
collection = self.client.get_collection(name)
return collection.query(
query_embeddings=query_embeddings,
n_results=n_results,
where=where
)
# 使用
db = ChromaDB("./my_kb")
collection = db.create_collection("articles")
# 添加数据
documents = ["深度学习入门", "机器学习基础", "Python教程"]
embeddings = [[0.1]*128, [0.2]*128, [0.3]*128] # 简化示例
db.add("articles", documents, embeddings)
# 查询
results = db.query("articles", [[0.15]*128, n_results=2)
print(results)四、性能优化
4.1 索引参数调优
HNSW是向量检索最常用的索引,它有几个关键参数:
# Milvus HNSW参数
collection.create_index(
field_name="embedding",
index_params={
"index_type": "HNSW",
"metric_type": "COSINE", # 或 L2, IP
"params": {
"M": 16, # 连接数,越大越精确但越慢
"efConstruction": 200 # 构建参数,越大越精确但越慢
}
}
)
# 搜索时调整ef
results = collection.search(
data=[query_vector],
anns_field="embedding",
param={"metric_type": "COSINE", "params": {"ef": 256}}, # 调大提高精度
limit=10
)参数选择建议:
- M:4-64,默认16
- efConstruction:100-400,默认200
- 内存紧张:M=4-8
- 追求精度:M=32-64, efConstruction=400
4.2 批量操作优化
# 批量插入(比单条插入快10-100倍)
def batch_upsert(client, collection_name, vectors, texts, batch_size=1000):
from qdrant_client.models import PointStruct
for i in range(0, len(vectors), batch_size):
batch_vectors = vectors[i:i+batch_size]
batch_texts = texts[i:i+batch_size]
points = [
PointStruct(
id=f"doc_{i+j}",
vector=batch_vectors[j],
payload={"text": batch_texts[j]}
)
for j in range(len(batch_vectors))
]
client.upsert(collection_name=collection_name, points=points)
if (i + batch_size) % 10000 == 0:
print(f"已插入 {i + batch_size} 条数据")4.3 内存优化
向量数据库是内存大户,配置要留足内存:
| 向量规模 | 推荐内存 |
|---|---|
| 100万向量(768维) | 8-16GB |
| 1000万向量 | 64-128GB |
| 1亿向量 | 256GB+ |
如果内存不够,可以考虑:
- 启用向量压缩
- 减小向量维度
- 分片存储
五、监控与运维
5.1 基本健康检查
# Milvus
curl http://localhost:9091/health
# Qdrant
curl http://localhost:6333/health
# Weaviate
curl http://localhost:8080/v1/.well-known/healthy5.2 Python健康检查
import time
def check_milvus_health(host="localhost", port="19530", timeout=30):
"""检查Milvus健康状态"""
from pymilvus import connections
start = time.time()
while time.time() - start < timeout:
try:
connections.connect(
alias="default",
host=host,
port=port
)
connections.disconnect("default")
return True
except Exception as e:
time.sleep(1)
return False
def check_qdrant_health(host="localhost", port=6333):
"""检查Qdrant健康状态"""
import requests
try:
resp = requests.get(f"http://{host}:{port}/health", timeout=5)
return resp.status_code == 200
except:
return False5.3 常用运维命令
# 查看Docker容器日志
docker logs -f milvus-standalone
docker logs -f qdrant
# 进入容器内部
docker exec -it milvus-standalone bash
# 查看资源使用
docker stats
# 重启服务
docker-compose restart
# 查看磁盘使用
df -h
du -sh /var/lib/docker/volumes/*六、常见问题
6.1 端口冲突
如果启动时报端口被占用:
# 查看端口占用
lsof -i :6333
# 或者换端口
docker run -d --name qdrant -p 6335:6333 qdrant/qdrant6.2 内存不足
Docker Desktop设置里增加内存分配:
- Mac: Docker Desktop → Preferences → Resources → Memory
- Windows: Docker Desktop → Settings → Resources → Memory
至少给8GB。
6.3 数据持久化
数据存在Docker volume里,重启容器不会丢失:
# 查看volume位置
docker volume inspect milvus-etcd_vol
# 备份volume
docker run --rm -v milvus-etcd_vol:/data -v $(pwd):/backup alpine tar czf /backup/etcd_backup.tar.gz -C /data .6.4 远程连接
# 如果Docker在远程服务器上
# 修改配置允许外部访问
# Milvus: 编辑docker-compose.yml
# milvus:
# command: ["milvus", "run", "standalone"]
# # 添加
# extra_hosts:
# - "host.docker.internal:host-gateway"
# Qdrant: 默认监听0.0.0.0,外部可以直接访问
# 确保防火墙开放6333端口七、总结
部署向量数据库其实没那么复杂:
- 测试环境:Docker Compose一键搞定
- 生产环境:Kubernetes保证高可用
- 性能优化:调好HNSW参数,留足内存
- 监控运维:定期检查健康状态
关键是理解原理,按需选择配置,不要过度工程。
相关主题
- 向量数据库对比 - 各数据库特性对比
- Embedding模型选择 - 向量生成模型
- 知识库管理 - 知识库整体架构
更新记录
- 2026-04-24:改写完成,语言风格优化
- 增加详细部署步骤和常见问题解答