这是本节的多页打印视图。
点击此处打印.
返回本页常规视图.
性能测试
Proxima Bilin Engine 的性能测试结果和简单分析
1 - 性能测试
1. 测试数据
5 份公开测试数据集分别从 ann-benchmarks
与 big-ann-benchmarks 中选取(为了测试不同数据量级下的性能表现,部分数据集只选取了其中一部分):
dataset |
dimension |
type |
distance |
datasize |
sift-128-euclidean |
128 |
FP32 |
Euclidean |
1,000,000 |
gist-960-euclidean |
960 |
FP32 |
Euclidean |
1,000,000 |
deep-image-96-angular |
96 |
FP32 |
Cosine |
10,000,000 |
yandex-text-to-image |
200 |
FP32 |
InnerProduct |
100,000,000 |
yandex-deep |
96 |
FP32 |
Euclidean |
1,000,000,000 |
2. 测试环境
- CPU:84 Core(Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz)
- 内存:464 GiB
- 磁盘:SSD
- GCC:8.3.0
- 指令集:Intel skylake-avx512
3. 测试说明
- 数据集分别采用 fp32/fp16/int8/int4 四类类型量化方式进行测试
- 索引构建采用 index_builder 工具来进行索引的构建
- 性能测试采用 bench_client 测试工具进行性能测试
- 构建采用图索引类型,相关参数为:max_neighbor_count=64,ef_search=200,ef_construction=500。例如如采用 Int8 量化,索引构建时使用的 schema 如下:
{
"collection_name": "yandex",
"index_column_params": [
{
"column_name": "feature",
"index_type": "IT_PROXIMA_GRAPH_INDEX",
"extra_params": [
{
"key": "quantize_type",
"value": "DT_VECTOR_INT8"
},
{
"key": "engine",
"value": "HNSW"
},
{
"key": "max_neighbor_count",
"value": "64"
}
],
"data_type": "DT_VECTOR_FP32",
"dimension": 96
}
]
}
3. 测试结果
3.1 sift-128-euclidean
3.1.1 构建信息
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
索引大小 |
734M |
490M |
399M |
329M |
构建时间 |
66s |
57s |
50s |
48s |
3.1.2 查询性能
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
1 并发 QPS/RT |
811/s 1132us |
922/s 980us |
982/s 932us |
1058/s 879us |
8 并发 QPS/RT |
6533/s 1212us |
7326/s 1084us |
7868/s 1011us |
8636/s 924us |
16 并发 QPS/RT |
12273/s 1339us |
14579/s 1165us |
15860/s 1077us |
15997/s 963us |
24 并发 QPS/RT |
15846/s 1454us |
18820/s 1261us |
20171/s 1140us |
22213/s 1029us |
32 并发 QPS/RT |
20671/s 1562us |
23389/s 1343us |
26223/s 1192us |
28353/s 1089us |
48 并发 QPS/RT |
24579/s 1898us |
29864/s 1547us |
34159/s 1364us |
38135/s 1231us |
64 并发 QPS/RT |
26829/s 2301us |
35575/s 1723us |
41072/s 1512us |
45778/s 1379us |
3.1.3 召回率
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
Top 1 召回率 |
99.99% |
99.99% |
96.33% |
61.46% |
Top 10 召回率 |
99.94% |
99.94% |
97.25% |
71.88% |
Top 20 召回率 |
99.89% |
99.89% |
97.54% |
74.48% |
Top 30 召回率 |
99.84% |
99.84% |
97.68% |
75.76% |
Top 40 召回率 |
99.79% |
99.79% |
97.82% |
76.58% |
Top 50 召回率 |
99.74% |
99.74% |
97.85% |
77.28% |
Top 60 召回率 |
99.68% |
99.68% |
97.88% |
77.82% |
Top 70 召回率 |
99.61% |
99.61% |
97.90% |
78.21% |
Top 80 召回率 |
99.54% |
99.54% |
97.91% |
78.58% |
Top 90 召回率 |
99.47% |
99.46% |
97.91% |
78.90% |
Top 100 召回率 |
99.39% |
99.39% |
97.92% |
79.19% |
3.2 gist-960-euclidean
3.2.1 构建信息
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
索引大小 |
3.8G |
2.0G |
1.2G |
686M |
构建时间 |
411s |
257s |
164s |
157s |
3.2.2 查询性能
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
1 并发 QPS/RT |
221/s 4230us |
235/s 3978us |
391/s 2518us |
400/s 2398us |
8 并发 QPS/RT |
1722/s 4625us |
1868/s 4257us |
3051/s 2593us |
3204/s 2483us |
16 并发 QPS/RT |
2948/s 5179us |
3676/s 4410us |
5920/s 2694us |
6224/s 2550us |
24 并发 QPS/RT |
3835/s 6333us |
4876/s 4951us |
7975/s 3046us |
8689/s 2744us |
32 并发 QPS/RT |
4218/s 7652us |
6135/s 5242us |
10005/s 3179us |
11013/s 2870us |
48 并发 QPS/RT |
4308/s 11180us |
6935/s 6956us |
11896/s 4054us |
14549/s 3302us |
64 并发 QPS/RT |
4396/s 14622us |
7295/s 8817us |
12418/s 5077us |
17216/s 3761us |
3.2.3 召回率
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
Top 1 召回率 |
99.68% |
99.66% |
96.06% |
55.64% |
Top 10 召回率 |
99.43% |
99.34% |
96.76% |
66.81% |
Top 20 召回率 |
99.19% |
99.10% |
96.96% |
69.53% |
Top 30 召回率 |
98.98% |
98.91% |
97.00% |
70.96% |
Top 40 召回率 |
98.78% |
98.70% |
97.02% |
71.91% |
Top 50 召回率 |
98.57% |
98.50% |
96.94% |
72.48% |
Top 60 召回率 |
98.38% |
98.32% |
96.86% |
73.02% |
Top 70 召回率 |
98.18% |
98.12% |
96.76% |
73.47% |
Top 80 召回率 |
97.98% |
97.92% |
96.68% |
73.82% |
Top 90 召回率 |
97.79% |
97.73% |
96.57% |
74.11% |
Top 100 召回率 |
97.63% |
97.57% |
96.48% |
74.40% |
3.3 deep-image-96-angular
3.3.1 构建信息
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
索引大小 |
6.3G |
4.6G |
4.0G |
3.3G |
构建时间 |
854s |
671s |
597s |
550s |
3.3.2 查询性能
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
1 并发 QPS/RT |
646/s 1547us |
750/s 1333us |
729/s 1352us |
736/s 1345us |
8 并发 QPS/RT |
4910/s 1602us |
5891/s 1349us |
5756/s 1410us |
5817/s 1367us |
16 并发 QPS/RT |
9100/s 1696us |
11055/s 1408us |
10474/s 1462us |
11222/s 1449us |
24 并发 QPS/RT |
12915/s 1860us |
15766/s 1520us |
15379/s 1560us |
15679/s 1538us |
32 并发 QPS/RT |
15917/s 1998us |
19746/s 1611us |
19252/s 1659us |
19545/s 1625us |
48 并发 QPS/RT |
19863/s 2423us |
25792/s 1852us |
24753/s 1929us |
25421/s 1883us |
64 并发 QPS/RT |
22562/s 2825us |
30531/s 2101us |
29083/s 2184us |
30160/s 2107us |
3.3.3 召回率
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
Top 1 召回率 |
99.68% |
99.66% |
96.06% |
55.64% |
Top 10 召回率 |
99.43% |
99.34% |
96.76% |
66.81% |
Top 20 召回率 |
99.19% |
99.10% |
96.96% |
69.53% |
Top 30 召回率 |
98.98% |
98.91% |
97.00% |
70.96% |
Top 40 召回率 |
98.78% |
98.70% |
97.02% |
71.91% |
Top 50 召回率 |
98.57% |
98.50% |
96.94% |
72.48% |
Top 60 召回率 |
98.38% |
98.32% |
96.86% |
73.02% |
Top 70 召回率 |
98.18% |
98.12% |
96.76% |
73.47% |
Top 80 召回率 |
97.98% |
97.92% |
96.68% |
73.82% |
Top 90 召回率 |
97.79% |
97.73% |
96.57% |
74.11% |
Top 100 召回率 |
97.63% |
97.57% |
96.48% |
74.40% |
3.4 yandex-text-to-image
3.4.1 构建信息
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
索引大小 |
100G |
64G |
46G |
37G |
构建时间 |
4h |
2.7h |
2.5h |
2.1h |
3.4.2 查询性能
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
1 并发 QPS/RT |
500/s 1999us |
530/s 1942us |
565/s 1821us |
572/s 1765us |
8 并发 QPS/RT |
3839/s 2151us |
4175/s 1967us |
4251/s 1846us |
4366/s 1832us |
16 并发 QPS/RT |
6606/s 2275us |
7207/s 2067us |
8023/s 1915us |
8404/s 1881us |
24 并发 QPS/RT |
9746/s 2534us |
10172/s 2280us |
11246/s 2082us |
11887/s 2060us |
32 并发 QPS/RT |
11368/s 2847us |
13136/s 2375us |
14709/s 2189us |
15081/s 2140us |
48 并发 QPS/RT |
12888/s 3708us |
16914/s 2894us |
18466/s 2591us |
19356/s 2467us |
64 并发 QPS/RT |
13523/s 4711us |
18051/s 3517us |
20718/s 3018us |
22637/s 2823us |
3.4.3 召回率
量化类型 |
FP32 |
FP16 |
INT8 |
INT4 |
Top 1 召回率 |
99.80% |
99.75% |
98.07% |
69.50% |
Top 10 召回率 |
99.80% |
99.69% |
97.78% |
74.08% |
Top 20 召回率 |
99.72% |
99.59% |
97.73% |
75.07% |
Top 30 召回率 |
99.61% |
99.48% |
97.91% |
75.83% |
Top 40 召回率 |
99.52% |
99.39% |
97.84% |
76.58% |
Top 50 召回率 |
99.44% |
99.31% |
97.92% |
77.11% |
Top 60 召回率 |
99.35% |
99.22% |
97.90% |
77.56% |
Top 70 召回率 |
99.26% |
99.13% |
97.87% |
77.95% |
Top 80 召回率 |
99.15% |
99.03% |
97.85% |
78.31% |
Top 90 召回率 |
99.08% |
98.96% |
97.85% |
78.62% |
Top 100 召回率 |
98.99% |
98.87% |
97.77% |
78.83% |
![Recall](/images/yandex-text-to-image-recall.png
3.5 yandex-deep
3.5.1 构建信息
量化类型 |
INT8 |
索引大小 |
386G |
构建时间 |
71h |
3.5.2 查询性能
量化类型 |
INT8 |
1 并发 QPS/RT |
417/s 2398us |
8 并发 QPS/RT |
3393/s 2574us |
16 并发 QPS/RT |
5886/s 2793us |
24 并发 QPS/RT |
7368/s 3257us |
32 并发 QPS/RT |
8085/s 3918us |
48 并发 QPS/RT |
8933/s 5556us |
64 并发 QPS/RT |
8536/s 7554us |
3.5.3 召回率
量化类型 |
INT8 |
Top 1 召回率 |
92.07% |
Top 10 召回率 |
93.11% |
Top 20 召回率 |
93.05% |
Top 30 召回率 |
92.88% |
Top 40 召回率 |
92.59% |
Top 50 召回率 |
92.27% |
Top 60 召回率 |
91.96% |
Top 70 召回率 |
91.63% |
Top 80 召回率 |
91.30% |
Top 90 召回率 |
91.00% |
Top 100 召回率 |
90.64% |