笔记列表

《提高C++性能的编程技术》笔记:总结
《提高C++性能的编程技术》笔记:跟踪
《提高C++性能的编程技术》笔记:构造函数、析构函数
《提高C++性能的编程技术》笔记:临时对象
《提高C++性能的编程技术》笔记:内存池(单线程、多线程)
《提高C++性能的编程技术》笔记:内联
《提高C++性能的编程技术》笔记:STL
《提高C++性能的编程技术》笔记:引用计数
《提高C++性能的编程技术》笔记:编码优化
《提高C++性能的编程技术》笔记:设计优化/可扩展性/系统体系结构

标准模板库(Standard Template Library, STL)是容器和通用算法的强效组合。

渐近复杂度:算法的渐近复杂度是对算法性能的近似估计。它是算法集到特定性能标准集的映射。如果需要对包含N个整数的向量的所有元素求和,那么每个整数必须且仅需检查一次,因此该算法的复杂度约为N,我们将其称为O(N)。另一方面,假设需要创建一个包含N个元素的向量,由于某种原因,你需要将这些元素插入到该向量的前端。然而,在向量的前端每插入一个元素便会迫使所有已存在的元素移动1个位置。这就导致了(1+2+3+…+N)次向量元素的移动,也即(N/2)(N+1)次移动。即使需要移动(NN/2)+(N/2)次,但我们仍说这个算法的复杂度是O(NN)。这是因为渐近性能标准集将忽略常量倍数和低阶因子。因而NN和7NN的复杂度相同:均为O(N*N)。STL对其算法渐近复杂度的保证是一个良好的开端。它告诉我们,所使用的算法是同类算法中最好的。

插入:在集合大小事先可知的情况下,在尾部进行元素插入操作,速度,数组>向量>双向列表>多重集。数组和向量均为占用连续内存块的顺序容器。当无法事先确定集合大小时,向量将会更有用。向量是可以动态增长的,这样程序员就不用考虑集合的大小。向量的大小是指向量当前拥有元素的数量。向量的容量是指向量能够容纳元素数量的最大值,超过这个值,向量就必须分配更多的内存以适应元素的增长。当往向量中插入第一个元素时,为了使向量的容量超过它的初始大小(值为1),典型的STL实现将会分配一大块内存。后续的插入将会增加向量的大小,而容量保持不变。如果集合持续增长,那么向量大小最终将会达到它的容量。下一次插入操作将会迫使向量的实现扩大自身容量。这必须包含以下几个步骤:(1). 分配更大的内存块,以便为增加的元素留出空间;(2). 将当前集合中现存的所有元素复制到新分配的内存中。在该过程中,会为旧集合的每一个元素调用拷贝构造函数;(3). 销毁旧集合并释放它所占用的空间。在该过程中,会为旧集合中的每一个元素调用析构函数。这些步骤可能会引起巨额开销。因此,我们希望尽可能少地出现向量大小超过其容量的情况。列表容器并未将元素存储在连续的内存空间中,所以也就无须应付容量问题和相关的性能开销问题。向量容器在进行前端插入时的性能极其槽糕。而列表在进行前端插入或后端插入时性能几乎一致。

删除:删除操作的性能情况在很多方面都与插入操作很类似。很多关于插入效率的结论同样适用于删除。例如:(1). 向量擅长于在尾部对元素进行插入(或删除)操作。因为这种操作与集合大小无关,所以它是一种固定时间的操作。(2). 除了尾部操作之外,采用向量进行其它任何位置的插入(或删除)操作都是一种极为糟糕的选择。这种插入(或删除)的代价与插入(或删除)点到向量的最后一个元素之间的距离成正比例。(3). 双向队列在集合的前端和尾部插入(或删除)元素时效率很高(常数时间),在其它任何位置插入(或删除)元素时效率都很低。(4). 列表在集合的任何位置插入(或删除)元素时效率都很高(固定时间)。

遍历:对于容器的遍历操作来说,向量和数组的性能是相同的。它们均远胜过列表。容器遍历的关键因素似乎为容器内存布局和系统缓存之间的交互作用。

查找:当进行元素查找时,使用成员find方法,多重容器可以胜过其它的所有竞争对手。而多重集是有序集,这使其在插入和删除操作上遭受了一些性能上的损失。但是,从另一方面来看,有序特性也为多重集容器在查找方面带来了巨大的优势。

函数对象:默认情况下,accumulate()函数将operator+操作应用到容器中所有元素,然后返回这些元素相加的累计结果。对于整数集合来说,如果提供给accumulate()函数的初始值为0,那么该函数返回的结果就是集合中所有元素的和。accumulate()函数不仅可以进行对象相加,而且可以对容器元素进行任何操作(前提是元素支持这种操作),然后返回累计结果。函数指针直到运行时才能被解析,这就使得它们无法被内联。而函数对象是在编译时被确定的,这就使得编译器可以自由地内联operator()函数并显著地提升效率。

STL实现在以下几个方面形成了自己的优势:(1). STL实现使用最好的算法;(2). STL实现的设计者非常有可能是领域内的专家;(3). 这些领域内的专家完全地致力于提供一个灵活、强大并且高效的库。这是他们的首要任务。

STL是抽象、灵活性和效率的一种罕见的结合。对于某种特定的应用模式,一些容器比其它的更加高效,这都要随着实际应用而定。除非了解一些相关领域内STL所忽略的问题,否则你是不可能超过STL实现的。不过,在一些特定的情况下,还是有可能超越STL实现的性能的。

以下是测试代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
#include <cstdlib>
#include <iostream>
#include <chrono>
#include <string>
#include <vector>
#include <list>
#include <set>
#include <algorithm>
#include <ctime>
#include <memory>
#include <numeric>
#include <functional>

namespace standard_template_library_ {

// reference: 《提高C++性能的编程技术》:第十一章:标准模板库
namespace {
template<class T>
void arrayInsert(T* a, const T* collection, int size)
{
for (int k = 0; k < size; ++k) {
a[k] = collection[k];
}
}

template<class T>
void arrayTraverse(const T* a, int size)
{
T sum = std::accumulate(&a[0], &a[size], 0);
}

template<class T>
void arrayFind(const T* a, const T* collection, int size)
{
T const value = collection[size/2];
T* tmp = const_cast<T*>(a);
T* p = std::find(&tmp[0], &tmp[size], value);
}

template<class T>
void vectorInsert(std::vector<T>* v, const T* collection, int size)
{
for (int k = 0; k < size; ++k) {
v->push_back(collection[k]);
}
}

template<class T>
void vectorInsertFront(std::vector<T>* v, const T* collection, int size)
{
for (int k = 0; k < size; ++k) {
v->insert(v->begin(), collection[k]);
}
}

template<class T>
void vectorDelete(std::vector<T>* v)
{
while (!v->empty()) {
v->pop_back();
}
}

template<class T>
void vectorDeleteFront(std::vector<T>* v)
{
while (!v->empty()) {
v->erase(v->begin());
}
}

template<class T>
void vectorTraverse(const std::vector<T>* v, int size)
{
T sum = std::accumulate(v->cbegin(), v->cend(), 0);
}

template<class T>
void vectorFind(const std::vector<T>* v, const T* collection, int size)
{
T const value = collection[size/2];
auto it = std::find(v->cbegin(), v->cend(), value);
}

template<class T>
void listInsert(std::list<T>* l, const T* collection, int size)
{
for (int k = 0; k < size; ++k) {
l->push_back(collection[k]);
}
}

template<class T>
void listInsertFront(std::list<T>* l, const T* collection, int size)
{
for (int k = 0; k < size; ++k) {
l->push_front(collection[k]);
}
}

template<class T>
void listDelete(std::list<T>* l)
{
while (!l->empty()) {
l->pop_back();
}
}

template<class T>
void listDeleteFront(std::list<T>* l)
{
while (!l->empty()) {
l->pop_front();
}
}

template<class T>
void listTraverse(const std::list<T>* l, int size)
{
T sum = std::accumulate(l->cbegin(), l->cend(), 0);
}

template<class T>
void listFind(const std::list<T>*l, const T* collection, int size)
{
T const value = collection[size/2];
auto it = std::find(l->cbegin(), l->cend(), value);
}

template<class T>
void multisetInsert(std::multiset<T>* s, const T* collection, int size)
{
for (int k = 0; k < size; ++k) {
s->insert(collection[k]);
}
}

template<class T>
void multisetFind(const std::multiset<T>* s, const T* collection, int size)
{
T const value = collection[size/2];

// 当查找多重集容器时,使用std::find并不是最佳的选择
//auto it = std::find(s->cbegin(), s->cend(), value);
auto it = s->find(value);
}

int* genIntData(int size)
{
std::srand(std::time(nullptr));
int* data = new int[size];
std::generate(&data[0], &data[size], std::rand);
return data;
}

int mult(int x, int y)
{
return (x*y);
}

class Mult {
public:
int operator()(int x, int y) const { return (x*y); }
};

} // namespace

} // namespace standard_template_library_
using namespace standard_template_library_;

int main()
{

std::chrono::high_resolution_clock::time_point time_start, time_end;
const int count{400000}, count2{10000}, count3{200000000};
int* data = genIntData(count);

{ // normal array insert back operation
std::unique_ptr<int[]> arr(new int[count]);
time_start = std::chrono::high_resolution_clock::now();
arrayInsert(arr.get(), data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<arr.get()[0]<<", array insert back time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
}

{ // vector insert back operation
std::vector<int> vec;
time_start = std::chrono::high_resolution_clock::now();
vectorInsert(&vec, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<vec[0]<<", vector insert back time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
}

{ // list insert back operation
std::list<int> l;
time_start = std::chrono::high_resolution_clock::now();
listInsert(&l, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<*(l.cbegin())<<", list insert back time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// sprintf(stdout, "first value: %d, list insert back time spend: %f seconds\n",
// *(l.cbegin()), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // multiset insert back operation
std::multiset<int> s;
time_start = std::chrono::high_resolution_clock::now();
multisetInsert(&s, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<*(s.cbegin())<<", multiset insert back time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, multiset insert back time spend: %f seconds\n",
// *(s.cbegin()), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

// { // vector insert front operation
// std::vector<int> vec;
// time_start = std::chrono::high_resolution_clock::now();
// vectorInsertFront(&vec, data, count);
// time_end = std::chrono::high_resolution_clock::now();
// std::cout<<"first value: "<<vec[0]<<", vector insert front time spend: "
// <<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// // fprintf(stdout, "first value: %d, vector insert front time spend: %f seconds\n",
// // vec[0], (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
// }

{ // list insert front operation
std::list<int> l;
time_start = std::chrono::high_resolution_clock::now();
listInsertFront(&l, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<*(l.cbegin())<<", list insert front time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, list insert front time spend: %f seconds\n",
// *(l.cbegin()), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // vector delete back operation
std::vector<int> vec(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
vectorDelete(&vec);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"vector size: "<<vec.size()<<", vector delete back time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "vector size: %d, vector delete back time spend: %f seconds\n",
// vec.size(), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // list delete back operation
std::list<int> l(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
listDelete(&l);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"list size: "<<l.size()<<", list delete back time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "list size: %d, list delete back time spend: %f seconds\n",
// l.size(), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

// { // vector delete front operation
// std::vector<int> vec(data, data+count);
// time_start = std::chrono::high_resolution_clock::now();
// vectorDeleteFront(&vec);
// time_end = std::chrono::high_resolution_clock::now();
// std::cout<<"vector size: "<<vec.size()<<", vector delete front time spend: "<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// // fprintf(stdout, "vector size: %d, vector delete front time spend: %f seconds\n",
// // vec.size(), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
// }

{ // list delete front operation
std::list<int> l(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
listDeleteFront(&l);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"list size: "<<l.size()<<", list delete front time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "list size: %d, list delete front time spend: %f seconds\n",
// l.size(), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // normal array traverse operation
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count2; ++i)
arrayTraverse(data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<data[0]<<", array traverse time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, array traverse time spend: %f seconds\n",
// data[0], (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // vector traverse operation
std::vector<int> vec(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count2; ++i)
vectorTraverse(&vec, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<vec[0]<<", vector traverse time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";

// fprintf(stdout, "first value: %d, vector traverse time spend: %f seconds\n",
// vec[0], (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // list traverse operation
std::list<int> l(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count2; ++i)
listTraverse(&l, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<*l.cbegin()<<", list traverse time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
fprintf(stdout, "first value: %d, list traverse time spend: %f seconds\n",
*l.cbegin(), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // normal array find operation
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count2; ++i)
arrayFind(data, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<data[0]<<", array find time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, array find time spend: %f seconds\n",
// data[0], (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // vector find operation
std::vector<int> vec(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count2; ++i)
vectorFind(&vec, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<data[0]<<", vector find time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, vector find time spend: %f seconds\n",
// vec[0], (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // list find operation
std::list<int> l(data, data+count);
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i< count2; ++i)
listFind(&l, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<*l.cbegin()<<", list find time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, list find time spend: %f seconds\n",
// *l.cbegin(), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // multiset find operation
std::multiset<int> s;
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count2; ++i)
multisetFind(&s, data, count);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"first value: "<<*(s.cbegin())<<", multiset find time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "first value: %d, multiset find time spend: %f seconds\n",
// *(s.cbegin()), (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

int a[10] = {1, 2, 3, 5, 7, 11, 13, 17, 19, 23};
int init = 1, product = 0;

{ // accumulate operation: function pointer
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count3; ++i)
product = std::accumulate(&a[0], &a[10], init, mult);
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"value: "<<product<<", accumulate function pointer time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "value: %d, accumulate function pointer time spend: %f seconds\n",
// product, (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // accumulate operation: function object
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count3; ++i)
product = std::accumulate(&a[0], &a[10], init, Mult());
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"value: "<<product<<", accumulate function object time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "value: %d, accumulate function object time spend: %f seconds\n",
// product, (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

{ // accumulate operation: function object(std)
time_start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < count3; ++i)
product = std::accumulate(&a[0], &a[10], init, std::multiplies<int>());
time_end = std::chrono::high_resolution_clock::now();
std::cout<<"value: "<<product<<", accumulate function object(std) time spend: "
<<(std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count() <<" seconds\n";
// fprintf(stdout, "value: %d, accumulate function object(std) time spend: %f seconds\n",
// product, (std::chrono::duration_cast<std::chrono::duration<double>>(time_end-time_start)).count());
}

delete [] data;
return 0;
}


利用 godbolt 执行,相应代码见 点击这里 测试结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
first value: 426745319, array insert back time spend: 0.00107669 seconds
first value: 426745319, vector insert back time spend: 0.00295537 seconds
first value: 426745319, list insert back time spend: 0.0211792 seconds
first value: 7053, multiset insert back time spend: 0.165407 seconds
first value: 498119490, list insert front time spend: 0.0477667 seconds
vector size: 0, vector delete back time spend: 0.000233846 seconds
list size: 0, list delete back time spend: 0.00776743 seconds
list size: 0, list delete front time spend: 0.00843209 seconds
first value: 426745319, array traverse time spend: 8.8e-08 seconds
first value: 426745319, vector traverse time spend: 1.25e-07 seconds
first value: 426745319, list traverse time spend: 1.15e-07 seconds
first value: 426745319, list traverse time spend: 0.000000 seconds
first value: 426745319, array find time spend: 9.8e-08 seconds
first value: 426745319, vector find time spend: 1.17e-07 seconds
first value: 426745319, list find time spend: 8.7e-08 seconds
first value: 0, multiset find time spend: 1.09e-07 seconds
value: 223092870, accumulate function pointer time spend: 1.88715 seconds
value: 223092870, accumulate function object time spend: 1.80509 seconds
value: 223092870, accumulate function object(std) time spend: 2.01622 seconds