Page MenuHomePhorge

Sean::function
ActivePublic

Authored by 0xseantasker on Sep 14 2024, 11:49 PM.
Referenced Files
F78747: image.png
Sep 15 2024, 1:31 PM
F78745: Sean::function
Sep 15 2024, 8:39 AM
F78742: image.png
Sep 15 2024, 3:02 AM
F78741: Sean::function
Sep 15 2024, 2:26 AM
F78740: Sean::function
Sep 15 2024, 2:12 AM
F78739: Sean::function
Sep 15 2024, 1:26 AM
F78738: Sean::function
Sep 14 2024, 11:50 PM
F78737: Sean::function
Sep 14 2024, 11:49 PM
Subscribers
None
Given the following classes:
class Base
{
public:
virtual ~Base() = default;
virtual void Call() = 0;
virtual void CallParams(int a, int b) = 0;
};
class A : public Base
{
public:
void Call() override
{
calls++;
position += calls % 10;
}
void CallParams(int a, int b) override
{
calls++;
position += calls % b;
}
int calls = 0;
int position = 0;
};
class B : public Base
{
public:
void Call() override final
{
calls++;
position += calls % 10;
}
void CallParams(int a, int b) override
{
calls++;
position += calls % b;
}
int calls = 0;
int position = 0;
};
I wanted to compare the performance of various call types to class methods in a hierarchy:
- via virtual
- "direct" member calls
- std::function
- boost::function
- My implementation of a std function like object that is intended to by-pass the vtable
In my implementation I essentially just get the member function pointer and object pointer then cast to what GCC internally expects: func(void*,params...).
I tried building with clang but it doesn't like my member function to void* cast.
Results:
Debug
g++-12 (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0
`-g -std=gnu++20`
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 30.49 | 32,793,129.79 | 2.0% | 8.57 | `base->Call()`
| 23.01 | 43,458,913.64 | 1.9% | 6.48 | `a->Call()`
| 21.77 | 45,926,564.46 | 2.3% | 6.22 | `b->Call()`
| 99.60 | 10,040,487.24 | 1.6% | 27.72 | `function &Base::Call, base`
| 90.66 | 11,030,408.01 | 1.8% | 25.21 | `function &A::Call, a`
| 91.46 | 10,934,136.38 | 1.4% | 25.37 | `function &B::Call, b`
| 83.14 | 12,027,721.56 | 1.7% | 23.06 | `boost function &Base::Call, base`
| 75.94 | 13,168,925.84 | 2.3% | 21.04 | `boost function &A::Call, a`
| 72.89 | 13,718,884.93 | 2.3% | 20.33 | `boost function &B::Call, b`
| 31.40 | 31,847,168.31 | 2.7% | 8.71 | `function pointer direct base, &Base::Call `
| 24.96 | 40,071,270.39 | 2.9% | 7.04 | `function pointer direct a, &A::Call`
| 26.24 | 38,110,580.64 | 1.8% | 7.36 | `function pointer direct b, &B::Call`
| 37.16 | 26,907,735.45 | 2.0% | 10.19 | `Sean::function a|b, &A|B::Call`
| 33.42 | 29,924,713.87 | 1.8% | 9.23 | `Sean::function a, &A::Call`
| 31.93 | 31,318,966.50 | 2.3% | 8.83 | `Sean::function b, &B::Call`
| 26.97 | 37,078,341.54 | 2.8% | 7.52 | `Sean::function direct a|b, &A|B::Call`
| 21.92 | 45,629,125.73 | 2.6% | 6.13 | `Sean::function direct a, &A::Call`
| 21.73 | 46,022,102.98 | 2.9% | 6.07 | `Sean::function direct b, &B::Call`
| 33.24 | 30,086,109.45 | 3.0% | 9.23 | `Sean::function a|b, &A|B::CallParams`
| 31.82 | 31,428,888.00 | 2.9% | 8.80 | `Sean::function a, &A::CallParams`
| 32.14 | 31,111,968.50 | 2.5% | 8.86 | `Sean::function b, &B::CallParams`
| 28.15 | 35,519,154.34 | 3.2% | 7.86 | `Sean::function direct a|b, &A|B::CallParams`
| 22.08 | 45,284,728.74 | 3.7% | 6.16 | `Sean::function direct a, &A::CallParams`
| 21.74 | 45,999,870.49 | 3.3% | 6.10 | `Sean::function direct b, &B::CallParams`
| 33.97 | 29,440,925.19 | 2.9% | 9.44 | `template sean::function a|b, &A|B::CallParams`
| 31.43 | 31,813,589.58 | 2.2% | 8.70 | `template sean::function a, &A::CallParams`
| 31.62 | 31,620,852.76 | 2.3% | 8.76 | `template sean::function b, &B::CallParams`
| 33.81 | 29,576,909.20 | 2.4% | 9.39 | `template check sean::function a|b, &A|B::CallParams`
| 30.79 | 32,481,714.89 | 2.9% | 8.54 | `template check sean::function a, &A::CallParams`
| 29.19 | 34,256,903.56 | 3.1% | 8.16 | `template check sean::function b, &B::CallParams`

Event Timeline

0xseantasker created this object in space S1 Default space.
0xseantasker created this object with visibility "Public (No Login Required)".
0xseantasker created this object with edit policy "0xseantasker (Sean Tasker)".
0xseantasker edited the content of this paste. (Show Details)

Results from Release:

g++-12 (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0:

-O3 -DNDEBUG -std=gnu++20

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                8.97 |      111,505,243.88 |    3.2% |      2.53 | `base->Call()`
|                5.35 |      187,073,255.20 |    4.4% |      1.52 | `a->Call()`
|                4.85 |      206,146,018.92 |    2.9% |      1.36 | `b->Call()`
|               11.91 |       83,951,922.46 |    2.5% |      3.34 | `function &Base::Call, base`
|                9.40 |      106,408,670.05 |    3.0% |      2.64 | `function &A::Call, a`
|                9.29 |      107,610,932.40 |    2.8% |      2.62 | `function &B::Call, b`
|               11.10 |       90,091,261.95 |    2.2% |      3.12 | `boost function &Base::Call, base`
|                7.88 |      126,858,949.14 |    3.2% |      2.23 | `boost function &A::Call, a`
|                7.36 |      135,790,316.78 |    2.7% |      2.06 | `boost function &B::Call, b`
|               10.48 |       95,408,295.17 |    2.1% |      2.92 | `function pointer direct base, &Base::Call `
|                6.71 |      149,076,013.32 |    2.8% |      1.89 | `function pointer direct a, &A::Call`
|                6.92 |      144,541,102.14 |    3.0% |      1.96 | `function pointer direct b, &B::Call`
|                5.89 |      169,756,950.49 |    3.5% |      1.65 | `Sean::function a|b, &A|B::Call`
|                2.81 |      356,323,824.69 |    3.3% |      1.09 | `Sean::function a, &A::Call`
|                2.72 |      366,998,189.34 |    3.1% |      1.10 | `Sean::function b, &B::Call`
|                5.71 |      175,196,322.94 |    3.1% |      1.61 | `Sean::function direct a|b, &A|B::Call`
|                2.81 |      355,563,444.66 |    3.8% |      1.10 | `Sean::function direct a, &A::Call`
|                2.71 |      368,476,521.26 |    3.2% |      1.10 | `Sean::function direct b, &B::Call`
|                6.95 |      143,792,394.89 |    3.1% |      1.96 | `Sean::function a|b, &A|B::CallParams`
|                2.87 |      347,939,831.24 |    3.7% |      1.09 | `Sean::function a, &A::CallParams`
|                2.87 |      348,645,958.79 |    3.0% |      1.10 | `Sean::function b, &B::CallParams`
|                7.01 |      142,639,204.54 |    2.9% |      1.96 | `Sean::function direct a|b, &A|B::CallParams`
|                2.88 |      347,443,044.31 |    2.9% |      1.10 | `Sean::function direct a, &A::CallParams`
|                2.89 |      346,163,931.96 |    3.5% |      1.09 | `Sean::function direct b, &B::CallParams`
|                7.04 |      142,023,712.12 |    3.1% |      1.97 | `template sean::function a|b, &A|B::CallParams`
|                2.86 |      349,374,179.85 |    2.9% |      1.10 | `template sean::function a, &A::CallParams`
|                2.87 |      348,555,217.87 |    3.8% |      1.10 | `template sean::function b, &B::CallParams`
|                7.36 |      135,807,392.68 |    2.8% |      2.07 | `template check sean::function a|b, &A|B::CallParams`
|                3.07 |      325,381,962.27 |    3.1% |      1.09 | `template check sean::function a, &A::CallParams`
|                2.98 |      336,103,318.78 |    3.6% |      1.09 | `template check sean::function b, &B::CallParams`

MinSizeRel

-Os -DNDEBUG -std=gnu++20

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|               10.60 |       94,344,931.82 |    2.8% |      2.97 | `base->Call()`
|                5.97 |      167,477,592.03 |    3.4% |      1.68 | `a->Call()`
|                5.41 |      184,901,294.35 |    3.0% |      1.53 | `b->Call()`
|               13.68 |       73,124,846.95 |    2.7% |      3.82 | `function &Base::Call, base`
|                9.46 |      105,669,404.37 |    3.2% |      2.64 | `function &A::Call, a`
|                9.70 |      103,085,126.35 |    3.1% |      2.67 | `function &B::Call, b`
|               14.48 |       69,048,782.30 |    3.1% |      4.05 | `boost function &Base::Call, base`
|                8.19 |      122,141,120.01 |    3.2% |      2.33 | `boost function &A::Call, a`
|                8.29 |      120,667,180.75 |    3.7% |      2.35 | `boost function &B::Call, b`
|               11.60 |       86,224,941.90 |    3.6% |      3.26 | `function pointer direct base, &Base::Call `
|                6.92 |      144,414,111.03 |    2.6% |      1.94 | `function pointer direct a, &A::Call`
|                6.68 |      149,690,679.93 |    2.2% |      1.84 | `function pointer direct b, &B::Call`
|                8.03 |      124,592,437.34 |    2.5% |      2.24 | `Sean::function a|b, &A|B::Call`
|                5.30 |      188,772,307.34 |    3.1% |      1.48 | `Sean::function a, &A::Call`
|                5.27 |      189,694,989.33 |    4.4% |      1.47 | `Sean::function b, &B::Call`
|                7.28 |      137,307,480.61 |    3.4% |      2.03 | `Sean::function direct a|b, &A|B::Call`
|                5.10 |      195,962,106.63 |    3.2% |      1.43 | `Sean::function direct a, &A::Call`
|                5.19 |      192,672,407.85 |    3.1% |      1.45 | `Sean::function direct b, &B::Call`
|                7.34 |      136,232,012.08 |    3.0% |      2.04 | `Sean::function a|b, &A|B::CallParams`
|                5.21 |      191,782,099.69 |    2.7% |      1.45 | `Sean::function a, &A::CallParams`
|                5.27 |      189,619,729.88 |    2.7% |      1.48 | `Sean::function b, &B::CallParams`
|                7.33 |      136,398,281.55 |    3.9% |      2.05 | `Sean::function direct a|b, &A|B::CallParams`
|                5.21 |      192,095,062.80 |    2.7% |      1.45 | `Sean::function direct a, &A::CallParams`
|                5.32 |      188,061,229.03 |    2.6% |      1.47 | `Sean::function direct b, &B::CallParams`
|                7.77 |      128,682,149.45 |    2.6% |      2.17 | `template sean::function a|b, &A|B::CallParams`
|                5.21 |      191,782,392.03 |    2.7% |      1.47 | `template sean::function a, &A::CallParams`
|                5.33 |      187,496,921.86 |    3.6% |      1.51 | `template sean::function b, &B::CallParams`
|                8.90 |      112,357,224.57 |    2.7% |      2.50 | `template check sean::function a|b, &A|B::CallParams`
|                6.01 |      166,347,689.09 |    2.8% |      1.69 | `template check sean::function a, &A::CallParams`
|                5.80 |      172,298,426.51 |    3.3% |      1.64 | `template check sean::function b, &B::CallParams`

RelWithDebInfo

-O2 -DNDEBUG -std=gnu++20

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                9.03 |      110,738,754.20 |    3.0% |      2.54 | `base->Call()`
|                5.28 |      189,366,926.98 |    3.7% |      1.50 | `a->Call()`
|                4.83 |      207,105,530.39 |    2.4% |      1.36 | `b->Call()`
|               11.85 |       84,358,606.01 |    2.2% |      3.33 | `function &Base::Call, base`
|                9.00 |      111,051,549.44 |    2.7% |      2.53 | `function &A::Call, a`
|                9.00 |      111,074,910.28 |    2.3% |      2.53 | `function &B::Call, b`
|               11.37 |       87,932,375.07 |    2.4% |      3.19 | `boost function &Base::Call, base`
|                7.31 |      136,868,309.00 |    2.3% |      2.05 | `boost function &A::Call, a`
|                7.33 |      136,417,043.66 |    2.9% |      2.08 | `boost function &B::Call, b`
|               10.21 |       97,977,654.63 |    2.3% |      2.85 | `function pointer direct base, &Base::Call `
|                6.63 |      150,734,756.40 |    2.6% |      1.87 | `function pointer direct a, &A::Call`
|                6.88 |      145,364,737.50 |    3.1% |      1.93 | `function pointer direct b, &B::Call`
|                6.19 |      161,458,499.84 |    3.0% |      1.73 | `Sean::function a|b, &A|B::Call`
|                2.89 |      346,152,377.07 |    2.7% |      1.09 | `Sean::function a, &A::Call`
|                2.93 |      341,304,113.06 |    2.7% |      1.10 | `Sean::function b, &B::Call`
|                6.00 |      166,796,329.13 |    3.0% |      1.69 | `Sean::function direct a|b, &A|B::Call`
|                2.90 |      344,292,303.07 |    2.6% |      1.07 | `Sean::function direct a, &A::Call`
|                2.90 |      345,391,845.52 |    2.3% |      1.09 | `Sean::function direct b, &B::Call`
|                7.22 |      138,497,211.74 |    1.8% |      2.00 | `Sean::function a|b, &A|B::CallParams`
|                2.90 |      344,818,582.23 |    2.6% |      1.09 | `Sean::function a, &A::CallParams`
|                3.02 |      331,334,448.80 |    2.5% |      1.12 | `Sean::function b, &B::CallParams`
|                7.34 |      136,228,235.17 |    2.0% |      2.04 | `Sean::function direct a|b, &A|B::CallParams`
|                3.05 |      327,705,146.85 |    3.0% |      1.11 | `Sean::function direct a, &A::CallParams`
|                3.00 |      332,837,039.23 |    2.0% |      1.10 | `Sean::function direct b, &B::CallParams`
|                7.38 |      135,533,804.72 |    1.9% |      2.06 | `template sean::function a|b, &A|B::CallParams`
|                3.03 |      330,020,313.85 |    1.8% |      1.11 | `template sean::function a, &A::CallParams`
|                2.97 |      336,151,301.10 |    2.6% |      1.12 | `template sean::function b, &B::CallParams`
|                7.62 |      131,292,731.68 |    2.6% |      2.11 | `template check sean::function a|b, &A|B::CallParams`
|                3.09 |      323,415,838.11 |    3.0% |      1.09 | `template check sean::function a, &A::CallParams`
|                3.11 |      321,256,824.05 |    3.3% |      1.13 | `template check sean::function b, &B::CallParams`

These tests were run because I use std function a fair bit and was curious about the performance overhead. I've usually assumed it did some compile time switching around which would have resulted in a near 1:1 with a regular call, but quite obviously that isn't what is happening. This might be because of bind getting in the way and causing some remapping overheads. But for functions that are 1:1 mapped I would expect the implementation to result in a solution that was near to what I have done.

My implementation is pretty inflexible (it requires a 1:1 binding) and isn't portable. It depends on the vtable being implemented in a particular way (it also isn't type safe, but you could add a layer of type safety on without incurring overhead).

Okay the above tests are invalid. My function pointers weren't bound to the correct functions so there were different workloads. I'm glad this was the case because I wouldn't have a good explanation for the significant difference in performance.

Here are corrected test results. The code was changed to calculate the function location based on &virtual X::function. gcc calculates this for you but clang doesn't. They both have the same vtable layout.

Interestingly, the implementation sits in roughly the middle of direct member calls and the std and boost functions.

image.png (686×1 px, 107 KB)