Issue about unsigned integer overflow and underflow

Posted on 2022-01-30 Edited on 2025-10-06

過年前要生產出一些東西出來，不然太久沒寫文章了
看到Unsigned integer overflow和underflow造成的問題，覺得Rust的解法實在很好，在編譯時就能檢查出來

#![deny(clippy::integer_arithmetic)]

use std::env;
const PAGE_SIZE: u64 = 4096;
fn main() {
    let args: Vec<String> = env::args().skip(1).collect();
    let size: u64 = args[0].parse().unwrap();
    println!("({} - 2 - {}) => {}", PAGE_SIZE, size, PAGE_SIZE - 2 - size);
}

然後安裝clippy當作cargo的subcommand

$ cargo clippy
error: integer arithmetic detected
 --> src/main.rs:8:54
  |
8 |     println!("({} - 2 - {}) => {}", PAGE_SIZE, size, PAGE_SIZE - 2 - size);
  |                                                      ^^^^^^^^^^^^^^^^^^^^
  |

如果要正確的處理，程式碼大概是這樣

fn foo(len: u64, size: u64) {
  match (PAGE_SIZE - 2).checked_sub(size) {
      Some(capacity) if len > capacity => {
          println!("no capacity left");
      }
      Some(capacity) => {
          println!("sufficient capacity {}", capacity);
      }
      None => {
          println!("underflow! bad user input!");
      }
  }
}

等價的C語言表示方法

#define IS_UINT_SUB_UNDERFLOW(x, y) ((x) - (y) > (x))
#define IS_UINT_ADD_OVERFLOW(x, y) ((x) + (y) < (x))
#define IS_UINT_MUL_OVERFLOW(x, y, size_max) ((x) && (y) > (size_max) / (x))
#define PAGE_SIZE 4096u

void foo(unsigned int len, unsigned int size) {
  if (IS_UINT_SUB_UNDERFLOW(PAGE_SIZE - 2, size)) {
    printf("underflow! bad user input!\n");
  } else {
    unsigned int capacity = PAGE_SIZE - 2 - size;
    if (len > capacity) {
      printf("no capacity left\n");
    } else {
      printf("sufficient capacity %u\n", capacity);
    }
  }
}

不過這需要CPU實作Modular arithmetic而不是Saturation arithmetic

不然還是有像Integers)這樣的第三方library，不過易用性就不如Rust了

Reference

– Rust: detect unsigned integer underflow

Refelection in C++

Posted on 2021-11-20 Edited on 2025-10-06

Why Refelection

有些時候，我們需要遍歷struct/class的member，最常見的的用途就是print/serialization/deserialization

struct obj {
    int a;
};

void print(const obj& o)
{
    printf("%d\n", o.a);
}

這樣子的做法雖然直接，不過有幾個問題

只要structure改變，你的implementation就要跟著改變
假設要一直支持新的structure，我們需要一個新的overload function

另外有時候我們也需要 struct field name的資訊，例如我們想知道struct file的名稱，而Compiler編譯出來的程式碼沒有struct/class的field資訊，所以我們會這樣手動寫死

void print(const obj& o)
{
    printf("a: %d\n", o.a);
}

如果我們把a名稱改成a1，也是要手動維護程式碼，那有什麼適合的方案嗎

Compilier dependent solution

clang的__builtin_dump_struct
只支援dump功能，其他沒了，也只有clang能用

struct obj1 {
    int a;
    int b;
};

int main() {
    struct obj1 o = { .a=1, .b=2 };
    __builtin_dump_struct(&o, &printf);
    return 0;
}

Wrong Idea

想到最直覺的方法，當然是這樣寫

template <typename T>
void print(const T& o)
{
    for (auto& field : { field of o })
        std::cout << field << "\n";
}

不過眾所周知，for loop不能這樣用

Boost pfr for resuce

山不轉路轉，有無數的聰明人想出了方法，其中最有名的就是boost pfr

#include <boost/pfr/ops.hpp>
template <typename T>
void print(const T& o)
{
    boost::pfr::for_each_field(o, [&](const auto& v) {
        std::cout << v << "\n";
    });
}

不過這方法也是有其侷限性

增加了對 boost pfr的依賴
只能對Aggregate type使用
不能解決field name的問題
nameof
一個借鑑於C#的library
大概的用法是這樣子
1
2
NAMEOF(somevar) -> "somevar"
NAMEOF(person.address.zip_code) -> "zip_code"
對單一變數效果還行，不過對struct/class裡面的field name還是無能為力

Macro based Solution

以Boost Hana為例

#include <boost/hana.hpp>
struct OrderedItem {
    BOOST_HANA_DEFINE_STRUCT(
            OrderedItem,
            (std::string, item_name),
            (int64_t, quantity),
            (int64_t, price_cents)
    );
};

template<typename T>
boost::json::value FormatStructure(const T &t) {
    boost::json::object result;
    boost::hana::for_each(t, boost::hana::fuse([&result](auto name, auto member) {
        result.emplace(boost::hana::to<const char *>(name), FormatObject(member));
    }));
    return result;
}
template<typename T>
boost::json::value FormatObject(const T &t) {
    if constexpr (boost::hana::Struct<T>::value) {
        return internal::FormatStructure(t);
    } else {
        return internal::FormatValue(t);
    }
}

光看程式碼就猜的到，BOOST_HANA_DEFINE_STRUCT做了很多事情，維護每個除了原先的 field declaration之外，還維護了field name的資訊
不過Macro就是黑魔法，維護起來就是麻煩，不過現階段也沒更好的方法

Runtime Refelection

上面說的都是Compile-time Refelection，當然還有一派作法是在Runtime時做Refelection，能無視編譯器的差異，提供比編譯器更多的Metadata，不過這一切都是要手動做

不管Compile-time Refelectionc還是Runtime Refelection，都掙脫不了Macro和Template的禁錮

Future

有個實驗性質的reflection TS

struct S {
    int b;
    std::string s;
    std::vector<std::string> v;
};
 
// Reflection TS
#include <experimental/reflect>
using meta_S = reflexpr(S);
using mem = std::reflect::get_data_members_t<meta_S>;
using meta = std::reflect::get_data_members_t<mem>;

不過前途未卜啊，搞不好像NetworkTS那樣推倒重來，C++23是無望了

Reference

Structure of Array in C++

Posted on 2021-10-25 Edited on 2025-10-06

What is array of structure

這就是我們一般常用的模式

struct Obj {
	int a, b;
};
std::array<Obj, 100> objs;

What is structure of array

剛好和上面的觀念相反，將object的member集中在一起，以上面的例子來說，可以寫成這樣

1	std::tuple<std::array<int, 100>, std::array<int, 100>> objs;

Why structure of array

從上面兩個寫法看來，array of structure更為自然，容易咧解
那為什麼會有structure of array的出現，一切都是為了性能
例如這樣子的Code

1
2
3

int sum = 0;
for (auto v : objs)
	sum += v.a;

由於CPU locality特性，a的stride是sizeof(Obj)大小，所以CPU Cache幾乎沒有作用
但如果寫成這樣

int sum = 0;
auto &as = std::get<0>(objs);
for (auto v : as)
	sum += v;

由於std::array<int, 100>是個連續的memory area，因此在CPU locality方面比起上面方案好
不過有一好沒兩好
structure of array的缺點有

程式碼不容易讀

How to use struct of array in C++

由於C++沒有原生的SOA支援，有第三方的Library供使用

struct_array
不過用起來很彆扭，如果真的不是真的效能至上，還是用原先的寫法吧
ahsohtoa - automatic AoS-to-SoA
這個不錯，不過需要Boost PFR和C++20
Boost Describe based Solution
不需要C++20，不過還是得要Boost Describe

不過C++ Refelction何時落地啊

Projection in C++

Posted on 2021-09-30 Edited on 2025-10-06

雖然之前有看過，不過看過即忘，還是得寫下來

What’s projection

從一個寫到爛的範例開始

struct Person {
	std::string name;
	int age;
};
std::vector<Person> persons;
std::sort(begin(persons), end(persons), [](const auto& p1, const auto& p2) {
	return p1.name < p2.name;
});

相信這樣的程式碼已經寫到吐了
如果用C++20 Ranges寫的話可以這樣寫

1	std::ranges::sort(persons, std::ranges::less{}, &Person::name);

可以知道我們要比的就是name，而這樣的寫法就叫做Projection

Backport to C++17

其實要backport到更之前的版本也行，只要有第三方或是自己寫的invoke
然後寫一個projecting_fn functor，compose以下的操作

template <typename Function, typename Projection>
class projecting_fn {
public:
    projecting_fn(Function function, Projection projection)
        : m_function{ std::move(function) }
        , m_projection{ std::move(projection) }
    {
    }

    template <typename... Args>
    decltype(auto) operator() (Args&&... args) const
    {
        return std::invoke(
            m_function,
            std::invoke(m_projection, std::forward<decltype(args)>(args))...);
    }

private:
    Function m_function;
    Projection m_projection;
};
std::sort(begin(persons), end(persons),
     projecting_fn{ std::less{}, &Person::name });

Projection and view filter

像這樣的Source Code是無法通過編譯的

template<typename T>
struct LessThan
{
    bool operator()(const T& x){
        return x < value;
    }
    T value;
};

struct Apple
{
    int weight;
    int size;
};

int main()
{
    auto apples = std::vector<Apple>{{1,2}, {2,3}, {3,4}};

    auto smallApples = apples | views::filter(LessThan{3}, &Apple::size);
}

解決方式有兩種

不用Projection也是一種解決方法

1	apples \| views::filter([] (Apple& a) {return a.size < 3;})

不過這方式就與本文無關了

Boost HOF

1	apples \| views::filter(hof::proj(&Apple::size, LessThan{3}));

這方法就類似上面C++17的projecting_fn

Reference

Cross Platform Testing on single machine

Posted on 2021-09-12 Edited on 2025-10-06

寫網路程式，Big / Little endian轉換是逃不了的問題，不過Big Endian的機器又難找

找到的資料又舊，不過最近看到Test cross-architecture without leaving home之後真是驚為天人，寫了一個Template
這下子省了很多麻煩

Introduction to C++20 Module

Posted on 2021-08-02 Edited on 2025-10-06 In programming

The simplest module

先看範例，就是Module版的Hello World

1
2
3

export module hello;
export void hello_world() {};
void non_export_func() {}

而Consumer Module的一方就這樣寫

import hello;
int main()
{
    hello_world(); // OK
    non_export_func(); // Cannot compile
    return 0;
}

Description

從這個範例當中，Consumer這邊不用特別說
這邊要說的是如何寫個Module

Module Unit

在C++20，有了一個新的Compile Unit，就是Module Unit，所有Module Unit的Top Level Statement都是有module關鍵字的
而module前面有沒有export就是決定這是哪一種Module Unit

有export的叫作Module Interface Unit
無export的叫做Module Implementation Unit

Module Implementation Unit後面再說

The content of a module

一個Module擁有

一個以上的Module Interface Unit
零個以上的Module Implementation Unit

且每個Module裡面有且唯一一個Primary Module Interface Unit

在Hello World這個範例當然只有Primary Module Interface Unit 的存在，至於什麼是Primary Module Interface Unit，也是後面再說

export

在上面的範例，我們定義了兩個函數

1 2	export void hello_world() {}; void non_export_func() {}

不塗於傳統的header file方式，如果是傳統的header file，兩個function應該都可以被外界可見，而Module Unit只有export出的符號才能輩Connsumer看到
export的其他用法還有這樣

// export entire namespace
export namespace hello {}

// export the symbols in the block
export {
	int e = 1;
	void test() {}
}

Module Implementation Unit

就像傳統header/implementation的方法，我們可以把declaration/implementation分離，因此我們有了Module Implementation Unit
重寫我們的範例，將implementation分開
因此我們的Module Interface Unit就變成

1
2
3

export module hello;
export void hello_world();
void non_export_func();

而Module Implementation Unit則是

1
2
3

module hello;
void hello_world() {};
void non_export_func() {}

如同之前所說的，module前面沒加export的就是Module Implementation Unit，而在function implementation前面也沒加export，就跟傳統的方式很像

My thought on Module Implementation Unit

之前declaration/implementation被人詬病的一點，就是你要維護兩份狀態，當你declaration改了之後，如果implementation沒改，會產生不可預料的後果，運氣好的話是編譯不過，運氣不好產生深層的Bug更難解

如同之前所說的，一個Module可以不必擁有Module Implementation Unit
那存在的必要是什麼？

我認為是將舊有的Source Code Mitigation到C++ Module的方式
如同現在流行的header only library一樣，未來的Module應該僅由Module Interface Unit組成

Import other module

寫Module時不免使用到其他Module，讓我們定義一個新的Module

1 2	export module world; export struct obj {};

而我們的hello module就變成這樣

1
2
3

export module hello;
import world;
export void hello_world(obj o) {};

注意，import只能放在top level module declaration之下，不能交換順序

接著要回去看Consumer的部分了

Visibility control

此時我們的Consumer會是這樣

import hello;
import world;
int main()
{
        obj o;
        hello_world(o);
        return 0;
}

這裡該注意的點，在hello module當中雖然import了world，但是不
會再次輸出symbol到hello module metadata中
因此如果Consumer沒加上import world時，會發現找不到obj的情形

但如果我們將hello改成這樣

1
2
3

export module hello;
export import world;
export void hello_world(obj o) {};

這邊將我們import進來的Module再度export出去，這也是我們細分module的基礎
那麼Consumer不加import world也是可以正常運行

Divide module into small parts

當一個Module大起來之後，要降低複雜度，細分成更小的Block是需要的，而其中又有兩種方法

Sobmodule

我們將hello_world分成兩個function
一個放在hello.sub_a，另外一個放在hello.sub_b
直接看程式碼

1 2	export module hello.sub_a; export void hello() {};

而另外一個就不貼了，看看我們hello module的定義

1
2
3

export module hello;
export import hello.sub_a;
export import hello.sub_b;

Reexport出hello.sub_a和hello.sub_b的exported symbol

Note

hello.sub_a和hello_sub_b是各自獨立完整的Module，submodule機制只是邏輯組合，讓他們看起來像是同一個Module
所以你Consumer這樣寫也是可以的

import hello.sub_a;
import hello.sub_b;
int main()
{
        hello();
        world();
        return 0;
}

Module partition

不同於submodule，partition所分的sub partition不能個別存在
一樣直接看程式碼

1 2	export module hello:part_a; export void hello() {};

跟上面很像，不過將.改成了:
而我們的hello module則是

1
2
3

export module hello;
export import :part_a;
export import :part_b;

這邊有幾點要注意的

一個module name當中沒有:出現的就是Primary Module Interface Unit，如同之前所說
一個以上的Module Interface Unit，有且唯一一個Primary Module Interface Unit
這個範例有三個Module Interface Unit，只有hello是Primary Module Interface Unit
而hello.sub_a則是一個獨立的Module，只是邏輯上看起來是同一個Mdoule
Partition只能接受import :part_a的語法，import hello:part_a是不對的
Consumer只能寫import hello了

Global Module Fragment

Global Module Fragment是提供preprocessor使用的空間，因此你可以在這邊定義Marco，或是include未被moduleized的header file，而在這邊定義的symbol則不會輸出到module interface中，因此不會汙染全局環境

Global Module Fragment必須在export module之前，就像這樣

module;
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
#include <string>
#include <vector>
export module hello;

Reference

Write a synchronize API based on ASIO asynchronous operation

Posted on 2021-07-15 Edited on 2025-10-06

前言

基本上這個要求蠻奇怪的，ASIO又不是沒提供Synchronize API，不過有些事情就是只有Asynchronous API能做到
例如我要在五秒鐘之內連線，五秒鐘之內無法連上就直接結束，如果用Synchronize API，Timeout由作業系統決定
這個時候就只有自己寫了

use_future

ASIO有一個feature，可以將Async operation轉成Sync operation
一般來說我們的程式碼會寫成這樣

1
2
3

socket.async_connect(endpoint, [](std::error_code ec) {
	// blablabla
});

但是如果我們用use_future的話，ASIO內部會自己轉成promise/future的Pattern
這適合在Threead synchronize的情景使用

asio::io_context ctx;
asio::ip::tcp::socket socket(ctx);
auto future = socket.async_connect(endpoint, asio::use_future);
std::thread t([&] {
	ctx.run();
});
future.get();

Combie with C++20 Coroutine

如果我們的條件更複雜，如一開始寫的五秒鐘Timeout這件事，上面的程式碼就不敷使用，
如果用原先的Function callback方式寫大概會死一堆腦細胞，而Coroutine可以讓我們大大減輕心智負擔

asio::awaitable<void> timeout(std::chrono::seconds seconds)
{
	asio::steady_timer timer(co_await asio::this_coro::executor);
	timer.expires_after(seconds);
	co_await timer.async_wait(use_nothrow_awaitable);
}

asio::awaitable<std::error_code> connect_with_timeout(
	asio::ip::tcp::socket& socket,
	const asio::ip::tcp::endpoint& endpoint)
{
	using namespace asio::experimental::awaitable_operators;
	auto result = co_await(
		socket.async_connect(endpoint, use_nothrow_awaitable) ||
		timeout(std::chrono::seconds(5))
	);
	if (result.index() == 1) {
		co_return asio::error::timed_out; // timed out
	}
	auto [r] = std::get<0>(result);
	co_return r;
}

asio::io_context io_context;
auto connect_future = asio::co_spawn(
    io_context.get_executor(),
	connect_with_timeout(asio::ip::tcp::socket(io_context), endpoint), 
	asio::use_future);
io_context.run();
return connect_future.get();

如上面程式碼寫的一樣
在connect_with_timeout有兩種可能，一個是socket connect的結果，另外一個是timeout
asio::co_spawn的最後一個參數不是教學中的detach，而是剛剛講的use_future
這樣子就可以把Coroutine 和 promise/future一起使用

eBPF and bcc

Posted on 2021-06-05 Edited on 2025-10-06

eBPF和bcc的介紹文件已經有不少了，多寫介紹實在是浪費資源
直接紀錄架構和該怎麼用，先有個概念，日後如果有需要的話再仔細研究

The artitecture of eBPF

一圖勝千文

What is bcc?

由於直接編寫eBPF難度很高，bcc提供了一個Python library，簡化eBPF的開發過程
bcc也納入了很多可以直接拿來用的Application

以下是bcc Tracking Tools的示意圖

Write a bcc program

只是個Hello World的範例

#!/usr/bin/python3

from bcc import BPF
from bcc.utils import printb

# define BPF program
prog = """
int hello(void *ctx) {
    bpf_trace_printk("Hello, World!\\n");
    return 0;
}
"""

# load BPF program
b = BPF(text=prog)
b.attach_kprobe(event=b.get_syscall_fnname("clone"), fn_name="hello")

# header
print("%-18s %-16s %-6s %s" % ("TIME(s)", "COMM", "PID", "MESSAGE"))

# format output
while 1:
    try:
        (task, pid, cpu, flags, ts, msg) = b.trace_fields()
    except ValueError:
        continue
    except KeyboardInterrupt:
        exit()
    printb(b"%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))

Reference

書籍

Empty struct, empty_value and [[no_unique_address]]

Posted on 2021-06-01 Edited on 2025-10-06

之前遇到亂流，需要重新找工作，如今告一段落，可以寫點東西了

來聊聊Empty Struct的問題好了

Empty struct

1
2
3

struct empty {
};
printf("%ld\n", sizeof(struct empty)); // ????

這個答案有所不同
在C語言，印出來的情形是0
在C++，印出來會是1，C++為了保證不同的Object的Address不同，就算是empty struct， sizeof也不為空

Embedded empty struct

那如果是這樣呢

struct empty {
};

struct non_empty {
        int v;
        struct empty e;
};
printf("%ld\n", sizeof(struct non_empty)); // ????

同樣的
在C語言，印出來的情形是4
在C++，印出來當然不會是4，在我的ubuntu 64bit印出來是8

Why use empty struct

在C語言的應用情景，empty struct沒有任何用途
可是在C++的世界裡面，empty struct可以是個functor
例如std::less，std::equal_to之類的
使用template class可以將functor傳入struct裡面，因此可以擴充這個class的功能

How to reuduce the size

既然有empty struct的使用場景，又不想浪費多餘的空間，所以就有人想出這樣的方法

struct non_empty : empty {
    int v;
};
printf("%ld\n", sizeof(struct non_empty)); // ????

這下就如我們預料的是4了

The problem of Inherence

Leak Interface

由於Inherence有很強的傳染力，Parent class的Public API都能背Child class自由使用，因此可以寫出這樣的程式碼

class empty {
public:
        int f() { return 42; }
};
class non_empty : public X {
        int v;
};
non_empty obj;
obj.f();

可是我不想讓obj直接f函數…該怎麼做

class empty {
public:
        int f() { return 42; }
};
class X : empty {
public:
         empty& get() { return *this; };
};
class non_empty : public X {
        int v;
};
non_empty obj;
obj.get().f();

要使用f只能透過get來做了
這也是boost empty_value在做的事情

Hard to reason sometimes

這也是我看了程式碼才能體會到的事情
以下是從boost intrusive中節錄的片段

template<class ValueTraits, class VoidOrKeyOfValue, class VoidOrKeyHash, class VoidOrKeyEqual, class BucketTraits, class SizeType, std::size_t BoolFlags>
struct hashdata_internal
   : public hashtable_size_traits_wrapper
      < bucket_hash_equal_t
         < ValueTraits, VoidOrKeyOfValue, VoidOrKeyHash, VoidOrKeyEqual
         , BucketTraits
         , 0 != (BoolFlags & hash_bool_flags::cache_begin_pos)
         >   //2
      , SizeType
      , (BoolFlags & hash_bool_flags::incremental_pos) != 0
      >
{
   typedef hashtable_size_traits_wrapper
      < bucket_hash_equal_t
         < ValueTraits, VoidOrKeyOfValue, VoidOrKeyHash, VoidOrKeyEqual
         , BucketTraits
         , 0 != (BoolFlags & hash_bool_flags::cache_begin_pos)
         >   //2
      , SizeType
      , (BoolFlags & hash_bool_flags::incremental_pos) != 0
      > internal_type;;
};

hashtable_size_traits_wrapper依設定不同，可能是個empty struct
上面這段，重歷的程式碼出現了兩次，又臭又長，難以理解

[[no_unique_address]]

C++20引進了一個很有用的attribute，這告訴Compilier，不必為這個object特別分配一個Address，因此有了無限可能

No need inherence

雲本需要用Inherence辦到的事情

class empty {
public:
        int f() { return 42; }
};

class non_empty {
        int v;
        [[no_unique_address]] empty e;
public:
        empty& get() { return e; }
};
non_empty obj;
obj.get().f();

Fix hard to reason issue

以上面那段Code來舉例

template<class ValueTraits, class VoidOrKeyOfValue, class VoidOrKeyHash, class VoidOrKeyEqual, class BucketTraits, class SizeType, std::size_t BoolFlags>
struct hashdata_internal
{
   typedef hashtable_size_traits_wrapper
      < bucket_hash_equal_t
         < ValueTraits, VoidOrKeyOfValue, VoidOrKeyHash, VoidOrKeyEqual
         , BucketTraits
         , 0 != (BoolFlags & hash_bool_flags::cache_begin_pos)
         >   //2
      , SizeType
      , (BoolFlags & hash_bool_flags::incremental_pos) != 0
      > internal_type;;
    [[no_unique_address]] internal_type size_traits_;
};

雖然還是又臭又長，不過已經改善不少

Trap on [[no_unique_address]]

以下的程式碼會是如何

class empty {
};
class non_empty {
        int v;
        [[no_unique_address]] empty e;
        [[no_unique_address]] empty e1;
};
printf("%ld\n", sizeof(struct non_empty));

答案當然不是4，e和e1是同一個type，為了區分，不所以只能有一個有[[no_unique_address]]的屬性
要修掉這問題也很簡單

template <int>
class empty {
};

class non_empty {
        int v;
        [[no_unique_address]] empty<0> e;
        [[no_unique_address]] empty<1> e1;
};

Another issue on [[no_unique_address]]

目前[[no_unique_address]]在MSVC是沒效果的，會造成ABI Break

Introduction to asynchronous programming

Posted on 2021-03-17 Edited on 2025-10-06

Asynchronous programming

Why asynchronous programming

Asynchronous programming 是個反人類的思考的東西，就算選擇不同的程式語言，共識最好的Network programming model，都是這個樣子，一個connection一個thread

listen(socket_fd, 20);

/* Looooop */
while (1) {
  newsocket_fd = accept(socket_fd, 
                          (struct sockaddr *) &client_addr, 
                          &client_len);
  pthread_t thread;
  pthread_create(&thread, NULL, run_thread, (void *) newsocket_fd);
  pthread_join(thread, NULL);
}

這個Model可以解決95%的問題，不過人生最難的就是那個But，這個Programming Model不能Scale

C10K Problem (1999)

這就是著名的C10K Problem，是Operation System的問題，OS不能有跟Connection一樣多的Thread，就算可以，也會耗費大量的Memory，以及頻繁的Context Switch
山不轉路轉，於是出現了IO multiplexing技術，也就是大家熟知的select/poll/epoll

The early stage of asynchronous programming

一開始的asynchronous programming，就算是libuv，asio或是nodejs等，都需要一個callback當參數，寫著寫著就會變成這樣

The problem of callback

反人類

Thread based solution之所以被推崇，就是人類的思考模式傾向於直線思考，而Callback based solution需要將步驟切得七零八落，慘不忍睹

難寫易錯

假設事務夠簡單，一兩層callback就能解決的話，事情還好辦，當邏輯複雜到一個程度，寫錯的機率實在是太高了

Source Code是要寫給人看的，因此需要有工具來管理複雜度，也就是Coroutine

System Language對於Coroutine的態度

C：不關我的事，你自己想辦法
C++：到了2021年還沒有標準的Network Library：會不會太落後
Rust：比C++早訂定標準：不過押寶押錯了：標準也定了：改不了了：至於押寶押錯這件事後面再說

What is coroutine

太陽底下沒有新鮮事，Coroutine在1963年就被提出，過了五十年後重新被人想起
Coroutine擁有以下四種特性

Invoke
Return
Yield
Resume

而我們一般所知道的Function就只有

Invoke
Return

也就是Function只是Coroutine的特殊案例

Coroutine的另一項特性

Cooperative multitasking

同樣的，太陽底下沒有新鮮事
聽過當初Windows 3.1常常會程式卡死，而Windows 95不會，就是因為將Cooperative multitasking改成Pre-emptive multitasking

The simplest example on coroutine

雖然這範例沒什麼用，不過能夠讓我們了解Corotuine的本質，能夠Yield和Resume
switch的case可以包含在for loop迴圈裡面，不過蔗不是本文重點

int counter(void) {
    static int i, state = 0;
    switch (state) {
        case 0: /* start of function */
        for (i = 0; i < 10; i++) {
            state = 1; /* so we will come back to "case 1" */
            return i;
            case 1:; /* resume control straight after the return */
        }
    }
}
int main()
{
    for (int i = 0; i < 10; i++)
        printf("%d\n", counter());
}

上面這個只是個玩具Coroutine，真正能拿來用的還分幾類
至於怎麼做就各顯神通了

Two difference model on Coroutine

就算是Coroutine，也可以分成兩類

Stackful Coroutine
Stackless Coroutine
顧名思義，差異就在對Stack的處理上面
Stackless將State放在Heap上，而Stackful放在Stack上
Stackless的大小是動態分配的，Stackful的Stack是固定大小的
Stackless本質是個StateMachine，而Stackful是個User Mode Thread
因此Stackess Machine的Runtime消耗比較小，Stackful相反
Stackful可以和舊有的synchronous code組合，Stackless不行
Stackless需要Compilier支援，Stackful只需要Library就能做了
Stackless的方案有傳染性，例如你在Javascrupt所看到的
1
2
3
async func1() {
await func2();
}
你的async/await是成雙成對的，布這麼用就會出錯，而Stackful沒有此限制
Stackful的程式好寫，Stackless需要一定能力

選邊站

由於兩種Model差異很大，由於程式語言的特性以及歷史因素，不同程式語言的選擇也不一樣

Stackless：C#(第一個使用async/await的主流語言)，Javascript，Python，C++，Rust，Kotlin(雖然是JVM的語言，不過跟Java選擇不同)
Stackful：Golang(其實是變種的Coroutine)，Java(照抄Golang那套，不過還沒推出)，PHP(in the future)

Goroutine

前面提到，Goroutine是Stackful Coroutine的變形，最主要的差異在於

coroutine是順序執行
Goroutine可以在多個cpu平行執行的
因此又產生了分歧點
假設我們有Coroutine A，B，C
C等待B的資料，B等待A的資料
如果是傳統的Coroutine，A執行完會transfer到B，B執行完會transfer到C，由於在同一個CPU上，資料不用加鎖
如果是Goroutine，A，B，C三者可能在不同的CPU上跑，關於資料的傳遞只能透過Channel
由於Golang實作了一個有效利用Cpu Usage的Runtime，將corotuine定義成light weight thread，所以Golang Runtime需要做一部分OS需要做的事情，例如Schedule coroutine
Mandatory goroutine，就算你寫一個hello world也避不掉
Goroutine不快，Maximum network connection也比不上Stackless Coroutine(C++/Rust)
不過程式好寫太多，這強項才是goroutine搶走PHP/Python的主要原因

押錯寶

講講Rust押寶押錯的故事

IO Model有兩種

如同Coroutine有兩種，IO EventLoop也有兩種

Proactor：最著名的就是Windows的IOCP了
Reactor：select/poll/epoll等都是
Rust使用epoll的Reactor Model，不過epoll不是linux的未來

CPU Spectre and Meltdown

就跟COVID-19一樣，Spectre和Meltdown改變了寫程式的方向
因為CPU的Bug，Linux修正方向，io_uring才是Linux的未來，而io_uring和IOCP一樣，是Proactor的model

Influence

由於標準定了，要改改不了了
如果要改的話只有兩種選擇

重新制定標準，然後變成v2版本，光是制定一個版本花了四年，這次應該會快一點
兩個Model是可以互轉的，只是會有Performance Loss，當Spectre和Meltdown的Patch打上去之後會掉多少更難以估計

Conclusion

如果你是那95%的人，根本用不上Asynchronous programming，直接使用thread model，還不容易錯
如果不幸是那5%的人，首先考慮golang，golang就算幾千個缺點，goroutine都能掩蓋過去
golang適合寫網路服務，也只能寫網路服務
如果你是一秒鐘幾千萬上下，出來跑得遲早都要還，逃不掉C/C++/Rust寫code了
這裡有個實際案例
Why Discord is switching from Go to Rust
沒有最好的方案，只有適合的方案

Reference

Why Refelection

Compilier dependent solution

Wrong Idea

Boost pfr for resuce

nameof

Macro based Solution

Runtime Refelection

Future

Reference

What is array of structure

What is structure of array

Why structure of array

How to use struct of array in C++

What’s projection

Backport to C++17

Projection and view filter

不用Projection也是一種解決方法

Boost HOF

Reference

The simplest module

Description

Module Unit

The content of a module

export

Module Implementation Unit

My thought on Module Implementation Unit

Import other module

Visibility control

Divide module into small parts

Sobmodule

Note

Module partition

Global Module Fragment

Reference

前言

use_future

Combie with C++20 Coroutine

The artitecture of eBPF

What is bcc?

Write a bcc program

Reference

書籍

Empty struct

Embedded empty struct

Why use empty struct

How to reuduce the size

The problem of Inherence

Leak Interface

Hard to reason sometimes

[[no_unique_address]]

No need inherence

Fix hard to reason issue

Trap on [[no_unique_address]]

Another issue on [[no_unique_address]]

Asynchronous programming

Why asynchronous programming

C10K Problem (1999)

The early stage of asynchronous programming

The problem of callback

System Language對於Coroutine的態度

What is coroutine

The simplest example on coroutine

Two difference model on Coroutine

選邊站

Goroutine

押錯寶

IO Model有兩種

CPU Spectre and Meltdown

Influence

Conclusion