第十三號艦隊

A Real Usage for C++26 Reflection

Posted on 2025-08-10

C++26 Reflection進入標準了，用一個實際的例子來證明這東西有什麼用

當我們有這樣一個struct時

struct NetworkAddress {
        std::string ip;
        uint16_t port;
};

如果希望使用std::format系列的函數搭配使用
需要自行定義formatter

template <>
struct std::formatter<NetworkAddress> : std::formatter<std::string_view> {
        auto format(const NetworkAddress& addr, std::format_context& ctx) const {
                return std::format_to(ctx.out(), "{{ip={},port={}}}", addr.ip, addr.port);
        }
};

直接使用就可以了

1 2	NetworkAddress addr { "127.0.0.1", 80 }; std::println("{}", addr);

不過當你自定義的structure多的話，手寫和維護formatter變成一個工程上的問題
因此我們需要一個自動化的方法

他山之石

Rust是個不錯的參考方案，多虧了Proc Macro這種黑魔法，可以寫出類似這樣的程式碼，也是Rust的殺手鐧之一

#[derive(Debug)]
struct NetworkAddress {
    ip: String,
    port: u16
}

fn main() {
    let addr = NetworkAddress {
        ip: "127.0.0.1".to_string(),
        port: 80
    };
    println!("{:#?}", addr);
}

希望之後C++的版本也能這麼乾淨

用魔法打敗魔法

在C++26 Reflection之前，已經有一個解決方案了，完整程式碼可以參考Reference
借助Boost Hana之便，可以寫出

namespace hana = boost::hana;

template <typename T>
constexpr auto CalculateFormatStringLength() {
        auto keys = hana::accessors<T>();
        auto length = hana::fold(keys, size_t{0}, [](auto sum, auto pair) {
                        return sum + hana::length(hana::first(pair));
        });
        length += 4 + 4 * (hana::length(keys).value);
        return length;
}

template <typename T, std::size_t... Is>
constexpr auto GenerateFormatStringImpl(std::index_sequence<Is...>) {
        auto keys = hana::accessors<T>();
        std::array<std::string_view, sizeof...(Is)> key_strings = { hana::to<char const*>(hana::first(keys[hana::size_c<Is>]))... };
        std::array<char, CalculateFormatStringLength<T>()> result{};
        std::size_t pos = 0;

        result[pos++] = '{';
        result[pos++] = '{';

        auto append = [&](std::string_view str) {
                for (char c : str) {
                        result[pos++] = c;
                }
        };

        auto append_key_value = [&](std::string_view key) {
                append(key);
                result[pos++] = '=';
                result[pos++] = '{';
                result[pos++] = '}';
                result[pos++] = ',';
        };

        (append_key_value(key_strings[Is]), ...);

        if (pos > 2) {
                pos--; // Remove the last comma
        }

        result[pos++] = '}';
        result[pos++] = '}';
        result[pos++] = '\0';
        return result;
}

template <typename T>
constexpr auto GenerateFormatString() {
        return GenerateFormatStringImpl<T>(std::make_index_sequence<hana::length(hana::accessors<T>()).value>{});
}

template <typename T>
class FormatStringImpl {
public:
        constexpr FormatStringImpl() : str(GenerateFormatString<T>()) {}
        std::array<char, CalculateFormatStringLength<T>()> str;
};

template <typename T>
struct FormatString {
        static constexpr FormatStringImpl<T> data{};
        static constexpr const char* value() { return data.str.data(); }
};

template <typename T>
constexpr FormatStringImpl<T> FormatString<T>::data;

template <typename T>
struct std::formatter<T, std::enable_if_t<hana::Struct<T>::value, char>> : std::formatter<std::string> {
        auto format(const T& t, std::format_context& ctx) const {
                auto members = hana::members(t);
                return hana::unpack(members, [&ctx](auto&&... args) {
                        return std::format_to(ctx.out(), FormatString<T>::value(), args...);
                });
        }
};

BOOST_HANA_ADAPT_STRUCT(NetworkAddress, ip, port);

雖然能用，不過C++26都要出了，需要嘗試更好的方法了

Generate string literal in compile time

先跳過反射的部分，解決比較小的問題
如何在compile time對字串做處理
我想寫一個Compile time function，Pseudo Code大概長這樣

consteval const char* make_greeting(std::string_view name) {
    std::string str = "Hello " + std::string(name);
    return ????; 
}

constexpr const char* greeting = make_greeting("ChatGPT");

上面的????就是難點所在
在C++26之前，看得到的解法大概長這樣

template <size_t N>
consteval auto make_greeting(const char (&name)[N]) {
    constexpr const char prefix[] = "Hello ";
    constexpr size_t prefix_len = sizeof(prefix) - 1;
    std::array<char, prefix_len + N> result{};
    for (size_t i = 0; i < prefix_len; ++i) {
        result[i] = prefix[i];
    }
    for (size_t i = 0; i < N; ++i) {
        result[prefix_len + i] = name[i];
    }
    return result;
}

static constexpr auto greeting_ = make_greeting("world!");
static constexpr const char* greeting = greeting_.data();

原理跟上面的GenerateFormatStringImpl差不多
不過這也有它的問題
- 不能直接套用std::string的方式，導致於更複雜的字串處理很難過
- The constexpr 2-Step，上面的greeting_和greeting都是必須存在的
Reference中有對上面更進一步的最佳化，不過非本文重點，有興趣自行研究

在C++26 Reflection通過之後，有一個小功能也順便進入標準了，因此我們可以這樣寫了

consteval const char* make_greeting(std::string_view name) {
    std::string str = "Hello " + std::string(name);
    return std::define_static_string(str);
}

constexpr const char* greeting = make_greeting("world!");

這就是我們之後產生struct layout description的基礎

C++26 Reflection Revisited

因為引進了反射，我們可以直接得到struct中每個field的名稱了

struct NetworkAddress {
        std::string ip;
        uint16_t port;
};

template <typename T>
consteval const char* FormatString() {
        std::string result = "{{";
        auto no_check = std::meta::access_context::unchecked();
        bool first = true;
        for (auto info : std::meta::nonstatic_data_members_of(^^T, no_check)) {
                if (!first) result += ",";
                result += std::meta::identifier_of(info);
                result += "={}";
                first = false;
        }
        result += "}}";
        return std::define_static_string(result);
}

現在解決第一部份了，看看剩下的部分

Revisited Hana’s implementation

template <typename T>
struct std::formatter<T, std::enable_if_t<hana::Struct<T>::value, char>> : std::formatter<std::string> {
        auto format(const T& t, std::format_context& ctx) const {
                auto members = hana::members(t);
                return hana::unpack(members, [&ctx](auto&&... args) {
                        return std::format_to(ctx.out(), FormatString<T>::value(), args...);
                });
        }
};

原來是這樣

將t的members打包成tuple

透過haha::unpack展開tuple中的所有元素，將其餵入std::format_to當參數

既然我們已經有std::apply了，就不需要hana::unpack了，剩下的就是將struct打包成tuple這個問題了

struct_to_tuple

在C++ Reflection論文中就有一個現成的實作，直接套來用

consteval auto type_struct_to_tuple(std::meta::info type) -> std::meta::info {
        constexpr auto ctx = std::meta::access_context::current();
        return substitute(^^std::tuple,
                        nonstatic_data_members_of(type, ctx)
                        | std::views::transform(std::meta::type_of)
                        | std::views::transform(std::meta::remove_cvref)
                        | std::ranges::to<std::vector>());
}

template <typename To, typename From, std::meta::info ... members>
constexpr auto struct_to_tuple_helper(From const& from) -> To {
        return To(from.[:members:]...);
}

template <typename From>
consteval auto get_struct_to_tuple_helper() {
        using To = [: type_struct_to_tuple(^^From) :];
        auto ctx = std::meta::access_context::current();

        std::vector args = {^^To, ^^From};
        for (auto mem : nonstatic_data_members_of(^^From, ctx)) {
                args.push_back(reflect_constant(mem));
        }

        return extract<To(*)(From const&)>(
                        substitute(^^struct_to_tuple_helper, args));
}

template <typename From>
constexpr auto struct_to_tuple(From const& from) {
        return get_struct_to_tuple_helper<From>()(from);
}

這時候以下的程式碼就能正常運作了

template <>
struct std::formatter<NetworkAddress> : std::formatter<std::string＿view> {
        auto format(const NetworkAddress& t, std::format_context& ctx) const {
                auto tuple = struct_to_tuple(t);
                return std::apply([&](auto&&... args) {
                        return std::format_to(ctx.out(), FormatString<NetworkAddress>(), args...);
                }, tuple);
        }
};

Little issue

上面的程式碼雖然可以運作，不過距離一般化差很遠，這樣的程式碼是不行的

template <typename T>
struct std::formatter<T> : std::formatter<std::string_view> {
        auto format(const T& t, std::format_context& ctx) const {
                auto tuple = struct_to_tuple(t);
                return std::apply([&](auto&&... args) {
                        return std::format_to(ctx.out(), FormatString<T>(), args...);
                }, tuple);
        }
};

看看Hana的signature

1 2	template <typename T> struct std::formatter<T, std::enable_if_t<hana::Struct<T>::value, char>>

依樣畫葫蘆，我們使用variable template和concept就能達成這目標了

template <typename T>
constexpr bool can_be_formatter = false;

template <>
constexpr bool can_be_formatter<NetworkAddress> = true;

template <typename T> requires(can_be_formatter<T>)
struct std::formatter<T> : std::formatter<std::string_view> {
	// ignore
};

不過要像Rust這樣標記

#[derive(Debug)]
struct NetworkAddress {
    ip: String,
    port: u16
}

我們需要另一個C++26特性Annotations

Annotations

Annotation的定義和API就不說了，直接擷取跟我們需要的功能

template <auto V> struct Derive { };
template <auto V> inline constexpr Derive<V> derive;

inline constexpr struct{} Debug;

template <typename T>
consteval auto has_annotation(std::meta::info r, T const& value) -> bool {
        auto expected = std::meta::reflect_constant(value);
        for (std::meta::info a : annotations_of(r))
                if (std::meta::constant_of(a) == expected)
                        return true;
        return false;
}

接著修改我們的 std::formatter

template <typename T> requires (has_annotation(^^T, derive<Debug>))
struct std::formatter<T> : std::formatter<std::string_view> {
	// ignore
};

接著修改最後的NetworkAddress

1
2
3

struct [[=derive<Debug>]] NetworkAddress {
	// ignore
};

到此結束，就可以跟Boost Hana說再見了

Reference

# C++ Reflection in under 100 lines of code
# c++ 模板元编程简化format
# C++26 反射元编程：Spec API 注入模型
 # 如何保存constexpr string的值在运行期使用？
Reflection for C++26
Annotations for Reflection
Reflection for C++26!!!
Code Generation in Rust vs C++26

SIMD in C++ 26

Posted on 2025-06-15 Edited on 2025-08-10

Single Instruction Multiple Data雖然不是什麼新玩意，不過走到標準化也花了二十幾年，寫一下自己的感想

石器時代 Inline Assembly

當我剛開始工作的時候，那時候還在使用MMX，SSE都還沒流行，更遑論之後的AVX了
典型的Assembly Code長這樣

add_AVX:
  // size <= 0 --> return
  testq %rdi, %rdi
  jle end_loop

  // i = 0
  movl $0, %eax

start_loop:
  // __m256i b_part = _mm256_loadu_si256((__m256i*) &b[i]);
    // compiles into two instructions, each of which loads 128 bits
  vmovdqu (%rdx,%rax,2), %xmm0
  vinserti128 $0x1, 16(%rdx,%rax,2), %ymm0, %ymm0

  // __m256i a_part = _mm_loadu_si128((__m128i*) &b[i]);
  vmovdqu (%rsx,%rax,2), %xmm1
  vinserti128 $0x1, 16(%rsx,%rax,2), %ymm1, %ymm1

  // a_part = _mm256_add_epi16(a_part, b_part);
  vpaddw %ymm1, %ymm0

  // _mm256_storeu_si256((__m256i*) &a[i], a_part)
  vmovups %ymm0, (%rsi,%rax,2)
  vextracti128 $0x1, %ymm0, 16(%rsi,%rax,2)

  // i += 16
  addq $16, %rax
  
  // i < size --> return
  cmpq %rax, %rdi
  jg start_loop
end:
  ret

雖然可以運作，不過問題也是不少

MSVC和GCC的Inline Assembly的寫法不同，更何況MSVC在64bit之後就不支持Inline Assembly了
要為每個Artitecture維護一份自己的Assembly Code，MMX一份，SSE1/2/3/4，AVX系列都要維護，也就是Portable issue

最大的問題，能夠寫Assembly Code的人，大概比日本壓縮機還要少
於是就從石器時代進入到青銅時代

青銅時代 Intrinsic function

Intrinsic function是一種特殊的函數，由編譯器維護，由於編譯器能夠對Intrinsic function做更進一步的最佳化，通常用於向量化和平行化
以下是一個範例

/* vectorized version */
void add_AVX(long size, unsigned short * a, const unsigned short *b) {
    for (long i = 0; i < size; i += 16) {
        /* load 256 bits from a */
        /* a_part = {a[i], a[i+1], a[i+2], ..., a[i+15]} */
        __m256i a_part = _mm256_loadu_si256((__m256i*) &a[i]);
        /* load 256 bits from b */
        /* b_part = {b[i], b[i+1], b[i+2], ..., b[i+15]} */
        __m256i b_part = _mm256_loadu_si256((__m256i*) &b[i]);
        /* a_part = {a[i] + b[i], a[i+1] + b[i+1], ...,
                     a[i+7] + b[i+15]}
         */
        a_part = _mm256_add_epi16(a_part, b_part);
        _mm256_storeu_si256((__m256i*) &a[i], a_part);
    }
}

看起來像是正常的C Code了，解決了上面1和3的問題，不過問提2還是存在
看看知名的llama.cpp當中的一段

void quantize_row_q8_1(const float * GGML_RESTRICT x, void * GGML_RESTRICT vy, int64_t k) {
    assert(k % QK8_1 == 0);
    const int nb = k / QK8_1;

    block_q8_1 * GGML_RESTRICT y = vy;

#if defined(__ARM_NEON)
    // Ignore
#elif defined __wasm_simd128__
    // Ignore
#elif defined(__AVX2__) || defined(__AVX__)
    // Ignore
#elif defined(__riscv_v_intrinsic)
    // Ignore
#elif defined(__POWER9_VECTOR__)
    // Ignore
#elif defined(__loongarch_asx)
    // Ignore
#elif defined(__VXE__) || defined(__VXE2__)
    // Ignore
#else
    // fallback
#endif
}

可以看到，不同的架構就有不同的Intrinsic function sets，不能重複使用，因此就要維護好幾份程式碼

鐵器時代

之後就有一派人馬，封裝了各架構不同的Intrinsic function，包裝成演算法的方式提供
例如

highway
EVE
xsimd

std-simd
雖然細節不盡相同，不過程式碼大概像這樣

using Vec3D = std::array<float, 3>;
float scalar_product(Vec3D a, Vec3D b) {
  return a[0] * b[0] + a[1] * b[1] + a[2] * b[2];
}

更接近一般的C++ Code了，而其中的std-simd就是C++26 SIMD的前身

Simd in Rust

看看Rust的方式，雖然能看得懂，不過談不上喜歡

fn reduce(x: &[i32]) -> i32 {
    assert!(x.len() % 4 == 0);
    let mut sum = i32x4::splat(0); // [0, 0, 0, 0]
    for i in (0..x.len()).step_by(4) {
        sum += i32x4::from_slice_unaligned(&x[i..]);
    }
    sum.wrapping_sum()
}

let x = [0, 1, 2, 3, 4, 5, 6, 7];
assert_eq!(reduce(&x), 28);

A possible SIMD reduce implemenation in C++26

基於目前的SIMD TS，之後SIMD的程式碼可能長這要，與C++現有的工具可以很好的配合

#include <algorithm>
#include <array>
#include <print>
#include <ranges>
#include <experimental/simd>
#include <print>

namespace stdx = std::experimental;
namespace stdv = std::views;
namespace stdr = std::ranges;

template <typename T, auto N>
constexpr auto reduce(const std::array<T, N> &arr) {
        using simd_t = stdx::native_simd<T>;

        constexpr auto step = simd_t::size();
        constexpr auto tile = N / step;
        constexpr auto left = N % step;

        T sum {};

        for (const auto &batch : arr | stdv::stride(step) | stdv::take(tile)) {
                simd_t temp(std::addressof(batch), stdx::element_aligned);
                sum += stdx::reduce(temp, std::plus{});
        }

        if constexpr (left) {
                auto left_view = arr | stdv::drop(tile * step);
                std::ranges::for_each(left_view, [&](auto v) { sum += v; });
        }

        return sum;
}


int main() {
        std::array<int, 7> arr {1, 2, 3, 4, 5, 6, 7};
        std::println("{}", reduce(arr));
}

不過眾人質疑的一點，是否之後性能能否達到Intrinsic function的水平，不過至少是解決了Portable這個痛點了

Reference

c-c++ assembly inline x86-64 128-bit SIMD - brief summary.md
SIMD for C++ Developers
Intrinsic function

Condition compilation and C++20 Module

Posted on 2025-04-26 Edited on 2025-08-10

Condition compilation相信大家都很熟了，相信不用多介紹了，以下是一個示範案例

struct Proxy {
#ifdef FEATURE_ENABLE
	int v; 
#endif
	void do_something() {
#ifdef FEATURE_ENABLE
		v = 42; 
#endif
	} 
};

不過在C++20 Module的世界中，要求有穩定，一致的ABI
Preprocessor格格不入，必須想個方法轉換，以下是我的思路

Step1: Create static constexpr variable

#ifdef FEATURE_ENABLE
static constexpr bool enable_feature_v = true;
#else
static constexpr bool enable_feature_v = false;
#endif

這一步就是將Compiler所收到的Definition變成constexpr variable

Step2: Failed attempt

試著翻譯上面的程式碼

template <bool>
struct ProxyImpl {};

template <>
struct ProxyImpl<true> {
	int v;
};

struct Proxy : public ProxyImpl<enable_feature_v> {
	void do_something() {
		if constexpr (enable_feature_v) {
			v = 42;
		}
	} 
};

當enable_feature_v為false的時候會報錯

**<source>:** In member function '**void Proxy::**do_something****()':

**<source>:14:25:** **error:** '**v**' was not declared in this scope

   14 |                         **v** = 42;

      |

不管enable_feature_v值為何，Proxy底下一定要有個member v存在，就算繼承改成member也是同樣的情形

Step3: std::optional

在Runtime決定狀態，多了一點Runtime overhead，不過至少可行

struct ProxyImpl {
    int v;
};

struct Proxy {
    std::optional<ProxyImpl> v;
    Proxy() {
        if constexpr (enable_feature_v) v.emplace();
    }

    void do_something() {
        if constexpr (enable_feature_v) {
            (*v).v = 42;
        }
    }
};

Step4: Revisit template specialization

重回失敗的第二步，不過這次是定義函數為操作單位

template <bool>
struct ProxyImpl {
    void update(int) {}
};

template <>
struct ProxyImpl<true> {
    int v;
    void update(int newV) { v = newV; }
};

struct Proxy : public ProxyImpl<enable_feature_v> {
    void do_something() {
        if constexpr (enable_feature_v) {
            update(42);
        }
    }
};

根據輸出的Assembly Code，空的update會被編譯器整個偵測到，完全消失
不過這方法也很麻煩，要同時維護好幾份的Function Set，就算是空的也要維護

Step5: Condtional Proxy

如果將feature flag帶入Type裡面，則可以解決上面的問題，且帶來其他的問題

template <bool>
struct ProxyImpl {};

template <>
struct ProxyImpl<true> {
    int v;
};

template <bool enable_feature>
struct Proxy {
    ProxyImpl<enable_feature> v;
    void do_something() {
        if constexpr (enable_feature) {
            v.v = 42;
        }
    }
};

int main()
{
	Proxy<enable_feature_v> p;
	p.do_something();
}

因為他是template class，所以在傳統的C++，只能放在Header Unit，如果是Module世界的話，只能放在Module interface unit，不能放到Module Implementation Unit

Conclusion

目前沒有什麼十全十美的好方法，可能需要進一步的探索

simulate constexpr for in C++

Posted on 2025-04-23 Edited on 2025-08-10

這是一種常見的code pattern

template <size_t length>
void for_loop_do() {
	for (size_t i = 0; i < length; i++) {
		do(i);
	}
}

不過既然我們在compile-time知道length的值了，自然想要unroll loop body，變成類似這樣

void unroll_do() {
	do(0);
	do(1);
	do(2);
	....
}

直覺的想法就是costexpr for

template <size_t length>
void for_loop_do() {
	constexpr for (size_t i = 0; i < length; i++) {
		do(i);
	}
}

不過自然沒這語法，只好用其他方法繞過，解決的方法也不只一種，提出兩種比較常用的

Version 1: Recursive + if constexpr

學過資料結構的話，知道迴圈可以用遞洄來模擬
加上C++17的if constexpr可以很容易蹶決這問題

template <size_t index, size_t boundary, typename Func>

void constexpr_for_impl(Func func)
{
    if constexpr (index == boundary) {
        return;
    } else {
        func(index);
        constexpr_for_impl<index + 1, boundary>(func);
    }
}

template <size_t boundary, typename Func>
void constexpr_for(Func func)
{
    constexpr_for_impl<0, boundary>(func);
}

int main()
{
    constexpr_for<3>([](size_t index) {
        printf("llu\n", index);
    });
}

Version 2: integer_sequence

C++14之後，引進了一個helper class integer_sequence，將常數納入Type當中
這很少直接拿來用，通常只有操作tuple時才會使用，不過要模擬constexpr for這個特色，這功能少不了

#include <iostream>
#include <utility>

template <typename T, T... ints>
void print_sequence(std::integer_sequence<T, ints...> int_seq)
{
    std::cout << "The sequence of size " << int_seq.size() << ": ";
    ((std::cout << ints << ' '), ...);
    std::cout << '\n';
}

int main()
{
    print_sequence(std::integer_sequence<unsigned, 9, 2, 5, 1, 9, 1, 6>{});
}

我們想要的就是有一個類似std::integer_sequence<size_t, 0, 1, ..., n>這樣的sequence，這時候另外一個helper type make_integer_sequence登場了
以下是上一個範例的修改

#include <iostream>
#include <utility>

template <typename T, T... ints>
void print_sequence(std::integer_sequence<T, ints...> int_seq)
{
    std::cout << "The sequence of size " << int_seq.size() << ": ";
    ((std::cout << ints << ' '), ...);
    std::cout << '\n';
}

int main()
{
    print_sequence(std::make_integer_sequence<int, 12>{});
}

這就是constexpr for的核心觀念

template <std::size_t... Is, typename Func>
void constexpr_for_impl(std::index_sequence<Is...>, Func func)
{
    (func(Is), ...);
}

template <size_t boundary, typename Func>
void constexpr_for(Func func)
{
    constexpr_for_impl(std::make_index_sequence<boundary>{}, func);
}

int main()
{
    constexpr_for<3>([](size_t index) {
        printf("llu\n", index);
    });
}

C++20之後，有了 Template Lambdas，因此我們可以進一步這樣寫

template <size_t boundary, typename Func>
void constexpr_for(Func func)
{
    [&] <size_t... Is> (std::index_sequence<Is...>) {
        ([&] <size_t I> (std::integral_constant<size_t, I>) {
            func(I);
        } (std::integral_constant<size_t, Is>{}), ...);
    } (std::make_index_sequence<boundary>{});
}

int main()
{
    constexpr_for<3>([](size_t index) {
        printf("llu\n", index);
    });
}

Conclusion

模擬只是模擬，當for loop加入了continue，break，return之後，上面的方法就會變得很複雜，只適合簡單的應用

Contract in C++26

Posted on 2025-02-21 Edited on 2025-08-10

Reflection還沒通過，Contract就先進入下一版的C++標準當中了
如果要簡單講完Contract，大概就是C語言assert的威力加強版吧

Before Contract

打開assert.h你大概可以看到類似這樣的程式碼

#ifdef  NDEBUG
# define assert(expr)           (__ASSERT_VOID_CAST (0))
#else
#  define assert(expr)                                                  \
  ((void) sizeof ((expr) ? 1 : 0), __extension__ ({                     \
      if (expr)                                                         \
        ; /* empty */                                                   \
      else                                                              \
        __assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION);   \
    }))
#endif

甚至知名的Open Source Project，也會搞自己的一套assert機制

#if SLANG_ASSERT_ENABLED
# define SLANG_ASSERT(cond) \
    do { \
      if (!(cond)) \
        assertFailed(...); \
    } while (false)
#else
# define SLANG_ASSERT(cond) \
    do { \
      (void)sizeof(cond); \
    } while (false)
#endif

或是l
由於這個機制基於Macro，所以Macro Pollution是繞不開的問題
將Contract列如標準之後，有以下好處

避免Macro Pollution
更靈活的處理策略
由於是Language本身的一部分，之後有更好的工具可供分析使用

How to Use

最簡單的方法就是將assert置換成`contract_assert

1
2
3

auto w = getWidget(); 
contract_assert(w.isValid());
processWidget(w);

基本行為就跟assert一致，不過可以透過Compiler控制Contract Semantics

Evaluation semantics

關於Contract，有四種不同的Evaluation方式

Ignore
Enforce
Observe
Quick-Enforce
這邊就不喜說了，需要的話參考Reference的連結

Pre and Post

這個也跟以前function用assert檢查Input跟Output Result差不多

int f(const int x)  {
    assert(x != 0 && x != -1);
    int r = x + 1;
    assert(r > x);
    return r;
}

基本上沒什麼問題，只不過不能從FUnction signature中檢查，要深入Function body才知道問題出在哪
有了Contract之後，可以在函數宣告的地方加上pre和post表示前置和後置條件

int f(const int x) 
    pre(x != 0 && x != -1)
    post(r : r > x);

int f(const int x) {
    return x + 1;
}

就是把assert的使用情境細分

handle_contract_violation

有些時候，除了Compiler 內定的行為，我們需要自行對Contract做處理
這時候就是Global function handle_contract_violation發揮的時刻

// Try overriding the violation handler
void handle_contract_violation(std::contracts::contract_violation const& violation) {
    if (violation.semantic() == std::contracts::evaluation_semantic::observe) {
        std::cerr << violation.location().function_name() << ":"
            << violation.location().line()
            << ": observing violation my way: "
            << violation.comment()
            << std::endl;
        return;
    }
    std::contracts::invoke_default_contract_violation_handler(violation);
}

最後一行表示回到內建的處理方式了

Status

目前MSVC還未實作，省略不計
gcc和clang的用法略有不同
目前gcc的使用方式：

1	-fcontracts -fcontracts-nonattr -fcontract-evaluation-semantic=enforce

而clang則是：

1	$ -fcontracts -fcontract-evaluation-semantic=enforce

Reference

Contracts for C++ explained in 5 minutes
C++26启航：Safe C++的破晓时刻

Troubleshooting between C++ Module and NVCC

Posted on 2025-01-26 Edited on 2025-08-10

在開發自己的Toy Project時，做了許多大膽的舉動，其中之一就是使用了C++20 Module，不過夜路走多了總是會碰到鬼，列下目前遇到的血淚史

NVCC doesn’t support C++20 Module

一開始沒打算使用CUDA，等到引進CUDA時才發現這是個超級大坑

Because it requires complex interaction between the CUDA device compiler and the host compiler, modules are not supported in CUDA C++
山不轉路轉，原先的程式碼大概長這樣

export module A;

export {
	enum Flag {
		// blabla
	};
}

退化成C++17能接受的語法

#prgram once
enum Flag {
	// blabla
};

和

module;
#inlcude "header.h"

export module A;

export {
	using Flag;
}

以及CUDA的Header file

1
2
3

#pragma  once
#include "header.h"
void doSomething(Flag);

至少是我目前想到最好的解法
乍看之下問題解決了，不過事情沒這麼簡單，假設有個Implementation unit

module;
#include "header.h"

module A;

void work()
{
	doSomething(Flag{});
}

會發現Compiler會告訴你Flag被重複定義，於是乎想到一個解決方案，forward references，改寫Cuda Header的部分

1
2
3

#pragma  once
enum Flag : int;
void doSomething(Flag);

然後將獨立出Cuda Implementation Unit

1 2	#include "header.h" void doSomething(Flag) {}

雖然很難看，不過現在跑起來沒啥問題

location on extern function

上面的方式是將cuda function拆成兩個檔案，header和implementation
如果我們要省略header該怎麼做

module A;
extern void funcFromCuda();

void work()
{
	funcFromCuda();
}

在MSVC上編譯沒問題，不過在Linux Clang下編譯就會出錯了
修正方式也很簡單，將extern function移到Global module fragment就好了

module;
extern void funcFromCuda();

module A;

void work()
{
	funcFromCuda();
}

One more thing

這其實跟C++20 Module無關，只是因為現階段Clang支持Module，而GCC還沒準備好，所以開發就以Clang為主，因此遇到了這麼一個問題

module;
#include <string>
extern void funcFromCuda(std::string);

module A;

void work()
{
	funcFromCuda("Hello World");
}

NVCC會告訴你找不到funcFromCuda function，因為我們外部的Compiler是使用clang，使用的是libc++，而NVCC使用的是libstdc++，兩者並不相容
於是乎只好退回舊方法，直接傳指標了

module;
#include <string>
extern void funcFromCuda(const char*);

module A;

void work()
{
	funcFromCuda("Hello World");
}

Conclusion

目前看來，NVCC支援C++20 Moudle遙遙無期，不然問題一和問題二應該都能解決
問題三比較麻煩，就算支援Module還是不行，只能使用老方法

Self-Reference Type process between C++ and Rust

Posted on 2024-11-15 Edited on 2025-08-10

為了搞懂Rust Pin在做什麼，耗費了很多精力，還真是有夠難懂的

About Self-Reference Type

有個資料結構，其中有個指標指向結構自己或是結構中的某個欄位
例如

struct Test {
protected:
        std::string a_;
        const std::string* b_;
public:
        Test(std::string text) : a_(std::move(text)), b_(&a_) {}
        const std::string& a() const { return a_; }
        const std::string& b() const { return *b_; }
};

這裡的b_指向a_的地址，同樣的事情在Rust寫成這樣

struct Test {
    a: String,
    b: *const String,
}

impl Test {
    fn new(txt: &str) -> Self {
        Test {
            a: String::from(txt),
            b: std::ptr::null(),
        }
    }

    fn init(&mut self) {
        let self_ref: *const String = &self.a;
        self.b = self_ref;
    }

    fn a(&self) -> &str {
        &self.a
    }

    fn b(&self) -> &String {
        unsafe {&*(self.b)}
    }
}

不看Rust的safe機制造成的不同，原理是相同的
現在的問題是，假設物件被移動了，指向結構中某部分的指標該怎麼辦
例如

1	std::swap(test1, test2);

先從我比較熟悉的C++來說好了

Solution1: Keep invariant

雖然達成目標的方法有很多，不過原則都是一樣：維持不變量就好了

void swap(Test& lhs, Test& rhs) {
    std::swap(lhs.a_, rhs.a_);
}
swap(test1, test2);

很顯然，這個方法不適用於Rust

Solution2: Don’t move the object

所謂的Pin也就是這麼一回事，當物件停留在記憶體的某個位置之後，就不會再移動了，所以Self-Reference Type的物件，在生命週期結束之前，所有的pointer和reference都會有效
在C++禁止的方法也不只一種，這是方法之一

template <typename T>
void swap(T&, T&) {}
struct Test {
protected:
        std::string a_;
        const std::string* b_;
public:
        Test(std::string text) : a_(std::move(text)), b_(&a_) {}
        friend void swap(Test&, Test&) = delete;
        const std::string& a() const { return a_; }
        const std::string& b() const { return *b_; }
};

不過由於Rust講究Safety，所以訂了一堆規則

About Pin in Rust

在Rust中對Self-Reference Type的處理，我們要禁止的只有這件事

pub fn swap<T>(x: &mut T, y: &mut T) {
    // SAFETY: the raw pointers have been created from safe mutable references satisfying all the
    // constraints on `ptr::swap_nonoverlapping_one`
    unsafe {
        ptr::swap_nonoverlapping_one(x, y);
    }
}

禁止Rust拿到&mut T的Reference，&mut Tˊ自然是不行，Box<T>也做不到這件事，所以就是Pin<T>登場的時候

Rust的Type分成兩類：

Default Type：可以安全在Rust Move的類型
- Default Type都實作了auto Unpin trait，也就是什麼都不用做
Self-Reference Type：也就是上面提到的部分
- 必須實作!Unpin的部分
- 使用PhantomPinned就可以了

以下是個範例程式

use std::pin::Pin;
use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
    _marker: PhantomPinned,
}

impl Test {
    fn new() -> Self {
        Test {
             _marker: PhantomPinned
        }
    }
}
pub fn main() {
    let mut test1 = Box::pin(Test::new());
    let mut test2 = Box::pin(Test::new());
    // compile failed
    std::mem::swap(test1.as_mut().get_mut(), test2.as_mut().get_mut());
}

你把上面的PhantomPinned註解掉，程式就能正常運作了
Pin還有很多細節，等我真的變成全職Rust工程師在研究吧

Reference

Rust 的 Pin 与 Unpin

About Safety on C/C++

Posted on 2024-10-30 Edited on 2025-08-10

最近被吵得很兇的Safe C/C++，主要討論的是沒有Undefined Behavior這件事

What’s Undefined Behavior

基本上就是一個逃生艙口，Compiler可以跳過某些邏輯的推理，正確的邏輯永遠不會引發UB，不正確的邏輯(可能)會引發UB

舉個例子：

int f(bool init) {
    int a;
    if (init) {
        a = 6;
    }
    return a / 2;

}

觸犯了UB，所以gcc/clang開啟Optimization時會產生

1
2
3

f(bool):
        mov     eax, 3
        ret

不過Undefined Behavior是Runtime Concept，所以以下的程式碼只有在Runtime才會發生UB

int nervous(bool is_scary, int n)
{
    if (is_scary) {
        return 100 / n;
    } else {
        return 0;
    }
}

int main()
{
    return nervous(false, 0);
}

所以Undefined Behavior Sanitizer只能在Runtime下使用t，也不能保證抓到所有UB，只要UB Code沒被執行到，整個程式行為還是受到標準限制的

How to archive safety

基本上分成兩個流派

程式語言中本身就沒有Undefined operations，如Java，Python，不過也是犧牲了一部分的速度和超能力，沒有十全十美的
將程式語言切分為Safe跟Unsafe的部分，如Rust
- Safe的地方由Compiler保證，不會有任何UB發生
- Unsafe的地方也沒有什麼特異功能，只是相信Unsafe的地方不會有任何問題

Future of safety C++

老實說，我不知道
Safecpp提供了一套類似於Rust的機制，不過那已經不算是C++了
Profile機制只打算完成80%的Safety，如果能達到這目標，我覺得夠用了，不過要說服NSA和美國政府是另一回事了

Reference

The difference between undefined behavior and ill-formed C++ programs

Writing a book

Posted on 2024-09-19 Edited on 2025-08-10

找不到適合的工作，只好多方嘗試，看有沒有寫書的才華
寫了一本有關C/C++生態系的書
C/C++編譯器與它的快樂夥伴

Revisit C++20 Module and CMake

Posted on 2024-09-06 Edited on 2025-08-10

由於打算寫本電子書，所以重新審視了C++20 Moudle的部分
語法的不是這篇的重點，這篇講的是

如何跟CMake搭配使用
測試環境是Linux + Clang20 + CMake 3.28

Prerequisites

首先先clone git repo，所有的變化都由範例legacy開始，這是沒有Module之前的做法

CMake

CMake已經是事實上的標準

Case1 Normal case

詳細內容請觀看 module_1目錄，這邊只講我覺得重要的地方
這個Case就是legacy直接翻譯成Module版本
首先看MathFunctions的CMakeLists.txt的部分

target_sources(MathFunctions
  PUBLIC
    FILE_SET primary_interface
    TYPE CXX_MODULES
    FILES
      MathFunctions.cppm
  PRIVATE
    FILE_SET implementaion_units
    TYPE CXX_MODULES
    FILES
      src/mysqrt.cppm
  PRIVATE
    src/MathFunctions.cxx
)

這邊有兩個FILE_SET

primary_interface：也就是我們要對外提供的Primary module interface unit
implementaion_units:內部的partion unit，不對外輸出
所以在安裝的時候，只會將MathFunctions.cppm複製到安裝的目錄下
Case2: Multiple Primary Module Interface Units
接著我們稍微修改MathFunctions.cppm的內容
1
2
3
4
5
6
7
8
9
module;

export module Math;

export import :detail;
export namespace mathfunctions
{
double sqrt(double);
}
我們也將detail的內容也輸出了，因此我們需要做以下的修改

detail module和namespace需要標記成 export

module;
#include <math.h>

export module Math:detail;

export namespace mathfunctions::detail {
        double sqrt(double x) {
                return ::sqrt(x);
        }
}

修改我們的CMakeLists.txt的部分

target_sources(MathFunctions
  PUBLIC
    FILE_SET primary_interface
    TYPE CXX_MODULES
    FILES
      MathFunctions.cppm
      src/mysqrt.cppm
  PRIVATE
    src/MathFunctions.cxx
)

現在我們有了兩個Primary Module Interface Units，在安裝的時候也要同時複製兩個檔案
Math.detail和Math:detail的情況類似，所以就不說了

接著來研究Mitgrate的部分，這是參考clang Transitioning to modules的部分

Case3: Mitgrate legacy to module (Part1)

看一下transform_1的目錄
這邊主要的差別在於CMakeLists.txt

target_sources(MathFunctions
  PUBLIC
    FILE_SET export_headers
    TYPE HEADERS
    BASE_DIRS include/
    FILES include/MathFunctions.h
  PUBLIC
    FILE_SET primary_interface
    TYPE CXX_MODULES
    FILES
      MathFunctions.cppm
  PRIVATE
    src/MathFunctions.cxx
    src/mysqrt.h
    src/mysqrt.cxx
)
install(TARGETS MathFunctions
  EXPORT MathFunctionsTargets
  ARCHIVE
  FILE_SET export_headers
  FILE_SET primary_interface
    DESTINATION lib/cmake/MathFunctions/src
)

既保留原有的leagcy code，更新增了一個Primary Module Interface Units
而MathFunctions.cppm的內容則是

module;
#include "MathFunctions.h"

export module Math;

export namespace mathfunctions {
        using mathfunctions::sqrt;
}

將Global Module Fragment中的內容導出到Module中
這種方法不會破壞原有leagcy code，殺傷力最小

Case4: Mitgrate legacy to module (Part2)

看一下transform_21的目錄，CMakeLists.txt跟上面一樣不變
改變的是MathFunctions.cppm和MathFunctions.h
此時的MathFunctions.cppm長這樣

module;

export module Math;

#define IN_MODULE_INTERFACE

extern "C++" {
        #include "MathFunctions.h"
}

而MathFunctions.h的內容則是

#pragma once

#ifdef IN_MODULE_INTERFACE
#define EXPORT export
#else
#define EXPORT
#endif

namespace mathfunctions {
        EXPORT double sqrt(double x);
}

由於只有在Module狀態下，IN_MODULE_INTERFACE才會發揮作用，因此leagcy code的情況下會維持不變
這個方法雖然比上面麻煩，不過可以順利遷移到下一個階段

Case5: Mitgrate legacy to module (Part3)

所有方案中最麻煩的一種
主要思想是在implemtation unit當中切開legacy和module的實作，強迫Consumer只能使用其中一種，例如原先的Header可能要加上export

#pragma once

#ifdef IN_MODULE_INTERFACE
#define EXPORT export
#else
#define EXPORT
#endif

namespace mathfunctions {
        EXPORT double sqrt(double x);
}

以及Implementation的部分也要隔開

#ifndef IN_MODULE_INTERFACE
#include "MathFunctions.h"
#include "mysqrt.h"
#else
module Math;
#endif

namespace mathfunctions {
        double sqrt(double x) {
                return detail::sqrt(x);
        }
}

在這裡我選擇對CMakeLists.txt動手腳

if (ENABLE_MODULE_BUILD)
target_sources(MathFunctions
  PUBLIC
    FILE_SET export_headers
    TYPE HEADERS
    BASE_DIRS include/
    FILES
      include/MathFunctions.h
      include/mysqrt.h
  PUBLIC
    FILE_SET primary_interface
    TYPE CXX_MODULES
    FILES
      MathFunctions.cppm
  PRIVATE
    src/MathFunctions.cxx
    src/mysqrt.cxx
)
else()
target_sources(MathFunctions
  PUBLIC
    FILE_SET export_headers
    TYPE HEADERS
    BASE_DIRS include/
    FILES
      include/MathFunctions.h
  PRIVATE
    include/mysqrt.h
    src/MathFunctions.cxx
    src/mysqrt.cxx
)
endif()

target_compile_definitions(MathFunctions
  PRIVATE
    $<$<BOOL:${ENABLE_MODULE_BUILD}>:IN_MODULE_INTERFACE>
)

if (ENABLE_MODULE_BUILD)
install(TARGETS MathFunctions
  EXPORT MathFunctionsTargets
  ARCHIVE
  FILE_SET export_headers
  FILE_SET primary_interface
    DESTINATION lib/cmake/MathFunctions/src
)
else()
install(TARGETS MathFunctions
  EXPORT MathFunctionsTargets
  ARCHIVE
  FILE_SET export_headers
)
endif()

當我們指定ENABLE_MODULE_BUILD的時候，會自動處理細節的部分
不過這邊也遇到了clang文件中的問題

Minor issue

由於我們之前的mysqrt.h是經由src/MathFunctions.cxx所include的，改成Module之後，這個相依性被切斷了
因此我們需要在MathFunctions.cppm強迫加入

module;

export module Math;
#include "MathFunctions.h"

module: private;
#include "mysqrt.h"

這樣沒有問題，不過

原來的mysqrt.h不需要公開，現在變成強迫要公開了
更好的方法是直接使用Module Partition Unit，也就是要改寫

他山之石

用魔法打敗魔法

Generate string literal in compile time

C++26 Reflection Revisited

Revisited Hana’s implementation

struct_to_tuple

Little issue

Annotations

Reference

石器時代 Inline Assembly

青銅時代 Intrinsic function

鐵器時代

Simd in Rust

A possible SIMD reduce implemenation in C++26

Reference

Step1: Create static constexpr variable

Step2: Failed attempt

Step3: std::optional

Step4: Revisit template specialization

Step5: Condtional Proxy

Conclusion

Version 1: Recursive + if constexpr

Version 2: integer_sequence

Conclusion

Before Contract

How to Use

Evaluation semantics

Pre and Post

handle_contract_violation

Status

Reference

NVCC doesn’t support C++20 Module

location on extern function

One more thing

Conclusion

About Self-Reference Type

Solution1: Keep invariant

Solution2: Don’t move the object

About Pin in Rust

Reference

What’s Undefined Behavior

How to archive safety

Future of safety C++

Reference

Prerequisites

CMake

Case1 Normal case

Case2: Multiple Primary Module Interface Units

Case3: Mitgrate legacy to module (Part1)

Case4: Mitgrate legacy to module (Part2)

Case5: Mitgrate legacy to module (Part3)

Minor issue