第十三號艦隊

Story about type punning

Posted on 2023-02-26 Edited on 2025-08-10

What is type punning

Type Punning是指用不同類型的Pointer，指向同一塊Memory address的行為，這是Undefined beahvior，可能會造成未知的錯誤．
例如

#include <iostream>

int main() {
	float f = 3.14;
	int* pi = (int*)&f; 
	*pi = 42; 
	std::cout << "f = " << f << std::endl; 
	return 0; 
}

Type punning違反了Strict aliasing rule

Example

寫網路程式的時候常常會遇到這種情形，分配一塊記憶體，然後Cast成另外一種Type的Pointer填值

typedef struct Msg
{
    unsigned int a;
    unsigned int b;
} Msg;

void SendWord(uint32_t);

int main(void)
{
    // Get a 32-bit buffer from the system
    uint32_t* buff = malloc(sizeof(Msg));
    
    // Alias that buffer through message
    Msg* msg = (Msg*)(buff);
    
    // Send a bunch of messages    
    for (int i = 0; i < 10; ++i)
    {
        msg->a = i;
        msg->b = i+1;
        SendWord(buff[0]);
        SendWord(buff[1]);   
    }
}

Solution

C Solution

union

C語言的話可以使用union

union {
  Msg msg;
  unsigned int asBuffer[sizeof(Msg)/sizeof(unsigned int)];
};

char*

或是使用(unisnged / signed) char *取代上面的int*
可以認為j從char*轉匯成type *是合法的，反之不成立

memcpy

1
2
3

int x = 42; 
float y; 
std::memcpy(&y, &x, sizeof(x));

這樣是合法的，不過缺點就是要多一次拷貝

C++ Solution

bit_cast

C++20引進的新東西，不過實作也就只是上面的memcpy包裝

template <class To, class From>
bit_cast(const From& src) noexcept
{
    To dst;
    std::memcpy(&dst, &src, sizeof(To));
    return dst;
}

std::start_lifetime_as

C++23引進的新觀念，類似於reinterpret_cast，不過沒有undefined behaviro的副作用

struct ProtocolHeader {
  unsigned char version;
  unsigned char msg_type;
  unsigned char chunks_count;
};

void ReceiveData(std::span<std::byte> data_from_net) {
    if (data_from_net.size() < sizeof(ProtocolHeader)) throw SomeException();
    const auto* header = std::start_lifetime_as<ProtocolHeader>(
        data_from_net.data()
    );
    switch (header->type) {>
        // ...
    }
}

Reference

GAT equivalence in C++

Posted on 2023-02-11 Edited on 2025-08-10

一開始看到GAT也不知道在幹嘛，是看到Could someone explain the GATs like I was 5?才有感覺]
最簡單的範例，現在有一個struct

1	struct Foo { bar: Rc<String>, }

假設你要同時支援 Rc和’Arc的版本
該怎麼做

Naive solution

1 2	struct FooRc { bar: Rc<String>, } struct FooArc { bar: Arc<String>, }

不過這當然沒什麼好說的

Macro solution

理論上辦得到，不過沒什麼優點

GAT Solution

我希望能寫成這樣

1	struct Foo<P: Pointer> { bar: P<String>, }

這樣是編譯不會過的，有了GAT之後，可以寫成這樣

trait PointerFamily { type Pointer<T>; }
struct RcFamily;  // Just a marker type; could also use e.g. an empty enum 
struct ArcFamily; // Just a marker type; could also use e.g. an empty enum 
impl PointerFamily for RcFamily { type Pointer<T> = Rc<T>; }
impl PointerFamily for ArcFamily { type Pointer<T> = Arc<T>; }

struct Foo<P: PointerFamily> { bar: P::Pointer<String>, }

C++ Solution

不過用C++對我來說反而更好理解，就是用nested template來做
首先是等價的版本

template <typename T>
struct Rc {};
template <typename T>
struct Arc {};

struct RcFamily {
    template <typename T>
    using type = Rc<T>;
};

struct ArcFamily {
    template <typename T>
    using type = Arc<T>;
};

template <typename T>
struct PointerFamily {
    template <typename U>
    using type = T::template type<U>;
};

template <typename T>
struct Foo {
    typename PointerFamily<T>::template type<std::string> bar;
};

不過對於這問題，還有更簡單的方法
用template template parameter即可

template <typename T>
struct Rc {};
template <typename T>
struct Arc {};

template <template <typename> class T>
struct Foo {
    T<std::string> bar;
};

More complicated example

trait Mappable {
    type Item;
    type Result<U>;
    fn map<U, P: FnMut(Self::Item) -> U>(self, f: P) -> Self::Result<U>;
}

impl<T> Mappable for Option<T> {
    type Item = T;
    type Result<U> = Option<U>;
    fn map<U, P: FnMut(Self::Item) -> U>(self, f: P) -> Option<U> {
        self.map(f)
    }
}

impl<T, E> Mappable for Result<T, E> {
    type Item = T;
    type Result<U> = Result<U, E>;
    fn map<U, P: FnMut(Self::Item) -> U>(self, f: P) -> Result<U, E> {
        self.map(f)
    }
}

等價的C++版本大概是

template <class T>
struct Option {
    // Option implementation...
    // The "GAT":

    template <class U>
    using MapResult = Option<U>;
    template <class U, class F>

    Option<U> map(F f) {
        // Apply f to the contents of `this`
    }
};

template <class T>
concept Mappable = requires {
    typename T::template MapResult<int>;
};

template <Mappable T>
typename T::template MapResult<int> zero_to_42(T t) {
    return t.template map<int>([](int x) {
        return x == 0 ? 42 : 0 ;
    });
}

C resource managment, defer, unique_ptr, out_ptr, template auto and coroutine

Posted on 2022-12-24 Edited on 2025-08-10

這裡的Resouce不光指Memory，可能是FILE，或是ffmpeg那種Handle
Resource Management一直都是個討論的重點，要混合在C++使用，有很多種方法
拿FILE來舉例好了

什麼都不做

1
2
3

FILE *fp = fopen(...);
// Do somthing
fclose(fp);

這種方法最直接，不用學其他額外的方法，不過常常會因為程式碼的改變，而忘記release resource這件事，因此才有其他流派生存的機會

defer

大概的程式碼長這樣，不過在C++不一定叫defer，可能叫ScopeGuard之類的東西，不過原理是一樣的

1 2	FILE *fp = fopen(...); defer([&]() { fclose(fp); });

在小規模的使用是沒問題的，當Resoruce 一多就會變得冗餘，例如

FILE *fp1 = fopen(...);
defer([&]() { fclose(fp1); });
FILE *fp2 = fopen(...);
defer([&]() { fclose(fp2); });
FILE *fp3 = fopen(...);
defer([&]() { fclose(fp3); });

於是C++ RAII的方式出現了，有鑑於shared_ptr耗費較多的資源，這邊的方案都是unique_ptr為主

naive unique_ptr solution

為每個resource寫出一個Wrapper

struct FILEWrapper {
	FILE* f;
	FILEWrapper(FILE *file) : f(file) {}
	~FILEWrapper() { if (f) fclose(f); }
};
std::unique_ptr<FILEWrapper> fp;

沒什麼不好，只是工作量太大，每加一種Resource就要有個Wrapper，那有沒有其他方案

unique_ptr with custrom destruction

同樣以FILE舉例，新增一個function object

#include <stdio.h>
#include <memory>
struct FileCloser {
        void operator()(FILE *f) {
                if (f) fclose(f);
        };
};
std::unique_ptr<FILE, FileCloser> fp;

這樣看起來跟上面差不了多少
另一種方法是

1	std::unique_ptr<FILE, int()(FILE )> fp(fp, fclose);

這種方式比上面那個還差

out_ptr

雖然跟上面無關，不過這也是unique_ptr的一部分，一併提出
由於API設計的關係，input需要的是double pointer
程式有些可能會變成這樣

std::unique_ptr<ITEMIDLIST_ABSOLUTE, CoTaskMemFreeDeleter> pidl; ITEMIDLIST_ABSOLUTE* rawPidl;
hr = SHGetIDListFromObject(item, &rawPidl);
pidl.reset(rawPidl); 
if (FAILED(hr)) 
	return hr;

這時候就是out_ptr使用場警

std::unique_ptr<ITEMIDLIST_ABSOLUTE, CoTaskMemFreeDeleter> pidl;
hr = SHGetIDListFromObject(item, std::out_ptr(pidl));
if (FAILED(hr)) 
	return hr;

雖然這是在C++23才進入標準庫，不過
GitHub - soasis/out_ptr: Repository for a C++11 implementation of std::out_ptr (p1132), as a standalone library!
已經可以先嘗鮮了

template auto

C++17之後，放寬template的要求
於是這樣的程式碼成為可能

template <auto destroy>
struct c_resource {
};
c_resource<fclose> fp;

配合上C++20的Concept之後，成為威力強大的武器
以下是從Meeting CPP 2022中節錄出來的片段

template <typename T, auto * ConstructFunction, auto * DestructFunction>
struct c_resource {
	using pointer       = T *;
	using const_pointer = std::add_const_t<T> *;
	using element_type  = T;

private:
	using Constructor = decltype(ConstructFunction);
	using Destructor  = decltype(DestructFunction);

	static_assert(std::is_function_v<std::remove_pointer_t<Constructor>>,
	              "I need a C function");
	static_assert(std::is_function_v<std::remove_pointer_t<Destructor>>,
	              "I need a C function");

	static constexpr Constructor construct = ConstructFunction;
	static constexpr Destructor destruct   = DestructFunction;
	static constexpr T * null              = c_resource_null_value<T>;

	struct construct_t {};

public:
	static constexpr construct_t constructed = {};

	[[nodiscard]] constexpr c_resource() noexcept = default;
	[[nodiscard]] constexpr explicit c_resource(construct_t) noexcept
	    requires std::is_invocable_r_v<T *, Constructor>
	: ptr_{ construct() } {}

	template <typename... Ts>
	    requires(sizeof...(Ts) > 0 && std::is_invocable_r_v<T *, Constructor, Ts...>)
	[[nodiscard]] constexpr explicit(sizeof...(Ts) == 1)
	    c_resource(Ts &&... Args) noexcept
	: ptr_{ construct(static_cast<Ts &&>(Args)...) } {}

	template <typename... Ts>
	    requires(sizeof...(Ts) > 0 &&
	             requires(T * p, Ts... Args) {
		             { construct(&p, Args...) } -> std::same_as<void>;
	             })
	[[nodiscard]] constexpr explicit(sizeof...(Ts) == 1)
	    c_resource(Ts &&... Args) noexcept
	: ptr_{ null } {
		construct(&ptr_, static_cast<Ts &&>(Args)...);
	}

	template <typename... Ts>
	    requires(std::is_invocable_v<Constructor, T **, Ts...>)
	[[nodiscard]] constexpr auto emplace(Ts &&... Args) noexcept {
		_destruct(ptr_);
		ptr_ = null;
		return construct(&ptr_, static_cast<Ts &&>(Args)...);
	}

	[[nodiscard]] constexpr c_resource(c_resource && other) noexcept {
		ptr_       = other.ptr_;
		other.ptr_ = null;
	};
	constexpr c_resource & operator=(c_resource && rhs) noexcept {
		if (this != &rhs) {
			_destruct(ptr_);
			ptr_     = rhs.ptr_;
			rhs.ptr_ = null;
		}
		return *this;
	};
	constexpr void swap(c_resource & other) noexcept {
		auto ptr   = ptr_;
		ptr_       = other.ptr_;
		other.ptr_ = ptr;
	}

	static constexpr bool destructible =
	    std::is_invocable_v<Destructor, T *> || std::is_invocable_v<Destructor, T **>;

	constexpr ~c_resource() noexcept = delete;
	constexpr ~c_resource() noexcept
	    requires destructible
	{
		_destruct(ptr_);
	}
	constexpr void clear() noexcept
	    requires destructible
	{
		_destruct(ptr_);
		ptr_ = null;
	}
	constexpr c_resource & operator=(std::nullptr_t) noexcept {
		clear();
		return *this;
	}

	[[nodiscard]] constexpr explicit operator bool() const noexcept {
		return ptr_ != null;
	}
	[[nodiscard]] constexpr bool empty() const noexcept { return ptr_ == null; }
	[[nodiscard]] constexpr friend bool have(const c_resource & r) noexcept {
		return r.ptr_ != null;
	}

	auto operator<=>(const c_resource &) = delete;
	[[nodiscard]] bool operator==(const c_resource & rhs) const noexcept {
		return 0 == std::memcmp(ptr_, rhs.ptr_, sizeof(T));
	}

#if defined(__cpp_explicit_this_parameter)
	template <typename U, typename V>
	static constexpr bool less_const = std::is_const_v<U> < std::is_const_v<V>;
	template <typename U, typename V>
	static constexpr bool similar = std::is_same_v<std::remove_const_t<U>, T>;

	template <typename U, typename Self>
	    requires(similar<U, T> && !less_const<U, Self>)
	[[nodiscard]] constexpr operator U *(this Self && self) noexcept {
		return std::forward_like<Self>(self.ptr_);
	}
	[[nodiscard]] constexpr auto operator->(this auto && self) noexcept {
		return std::forward_like<decltype(self)>(self.ptr_);
	}
	[[nodiscard]] constexpr auto get(this auto && self) noexcept {
		return std::forward_like<decltype(self)>(self.ptr_);
	}
#else
	[[nodiscard]] constexpr operator pointer() noexcept { return like(*this); }
	[[nodiscard]] constexpr operator const_pointer() const noexcept {
		return like(*this);
	}
	[[nodiscard]] constexpr pointer operator->() noexcept { return like(*this); }
	[[nodiscard]] constexpr const_pointer operator->() const noexcept {
		return like(*this);
	}
	[[nodiscard]] constexpr pointer get() noexcept { return like(*this); }
	[[nodiscard]] constexpr const_pointer get() const noexcept { return like(*this); }

private:
	static constexpr auto like(c_resource & self) noexcept { return self.ptr_; }
	static constexpr auto like(const c_resource & self) noexcept {
		return static_cast<const_pointer>(self.ptr_);
	}

public:
#endif

	constexpr void reset(pointer ptr = null) noexcept {
		_destruct(ptr_);
		ptr_ = ptr;
	}

	constexpr pointer release() noexcept {
		auto ptr = ptr_;
		ptr_     = null;
		return ptr;
	}

	template <auto * CleanupFunction>
	struct guard {
		using cleaner = decltype(CleanupFunction);

		static_assert(std::is_function_v<std::remove_pointer_t<cleaner>>,
		              "I need a C function");
		static_assert(std::is_invocable_v<cleaner, pointer>, "Please check the function");

		constexpr guard(c_resource & Obj) noexcept
		: ptr_{ Obj.ptr_ } {}
		constexpr ~guard() noexcept {
			if (ptr_ != null)
				CleanupFunction(ptr_);
		}

	private:
		pointer ptr_;
	};

private:
	constexpr static void _destruct(pointer & p) noexcept
	    requires std::is_invocable_v<Destructor, T *>
	{
		if (p != null)
			destruct(p);
	}
	constexpr static void _destruct(pointer & p) noexcept
	    requires std::is_invocable_v<Destructor, T **>
	{
		if (p != null)
			destruct(&p);
	}

	pointer ptr_ = null;
};

幾乎修正了上面所說的痛點
使用上也只要

1	c_resource<FILE, fopen, fclose> fp;

算是目前看到最通用的解法

Coroutine solution

這算是另闢新徑的方案，RAII的方案都把release resource放在destructor中
自從C++20引進Corotuine，產生了新的可能
使用上大概會是這樣

co_resource<FILE*> usage() {
  FILE *fp = fopen(...);
  co_yield fp;
  fclose(fp);
}

void foo() {
  co_resource<FILE*> r = usage();
  // Do somthing
}

Reference

Allocator in C++

Posted on 2022-11-06 Edited on 2025-08-10

Allocator for C++11

滿足C++11中對Alloocator的需求，所能寫出的最簡單allocator
注意

這邊的allocatte和deallocate不會呼叫Constructor/Destructor，只是單純的記憶體分配，為了簡單，直接用malloc/free
可以對兩個Allocator做比較的動作，如果兩者相等的話，可以達成在A進行allocate，而在B進行deallocate的動作

#include <cstdlib>

template <typename T>
class Minallocator {
public:
  using value_type = T;

  T* allocate(size_t num) { return allocate(num, nullptr); }
  T* allocate(size_t num, const void* hint) { return reinterpret_cast<T*>(std::malloc(sizeof(T) * num)); }
  void deallocate(T* ptr, size_t num) { std::free(ptr); }
  Minallocator() = default;
  ~Minallocator() = default;
  Minallocator(const Minallocator&)            = default;
  Minallocator(Minallocator&&)                 = default;
  Minallocator& operator=(const Minallocator&) = default;
  Minallocator& operator=(Minallocator&&)      = default;
};

template <typename T1, typename T2>
bool operator==(const Minallocator<T1>& lhs,const Minallocator<T2>& rhs)
{
        return true;
}

template <typename T1, typename T2>
bool operator!=(const Minallocator<T1>& lhs, const Minallocator<T2>& rhs)
{
        return false;
}

而要用自己的Allocate就可以這麼做

1	std::vector<int, Minallocator<int>> v;

std::scoped_allocator_adaptor

不常用，有用到再說

rebind

已知T類型的Allocator，想要根據相同策略拿到U類型的Allocator
也就是說希望用同樣的方式來分配U
可以透過

1	allocator<U>=allocator<T>::rebind<U>::other.

拿到，因此

std::allcoator<T>::rebind<U>::other等同於std::allcoator<U>
Myallcoator<T>::rebind<U>::other等同於Myallcoator<U>

在libstdc++中的實現

template <typename _Tp1>
    struct rebind
    {
      typedef allocator<_Tp1> other;
    };

Problem with allocators and containers

這樣的程式碼會有問題

ector<int, Minallocator<int>>  pool_vec  { 1, 2, 3, 4 };
vector<int, Other_allocator<int>> other_vec { };

other_vec = pool_vec;    // ERROR!

因為兩者的Allocator Type不同，所以直接複製不行，所以只要兩者相同就行了，也就是C++17 PMR的初衷

C++17 Polymorphic Memory Resource

新提出來的memory_resource是個asbtract class，不同的instance會有不同的行為
因此可以可以這樣做

// define allocation behaviour via a custom "memory_resource"
class my_memory_resource : public std::pmr::memory_resource { ... };
my_memory_resource mem_res;
auto my_vector = std::pmr::vector<int>(0, &mem_res);

// define a second memory resource
class other_memory_resource : public std::pmr::memory_resource { ... };
other_memory_resource mem_res_other;
auto my_other_vector = std::pmr::vector<int>(0, &mes_res_other);

auto vec = my_vector; // type is std::pmr::vector<int>
vec = my_other_vector; // this is ok -
      // my_vector and my_other_vector have same type

Reference

Reflection in C++

Posted on 2022-10-15 Edited on 2025-08-10

原理

#include <iostream>
#include <string>
struct Test {
        int index;
        std::string name;
        void printInfo() const {
                std::cout << "index: " << index << ", name: " << name << "\n";
        }
};
int main()
{
        Test test;
        test.index = 1;
        test.name = "test_1";
        test.printInfo();
        auto index_addr = &Test::index;
        auto name_addr = &Test::name;
        auto fun_print_addr = &Test::printInfo;
        test.*index_addr = 2;
        test.*name_addr = "test_2";
        (test.*fun_print_addr)();
        return 0;
};

透過上面的index_addr，name_addr，fun_print_addr等，可以對object進行操作
而反射主要分成兩部分

Metadata generation
和C++ object有關的information就叫做metadata，如上面的例子，這邊的困難點是如何減少工作量
Metadata Reflection
既然有了Metadata，如何跟現實使用上連結起來

雖然目前的官方標準還沒出來，不過現在有兩大流派

手工打造

什麼辦不到的事情，用Marco就好了
以Boost Describe舉例

struct X
{
    int m1;
    int m2;
};
BOOST_DESCRIBE_STRUCT(X, (), (m1, m2))

其他Macro Based的方案也差不多，就是另外定義一個Macro，自動生成類似上面的Metadata
不過這邊的問題就是

你要同時維護兩份資料的一致性
Macro滿天飛
修改困難 (因為都是Marco的黑魔法，要新增功能就得對Marco動刀)

libclang

另外一派就是借助libclang來動手生成，透過Parse C++ AST來生成需要的API
舉例說明

class MyClass
{
public:
	int field = 0;
	static int static_field;
	void method();
	static void static_method();
};

生成的Metadata可以這麼使用

reflang::Class<MyClass> metadata;
MyClass c;

// Modify / use c's 'field'.
reflang::Reference ref = metadata.GetField(c, "field");
ref.GetT<int>() = 10;

// Modify / use 'static_field'.
ref = metadata.GetStaticField("static_field")
ref.GetT<int>() = 10;

// Execute 'method()'.
auto methods = metadata.GetMethod("method");
(*methods[0])(c);

// Execute 'static_method()'.
auto methods = metadata.GetStaticMethod("static_method");
(*methods[0])();

這個方案的問題在於

要有libclang才能用
構建的時候會多一個步驟，必須掃描所有的檔案，生成需要的header/sources，修改Makefile/CMakeLists.txt來調整編譯流程

Reflection API in the future

雖然現有的Reflection library多的跟山一樣，不過眾口難調，有些是針對特定用途設計的，無法涵蓋其他方面的使用，有些功能完整，但是難用
於是乎就有人想要對語法方面下手，成為C++ Standard中的一部分

template <class T>
void print_type() {
    std::cout << "void "
              << get_name_v<reflexpr(print_type<T>)> // guaranteed "print_type"
              << "() [with T = "
              << get_display_name_v<reflexpr(T)> 
              << "]" << std::endl;
}

reflexpr和decltype一樣是type-based，所以可以套用到type based metaprogramming中
不過會不會成為標準是另外一回事了
跟Network Library一樣，成為標準之前先用成熟的方案解決

Reference

How to write comparsion operator for custom type in C++

Posted on 2022-09-09 Edited on 2025-08-10

How to write comparsion operator for custom type

The simple case

假設我們有一個類別

1
2
3

struct Value {
	int v;
};

我們要怎麼寫出的程式碼

1 2	Value v1, v2; v1 < v2;

有幾種方式

Naive solution

一種是當member function存在
手動寫出所有comparsion operator

struct Value {
	int v;
	bool operator<(const Value &rhs) { return v < rhs.v; }
	bool operator==(const Value &rhs) { return v == rhs.v; }
	// Ignore
};

另外一種是Free function存在

1 2	bool operator<(const Value &lhs, const Value &rhs) { return lhs.v < rhs.v; } bool operator==(const Value &lhs, const Value &rhs) { return lhs.v == rhs.v; }

兩種實現原理相同，看情況選擇要用哪種，現在要討論的是其他的問題
當我們需要支持更多運算符號時，我們就需要寫更多的Function

1
2
3

bool operator>(const Value &lhs, const Value &rhs);
bool operator==(const Value &lhs, const Value &rhs);
bool operator!=(const Value &lhs, const Value &rhs);

如果我們需要支援另外一種Type

struct Value1 {
    int v;
	int v1;
};

然後又要出現一堆複製貼上加上手動修改的產物

bool operator<(const Value1 &lhs, const Value1 &rhs);
bool operator>(const Value1 &lhs, const Value1 &rhs);
bool operator==(const Value1 &lhs, const Value1 &rhs);
bool operator!=(const Value1 &lhs, const Value1 &rhs);

寫起來麻煩又沒什麼技術含量

CRTP solution

有些operator可以用其他operator表示，例如Not Equal就是Not + Equal
所以我們可以用CRTP技巧減少我們的程式碼

template<class Derived>
struct Equality {
        bool operator !=(const Equality &rhs) {
                return !(static_cast<Derived&>(*this) == static_cast<const Derived&>(rhs));
        }
};

struct Value : Equality<Value> {
        int v;
        bool operator==(const Value &rhs) const { return v == rhs.v; }
};

struct Value1 : Equality<Value1> {
		int v;
        int v1;
        bool operator==(const Value1 &rhs) const { return v == rhs.v; && v1 == rhs.v1; }
};

其他的operator可以如法炮製，很多的C++ Graphics/Math Library都用了這個技巧
只要實作<和==，可以用來推導出其他四種比較關係
不過很不直觀，CRTP就是一種Hack，那有沒有更好的方法

C++20 spaceship operator

Spaceship oerator也叫做The Three-Way Comparison Operator
這是C++20的一個特性，直接上Code來說明

#include <compare>
struct Value {
        int v;
        auto operator<=>(const Value&) const = default; (1)
};

而Compiler直接為你生成Comparsion Code，原先的程式碼視為這樣

1
2
3

(a <=> b) < 0  //true if a < b
(a <=> b) > 0  //true if a > b
(a <=> b) == 0 //true if a is equal/equivalent to b

這種方式類似於strcmp，會回傳<0，>0，0三種情形
基本上這樣就滿足了80%的需求了，不過人生最難的就是那個But
有需要的話自定義比較方式的話，可以自定義comparsion operator

struct Value1 {
	int v;
	int v1;
public:
	auto operator<=>(const Value1& rhs) const {
	   if (auto cmp = v <=> rhs.v; cmp != 0)
		   return cmp;
		return v1 <=> rhs.v1;
	}
 }
};

不過現在spaceship operator必須回傳的是std::strong_ordering，std::weak_ordering，std::partial_ordering其中之一
至於三種ordering的差異，在此不探討，需要的話去Reference看，大部分只需要std::strong_ordering即能完成需求

Reference

Customization Point Object, tag_invoke and future

Posted on 2022-08-06 Edited on 2025-08-10

namespace

由於繼承自C語言，所以會遇到像這樣的問題

// my_std.h
void foo(int);
void bar(void);
// other_lib.h
int foo(void);
int baz(int, int);

來自於不同的Library，且提供不同的實作，在使用上會出現一些問題
而C語言時代的解法就是對Function Name加料

// my_std.h
void my_std_foo(int);
void my_std_bar(void);
// other_lib.h
int other_lib_foo(void);
int other_lib_baz(int, int);

而C++做的事情差不多，用namespace隔開

// my_std.h
namespace my_std {
	void foo(int);
	void bar(void);
}
// other_lib.h
namespace other_lib {
	int foo(void);
	int baz(int, int);
}

ADL

全名是Argument-Dependent Lookup
只要有一個參數在函數的命名空間內，在使用的時候就不用加namespace prefix
在ADL只關心函數，不包含Function Object，這點在之後會用到

namespace A
{
    struct Empty {};
    void foo(int) {}
    void bar(Empty, int) {}
}

void func()
{
    A::foo(2);
    bar(A::Empty{}, 1);
    std::cout << 1; // operator<< (std::cout, 1) Due to ADL
}

如果沒有ADL，最後那行只能這樣寫了

1	std::operator<<(std::cout, 1);

應該沒人會喜歡

Example for std::swap

這是拓展問題的最好範例

namespace std {
	template<typename T> 
	void swap(T& a, T& b) { 
		T temp(a); 
		a = b; 
		b = temp; 
	}
}

如果我們要對自己的class做swap動作時，該怎麼做

namespace My {
	class A {
	public:
		void swap(A&) {}
	};
}

直覺的寫法可以這樣做

```cpp
namespace std
{
    template<>
    void swap<::My::A>(::My::A& a, ::My::A& b) {a.swap(b);}
}

這樣寫是Undefined Beahvior
而另外一種做法是

1 2	template<> void std::swap<My::A>(My::A& a, My::A& b) { a.swap(b); }

不過如果是My::A<T>的話就不管用了
而比較常用的手法，就是利用ADL

void fun(...); // 1 
namespace My { 
	struct A{}; 
	void fun(const A&); // 2 
}
namespace Code { 
	void fun(int); // 3 
	void use() {
		::My::A a;
		fun(a); // HERE 
	} 
}

呼叫的foo(a)時，會考慮2和3，1是因為在Code的namespace已經找到一個fun了，部會在往上層的scope去尋找
利用ADL two-step的手法來拓展我們的std::swap

#include <utility>
namespace My
{
    struct A
    {
        friend void swap(A& a, A& b) { a.swap(b); }
    };
    template<typename T>
    struct B
    {
        friend void swap(B& a, B& b) { a.swap(b); }
    };
}
namespace Code
{
    void use()
    {
        using std::swap;
        ::My::A a1, a2;
        swap(a1, a2); // HERE #1
        ::My::B<int> b1, b2;
        swap(b1, b2); // HERE #2
        int i1, i2;
        swap(i1, i2); // NOPE
    }
}

在這個範例當中，呼叫swap的時候沒加上namespace，而讓std::swap注入當今的Scope下，如果可以透過ADL找到對應的函數，則用特化版的函數，不然就用原先的std::swap做預設值

Drawback on ADL two-step

最大的問題在於

1 2	using std::swap; swap(a1, a2);

可能一不小心就寫成

1	std::swap(a1, a2);

不會報錯，頂多是效能差
另外一個比較大的問題是這個

namespace __my_std_impl
{
    template<typename T>
    auto __distance_impl(T first, T last) {/* ... */}
    template<typename T>
    auto distance(T first, T last) {return __distance_impl(first, last);}
}
struct incomplete;
template<typename T> struct box {T value;};
void use()
{
    incomplete* i = nullptr; // fine
    __my_std_impl::distance(i, i); // fine
    box<incomplete>* b = nullptr; // fine
    __my_std_impl::distance(b, b); // !!!
}

在__my_std_impl::distance(b, b)的地方會報錯
原因在於__distance_impl階段會進行ADL動作，在box的定義上尋找是否有__distance_impl的函數，因找到incomplete value，故報錯
一種可能的解法就是加上namespace

1 2	template<typename T> auto distance(T first, T last) {return __my_std_impl::__distance_impl(first, last);}

Customization Point Object

兩階段ADL的最大問題就是容易誤用
因此叫Standlard library來幫你做這件事
其中最簡單的CPO就長這樣

namespace std::ranges {
	inline constexpr swap = [](auto& a, auto& b) { 
		using std::swap;
		swap(a, b); 
	};
}

這裡的swap是個constexpr object，而不是個function，不過他是一個functor，因此可以適用於所有std::swap的環境下
CPO還有一個優勢，它是一個object，所以它能夠這樣用

1	some_ranges \| views::transform(ranges::begin)

而

1	some_ranges \| views::transform(std::begin)

這樣用不合法，因為它是個template function

Niebloids

Niebloids是要解決另外一個問題，去除掉不想要的ADL candicate
禁用的方法就是讓它成為CPO
以下是StackOverflow的範例

#include <iostream>
#include <type_traits>
namespace mystd
{
    class B{};
    class A{};
    template<typename T>
    void swap(T &a, T &b)
    {
        std::cout << "mystd::swap\n";
    }
}

namespace sx
{
    namespace impl {
       //our functor, the niebloid
        struct __swap {
            template<typename R, typename = std::enable_if_t< std::is_same<R, mystd::A>::value >  >
            void operator()(R &a, R &b) const
            {
                std::cout << "in sx::swap()\n";
                // swap(a, b); 
            }
        };
    }
    inline constexpr impl::__swap swap{};
}

int main()
{
    mystd::B a, b;
    swap(a, b); // calls mystd::swap()

    using namespace sx;
    mystd::A c, d;
    swap(c, d); //No ADL!, calls sx::swap!

    return 0;
}

如果找到的是function object，則不會使用ADL

tag_invoke

根據libunifex裡面的描述，一樣是透過ADL，要解決以下兩個問題

Each one internally dispatches via ADL to a free function of the same name, which has the effect of globally reserving that identifier (within some constraints). Two independent libraries that pick the same name for an ADL customization point still risk collision.

There is occasionally a need to write wrapper types that ought to be transparent to customization. (Type-erasing wrappers are one such example.) With C++20’s CPOs, there is no way to generically forward customizations through the transparent wrap
比較大的問題是第一點，由於透過ADL尋找函數，所以每個namespace下都需要將函數名稱當作保留字

namespace std::range {
	inline constexpr swap = [](auto& a, auto& b) { 
		using std::swap;
		swap(a, b); 
	};
}
namespace A {
	void swap(...);
}
nameapce B {
	void swap(....);
}

也就是你用了swap當CPO之後，其他地方都要保留swap當作保留字不能使用，tag_invoke就是為了這點而生的
參考C++11 tag_invoke的實作 duck_invoke

#include <bfg/tag_invoke.h>
namespace compute {
BFG_TAG_INVOKE_DEF(formula);
} // namespace compute

template <typename Compute>
float do_compute(const Compute & c, float a, float b)
{
	return compute::formula(c, a, b);
}

struct custom_compute
{
private:
	friend float
		tag_invoke(compute::formula_t, const custom_compute &, float a, float b)
	{
		return a * b;
	}
};

int main()
{
	do_compute(custom_compute{}, 2, 3);
}

主要的作法是

需要一個CPO參數，以上的範例是formula
只需要一個tag_invoke function，不過可以Overloading，對不同的CPOj做不同的處理
不過tag_invoke製造了其他問題，難以理解且囉嗦

Future

由於Executors跳票了，所以tag_invoke也不一定是最終解決方案
目前有其他提案，不過會不會被接受也在未定之天
詳細可以找找P2547R0來研究

Reference

如何理解 C++ 中的定制点对象这一概念？为什么要这样设计？
c++ execution 与 coroutine （一) : CPO与tag_invoke
C++特殊定制：揭秘cpo与tag_invoke！
Customization Points
Argument-dependent lookup - cppreference.com
Why tag_invoke is not the solution I want (brevzin.github.io)
What is a niebloid?
ADL，Concepts与扩展C++类库带来的思考
 Duck Invoke — tag_invoke for C++11

Macro in Rust

Posted on 2022-06-04 Edited on 2025-08-10

寫了一堆CC++的文章，是時候換換口味了
Macro在Rust也有，不過不同於C/C++的Texture Replace
Rust的Macro強大的不得了，順便也跟C++的template做個比較

Declarative Macros

從min開始

C macro版的min或是C++ templaate版的就不提供了，寫到不想寫了
直接看Rust的

macro_rules! min {
    ($a:ident, $b:ident) => {
        if ($a < $b) {
            $a
        } else {
            $b
        }
    }
}
fn main() {
    let a = 2u32;
    let b = 3u32;
    println!("{}",  min!(a, b));
}

這樣看起來沒什麼特別的
那如果多加一個變數呢

min version2

macro_rules! min {
    ($a:ident, $b:ident) => {
        if ($a < $b) {
            $a
        } else {
            $b
        }
    };
    ($a:ident, $b:ident, $c:ident) => {
        if ($a < $b) {
            if ($a < $c) {
                $a
            } else {
                $c
            }
        } else {
            if ($b < $c) {
                $b
            } else {
                $c
            }
        }
    }
}
fn main() {
    let a = 3u32;
    let b = 2u32;
    let c = 1u32;
    println!("{}",  min!(a, b, c));
}

同樣的macro，可以有兩種不同的使用方式意
C語言的marco板本長這樣

#define min_2(a, b) ((a) < (b)) ? (a) : (b)
#define min_3(a, b, c) ((a) < (b)) ? ((a) < (c)) ? (a) : (c) : ((b) < (c)) ? (b) : (c)
#define GET_MACRO(_1,_2,_3,NAME,...) NAME
#define min(...) GET_MACRO(__VA_ARGS__, min_3, min_2)(__VA_ARGS__)
printf("%d\n", min(3, 2, 1));

看起來就是一堆亂七八糟拼湊的組合怪
來看看Template版

template <typename T>
T min(T a, T b)
{
        return (a < b) ? a : b;
}
template <typename T>
T min(T a, T b, T c)
{
        return (a < b) ? (a < c) ? a : c : (b < c) ? b : c;
}

憑藉於Function overloading，可讀性高很多，唯一比較麻煩的是要寫兩次template function declaration

min version3

來個varadic個版本，先寫個看起來沒問題，實際上編譯不過的

macro_rules! min {
    ($a:ident) => { $a };
    ($a:ident, $($b:ident),+) => {
        let minV = min!($($b),+)
        if ($a < minV) {
            $a
        } else {
            minV
        }
    };
}

後來發現Rust Macro裡面不能有local variable，只能改成這樣

macro_rules! min {
    ($a:ident) => { $a };
    ($a:ident, $($b:ident),+) => {
        std::cmp::min($a, min!($($b),+))
    };
}

之後又發現一點和C/C++ preprocessor不同的地方，由於他是直接對AST做操作，所以得到的Token要自己Parse
所以做個實驗，參數之間分隔用;取代,，這樣是合法的

macro_rules! min {
    ($a:ident) => { $a };
    ($a:ident; $($b:ident);+) => {
        std::cmp::min($a, min!($($b);+))
    };
}
fn main() {
    let a = 3u32;
    let b = 2u32;
    let c = 1u32;
    println!("{}",  min!(a; b; c));
}

不過沒辦法用local variable有點可惜，
Marco版的，我寫不出來，直接看Variadic Template的版本

template <typename T, typename... Args>
T min(const T& first, const Args&... args)
{
        if constexpr (sizeof...(Args) == 0) {
                return first;
        } else {
                const auto minV = min(args...);
                return (first < minV) ? first : minV;
        }
}

可以做更多的變化，不過Variadic Template最大的問題是我永遠記不住...到底要放哪這件事`
不過Rust真正厲害的是第二種Macro

Procedural Macros

基本上就是把輸入的TokenStream轉成另外的TokenStream的流程

分成三種

1	$ cargo new macro-demo --lib

在Cargo.toml新增以下兩行

1 2	[lib] proc-macro = true

Attribute macros

#[proc_macro_attribute]
fn sorted(args: TokenStream, input: TokenStream) -> TokenStream {
    let _ = args;
    let _ = input;

    unimplemented!()
}

How to use atribute macro

#[sorted]
enum Letter {
    A,
    B,
    C,
}

Function-like procedural macros

#[proc_macro]
pub fn seq(input: TokenStream) -> TokenStream {
    let _ = input;

    unimplemented!()
}

How to use function-like macro

1
2
3

seq! { n in 0..10 {
    /* ... */
}}

Derive macro helper attributes

#[proc_macro_derive(Builder)]
fn derive_builder(input: TokenStream) -> TokenStream {
    let _ = input;

    unimplemented!()
}

How to use derived macro

#[derive(Builder)]
struct Command {
    // ...
}

Caution

Procedural Macros不同於Declarative Macros，必須單獨是一個crate存在，目前IDE對Proc Macro的支持度不好，連Debug Proc Macro也很麻煩，最常使用的還是print大法

Simple example

從別人的範例中學來的，這邊實作一個Attribute macros

$ mkdir rust_proc_macro_demo && cd rust_proc_macro_demo
$ mkdir rust_proc_macro_guide && cd rust_proc_macro_guide
$ cargo init --bin
$ cd ..
$ mkdir proc_macro_define_crate && cd proc_macro_define_crate
$ cargo init --lib
$ cd ..

修改proc_macro_define_crate/Cargo.toml
加入

[lib]
proc-macro = true

[dependencies]
quote = "1"
syn = {features=["full","extra-traits"]}

接著修改rust_proc_macro_guide/Cargo.toml

1 2	[dependencies] proc_macro_define_crate = {path="../proc_macro_define_crate"}

置換掉proc_macro_define_crate/src/lib.rs裡面的內容

use proc_macro::TokenStream;

#[proc_macro_attribute]
pub fn mytest_proc_macro(attr: TokenStream, item: TokenStream) -> TokenStream {
    eprintln!("Attr {:#?}", attr);
    eprintln!("Item {:#?}", item);
    item
}

一樣將rust_proc_macro_guide/src/main.rs內部的內容換掉

use proc_macro_define_crate::mytest_proc_macro;

#[mytest_proc_macro(HungMingWu)]
fn foo(a:i32){
    println!("hello world");
}

接著用cargo check檢查

1 2	$ cd rust_proc_macro_guide/ $ cargo check

可以看到類似這樣的輸出

Attr TokenStream [
    Ident {
        ident: "HungMingWu",
        span: #0 bytes(69..79),
    },
]
Item TokenStream [
    Ident {
        ident: "fn",
        span: #0 bytes(82..84),
    },
    Ident {
        ident: "foo",
        span: #0 bytes(85..88),
    },
    Group {
        delimiter: Parenthesis,
        stream: TokenStream [
            Ident {
                ident: "a",
                span: #0 bytes(89..90),
            },
            Punct {
                ch: ':',
                spacing: Alone,
                span: #0 bytes(90..91),
            },
            Ident {
                ident: "i32",
                span: #0 bytes(91..94),
            },
        ],
        span: #0 bytes(88..95),
    },
    Group {
        delimiter: Brace,
        stream: TokenStream [
            Ident {
                ident: "println",
                span: #0 bytes(101..108),
            },
            Punct {
                ch: '!',
                spacing: Alone,
                span: #0 bytes(108..109),
            },
            Group {
                delimiter: Parenthesis,
                stream: TokenStream [
                    Literal {
                        kind: Str,
                        symbol: "hello world",
                        suffix: None,
                        span: #0 bytes(110..123),
                    },
                ],
                span: #0 bytes(109..124),
            },
            Punct {
                ch: ';',
                spacing: Alone,
                span: #0 bytes(124..125),
            },
        ],
        span: #0 bytes(95..127),
    },
]

這樣我們就能看出Attr和Item分別對應的TokenStream了

From TokenStream to Syntax Tree

有些時候，光看Lexer的TokenStream無助於解決問題，我們需要Syntax Tree
因此我們修改mytest_proc_macro

use proc_macro::TokenStream;
use syn::{parse_macro_input, AttributeArgs, Item};
use quote::quote;

#[proc_macro_attribute]
pub fn mytest_proc_macro(attr: TokenStream, item: TokenStream) -> TokenStream {
    eprintln!("Attr {:#?}", parse_macro_input!(attr as AttributeArgs));
    let body_ast = parse_macro_input!(item as Item);
    eprintln!("Item {:#?}", body_ast);
    quote!(#body_ast).into()
}

會跑出這樣的結果

Attr [
    Meta(
        Path(
            Path {
                leading_colon: None,
                segments: [
                    PathSegment {
                        ident: Ident {
                            ident: "HungMingWu",
                            span: #0 bytes(69..79),
                        },
                        arguments: None,
                    },
                ],
            },
        ),
    ),
]
Item Fn(
    ItemFn {
        attrs: [],
        vis: Inherited,
        sig: Signature {
            constness: None,
            asyncness: None,
            unsafety: None,
            abi: None,
            fn_token: Fn,
            ident: Ident {
                ident: "foo",
                span: #0 bytes(85..88),
            },
            generics: Generics {
                lt_token: None,
                params: [],
                gt_token: None,
                where_clause: None,
            },
            paren_token: Paren,
            inputs: [
                Typed(
                    PatType {
                        attrs: [],
                        pat: Ident(
                            PatIdent {
                                attrs: [],
                                by_ref: None,
                                mutability: None,
                                ident: Ident {
                                    ident: "a",
                                    span: #0 bytes(89..90),
                                },
                                subpat: None,
                            },
                        ),
                        colon_token: Colon,
                        ty: Path(
                            TypePath {
                                qself: None,
                                path: Path {
                                    leading_colon: None,
                                    segments: [
                                        PathSegment {
                                            ident: Ident {
                                                ident: "i32",
                                                span: #0 bytes(91..94),
                                            },
                                            arguments: None,
                                        },
                                    ],
                                },
                            },
                        ),
                    },
                ),
            ],
            variadic: None,
            output: Default,
        },
        block: Block {
            brace_token: Brace,
            stmts: [
                Semi(
                    Macro(
                        ExprMacro {
                            attrs: [],
                            mac: Macro {
                                path: Path {
                                    leading_colon: None,
                                    segments: [
                                        PathSegment {
                                            ident: Ident {
                                                ident: "println",
                                                span: #0 bytes(101..108),
                                            },
                                            arguments: None,
                                        },
                                    ],
                                },
                                bang_token: Bang,
                                delimiter: Paren(
                                    Paren,
                                ),
                                tokens: TokenStream [
                                    Literal {
                                        kind: Str,
                                        symbol: "hello world",
                                        suffix: None,
                                        span: #0 bytes(110..123),
                                    },
                                ],
                            },
                        },
                    ),
                    Semi,
                ),
            ],
        },
    },
)

Comparsion with C/C++

要達到類似的功能，除了X-Macros之外，我想不到類似的方法了
不過X-Marcos不僅醜，功能還有限，Debug更困難

Reference

– Rust Macro 手册
– Rust宏编程新手指南【Macro】
– Rust 过程宏 101
– The Little Book of Rust Macros
– Rust Latam: procedural macros workshop
– Macros in Rust: A tutorial with examples - LogRocket Blog
– Overloading Macro on Number of Arguments

ScopeExit and Higher Order Function in C++

Posted on 2022-02-27 Edited on 2025-08-10

Story

故事起源來自於看到類似這樣的程式碼

#define VL_RESTORER(var) \
    const VRestorer<typename std::decay<decltype(var)>::type> restorer_##var(var);

template <typename T> class VRestorer {
    T& m_ref;
    const T m_saved;
public:
    explicit VRestorer(T& permr)
        : m_ref{permr}
        , m_saved{permr} {}
    ~VRestorer() { m_ref = m_saved; }
};

利用RAII來保存上下文當前的值，執行到結束的時候恢復原狀
不過

int a = 1, b = 2;
VL_RESTORER(a);
VL_RESTORER(b);
a = 3;
b = 4;

用起來沒什麼問題，不過總要找個題目來練習

ScopeExit

基本上就是RAII的變形，在Destructor的部分執行我們需要的Function，隨便在github搜尋就一堆了，這邊有個最簡單的方案

template <typename F>
struct scope_exit
{
    F f;
    ~scope_exit() { f(); }
};

template <typename F>
inline scope_exit<F> make_scope_exit(F&& f)
{
        return scope_exit<F>{f};
}

如果使用上C++17的CTAD，底下的make_scope_exit也不一定得存在

所以問題就變成了這樣，我希望在結束的時候，將所存的變數恢復原狀
問題就變成了該怎麼做

Higher Order Function

雖然C++不是標準的Functional Programming Language，不過要做點手腳還是辦得到的
問題變成了，傳入需要保存狀態的變數，回傳是一個函數，執行這個函數就能恢復原狀，這裡用上了Variadic Template和Tuple

template <typename ...Ts>
inline auto restore(Ts&& ...ts)
{
        return [restore_ref = std::tuple<std::add_lvalue_reference_t<std::decay_t<Ts>>...>(std::forward<Ts>(ts)...),
                store = std::tuple<std::add_const_t<std::decay_t<Ts>>...>(ts...)]() mutable noexcept
        {
                        restore_ref = store;
        };
}

這邊有兩個tuple，其中restore_ref保存了所有變數的reference，store則是變數這個時間點的值

Combo

上面的方式能夠寫成

int a = 1, b = 2;
auto _ = make_scope_exit(restore(a, b));
a = 3;
b = 4;

好壞就見仁見智了

Issue about unsigned integer overflow and underflow

Posted on 2022-01-30 Edited on 2025-08-10

過年前要生產出一些東西出來，不然太久沒寫文章了
看到Unsigned integer overflow和underflow造成的問題，覺得Rust的解法實在很好，在編譯時就能檢查出來

#![deny(clippy::integer_arithmetic)]

use std::env;
const PAGE_SIZE: u64 = 4096;
fn main() {
    let args: Vec<String> = env::args().skip(1).collect();
    let size: u64 = args[0].parse().unwrap();
    println!("({} - 2 - {}) => {}", PAGE_SIZE, size, PAGE_SIZE - 2 - size);
}

然後安裝clippy當作cargo的subcommand

$ cargo clippy
error: integer arithmetic detected
 --> src/main.rs:8:54
  |
8 |     println!("({} - 2 - {}) => {}", PAGE_SIZE, size, PAGE_SIZE - 2 - size);
  |                                                      ^^^^^^^^^^^^^^^^^^^^
  |

如果要正確的處理，程式碼大概是這樣

fn foo(len: u64, size: u64) {
  match (PAGE_SIZE - 2).checked_sub(size) {
      Some(capacity) if len > capacity => {
          println!("no capacity left");
      }
      Some(capacity) => {
          println!("sufficient capacity {}", capacity);
      }
      None => {
          println!("underflow! bad user input!");
      }
  }
}

等價的C語言表示方法

#define IS_UINT_SUB_UNDERFLOW(x, y) ((x) - (y) > (x))
#define IS_UINT_ADD_OVERFLOW(x, y) ((x) + (y) < (x))
#define IS_UINT_MUL_OVERFLOW(x, y, size_max) ((x) && (y) > (size_max) / (x))
#define PAGE_SIZE 4096u

void foo(unsigned int len, unsigned int size) {
  if (IS_UINT_SUB_UNDERFLOW(PAGE_SIZE - 2, size)) {
    printf("underflow! bad user input!\n");
  } else {
    unsigned int capacity = PAGE_SIZE - 2 - size;
    if (len > capacity) {
      printf("no capacity left\n");
    } else {
      printf("sufficient capacity %u\n", capacity);
    }
  }
}

不過這需要CPU實作Modular arithmetic而不是Saturation arithmetic

不然還是有像Integers)這樣的第三方library，不過易用性就不如Rust了

Reference

– Rust: detect unsigned integer underflow