RUST Part2

错误处理

错误处理，包含捕获、传播、处理。处理方法常见有，使用返回值（Golang），使用异常（Python，C++），使用类型系统（Rust，Haskell）。

使用返回值时，错误需要立即处理或者显式传递。使用异常需要开发者对其有较好的理解和把控，平衡好异常处理的开销。

Rust 使用类似 Haskell 的类型设计，使用 Option 和 Result 作为一个内部包含正常返回类型和错误返回类型的复合类型。

Result 类型声明时还有个 must_use 的标注，如果该类型对应的值没有被显式使用，编译器会提示。

如果你只想传播错误，不想就地处理，可以用 ? 操作符。

? 操作符内部被展开成类似这样的代码：

match result {
  Ok(v) => v,
  Err(e) => return Err(e.into())
}

Rust 还为 Option 和 Result 提供了大量的辅助函数，如 map (处理Ok和Some) / map_err (处理 err) / and_then (处理 Ok和Some，同时捕获新的 Err)。

严重的错误

使用 panic! 和 catch_unwind 处理不可恢复或者不想恢复的错误。比如，如果协议变量写错了，最佳的方式是立刻 panic! 出来（当然还是需要考虑系统是否需要忽略部分错误），让错误立刻暴露，以便解决这个问题。

catch_unwind 作用和其它语言的 try {…} catch {…} 一样。

use std::panic;

fn main() {
    let result = panic::catch_unwind(|| {
        println!("hello!");
    });
    assert!(result.is_ok());
    let result = panic::catch_unwind(|| {
        panic!("oh no!");  // exception
    });
    assert!(result.is_err());
    println!("panic captured: {:#?}", result);
}

如果把 Rust 代码整个封装在 catch_unwind() 函数所需要传入的闭包中。一旦任何代码中，包括第三方 crates 的代码，含有能够导致 panic! 的代码，都会被捕获，并被转换为一个 Result。

错误类型转换

Rust 定义了 Error trait：

pub trait Error: Debug + Display {
    fn source(&self) -> Option<&(dyn Error + 'static)> { ... }
    fn backtrace(&self) -> Option<&Backtrace> { ... }
    fn description(&self) -> &str { ... }
    fn cause(&self) -> Option<&dyn Error> { ... }
}

使用 thiserror和anyhow 可以简化自定义的错误类型。

比如使用 thiserror：

use thiserror::Error;
#[derive(Error, Debug)]
#[non_exhaustive]
pub enum DataStoreError {
    #[error("data store disconnected")]
    Disconnect(#[from] std::io::Error),
    #[error("the data for key `{0}` is not available")]
    Redaction(String),
    #[error("invalid header (expected {expected:?}, found {found:?})")]
    InvalidHeader {
        expected: String,
        found: String,
    },
    #[error("unknown data store error")]
    Unknown,
}

而 anyhow 实现了 anyhow::Error 和任意符合 Error trait 的错误类型之间的转换，让你可以使用 ? 操作符，不必再手工转换错误类型。anyhow 还可以让你很容易地抛出一些临时的错误，而不必费力定义错误类型，但是不提倡滥用这个能力。

闭包

闭包是一种匿名类型，一旦声明，就会产生一个新的类型，但这个类型无法被其它地方使用。这个类型就像一个结构体，会包含所有捕获的变量。其大小和内部的局部变量大小无关。

捕获顺序，会影响闭包保存变量的顺序。但是 Rust 编译器会对结构体内存进行内存优化，所以实际程序中内存顺序，不一定是 debug 所显式的顺序。

闭包是存储在栈上（和 Golang/Python/Java 这些不同，它们会申请堆内存），并且除了捕获的数据外，闭包本身不包含任何额外函数指针指向闭包的代码。

Golang/Python/Java 这些闭包，会有额外的堆内存分配、潜在的动态分派（闭包被处理成函数指针）、额外的内存回收。

使用了 move 且 move 到闭包内的数据结构需要满足 Send。闭包拥有数据的所有权，它的生命周期是 'static。

Rust 的性能却和使用命令式编程的 C 几乎一样，除了编译器优化的效果，也因为 Rust 闭包的性能和函数差不多。

Rust 闭包的效率非常高。首先闭包捕获的变量，都储存在栈上，没有堆内存分配。其次因为闭包在创建时会隐式地创建自己的类型，每个闭包都是一个新的类型。通过闭包自己唯一的类型，Rust 不需要额外的函数指针来运行闭包，所以闭包的调用效率和函数调用几乎一致。

闭包类型

FnOnce

定义如下：

pub trait FnOnce<Args> {
    type Output;
    extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
}

FnOnce 有一个关联类型 Output，显然，它是闭包返回值的类型。还有一个方法 call_once，要注意的是 call_once 第一个参数是 self，它会转移 self 的所有权到 call_once 函数中。所以它只能被调用一次。

FnMut

定义如下：

pub trait FnMut<Args>: FnOnce<Args> {
    extern "rust-call" fn call_mut(
        &mut self, 
        args: Args
    ) -> Self::Output;
}

首先，FnMut “继承”了 FnOnce，或者说 FnOnce 是 FnMut 的 super trait。所以FnMut也拥有 Output 这个关联类型和 call_once 这个方法。此外，它还有一个 call_mut() 方法。call_mut() 传入 &mut self，它不移动 self，所以 FnMut 可以被多次调用。

Fn

定义如下：

1
2
3

pub trait Fn<Args>: FnMut<Args> {
    extern "rust-call" fn call(&self, args: Args) -> Self::Output;
}

可以看到，它“继承”了 FnMut，或者说 FnMut 是 Fn 的 super trait。这也就意味着任何需要 FnOnce 或者 FnMut 的场合，都可以传入满足 Fn 的闭包。Fn 不允许修改闭包的内部数据，也可以执行多次。

示例

tonic（Rust 下的 gRPC 库）的例子：

pub trait Interceptor {
    /// Intercept a request before it is sent, optionally cancelling it.
    fn call(&mut self, request: crate::Request<()>) -> Result<crate::Request<()>, Status>;
}

impl<F> Interceptor for F
where
    F: FnMut(crate::Request<()>) -> Result<crate::Request<()>, Status>,
{
    fn call(&mut self, request: crate::Request<()>) -> Result<crate::Request<()>, Status> {
        self(request)
    }
}

这里为 F 实现 Trait，把 Request 和闭包 F 统一起来调用。

泛型数据结构使用

BufReader 代码示例：

pub struct BufReader<R: ?Sized> {
    buf: Buffer,
    inner: R,
}

impl<R: Read> BufReader<R> {
    pub fn new(inner: R) -> BufReader<R> {
        BufReader::with_capacity(DEFAULT_BUF_SIZE, inner)
    }
    
    pub fn with_capacity(capacity: usize, inner: R) -> BufReader<R> {
        BufReader { inner, buf: Buffer::with_capacity(capacity) }
    }
}

impl<R: Read + ?Sized> BufReader<R> {
    pub fn peek(&mut self, n: usize) -> io::Result<&[u8]> {...}
}

impl<R: ?Sized> BufReader<R> {
    pub fn get_ref(&self) -> &R {
        &self.inner
    }
    
    pub fn get_mut(&mut self) -> &mut R {
        &mut self.inner
    }
    
    pub fn buffer(&self) -> &[u8] {
        self.buf.buffer()
    }
    
    pub fn into_inner(self) -> R
    where
        R: Sized,
    {
        self.inner
    }
    
    pub(in crate::io) fn discard_buffer(&mut self) {
        self.buf.discard_buffer()
    }
}

impl<R: ?Sized + Read> Read for BufReader<R> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {...}
    
    ...
}

根据不同的约束，分成了不同的代码块。

三种常见的使用场景

使用泛型参数延迟数据结构的绑定；
使用泛型参数和 PhantomData，声明数据结构中不直接使用，但在实现过程中需要用到的类型。PhantomData 长度为零，是个 ZST（Zero-Sized Type），就像不存在一样，唯一作用就是类型的标记；

#[derive(Debug, Default, PartialEq, Eq)]
pub struct Identifier<T> {
    inner: u64,
    _tag: PhantomData<T>,
}

#[derive(Debug, Default, PartialEq, Eq)]
pub struct User {
    id: Identifier<Self>,
}

#[derive(Debug, Default, PartialEq, Eq)]
pub struct Product {
    id: Identifier<Self>,
}

使用泛型参数让同一个数据结构对同一个 trait 可以拥有不同的实现。

#[derive(Debug, Default)]
pub struct Equation<IterMethod> {
    current: u32,
    _method: PhantomData<IterMethod>,
}

#[derive(Debug, Default)]
pub struct Linear;

#[derive(Debug, Default)]
pub struct Quadratic;

impl Iterator for Equation<Linear> {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        self.current += 1;
        if self.current >= u32::MAX {
            return None;
        }

        Some(self.current)
    }
}

impl Iterator for Equation<Quadratic> {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        self.current += 1;
        if self.current >= u16::MAX as u32 {
            return None;
        }

        Some(self.current * self.current)
    }
}

另外：

泛型参数支持默认值；

impl Trait 与 <Constrain: Trait> 是相同的；

动态分配的 trait object

trait object 是 Rust 处理多态的手段。

// 返回 trait object
pub fn trait_object_as_return_working(i: u32) -> Box<dyn Iterator<Item = u32>> {
    Box::new(std::iter::once(i))
}

使用 trait object 是有额外的代价的，首先这里有一次额外的堆分配，其次动态分派会带来一定的性能损失。

当我们在运行时想让某个具体类型，只表现出某个 trait 的行为，可以通过将其赋值给一个 dyn T，无论是 &dyn T，还是 Box<dyn T>，还是 Arc<dyn T>。此时，原有的类型被抹去，Rust 会创建一个 trait object，并为其分配满足该 trait 的 vtable。

在编译 dyn T 时，Rust 会为使用了 trait object 类型的 trait 实现，生成相应的 vtable，放在可执行文件中（一般在 TEXT 或 RODATA 段）。

当 trait object 调用 trait 的方法时，它会先从 vptr 中找到对应的 vtable，进而找到对应的方法来执行。

pub type BoxedError = Box<dyn Error + Send + Sync>;

pub trait Executor {
    fn run(&self) -> Result<Option<i32>, BoxedError>;
}

/// 使用泛型参数
pub fn execute_generics(cmd: &impl Executor) -> Result<Option<i32>, BoxedError> {
    cmd.run()
}

/// 使用 trait object: &dyn T
pub fn execute_trait_object(cmd: &dyn Executor) -> Result<Option<i32>, BoxedError> {
    cmd.run()
}

/// 使用 trait object: Box<dyn T>
pub fn execute_boxed_trait_object(cmd: Box<dyn Executor>) -> Result<Option<i32>, BoxedError> {
    cmd.run()
}

&dyn Executor 和 Box 是 trait object，前者在栈上，后者分配在堆上。

学习 Trait 设计的开源库：snow snow-doc

用 trait 做桥接

// Engine trait：未来可以添加更多的 engine，主流程只需要替换 engine
pub trait Engine {
    // 对 engine 按照 specs 进行一系列有序的处理
    fn apply(&mut self, specs: &[Spec]);
    // 从 engine 中生成目标图片，注意这里用的是 self，而非 self 的引用
    fn generate(self, format: ImageOutputFormat) -> Vec<u8>;
}

// 使用 image engine 处理
let mut engine: Photon = data
    .try_into()
    .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
engine.apply(&spec.specs);
let image = engine.generate(ImageOutputFormat::Jpeg(85));

可读性优化

// 从 Bytes 转换成 Photon 结构
impl TryFrom<Bytes> for Photon {
    type Error = anyhow::Error;

    fn try_from(data: Bytes) -> Result<Self, Self::Error> {
        Ok(Self(open_image_from_bytes(&data)?))
    }
}

// Engine trait：未来可以添加更多的 engine，主流程只需要替换 engine
pub trait Engine {
    // 生成一个新的 engine
    fn create<T>(data: T) -> Result<Self>
    where
        Self: Sized,
        T: TryInto<Self>,
    {
        data.try_into()
            .map_err(|_| anyhow!("failed to create engine"))
    }
    // 对 engine 按照 specs 进行一系列有序的处理
    fn apply(&mut self, specs: &[Spec]);
    // 从 engine 中生成目标图片，注意这里用的是 self，而非 self 的引用
    fn generate(self, format: ImageOutputFormat) -> Vec<u8>;
}

// 使用 image engine 处理
let mut engine = Photon::create(data)
    .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
engine.apply(&spec.specs);
let image = engine.generate(ImageOutputFormat::Jpeg(85));

桥接也可以用来隐藏具体实现细节：

let secret_api = api_with_user_token(&user, params);
let data: Vec<Status> = reqwest::get(secret_api)?.json()?;

// 隐藏 api 调用细节的 trait
pub trait FriendCircle {
	fn get_published(&self, user: &User) -> Result<Vec<Status>, FriendCircleError>;
}

用 trait 作为约束反馈信息给上层

pub trait Engine {
    // 生成一个新的 engine
    fn create<T>(data: T) -> Result<Self>
    where
        Self: Sized,
        T: TryInto<Self>,  // 只要 T 实现了 TryInto<Self> 即可
    {
        data.try_into()
            .map_err(|_| anyhow!("failed to create engine"))
    }
    ...
}

用 trait 实现 SOLID

SRP：单一职责原则，是指每个模块应该只负责单一的功能，不应该让多个功能耦合在一起，而是应该将其组合在一起。
OCP：开闭原则，是指软件系统应该对修改关闭，而对扩展开放。trait 的不同实现，或者 trait 的继承扩展。
LSP：里氏替换原则，是指如果组件可替换，那么这些可替换的组件应该遵守相同的约束，或者说接口。比如，上文中实现了 Engine trait 的 engine 可以进行替换。
ISP：接口隔离原则，是指使用者只需要知道他们感兴趣的方法，而不该被迫了解和使用对他们来说无用的方法或者功能。一般当 trait 满足 SRP 单一职责原则时，它也满足接口隔离原则
DIP：依赖反转原则，是指某些场合下底层代码应该依赖高层代码，而非高层代码去依赖底层代码。

示例

pub trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;
	
    // 在使用了引用的场景，返回对象的所有权
    fn into_iter(self) -> Self::IntoIter;
}

Service 注册：

pub struct ServiceInner<Store> {
    store: Store,
    on_received: Vec<fn(&CommandRequest)>,
    on_executed: Vec<fn(&CommandResponse)>,
    on_before_send: Vec<fn(&mut CommandResponse)>,
    on_after_send: Vec<fn()>,
}

impl<Store: Storage> ServiceInner<Store> {
    pub fn new(store: Store) -> Self {
        Self {
            store,
            on_received: Vec::new(),
            on_executed: Vec::new(),
            on_before_send: Vec::new(),
            on_after_send: Vec::new(),
        }
    }

    pub fn fn_received(mut self, f: fn(&CommandRequest)) -> Self {
        self.on_received.push(f);
        self
    }

    pub fn fn_executed(mut self, f: fn(&CommandResponse)) -> Self {
        self.on_executed.push(f);
        self
    }

    pub fn fn_before_send(mut self, f: fn(&mut CommandResponse)) -> Self {
        self.on_before_send.push(f);
        self
    }

    pub fn fn_after_send(mut self, f: fn()) -> Self {
        self.on_after_send.push(f);
        self
    }
}

impl<Store: Storage> From<ServiceInner<Store>> for Service<Store> {
    fn from(inner: ServiceInner<Store>) -> Self {
        Self {
            inner: Arc::new(inner),
        }
    }
}

let service: Service = ServiceInner::new(MemTable::default())
        .fn_received(|_: &CommandRequest| {})
        .fn_received(b)
        .fn_executed(c)
        .fn_before_send(d)
        .fn_after_send(e)
        .into();

可变泛型 trait 和不可变泛型 trait：

/// 事件通知（不可变事件）
pub trait Notify<Arg> {
    fn notify(&self, arg: &Arg);
}

/// 事件通知（可变事件）
pub trait NotifyMut<Arg> {
    fn notify(&self, arg: &mut Arg);
}

impl<Arg> Notify<Arg> for Vec<fn(&Arg)> {
    #[inline]
    fn notify(&self, arg: &Arg) {
        for f in self {
            f(arg)
        }
    }
}

impl<Arg> NotifyMut<Arg> for Vec<fn(&mut Arg)> {
	#[inline]
    fn notify(&self, arg: &mut Arg) {
        for f in self {
            f(arg)
        }
    }
}

impl<Store: Storage> Service<Store> {
    pub fn execute(&self, cmd: CommandRequest) -> CommandResponse {
        self.inner.on_received.notify(&cmd);

        let mut res = dispatch(cmd, &self.inner.store);

        self.inner.on_executed.notify(&res);
        self.inner.on_before_send.notify(&mut res);

        res
    }
}

三方库推荐

tonic / axum / tokio-uring / tokio-rustls / tokio-stream / tokio-util 等网络和异步 IO 库

bytes / tracing / mio / slab / serde / clap / structopt / indicatif / dialoguer / crossbeam / nom 基础组件库

hyper 处理 http1/http2，quinn / quiche 处理 QUIC/http3，tonic 处理 gRPC，以及 tungstenite / tokio-tungstenite 处理 websocket

avro-rs 处理 apache avro，capnp 处理 Cap’n Proto，prost 处理 protobuf，flatbuffers 处理 google flatbuffers，thrift 处理 apache thrift

actix-web / rocket / axum Web 框架

diesel / sea-orm ORM，sqlx SQL

支持 jinja 语法的 askama，有类似 jinja2 的 tera，处理 markdown 的 comrak 的模板引擎

纯前端 yew 和 seed，全栈 MoonZoon

Web 测试 headless_chrome, thirtyfour 和 fantoccini

静态网站生成领域，对标 hugo 的 zola 和对标 gitbook 的 mdbook

云原生 kube-rs

WebAssembly wasm-pack, wasm-bindgen, wasmtime 和 rustwasm

嵌入式开发 embedded WG，Awesome embedded rust

机器学习 tensorflow 的绑定，tch-rs libtorch（PyTorch）的绑定，对标 scikit-learn 的 linfa

Notes Rust

Rust

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

RUST Part3 上一篇

RUST Part1 下一篇