Blog Home

交通仿真

基于元胞自动机(Cellular Automata,CA),模拟交通流、行人流过程。

交通规划原理

四种增长系数法预测交通分布的GUI编写

路径规划

改进A*、Dijkstra、Floyd、0-1规划模型实现全局路径规划

机器学习

机器学习一些常见代码（BP、Pytorch实现ANN、MNIST手写数字识别等）

数学建模

常见数学建模算法整理

个人简介

谈谈自己

Chapter 10 Actor-Critic Methods

In chapter 8, we introduce value approximation function, that is to replace tabular representations for state/action value with function. Similarly, in chapter 9 we use function to represent policy instead fo tabular and turn to policy-based methods. So in this chapter we combine both of them, representing both value and policy with function and incorporating both policy-based and value-based methods.

RyanLee_ljx...大约 10 分钟

Chapter 9 Policy Gradient Methods

In all previous chapters, all the methods are value-based methods. The difference between value-based and policy-based methods lies in their approach. Value-based methods generate policies implicitly and indirectly. The algorithm itself does not directly maintain a policy function. Instead, it solves for the value (state or action value) based on model-free or model-based methods. It greedily (or uses an epsilon-greedy approach) determines the action by maximizing the value function at each step (e.g., $\arg\max_a Q(s, a)$ ), thereby deriving the policy from the value. In contrast, policy-based methods directly represent the policy as a parameterized function $\pi(a|s, \theta)$ , where $\theta$ is a parameter vector (instead of previous tabular representation). The probability distribution of the policy is obtained by directly optimizing the parameter $\theta$ .

RyanLee_ljx...大约 8 分钟

Chapter 8 Value Function Methods

In this chapter we move from previous tabular representation for state/action value to function representation. That is to say, we use a function to fit the true expression of the state/action value function. Such a function can be predifined, e.g., a linear function, or a neural network.

The reason why we move from tabular-based representation to function-based are:

RyanLee_ljx...大约 10 分钟

前言

这篇博客主要是对Kevin M. Lynch and Frank C. Park老师的MODERN ROBOTICS MECHANICS, PLANNING, AND CONTROL一书内容做一个简单整理，只涵盖主要核心的内容，细节内容会有所忽略。

RyanLee_ljx...小于 1 分钟

现代机器人学

第 1 章：绪论 (Preliminary)

机器人的本质是由刚体 (Rigid Bodies) 组成的系统。

连杆 (Links): 机器人系统中的刚体。
关节 (Joints): 连接相邻连杆并允许其发生相对运动的部件。

第 2 章：位形空间 (Configuration Space)

2.1 基本概念

位形 (Configuration): 指定机器人上每一个点的位置（Position）和姿态（Orientation）的一组参数。
位形空间 (C-space): 所有可能位形的集合。
自由度 (Degrees of Freedom, dof): C-space 的维度，即表示机器人位形所需的最小实数参数的个数。

RyanLee_ljx...大约 29 分钟

Preliminaries

对数据的认识

机器学习就是对一个未知分布的数据建模的过程。无论是机器学习哪种学派，其都认为观察到的数据并不是凭空产生的，而是由一个潜在的、客观存在的数据生成过程所产生。这个数据生成过程可以用一个概率分布来描述。

例如抛硬币，会出现正面或反面，我们抛了 $k$ 次，得到 $k$ 个数据。这个结果就可以看作是由一个伯努利分布生成（采样）的。

RyanLee_ljx...大约 11 分钟

变分推断与VAE

隐变量

举一个例子（源于【隐变量（潜在变量）模型】硬核介绍）：

观察下图，表面上我们观测到的数据是一堆点 $x = \{x_1, x_2, \dots, x_n\}$ ，但实际上我们可以直观地发现这些点以某种概率采样自四个不同的分布（假设都是高斯分布）。而潜在变量 $z_i$ 控制了 $x_i$ 从哪个分布中采样： $z_i \sim N(\mu_k, \sigma_k^2)$ ，其中 $k = 1, 2, 3, 4$ 。设 $\sigma_k$ 已知。于是，潜在变量 $z_i$ 表示观测变量 $x_i$ 对应类别的序号。

RyanLee_ljx...大约 6 分钟

Before reading

这篇博客及后续内容将主要介绍扩散模型的相关内容，包括一些基础知识，最终引入扩散模型，最终希望介绍Diffusion Policy在机械臂motion planning的应用。

RyanLee_ljx...小于 1 分钟

Chapter 7 Temporal-Difference learning

In this section we will first introduce TD learning, which refers a wide range of algorithms. It can solve Bellman equation of a given policy $\pi$ without model. We refer TD learning in the first chapter specifically as a classic algorithm for estimating state values. Then we will introduce other algorithms belonging to the wide range of TD learning in the next section.

RyanLee_ljx...大约 9 分钟

Chapter 6 Stochastic Approximation

Stochastic Approximation (SA) refers to a broad class of stochastic iterative algorithms solving root finding or optimization problems. Compared to many other root-finding algorithms such as gradient-based methods, SA is powerful in the sense that it does not require to know the expression of the objective function nor its derivative.

RyanLee_ljx...大约 3 分钟

Ryan Lee

第 1 章：绪论 (Preliminary)

第 2 章：位形空间 (Configuration Space)

2.1 基本概念

对数据的认识

隐变量