作業系統 Ch3 – Process (下）

本章節承接 Process（上），會繼續談到下列這些東西：

process scheduling
process scheduling queue
Inter-process Communication
- Share Memory
- Message Passing

Process Scheduling 的概念

目的

Multi-programming: 最大化 CPU 的產能（utilization）
Time sharing: 因為程式是輪流去使用 CPU 的，所以誰先誰後就是 scheduler要處理的事情。

Process Scheduling Queue

Job queue: 發生在 New State
Ready queue: 發生在 Ready state，等著使用 CPU
Device queue: 發生在 Wait State，等著使用 I/O

process scheduling diagram

scheduler

scheduler 根據「即將被處理」的對象不同會有不同的 scheduler，像是 CPU scheduler、Job scheduler 之類的。

short-term scheduler
- CPU scheduler 又被稱為 short-term scheduler，因為執行的頻率較高
- Ready State -> Run State
long-term scheduler
- Job scheduler。相較於 CPU 的頻率來看，Job scheduler 的執行的頻率較低
- 有新的程式被執行，要產生 process 時才會執行的 scheduler。
- 主要是人的行為觸發。
- New State -> Ready State
Medium-term scheduler
- 把 program 從 memory swap 回 disk 成 memory content（和之後會講到的 virtual memory 有關）
- Ready State -> Wait State

short-term scheduler

執行頻率很高（e.g., once per 100ms ）
- 所以執行的效率就必須要很高
要想演算法來縮短每個 process 的等待時間（還要在意 fairness）
又因為效率要高，所以也不能太複雜（不然 CPU overhead 就會太高）
- overehad = (選的時間) / (執行 job 的時間 + 選的時間)

long-term scheduler

控制 degree of multi-programming
- degree of multi-programming 指的是「目前電腦上，有多少程式在 memory 中」
- 太低的話，CPU 就會有很大的時間在 idle
- 太高的話，就會發生 trashing。很多程式在搶 memory，會一直做 swap、不斷地做 I/O。
負責讓 CPU & I/O 有最高的產能：所以就要混著讓做 CPU & 做 I/O 的程式進來

medium-term scheduler

處理程式在 memory & disk 之間切換的 scheduler，現在電腦因為 memory 比較夠、且有 virtual memory 的概念，所以 long-term scheduler 比較式微，其工作相較過往，都會比較落在 medium-term scheduler 身上。

swap out: 把 process 從 memory 搬回 disk 上來減少 degree of multi-programming
swap in: 發現 memory 有空間了，就把 process 從 disk 抓回 memory
propose:
- improve process mix：控制 CPU & I/O bound
- free up memory

Inter-process Communication (IPC)

定義：在一或多個 Process 或 Thread 之間，用來溝通的方法
種類
- 不溝通的：independent process
- 會溝通的：cooperating process
目的
- 資訊共享
- 加速運算（平行運算）
- 方便
- modularity（e.g. micro-kernel）

How-to IPC

要做到 IPC 有兩種方式：

share memory
- user synchronization 要更小心謹慎（處理這個最痛苦啊）
- 好處就是快
- 透過 memory 的 address
message passing
- 需要 system call，所以通常較慢
- 但若資料小，用這個方法的話可以省去 share memory 的 lock 之類的額外資源消耗，反而會略快（且不用在意 sync 的問題）

Share Memory

要知道怎麼 create 一個可以共用的 memory 區塊
- 預設是不會發生，因為 OS 要確保 process 的 protection
開空間要呼叫 system call
但資料格式都是一堆 bytes，由 process 自行定義、處理
必須確保不會同時有兩個對同一塊 memory 進行寫入的 access 行為

這邊周老師舉了個 consumer & producer 的例子，不使用 locking 的話會需要犧牲一塊空間來做到 synchronization。

Message Passing

好處

通常內含 synchronize 的機制，programmer 不用擔心這件事
可以自行掌控各個 variable，不用擔心被別人亂竄改

方式有兩種：

Sockets
- network 中的 IP/port，IP 就是電腦，port 就是 process
- 資料格式就是 stream of bytes
Remote Procedure Calls
- 使用遠端提供的 function call 去做到資料的傳輸
- 輸入輸出會有預先定義好的資料格式

一般來說，OS 會提供兩種 function 讓程式之間可以 Call，並在背後做各種 memory copy 的處理。

send(Message)
receive(Message)

而整個 communication system 還會需要提供一個 link，以便可以溝通，這個 Link 又會分為兩類：

Physical
- 是透過 network 還是 HW bus，甚至是 shared memory 等等的做法
Logical
- direct / indirect：要否指定接收對象
- 方向性：只能 A 傳 B，B 不能傳 A
- 角色是否相對
- Blocking / non-blocking：行為做完才會回傳 or not
- send by copy or by reference
- 固定大小 / 可變大小的 message

Direct V.S Indirect Message Passing

Direct

send & receive 的 params 中就需要有對方的 identity
可以不用 init
一定是 1 to 1
modularity 有限

Indirect

會有一個 mailbox，要傳的人會把東西丟進去，要拿的人再跟 mailbox 拿
sender & receiver 都不用在意對方是誰
connection 需要由 programmer 建立
可以多對多
處理 message 的管理的人會是另一個程式，而非收送雙方，Mailbox 可以是 by OS 或 by process
Issue: 可能會有個人同時跟 mailbox 拿資料，理論上一個 msg 應該只有一個人拿得到，所以下面就有幾個解法
- 只允許一個 receiver（最爛，根本就是 direct）
- 用 Locking 的方式，確保同一時間只有一個 receiver
- 交給 mailbox 決定，會分為兩步驟。receiver 會先要求要取值，而 mailbox 決定要給誰之後再去跟 receiver 說有沒有拿到。

Blocking V.S non-Blocking

blocking -> synchronous
non-blocking -> asynchronous

Blocking

non-Blocking

Send

確定對方接到後才會 return

送出後就不管

Receive

拿到 Message 才會 return

Call 完後就不管。
通常會拿到一個 token，
之後確認有拿到後
再用 callback function 去處理它

Buffer 與此兩種之間的關係

Blocking 通常會有 size=1 的 buffer，用來 check 是否完成行為。（未完成就一個擋著）
Non-blocking 則有兩種情況
- bounded: 送到滿了之後就會 blocking，或是 throw 一個 error
- unbounded: 不管，就是送到整個 OS 都滿了才會不給送

Socket

會有 client & server
server 要先打開 client 才能連
client 進行 connect 之後，就可以盡情 read / write
補充：
- server 在 accept 之後，會動態開 thread，如此才能一次處理多個 request

Remote Procedure Calls (RPC)

Library 通常會有 stubs，在 client 端和 server 端會各有一隻小程式，（server 端的叫 skeleton）會負責 implement 細節
其實通常下面也可能用 socket，只是幫 programmer 包起來而已
parameter mashaling
- 把 params 包進 message 就叫
- 要把兩台電腦串在一起其實有許多細節要處理，RPC 就要負責幫使用者把這些繁瑣的事情解決掉

Wildsky F.