且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

理解scala中演员的线程性

更新时间:2022-10-15 15:15:06

Actor模型可用于从外部世界隔离可变状态。当你有一个可变状态(例如一个ID分配给多个并发进程的全局注册表)时,你可以在一个Actor中包装这个可变状态,并使客户端通过消息传递与Actor通信。这样,只有演员直接访问可变状态,并且如您所说,客户端消息排队等待逐个读取和处理。消息是不可变的。



为了避免队列变满,重要的是消息处理( react receive 等)尽可能短。长时间运行的任务应该交给其他参与者:

  1。演员A从发送者S 
接收消息M. 2.产生新演员C
3. A发送(S,f(M))给C
4.并行:
4a。 A开始处理下一条消息。
4b。 C执行长时间运行或危险(IO)任务
完成后,将结果发送给S,
和C终止。

过程中的一些替代方案:


  • C将(S,result)返回给转发给S的$ A
  • A保持映射 ActorRef C => (Sender S,Message M) 因此,如果它看到C失败,它可以用一个新的Actor重试处理M.



一个Actor是多线程的,多个客户端可以从多个线程发送多条消息,并且保证Actor将连续处理所有这些消息(尽管排序可能受到各种非过分严格的约束)。注意,尽管Actor的反应代码可以在各个线程上执行,在一个给定的时间点内,它只在一个给定的线程上执行(可以图片说明Actor作为调度程序认为合适的从一个线程跳转到另一个线程,但这是一个技术细节)。 注:内部状态仍然不需要同步,因为Actors 保证在处理消息之间发生语义之前。



并行是通过并行工作多个Actor,通常形成 supervisor hierarchy 平衡工作量注意如果你只需要是并发/异步计算,但是您没有或可以摆脱全局状态, 未来 s是一个更好的组合和更简单的概念。


I've been told that (Scala) Actors never actually perform two operations at the same time, which suggests that the act (or react? or receive?) method is inherently synchronized. I know a long operation in an act method can cause blocking issues, and I assume that access to the message queue must be synchronized in some way... but...

What was suggested is that an actor receiving messages telling it to increment an internal counter would increment the counter in a threadsafe way. That no two update messages would be processed simultaneously, and so no two messages could attempt to update the counter at the same time.

A counter attribute in an actor sounds like "shared state."

Is it really true that such an operation would be completely threadsafe? If so, how does an actor make use of multiple core machines in some efficient way? How is an actor multi threaded at all?

If not, what's an appropriate idiomatic way to count messages in a threadsafe way without needing some synchronized/volatile variable?

The Actor model can be used to isolate mutable state from the outside world. When you have a mutable state (for example a global registry of IDs assigned to multiple concurrent processes), you can wrap that mutable state up inside an Actor and make the clients communicate with the Actor via message-passing. That way only the actor accesses the mutable state directly, and as you say, the client messages queue up to be read and processed one-by-one. It is important for messages to be immutable.

To avoid the queue getting full, it is important that the message processing (react, receive, etc.) be as short as possible. Long-running tasks should be handed off to an other actor:

1.  Actor A receives a message M from sender S
2.  A spawns a new actor C
3.  A sends (S, f(M)) to C
4.  In parallel:
4a. A starts processing the next message.
4b. C does the long-running or dangerous (IO) task,
    When finished, sends the result to S,
    and C terminates.

Some alternatives in the process:

  • C sends (S, result) back to A who forwards to S
  • A keeps a mapping ActorRef C => (Sender S, Message M) so in case it sees C fail, it can retry processing M with a new Actor.

So to recap, an Actor is multi-threaded to the extent that multiple clients can send it multiple messages from various threads, and it is guaranteed that the Actor will process all these messages serially (although the ordering can be subject to various non-overly strict constraints).

Note that while the Actor's react code may be executed on various threads, in a single given point of time it is executed on a single given thread only (you can picture this that the Actor jumps from thread to thread as the scheduler sees fit, but this is a technical detail). Note: The internal state still doesn't need syncronization, since Actors guarantee happens-before semantics between processing messages.

Parallelism is achieved by having multiple Actors working in parallel, usually forming supervisor hierarchies or balancing workload.

Note that if all you need is concurrent/asynchronous computations, but you don't have or can get rid of global state, Futures are a better composing and easier concept.