source

06 Inter Process Communication

A process has its own virtual address space.

Can’t see each other’s memory Can’t directly call each other’s functions Only interact via the OS

But in real system, processes need to:

IPC = all the OS mechanisms that make that possible.

  1. Message-based IPC
    • Data is sent as messages via some OS-managed channel
    • pipes, message queues, sockets
  2. Memory-based IPC
    • Processes share part of memory
    • shared memory segments, memory-mapped files Plus: Higher- level: Remote Procedure Call (RPC) Synchronization primitives: mutexes, semaphores, condition variables, etc.

Message-based IPC

Processes do: send/write a message to a port or handle recv/read a message from a port

The OS: Creates and maintains the channel Manages buffers, queues, scheduling, synchronization

Cost: user/kernel crossings + copies

Each send/receive usually does:

  1. User -> kernel (system call)
  2. Copy data from process memory -> kernel buffer
  3. Kernel maybe moves data around internally
  4. Kernel -> user (on receive)
  5. Copy data from kernel buffer -> process memory

For a request-response (A->B then B->A): 4 system calls total (send, recv, send, recv) 4 copies total (two each way)

Pros: Simple to use: OS hides details Synchronization largely handled by the OS Works between unrelated processes Often works across machines Cons: Repeated system call overhead Repeated data copying Can be slow for large data

Pipes: is a unidirectional data channel: writer -> pipe -> reader.

Two endpoints (file descriptors)

Message Queues: Channel understands messages, not just bytes:

API in Unix: SysV Message queues POSIX Message queues These often provide: Blocking/non-blocking send/recv; Priority-based ordering; Flags for different behaviors

Sockets: Sockets generalize IPC to local + network communication. A socket is an endpoint: think “file descriptor + protocol” You get a socket with socket(…), which: Creates a kernel buffer for that socket Associates a protocol stack Local vs Remote:

Shared Memory IPC (memory-based)

Instead of copying data through the kernel each time: OS maps the same physical pages into the virtual address spaces of multiple processes/ So: Process A and B both have some addresses (maybe different virtual addresses) that refer to the same physical memory

After the mapping is set up: Each process just loads/stores to that region as if it were normal memory No system call needed per access

Pros: Extremely fast after setup No user/kernel crossings per access Zero-copy data sharing is possible Very good for large data and frequent communication Cons: The OS only sets up maps; you must: Handle synchronization (avoiding races) Define a protocol (where to put data, when it’s ready) Harder to get right than message-based IPC

Physical pages do not need to be contiguous Virtual addresses in each process can be different The OS sets up the mapping in page tables

Copy vs Map trade-offs

Copy (message-based IPC)

Data copies might still happen:

OS example: Windows Local Procedure Calls (LPC): For small messages, it just copies via a port-like mechanism For large messages, it uses mapping semantics (shared memory)

SysV shared Memory

SysV shared memory is an older Unix API based on segments. Segments as resources

Segments are persistent:

Getting a key: ftok Different processes need to agree on a segment identifier. You don’t want to hardcode numeric IDs. key_t ftok(const char *pathname, int proj_id);

Attaching shmat: void *shmat(int shmid, const void *shmaddr, int shmflg);

Detaching: shmdt int shmdt(const void *shmaddr);

POSIX Shared Memory

POSIX takes a more file-like approach. Files in tmpfs: POSIX shared memory objects look like files but: Live in a tmpfs (memory-backed pseudo-filesystem) Represent chunks of physical memory

Create/open object int shm_open(const char *name, int oflag, mode_t mode); Returns a file descriptor. Size the object: ftruncate(fd, size); Map it into your address space: void *addr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); Unmap when done: munmap(addr, size); Remove shared memory object shm_unlink(name);

Synchronization for Shared Memory

Once multiple processes share memory, you get the same race condition problems as with multithreading—plus some extra complexity. Rules:

Pthreads sync across processes

Pthreads can be used across processes if:

  1. The synchronization objects (mutexes, cond vars) live in shared memory.
  2. They are initialized with the PTHREAD_PROCESS_SHARED attribute.

Steps:

  1. Create a shared segment (SysV or POSIX).
  2. Define a struct in that shared region, e.g.: typedef struct { pthread_mutex_t lock; char buffer[BUF_SIZE]; } shm_data_t;
  3. initialize attributes: pthread_mutexattr_t attr; pthread_mutexattr_init(&attr); pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED); pthread_mutex_init(&shm_ptr->lock, &attr);
  4. Now processes can pthread_mutex_lock(&shm_ptr->lock) and coordinate.

When PTHREAD_PROCESS_SHARED is not supported or is inconvenient, you can use:

Design considerations for shared-memory IPC

Imagine two multithreaded processes communicating via shared memory. You have to design: How many shared segments? Option A: One big segment

Option B: Multiple segments

Often a good hybrid is:


How big should segments be?

Question: Is the size of data known and bounded?

This requires:


RPC

RPC is higher level than raw IPC Instead of “send message” / “recv message”, you say:

RPC describes: