Postgresql Architecture, Basic Info
PostgreSQL Reference Guide
General Database Limits
- Maximum Database Size: Unlimited
- Maximum Table Size: 32 TB
- Maximum Row Size: 1.6 TB
- Maximum Field Size: 1 GB
- Maximum Rows per Table: Unlimited
- Maximum Columns per Table: 250–1600 (depends on column types)
- Maximum Indexes per Table: Unlimited
Common Database Object Names
| Industry Term | PostgreSQL Term |
|---|---|
| Table or Index | Relation |
| Row | Tuple |
| Column | Attribute |
| Data Block | Page (on disk) |
| Page | Buffer (in memory) |
Process & Memory Architecture
1. Top‑Level Process
postmaster (also postgres) is the parent daemon. It listens on 5432 (by default), accepts client connections, and forks a dedicated backend for each session.
2. Shared Memory Areas
| Component | Purpose |
|---|---|
| Shared Buffers | Main buffer cache holding data pages read from disk. |
| WAL Buffers | Staging area for write‑ahead log (WAL) records before flushing. |
| Process Array | Metadata slot for every active backend process. |
3. Background & Utility Processes
| Process | Key Function |
|---|---|
| Background writer | Trickle‑writes dirty buffers to disk, smoothing I/O. |
| WAL writer | Flushes WAL buffers to segment files. |
| Checkpointer | Creates periodic checkpoints for crash recovery. |
| Logging collector | Routes server logs to syslog, Event Log, or files. |
| Autovacuum launcher / workers | VACUUM & ANALYZE tables automatically. |
| Archiver | Copies completed WAL segments (archive_command). |
| Logical replication launcher | Spawns workers that stream changes to subscribers. |
4. Disk Structures
- Data files: Heap & index storage.
- WAL segments: Default 16 MB sequential log files.
- Archived WAL: Off‑site copies for PITR & standby sync.
- Error logs: Text diagnostics.
Backend Connection & Session Lifecycle
| Phase | What Happens |
|---|---|
| Connection request | Client contacts postmaster. |
| Process fork | postmaster forks a backend (postgres). |
| Authentication | Evaluated via pg_hba.conf. |
| Authorization | Catalog privilege checks on every stmt. |
| IPC | Shared memory, semaphores, LWLocks. |
| Per‑query memory | Each backend uses its own work_mem. |
| Termination | Backend exits, PID freed. |
Shared Buffer Read & Write Path
Read Buffering
The first access triggers a disk read into shared_buffers; subsequent reads hit RAM only.
Write Buffering
- Backends modify cached blocks (marking them dirty).
- bgwriter flushes dirty pages gradually.
- Checkpointer forces all older dirty pages to disk at a checkpoint.
Tuning tip: Raiseshared_buffers; balancecheckpoint_timeout,max_wal_size, andbgwriter_*.
Background Writer Cleaning Scan
bgwriter keeps a pool of clean pages ready; governed by bgwriter_lru_maxpages & related settings.
Write‑Ahead Logging (WAL) Mechanics
- Modify page → WAL record generated.
- Group commit collects commits into one
fsync. - Flush via WAL writer,
COMMIT, or buffer wrap‑around.
Key GUCs: wal_buffers, wal_writer_delay, commit_delay, synchronous_commit.
Transaction Log Archiving
With archive_mode = on, archiver executes archive_command for each completed WAL segment to enable PITR and replica creation.
Commit & Checkpoint Sequence
- Pre‑commit: Dirty buffers only in memory.
- Commit: WAL
fsync→ durable; buffers flagged committed. - Checkpoint: Checkpointer flushes all dirty pages ≤ current LSN.
Tune: lower checkpoint_timeout for faster recovery; raise max_wal_size to reduce checkpoint frequency.
SQL Statement Processing Pipeline
- Parse – syntax check, tokenise, traffic‑cop accepts or rejects.
- Optimise – planner explores candidate plans, estimates cost via stats, chooses best.
- Execute – executor walks plan tree, produces tuples to client.
Prepared statements skip parse/plan on reuse; planner can re‑optimise with parameter sniffing.
Physical Database Architecture
- One cluster = data directory + unique port + background procs.
- Multiple databases live in a single cluster, sharing WAL.
Installation Layout (RPM example)
/usr/pgsql‑16/bin– executables/usr/pgsql‑16/lib– shared libraries/usr/pgsql‑16/share– extension / tzdata- Data dir:
/var/lib/pgsql/16/data
Key Sub‑directories
global/– cluster‑wide catalogsbase/– one subdir per database OIDpg_tblspc/– symlinks to external tablespacespg_wal/– active WALpg_xact/,pg_multixact/– commit status- … plus numerous runtime & config files
Tablespaces & File‑per‑Relation Storage
Every table or index lives in its own file. When it grows beyond 1 GB, PostgreSQL appends .1, .2, … segment files (each another 1 GB). Auxiliary files (_fsm, _vm) track free & visibility info.
-- Locate the physical files for a given relation
SELECT pg_relation_filepath('public.my_table');
Sample Data‑Directory Tree (OID Walk‑Through)
$PGDATA/
├─ base/14307/14297 -- first 0‑1 GB of a relation
├─ base/14307/14297.1 -- second GB
├─ pg_tblspc/16650 → /storage/pg_tab
└─ …
Heap / Index Page Layout (8 kB)
| Section | Size | Purpose |
|---|---|---|
| Page header | 24 B | Flags, LSN, checksum, free‑space pointers |
| Line‑pointer array | 4 B × n | Offsets to tuples / index items (grows downward) |
| Free space | variable | Unallocated bytes |
| Tuple / Index entries | variable | Actual row or index data (grows upward) |
| Special area | AM‑specific | Index metadata; empty for heap pages |
Note: Custom builds can use 4–32 kB pages; the 1 GB segment limit scales proportionally.
Comments
Post a Comment