Skip to main content

RESTful API & DevOps Pipeline

A cloud data logger that pulls industrial OPC UA telemetry into a versioned REST API, shipped end-to-end through GitHub Actions and Docker. Industrial sources on one end, scripts and dashboards on the other, with a clean contract in the middle.

View on GitHub
  • Backend Architecture
  • REST API Design
  • CI/CD Pipeline
  • Docker & Containerization
opcua-cloud-datalogger · openapi
versionv1.4.0specopenapi 3.1contractstable
  • GET/v1/devicesList every device the logger is polling.
  • GET/v1/devices/{id}Single-device metadata and last seen.
  • GET/v1/devices/{id}/samplesTime-windowed samples for one device.
  • POST/v1/devices/{id}/pollForce an out-of-cycle poll. Idempotent.
  • GET/v1/runsPipeline run history with status and duration.
  • GET/healthzLiveness probe. Returns 200 if the logger is alive.
v1
Versioned REST surface
6
Endpoints, OpenAPI-described
2m
Push-to-deploy time
1
Docker image, every layer

The problem

Industrial PLCs speak OPC UA. The applications that need that data — dashboards, BI tools, automation scripts — speak HTTP. Stitching the two together every time you onboard a new device is exhausting, and exposing the OPC UA server directly to the network is a non-starter for security and contract stability reasons.

The logger sits between the two. It owns the OPC UA session, persists what it pulls, and presents a small, versioned REST surface that downstream applications can rely on. The whole thing ships as a single Docker image, deployed by GitHub Actions on every push to main.

What it does

  • OPC UA session, owned
    The logger keeps a single OPC UA session per device. It subscribes where the server supports it and falls back to polling where it does not.
  • Versioned REST surface
    All endpoints live under /v1. Breaking changes get a new prefix; non-breaking ones never break clients. OpenAPI is the contract.
  • Auto-generated docs
    Swagger UI ships in the same image. The docs and the API can never disagree because the spec is generated from the route definitions.
  • At-least-once delivery
    Samples are written idempotently on (device, ts). Backfills, retries, and restarts are all safe.
  • Observable runs
    Every poll cycle is a row in the runs table. Duration, sample count, status, and stack trace on failure — readable through /v1/runs.
  • One image, every layer
    Storage, service, API. Shipped as one container; configuration is environment variables. No multi-service orchestration in v1.

Browsing the address space

OPC UA exposes the PLC’s data as a tree — folders, objects, and variables, each with a node-id, type, and access level. The logger walks the tree on connect, learns the shape, and only persists what the device config says to persist.

opc ua · address-space browser
sessionconnectedpoll500ms
  • opc.tcp://plc-01:4840
  • Objects
  • Production
  • Line 1
  • SpeedInt321240 rpm
  • StatusBooleantrue
  • TemperatureFloat62.4 °C
  • Line 2
  • Diagnostics

End to end, in one diagram

Each hop has one job. The logger owns the OPC UA session, Postgres owns the samples, the REST surface owns the contract, and clients never need to know the upstream protocol existed.

pipeline · plc → cloud → client
deliveryat-least-oncecontractversioned
PLC / OPC UA
industrial source
Logger service
pyopcua · async
Postgres
samples · runs · devices
REST v1
OpenAPI · versioned
Clients
web · BI · scripts
subscribe + backfill·schema migrations on deploy·API surface is the only contract

The poll loop

The hot path is intentionally small. Subscribe when the server supports it, poll when it does not, write idempotently, and let the schema enforce uniqueness rather than the application.

logger/poll.pypython
async def poll_once(device: Device) -> RunResult:
    async with opc_session(device.endpoint) as client:
        values = await client.read_values(device.node_ids)

    rows = [
        Sample(device_id=device.id, ts=now(), tag=tag, value=val)
        for tag, val in zip(device.tags, values)
    ]

    # ON CONFLICT (device_id, ts, tag) DO NOTHING
    inserted = await samples.bulk_insert_idempotent(rows)

    return RunResult(
        device_id=device.id,
        polled=len(rows),
        inserted=inserted,
        ok=True,
    )

From push to deploy

Every push to main runs the same pipeline: install, lint, test, build, push the image, deploy. Nothing about the production deploy is a one-off — the workflow file is the deploy.

.github/workflows/release.yml
runnerubuntu-24.04triggerpush · main
this run · a8c1f3etotal · 2m 15s
  1. install
    7s
  2. lint
    4s
  3. test
    38s
  4. build image
    52s
  5. push :sha
    11s
  6. deploy
    23s
recent runs
  • a8c1f3emainfix: backfill window inclusive2m ago
  • 4d2bf91mainfeat: add /v1/runs filter38m ago
  • 7f014a0feat/opc-resubwip: resubscribe on dc1h ago
  • 9bb22e1mainchore: bump pyopcua3h ago

The workflow file

One workflow, six jobs, one image. Caching keeps the typical run under three minutes. Tags get a separate workflow that pins the image tag instead of using :sha.

.github/workflows/release.ymlyaml
name: release
on:
  push:
    branches: [main]

jobs:
  ci:
    runs-on: ubuntu-24.04
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.12', cache: pip }
      - run: pip install -r requirements.txt
      - run: ruff check .
      - run: pytest -q

  ship:
    needs: ci
    runs-on: ubuntu-24.04
    permissions: { packages: write, contents: read }
    steps:
      - uses: actions/checkout@v4
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v6
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
      - uses: ./.github/actions/deploy
        with: { tag: ${{ github.sha }} }

Stack, by layer

Each layer has one boundary and one contract. Swap the runtime, storage, or registry without touching the layer above or below.

stack · four layers · one image
shipped assingle container
  1. 04layer
    Delivery
    Docker images on a registry, deployed by GitHub Actions to the cloud target.
    • GitHub Actions
    • Docker
    • GHCR
    • Cloud Run / Fly.io
  2. 03layer
    API
    Versioned REST. OpenAPI is the contract; breaking changes get a new prefix.
    • FastAPI
    • Pydantic
    • OpenAPI 3.1
    • Uvicorn
  3. 02layer
    Service
    Async OPC UA client. Subscribes when it can, polls when it must, idempotent on retry.
    • Python 3.12
    • asyncua
    • APScheduler
    • tenacity
  4. 01layer
    Storage
    Postgres for samples, runs, and devices. Migrations versioned with the code.
    • PostgreSQL
    • SQLAlchemy
    • Alembic
plc → service → storage → api → client·swap storage or runtime with one env var

How a sample becomes a JSON row

  • 01 · source

    OPC UA value

    A PLC variable updates. The logger subscription receives it, or the next poll cycle picks it up.

  • 02 · normalize

    Schema-aware row

    Tagged with device id, timestamp, and node path. Type-coerced into the storage schema.

  • 03 · persist

    Idempotent write

    INSERT ... ON CONFLICT (device, ts, tag) DO NOTHING. Backfills and retries are safe by construction.

  • 04 · serve

    REST v1 contract

    Read endpoints expose the same row through a stable JSON shape. No client ever sees OPC UA.

  • 05 · ship

    Container in production

    Image built and deployed on every push to main. Configuration is environment variables. Rolling restart on update.

Where it stands

The logger is in production for the OPC UA fleets it was built for. Adding a new device is two lines of YAML and a push; the rest of the stack carries it from there.