Skip to main content

Data Collection

This guide explains how edge devices collect and upload data to Acusight, and how that data flows through the MLOps pipeline.

Overview

Acusight supports two data upload modes depending on your network architecture:
ModeWhen to UseHow It Works
LAN ModeEdge devices can reach MinIO directlyDevices upload to MinIO, notify core service
Cloud ModeEdge devices behind NAT/firewallDevices upload through core API (proxy to MinIO)

Cloud Mode (Default)

In cloud mode, the edge agent sends images directly to the core API, which proxies them to MinIO.
Edge Device                    Acusight Platform
┌─────────────┐    HTTP POST   ┌─────────────┐    S3 PUT    ┌─────────┐
│ Agent       │───────────────▶│ Core Service│─────────────▶│  MinIO  │
│             │                │             │              │         │
└─────────────┘                └──────┬──────┘              └─────────┘

                                      │ Notify

                               ┌─────────────┐
                               │Data Service │
                               └─────────────┘

Upload Endpoint

POST /api/device/images/upload
Headers:
  • X-Device-Key: adk_<your-api-key> (required)
  • Content-Type: multipart/form-data
Form Fields:
  • image: The image file (required)
  • metadata: JSON object with additional info (optional)

Example Upload

curl -X POST http://<SERVER>:8080/api/device/images/upload \
  -H "X-Device-Key: adk_your_api_key" \
  -F "image=@/path/to/image.jpg" \
  -F 'metadata={"source":"camera-1","timestamp":"2024-01-15T10:30:00Z"}'

Agent Automatic Upload

The Acusight agent automatically handles uploads when configured. Place images in the watch directory:
# Default watch directory inside agent container
/data/uploads/

# The agent will:
# 1. Detect new images
# 2. Upload to core API
# 3. Move processed images to /data/uploaded/

LAN Mode

In LAN mode, devices upload directly to MinIO (faster, less load on core service).
Edge Device                                  Acusight Platform
┌─────────────┐    S3 PUT      ┌─────────┐
│ Agent       │───────────────▶│  MinIO  │
│             │                │         │
└──────┬──────┘                └─────────┘

       │ Notify                ┌─────────────┐
       └──────────────────────▶│ Core Service│
                               └──────┬──────┘
                                      │ Notify

                               ┌─────────────┐
                               │Data Service │
                               └─────────────┘
To enable LAN mode, configure the agent with MinIO credentials:
# /etc/acusight/agent.conf
ACUSIGHT_DEVICE_ID=...
ACUSIGHT_API_KEY=...
ACUSIGHT_API_ENDPOINT=http://<SERVER>:8080
ACUSIGHT_MINIO_ENDPOINT=http://<SERVER>:9010
ACUSIGHT_MINIO_ACCESS_KEY=minioadmin
ACUSIGHT_MINIO_SECRET_KEY=minioadmin

Data Flow Pipeline

Once images are uploaded, they flow through the MLOps pipeline:
┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Upload    │───▶│   Batch     │───▶│  Annotate   │───▶│  Dataset    │
│  (raw-data) │    │ (collecting)│    │ (annotating)│    │  Version    │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘

1. Raw Data Pool

  • Images land in MinIO bucket: raw-data/<uuid>_<filename>
  • Automatically grouped into batches based on device and time window

2. Batch Collection

  • Collecting: Active batch receiving new images
  • Complete: Batch closed (time window ended or manual close)
  • Unassigned: Ready to be assigned to a project

3. Annotation

Once assigned to a project:
  • Annotating: Images being labeled
  • Annotators add bounding boxes, classes, etc.

4. Dataset Version

  • Create immutable snapshots with train/val/test splits
  • Export in various formats (YOLO, COCO, etc.)
  • Use for model training

Batch Configuration

Batches are automatically created based on configurable rules:
SettingDescriptionDefault
batch_timeoutTime window for collecting images30 minutes
batch_max_imagesMaximum images per batch1000
auto_closeAutomatically close when timeout reachedtrue

Monitoring Uploads

Check Recent Batches

curl http://<SERVER>:8080/api/data/batches \
  -H "Authorization: Bearer <TOKEN>"

Check Batch Images

curl http://<SERVER>:8080/api/data/batches/<BATCH_ID>/images \
  -H "Authorization: Bearer <TOKEN>"

View Upload Metrics

The core service exposes Prometheus metrics:
curl http://<SERVER>:8080/metrics | grep acusight_images
Key metrics:
  • acusight_images_uploaded_total - Total images uploaded
  • acusight_upload_duration_seconds - Upload latency histogram
  • acusight_upload_size_bytes - Image size histogram

Troubleshooting

ProblemCauseSolution
Upload returns 401Invalid or expired API keyCheck agent config
Upload returns 413Image too largeCheck size limits
Images not appearing in batchesData service not notifiedCheck core service logs
Batch stuck in “collecting”Timeout not triggeredManually close batch or check settings

Next Steps