> ## Documentation Index
> Fetch the complete documentation index at: https://docs.acusight.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Collection

> How edge devices send data to Acusight

# Data Collection

This guide explains how edge devices collect and upload data to Acusight, and how that data flows through the MLOps pipeline.

## Overview

Acusight supports two data upload modes depending on your network architecture:

| Mode           | When to Use                           | How It Works                                     |
| -------------- | ------------------------------------- | ------------------------------------------------ |
| **LAN Mode**   | Edge devices can reach MinIO directly | Devices upload to MinIO, notify core service     |
| **Cloud Mode** | Edge devices behind NAT/firewall      | Devices upload through core API (proxy to MinIO) |

## Cloud Mode (Default)

In cloud mode, the edge agent sends images directly to the core API, which proxies them to MinIO.

```
Edge Device                    Acusight Platform
┌─────────────┐    HTTP POST   ┌─────────────┐    S3 PUT    ┌─────────┐
│ Agent       │───────────────▶│ Core Service│─────────────▶│  MinIO  │
│             │                │             │              │         │
└─────────────┘                └──────┬──────┘              └─────────┘
                                      │
                                      │ Notify
                                      ▼
                               ┌─────────────┐
                               │Data Service │
                               └─────────────┘
```

### Upload Endpoint

```
POST /api/device/images/upload
```

**Headers:**

* `X-Device-Key: adk_<your-api-key>` (required)
* `Content-Type: multipart/form-data`

**Form Fields:**

* `image`: The image file (required)
* `metadata`: JSON object with additional info (optional)

### Example Upload

```bash theme={null}
curl -X POST http://<SERVER>:8080/api/device/images/upload \
  -H "X-Device-Key: adk_your_api_key" \
  -F "image=@/path/to/image.jpg" \
  -F 'metadata={"source":"camera-1","timestamp":"2024-01-15T10:30:00Z"}'
```

### Agent Automatic Upload

The Acusight agent automatically handles uploads when configured. Place images in the watch directory:

```bash theme={null}
# Default watch directory inside agent container
/data/uploads/

# The agent will:
# 1. Detect new images
# 2. Upload to core API
# 3. Move processed images to /data/uploaded/
```

## LAN Mode

In LAN mode, devices upload directly to MinIO (faster, less load on core service).

```
Edge Device                                  Acusight Platform
┌─────────────┐    S3 PUT      ┌─────────┐
│ Agent       │───────────────▶│  MinIO  │
│             │                │         │
└──────┬──────┘                └─────────┘
       │
       │ Notify                ┌─────────────┐
       └──────────────────────▶│ Core Service│
                               └──────┬──────┘
                                      │ Notify
                                      ▼
                               ┌─────────────┐
                               │Data Service │
                               └─────────────┘
```

To enable LAN mode, configure the agent with MinIO credentials:

```bash theme={null}
# /etc/acusight/agent.conf
ACUSIGHT_DEVICE_ID=...
ACUSIGHT_API_KEY=...
ACUSIGHT_API_ENDPOINT=http://<SERVER>:8080
ACUSIGHT_MINIO_URL=http://<SERVER>:9010
ACUSIGHT_MINIO_ACCESS_KEY=minioadmin
ACUSIGHT_MINIO_SECRET_KEY=minioadmin
```

## Data Flow Pipeline

Once images are uploaded, they flow through the MLOps pipeline:

```
┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Upload    │───▶│   Batch     │───▶│  Annotate   │───▶│  Dataset    │
│  (raw-data) │    │ (collecting)│    │ (annotating)│    │  Version    │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
```

### 1. Raw Data Pool

* Images land in MinIO bucket: `raw-data/<uuid>_<filename>`
* Automatically grouped into batches based on device and time window

### 2. Batch Collection

* **Collecting**: Active batch receiving new images
* **Complete**: Batch closed (time window ended or manual close)
* **Unassigned**: Ready to be assigned to a project

### 3. Annotation

Once assigned to a project:

* **Annotating**: Images being labeled
* Annotators add bounding boxes, classes, etc.

### 4. Dataset Version

* Create immutable snapshots with `train`/`val`/`test` splits
* Export in various formats (YOLO, COCO, etc.)
* Use for model training

## Batch Configuration

Batches are automatically created based on configurable rules:

| Setting            | Description                              | Default    |
| ------------------ | ---------------------------------------- | ---------- |
| `batch_timeout`    | Time window for collecting images        | 30 minutes |
| `batch_max_images` | Maximum images per batch                 | 1000       |
| `auto_close`       | Automatically close when timeout reached | true       |

## Monitoring Uploads

### Check Recent Batches

```bash theme={null}
curl http://<SERVER>:8080/api/data/batches \
  -H "Authorization: Bearer <TOKEN>"
```

### Check Batch Images

```bash theme={null}
curl http://<SERVER>:8080/api/data/batches/<BATCH_ID>/images \
  -H "Authorization: Bearer <TOKEN>"
```

### View Upload Metrics

The core service exposes Prometheus metrics:

```bash theme={null}
curl http://<SERVER>:8080/metrics | grep acusight_images
```

Key metrics:

* `acusight_images_uploaded_total` - Total images uploaded
* `acusight_upload_duration_seconds` - Upload latency histogram
* `acusight_upload_size_bytes` - Image size histogram

## Troubleshooting

| Problem                         | Cause                      | Solution                               |
| ------------------------------- | -------------------------- | -------------------------------------- |
| Upload returns 401              | Invalid or expired API key | Check agent config                     |
| Upload returns 413              | Image too large            | Check size limits                      |
| Images not appearing in batches | Data service not notified  | Check core service logs                |
| Batch stuck in "collecting"     | Timeout not triggered      | Manually close batch or check settings |

## Next Steps

<CardGroup cols={2}>
  <Card title="API Reference" icon="code" href="/api-reference/introduction">
    Full API documentation
  </Card>

  <Card title="Device Provisioning" icon="plus" href="/guides/deploy/device-provisioning">
    Add more devices
  </Card>
</CardGroup>
