Question 1

Is my image uploaded anywhere?

Accepted Answer

No. Whichever model you pick runs entirely in your browser, using your GPU via WebGPU when available and WebAssembly otherwise. The image is processed on your device and never uploaded. Only the model itself is downloaded, once, then cached for instant reuse.

Question 2

What kinds of objects can it detect?

Accepted Answer

It detects 80 common object types from the COCO dataset: people, animals (cats, dogs, birds, horses), vehicles (cars, bikes, buses, planes), and everyday objects (chairs, laptops, bottles, cups, phones, TV remotes, and more).

Question 3

Which model powers the detection?

Accepted Answer

You pick the tier. Fast is YOLO11n, a state-of-the-art real-time detector that loads instantly as a roughly 10 MB ONNX model. Best is D-FINE-S, an Apache-2.0 transformer that is noticeably more accurate (about 48.5 COCO AP versus 39.5) for a similar, roughly 11 MB download. Both predict a labeled box and a confidence score per object and run locally in your browser (WebGPU accelerated when available, WebAssembly otherwise).

Question 4

What does the confidence slider do?

Accepted Answer

Each detection has a score from 0 to 1 for how sure the model is. The slider hides any box below the threshold you pick, so you can trim weak or spurious guesses. It re-filters the existing results instantly without re-running the model.

Question 5

Why does the first run take a moment?

Accepted Answer

The model you pick (roughly 10 MB for Fast, 11 MB for Best) downloads the first time you use it, then is cached for instant reuse. Larger images also take a little longer because more pixels are analyzed.

Object Detector

How to detect objects in a photo

Examples

Frequently asked questions

Related tools

AI Background Remover

AI Depth Map Generator

Image to Text (OCR)

Add Border to Image

AI Alt Text Generator

AI Image Upscaler