Automatic Photo Augmentation

Overview

Many photos fail not because the scene is bad, but because the chosen framing, focus, or style does not match the subject a user actually cares about. Our proposal reframes the task as perceptual post-capture optimization: automatically identify the most important region, emphasize it, and render a more intentional image.

We frame the task as perceptual post-capture optimization: detect the visually important region, organize the composition around it, and render a stronger final image automatically.

Input: one RGB photograph
Automatic subject emphasis: depth, contrast, saturation, and center priors
Composition adjustment: adaptive crop guided by subject placement
Output: a refocused and photo-enhanced image with clearer visual hierarchy

Pipeline

The demo pipeline below shows the full CS766 flow from the original photo to the final augmented result.

Depth Estimation

We estimate monocular depth with Depth Anything V2 when the environment supports it, and fall back to a heuristic pseudo-depth prior when local torch and torchvision versions are mismatched.

Robust demo path

Automatic Subject Prior

We combine near-depth preference, local contrast, saturation, edge strength, and a soft center bias to infer which region is most likely to be the intended subject.

No manual click required

Adaptive Focus and Crop

The subject prior drives automatic focus distance estimation and a crop box that tries to keep the subject comfortably framed while improving balance.

Composition aware

Style Enhancement

We apply lightweight local contrast, color, and detail enhancement, with stronger sharpening concentrated around the automatically selected subject.

Presentation ready

Method Details

Depth and Geometry

Dense depth estimation gives us a rough geometric layout. We smooth and remap the depth field before using it for focus simulation and subject ranking.

Depth-Aware Rendering

We compute a Circle of Confusion map and blend multiple blur layers to produce depth-aware focus effects that strengthen subject separation before the final enhancement step.

Automatic Augmentation

The detected subject prior drives a crop and enhancement pass that directly improves framing, emphasis, and presentation quality in the final image.

Interactive Results

These examples compare the input photo, depth-aware rendering outputs, and the final automatic augmentation stage.

Original vs. Depth Prior

Drag the handle to compare the input photograph against the estimated depth visualization.

Original Depth prior

Original vs. Refocused

This component uses the estimated focus distance and depth map to synthesize a shallower depth of field.

Original Refocused

CoC Visualization

The Circle of Confusion map highlights where the renderer assigns stronger blur.

Automatic Augmentation Result

The left image is the adaptive crop before enhancement. The right image is the final augmented result.

Adaptive crop Final augment

Subject Prior and Crop

The overlay shows the saliency prior, the detected subject box, and the crop selected for final enhancement.

Summary Strip

Depth-of-Field Sweep Across Apertures

These samples show how the renderer responds when the aperture changes.

f/1.0
Extreme background blur

f/1.4
Very shallow depth of field

f/2.8
Strong portrait-style emphasis

f/5.6
Moderate blur

f/8.0
Deeper focus

f/11.0
Minimal synthetic blur

Automatic Focus Sweep

This animation now shows two synchronized views: the current focus render on the left and the final augmented result on the right, so the change in focus and the enhancement pass are both visible.

Implementation

Core Components

Depth Anything V2 or pseudo-depth fallback for robust demo execution
Depth-aware Circle of Confusion renderer for controllable focus effects
Automatic subject saliency from depth, contrast, saturation, and center priors
Adaptive crop and subject-aware enhancement for final photo augmentation

Command Examples

python CS766_Project/CS766_AutoPhoto_Core/auto_augment.py Image_Folder/pic-24.jpg --output-dir Auto_Augment_Results

python CS766_Project/CS766_AutoPhoto_Webpage/prepare_webpage_images.py Image_Folder/pic-24.jpg --output-dir CS766_Project/CS766_AutoPhoto_Webpage/webpage_images

python CS766_Project/CS766_AutoPhoto_Webpage/update_html_paths.py webpage_images

Film Style Presets

Beyond depth-of-field simulation, the pipeline supports personalised colour grading inspired by classic film stocks and cinematic looks. Each preset adjusts tone curves, saturation, shadow and highlight tints, spatially-correlated film grain, and a subject-anchored vignette — all applied after the subject-aware augmentation stage so the look is consistent regardless of composition.

Kodak Portra 400
Warm skin tones · lifted shadows · fine grain

Fuji Velvia 50
Punchy saturation · cool shadows · landscape look

Fuji Pro 400H
Pastel · airy · cool cast · portrait favourite

CineStill 800T
Tungsten balanced · halation glow · neon & night

Ilford HP5
Classic B&W · medium grain · wide exposure latitude

Kodak Tri-X
High contrast · coarse grain · gritty documentary

Teal & Orange
Hollywood blockbuster · warm highlights · cool shadows

Vintage Faded
Lifted blacks · warm cast · worn film feel

Challenges and Next Steps

Current Challenges

Monocular depth still introduces edge errors around thin structures and cluttered scenes.
The subject heuristic is effective for demos, but it is not yet learned end-to-end from user preference data.
Automatic cropping can fail on unusual compositions where the intended subject is not the most salient region.

Future Work

Train a learned importance predictor for subject emphasis rather than using heuristics alone.
Add multiple target aspect ratios and platform-specific crop presets.
Introduce stronger style controls so users can choose portrait, editorial, or cinematic output modes.
Connect the pipeline to a simple web upload interface for live demos.