Zero-Shot Image Classification
Transformers
Safetensors
English
clip
image-geolocation
geolocation
geography
geoguessr
multi-modal
Instructions to use jrheiner/thesis-clip-geoloc-continent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jrheiner/thesis-clip-geoloc-continent with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="jrheiner/thesis-clip-geoloc-continent") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotImageClassification processor = AutoProcessor.from_pretrained("jrheiner/thesis-clip-geoloc-continent") model = AutoModelForZeroShotImageClassification.from_pretrained("jrheiner/thesis-clip-geoloc-continent") - Notebooks
- Google Colab
- Kaggle
Model Card for Thesis-CLIP-geoloc-continent
CLIP-ViT model fine-tuned for image geolocation. Optimized for queries at continent-level.
Model Details
Model Description
- Developed by: jrheiner
- Model type: CLIP-ViT
- Language(s) (NLP): English
- License: Creative Commons Attribution Non Commercial 4.0
- Finetuned from model: openai/clip-vit-large-patch14-336
Model Sources
- Repository: https://github.com/jrheiner/thesis-appendix
- Demo: Image Geolocation Demo Space
How to Get Started with the Model
from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("jrheiner/thesis-clip-geoloc-continent")
processor = CLIPProcessor.from_pretrained("jrheiner/thesis-clip-geoloc-continent")
url = "https://huggingface.co/spaces/jrheiner/thesis-demo/resolve/main/kerger-test-images/Oceania_Australia_-32.947127313081_151.47903359833_kerger.jpg"
image = Image.open(requests.get(url, stream=True).raw)
choices = ["North America", "Africa", "Asia", "Oceania", "South America", "Europe"]
inputs = processor(text=choices, images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
Training Details
The model was fine-tuned on 177 270 images (29 545 per continent) sourced from Mapillary.
- Downloads last month
- 18
Model tree for jrheiner/thesis-clip-geoloc-continent
Base model
openai/clip-vit-large-patch14-336