Commit dd515664 authored by camille garcin's avatar camille garcin
Browse files

updating readme

parent 0451fd34
# The plantnet dataset
The plantnet dataset is comprised of 1081 species of plants,
representing a total of 306293 images, split in a train, validation and test set with proportion 80/10/10% repesctively.
This dataset is a subsampling of the full dataset set used to train the models of the Pl@ntNet application (https://plantnet.org/). It comprises 1081 species of plants,
representing a total of 306293 images, split in a train, validation and test set with proportion 80/10/10% repesctively. It was created by randomly sampling the full Pl@ntNet dataset at the genus level (i.e. the level uppon the species level in the taxonomy). This allows preserving two essential properties of the full Pl@ntNet dataset: (i) the heavily tailed imbalanced distribution of the classes and (ii), the strong ambiguity existing between some species of the same genus. This dataset is aimed at facilitating research on these two fundamental problems occurring jointly (long tail distribution and class ambiguity)
## Installation
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment