https://www.nist.gov/itl/products-and-services/emnist-dataset
The EMNIST Dataset
What is it? The EMNIST dataset is a set of handwritten character digits derived from
www.nist.gov
28x28 pixel image format and dataset structure that directly matches the MNIST dataset.
- EMNIST ByClass: 814,255 characters. 62 unbalanced classes.
- EMNIST ByMerge: 814,255 characters. 47 unbalanced classes.
- EMNIST Balanced: 131,600 characters. 47 balanced classes.
- EMNIST Letters: 145,600 characters. 26 balanced classes.
- EMNIST Digits: 280,000 characters. 10 balanced classes.
- EMNIST MNIST: 70,000 characters. 10 balanced classes.
The full complement of the NIST Special Database 19 is available in the ByClass and ByMerge splits. The EMNIST Balanced dataset contains a set of characters with an equal number of samples per class. The EMNIST Letters dataset merges a balanced set of the uppercase and lowercase letters into a single 26-class task. The EMNIST Digits and EMNIST MNIST dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset.
https://arxiv.org/abs/1702.05373v1
EMNIST: an extension of MNIST to handwritten letters
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements
arxiv.org
https://www.tensorflow.org/datasets/catalog/emnist
emnist | TensorFlow Datasets
The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset. Note: Like the original EMNIST data, images pro
www.tensorflow.org