ICDAR 2019

Coding/Image

by linguana 2021. 6. 8. 16:41

Downloads - ICDAR 2019 Robust Reading Challenge on Multi-lingual scene text detection and recognition - Robust Reading Competition (uab.es)

Note that this task only requires localization results (as indicated in results format in the tasks page), but the ground truth also provides the script id of each bounding box and the transcription. This extra information will be needed in Tasks 3 and 4.

Extra information about the training set (may be useful for researchers who focus on one or only few languages, not all of the multi-lingual set):

The 10,000 images are ordered in the training set such that: each consecutive 1000 images contain text of one main language (and it may of course contain additional text from 1 or 2 other languages, all from the set of the 10 languages)
00001 - 01000:  Arabic
01001 - 02000:  English
02001 - 03000:  French
03001 - 04000:  Chinese
04001 - 05000:  German
05001 - 06000:  Korean
06001 - 07000:  Japanese
07001 - 08000:  Italian
08001 - 09000:  Bangla
09001 - 10000:  Hindi

Synth data: 약 40k

Real scene data: 1k

모두 좌표 4개 (x1, y1, x2, y2, x3, y3, x4, y4) + Latin/Korean + 텍스트 내용

EDA

icdar 한국어만 따로 떼어내서 데이터 분석한 주피터노트북 파일.

icdar_mlt_kor_eda.ipynb

3.66MB

저작자표시 비영리 변경금지 (새창열림)

'Coding > Image' 카테고리의 다른 글

FOTS (0)	2021.06.08
Affine Transformation (0)	2021.06.08
[DMQA Open Seminar] Scene Text Detection and Recognition (0)	2021.06.08
EAST (0)	2021.06.03
Real-Time STD with DB (0)	2021.05.31

linguana

고정 헤더 영역

메뉴 레이어

메뉴 리스트

검색 레이어

검색 영역

상세 컨텐츠

본문 제목

본문

EDA

'Coding > Image' 카테고리의 다른 글

관련글 더보기

추가 정보

인기글

최신글

티스토리툴바