Before analyzing the code, I would like to start off by introducing an article that I found in quora by searching "how to read big source code." So, here it goes:
www.quora.com/How-can-you-read-and-study-a-large-software-project-source-code
To make the long story short, it is suggesting that you
Well, then. Let's get down to the business with no further ado.
목표: 아래의 깃헙 코드를 4/9 - 4/13 (5일) 이해하고 구현한다.
chenyuntc/simple-faster-rcnn-pytorch
A simplified implemention of Faster R-CNN that replicate performance from origin paper - chenyuntc/simple-faster-rcnn-pytorch
github.com
일단 파일 구조부터 보자:
# shift + 우측 마우스 클릭 => 여기에 PowerShell 창 열기
# tree /F
.
│ .gitattributes
│ demo.ipynb
│ LICENSE
│ README.MD
│ requirements.txt
│ train.py
│ trainer.py
│
├─data
│ dataset.py
│ util.py
│ voc_dataset.py
│ __init__.py
│
├─imgs
│ faster-speed.jpg
│ model_all.png
│ visdom-fasterrcnn.png
│
├─misc
│ convert_caffe_pretrain.py
│ demo.jpg
│ train_fast.py
│
├─model
│ │ faster_rcnn.py
│ │ faster_rcnn_vgg16.py
│ │ region_proposal_network.py
│ │ __init__.py
│ │
│ └─utils
│ bbox_tools.py
│ creator_tool.py
│ __init__.py
│
└─utils
array_tool.py
config.py
eval_tool.py
vis_tool.py
__init__.py
가장 먼저 확인할 파일은 train.py다.
전체적인 구조만 간단하게 보자면 여러 dependencies를 import하고, 모델을 평가하고(eval), 훈련(train)시키는 파일이다.
data/voc_dataset.py VOCBboxDataset line 102-124
xml 형식은 다음과 같아야 함:
<result>
19.7. xml.etree.ElementTree — The ElementTree XML API — Python 2.7.18 documentation
xml 파일에 대한 파이썬 공식문서
Python XML Parser Tutorial | ElementTree and Minidom Parsing | Edureka
'add' 찾아서 xml 수정하는 법 숙지하기
# Miscellaneous
github.com/rbgirshick/py-faster-rcnn
Deepayan137/Text-detection
A faster-RCNN approach toeards text detction and it subsequent recognition - Deepayan137/Text-detection
github.com
github.com/rbgirshick/py-faster-rcnn
rbgirshick/py-faster-rcnn
Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version - rbgirshick/py-faster-rcnn
github.com
github.com/kbardool/keras-frcnn
Keras 기반 F-RCNN의 원리 (inspace4u.github.io)
www.youtube.com/watch?v=HmJWvwIpW5g
나무위키 OCR
최근 네이버 Clova에서 OCR서비스를 오픈; OCR 챌린지인 'ICDAR Robust Reading Competition'에서 '19년 4개 분야를 석권, 정확도와 기술력을 인정받았다고 한다.
네이버 클라우드 플랫폼의 OCR 소개 및 활용법
www.youtube.com/watch?v=9SSJEPgwMXs&t=28s
도메인 생성 (이번 프로젝트 경우 소설) -> 템플릿 빌더 실행 -> 템플릿 생성 -> 샘플 설정 -> 이미지 판독 -> 결과확인
Pre-process: 이미지 조정 / 굴곡 보정
www.cvc.uab.es/icdar2011competition/?com=results
Int. Conference on Document Analysis and Recognition
Textual content (handwritten or typewritten), non-textual elements (marks, tick boxes, separators, diagrams), layout (page structure, forms, tables), and style (font, colours, highlighting)
CRAFT 네이버 클로바 AI 팀
m.blog.naver.com/n_cloudplatform/222201068226
github.com/clovaai/CRAFT-pytorch
www.youtube.com/watch?v=gsZxtO7Unyg
The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.
삼성SDS Techtonic
https://youtu.be/pECt2rXbpTk
faster RCNN Implementation
PR-012: Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks - YouTube
The PASCAL Visual Object Classes Challenge 2012 (VOC2012) (ox.ac.uk)
github.com/small-yellow-duck/keras-frcnn/blob/master/train_frcnn.py
Pytorch implementation
Guide to build Faster RCNN in PyTorch | by Machine-Vision Research Group | Medium
Weakly Supervised Learning (0) | 2021.04.19 |
---|---|
Bank Check OCR (0) | 2021.04.19 |
NMS (Non-Maximum Suppression) (0) | 2021.04.14 |
[4] RCNN Object Detection (0) | 2021.04.12 |
[3] Region Proposal Object Detection (0) | 2021.04.12 |