RCNN Miscellaneous Repository

Coding/Image

by linguana 2021. 4. 15. 22:48

Before analyzing the code, I would like to start off by introducing an article that I found in quora by searching "how to read big source code." So, here it goes:

www.quora.com/How-can-you-read-and-study-a-large-software-project-source-code

To make the long story short, it is suggesting that you

skim through the whole project and get the big picture of how it works,
understand the specifics of modules only when the need arises,
and have the whole mechanism seep in your mind after repeating the above steps in a "divide and conquer" fashion.

Well, then. Let's get down to the business with no further ado.

목표: 아래의 깃헙 코드를 4/9 - 4/13 (5일) 이해하고 구현한다.

GitHub - chenyuntc/simple-faster-rcnn-pytorch: A simplified implemention of Faster R-CNN that replicate performance from origin paper

chenyuntc/simple-faster-rcnn-pytorch

A simplified implemention of Faster R-CNN that replicate performance from origin paper - chenyuntc/simple-faster-rcnn-pytorch

github.com

일단 파일 구조부터 보자:

# shift + 우측 마우스 클릭 => 여기에 PowerShell 창 열기
# tree /F

.
│  .gitattributes
│  demo.ipynb
│  LICENSE
│  README.MD
│  requirements.txt
│  train.py
│  trainer.py
│
├─data
│      dataset.py
│      util.py
│      voc_dataset.py
│      __init__.py
│
├─imgs
│      faster-speed.jpg
│      model_all.png
│      visdom-fasterrcnn.png
│
├─misc
│      convert_caffe_pretrain.py
│      demo.jpg
│      train_fast.py
│
├─model
│  │  faster_rcnn.py
│  │  faster_rcnn_vgg16.py
│  │  region_proposal_network.py
│  │  __init__.py
│  │
│  └─utils
│          bbox_tools.py
│          creator_tool.py
│          __init__.py
│
└─utils
        array_tool.py
        config.py
        eval_tool.py
        vis_tool.py
        __init__.py

가장 먼저 확인할 파일은 train.py다.

전체적인 구조만 간단하게 보자면 여러 dependencies를 import하고, 모델을 평가하고(eval), 훈련(train)시키는 파일이다.

data/voc_dataset.py VOCBboxDataset line 102-124

xml 형식은 다음과 같아야 함:

<result>

19.7. xml.etree.ElementTree — The ElementTree XML API — Python 2.7.18 documentation

xml 파일에 대한 파이썬 공식문서

Python XML Parser Tutorial | ElementTree and Minidom Parsing | Edureka

'add' 찾아서 xml 수정하는 법 숙지하기

# Miscellaneous

github.com/rbgirshick/py-faster-rcnn

GitHub - Deepayan137/Text-detection: A faster-RCNN approach toeards text detction and it subsequent recognition

Deepayan137/Text-detection

A faster-RCNN approach toeards text detction and it subsequent recognition - Deepayan137/Text-detection

github.com

github.com/rbgirshick/py-faster-rcnn

rbgirshick/py-faster-rcnn

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version - rbgirshick/py-faster-rcnn

github.com

Faster_RCNN(2016).pdf

6.59MB

github.com/kbardool/keras-frcnn

Keras 기반 F-RCNN의 원리 (inspace4u.github.io)

www.youtube.com/watch?v=HmJWvwIpW5g

RCNN.hwp

1.52MB

나무위키 OCR

namu.wiki/w/OCR

최근 네이버 Clova에서 OCR서비스를 오픈; OCR 챌린지인 'ICDAR Robust Reading Competition'에서 '19년 4개 분야를 석권, 정확도와 기술력을 인정받았다고 한다.

네이버 클라우드 플랫폼의 OCR 소개 및 활용법

www.youtube.com/watch?v=9SSJEPgwMXs&t=28s

도메인 생성 (이번 프로젝트 경우 소설) -> 템플릿 빌더 실행 -> 템플릿 생성 -> 샘플 설정 -> 이미지 판독 -> 결과확인

Pre-process: 이미지 조정 / 굴곡 보정

rrc.cvc.uab.es/

www.cvc.uab.es/icdar2011competition/?com=results

Int. Conference on Document Analysis and Recognition

Textual content (handwritten or typewritten), non-textual elements (marks, tick boxes, separators, diagrams), layout (page structure, forms, tables), and style (font, colours, highlighting)

CRAFT 네이버 클로바 AI 팀

youtu.be/NQeaLc2X8vk

m.blog.naver.com/n_cloudplatform/222201068226

github.com/clovaai/CRAFT-pytorch

www.youtube.com/watch?v=gsZxtO7Unyg

CharacterRegionAwarenessforTextDetection.pdf

4.50MB

The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

ICDAR2019_Robust_Reading_Challenge_on_Multilingual_Scene_Text_Detection_and_Recognition.pdf

0.17MB

삼성SDS Techtonic
https://youtu.be/pECt2rXbpTk

faster RCNN Implementation

Faster R-CNN (object detection) implemented by Keras for custom data from Google’s Open Images Dataset V4 | by Yinghan Xu | Towards Data Science

PR-012: Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks - YouTube

The PASCAL Visual Object Classes Challenge 2012 (VOC2012) (ox.ac.uk)

github.com/small-yellow-duck/keras-frcnn/blob/master/train_frcnn.py

Pytorch implementation

Guide to build Faster RCNN in PyTorch | by Machine-Vision Research Group | Medium

Pytorch_tutorial 객체탐지 모델_.. : 네이버블로그 (naver.com)

저작자표시 비영리 변경금지 (새창열림)

'Coding > Image' 카테고리의 다른 글

Weakly Supervised Learning (0)	2021.04.19
Bank Check OCR (0)	2021.04.19
NMS (Non-Maximum Suppression) (0)	2021.04.14
[4] RCNN Object Detection (0)	2021.04.12
[3] Region Proposal Object Detection (0)	2021.04.12

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

linguana

고정 헤더 영역

메뉴 레이어

메뉴 리스트

검색 레이어

검색 영역

상세 컨텐츠

본문 제목

본문

'Coding > Image' 카테고리의 다른 글

관련글 더보기

추가 정보

인기글

최신글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역