상세 컨텐츠

본문 제목

RCNN Miscellaneous Repository

Coding/Image

by linguana 2021. 4. 15. 22:48

본문

Before analyzing the code, I would like to start off by introducing an article that I found in quora by searching "how to read big source code." So, here it goes:

www.quora.com/How-can-you-read-and-study-a-large-software-project-source-code

To make the long story short, it is suggesting that you

  1. skim through the whole project and get the big picture of how it works,
  2. understand the specifics of modules only when the need arises,
  3. and have the whole mechanism seep in your mind after repeating the above steps in a "divide and conquer" fashion.

Well, then. Let's get down to the business with no further ado.


목표: 아래의 깃헙 코드를 4/9 - 4/13 (5일) 이해하고 구현한다.

GitHub - chenyuntc/simple-faster-rcnn-pytorch: A simplified implemention of Faster R-CNN that replicate performance from origin paper

 

chenyuntc/simple-faster-rcnn-pytorch

A simplified implemention of Faster R-CNN that replicate performance from origin paper - chenyuntc/simple-faster-rcnn-pytorch

github.com

 

일단 파일 구조부터 보자:

더보기
# shift + 우측 마우스 클릭 => 여기에 PowerShell 창 열기
# tree /F

.
│  .gitattributes
│  demo.ipynb
│  LICENSE
│  README.MD
│  requirements.txt
│  train.py
│  trainer.py
│
├─data
│      dataset.py
│      util.py
│      voc_dataset.py
│      __init__.py
│
├─imgs
│      faster-speed.jpg
│      model_all.png
│      visdom-fasterrcnn.png
│
├─misc
│      convert_caffe_pretrain.py
│      demo.jpg
│      train_fast.py
│
├─model
│  │  faster_rcnn.py
│  │  faster_rcnn_vgg16.py
│  │  region_proposal_network.py
│  │  __init__.py
│  │
│  └─utils
│          bbox_tools.py
│          creator_tool.py
│          __init__.py
│
└─utils
        array_tool.py
        config.py
        eval_tool.py
        vis_tool.py
        __init__.py

가장 먼저 확인할 파일은 train.py다.

 

 

train.py

 

 

전체적인 구조만 간단하게 보자면 여러 dependencies를 import하고, 모델을 평가하고(eval), 훈련(train)시키는 파일이다.

 

 

data/voc_dataset.py VOCBboxDataset line 102-124

xml 형식은 다음과 같아야 함:

<result>
	

 

 

 

 

 

 

 

 

19.7. xml.etree.ElementTree — The ElementTree XML API — Python 2.7.18 documentation

xml 파일에 대한 파이썬 공식문서

Python XML Parser Tutorial | ElementTree and Minidom Parsing | Edureka

'add' 찾아서 xml 수정하는 법 숙지하기 

 


# Miscellaneous

더보기

나무위키 OCR

namu.wiki/w/OCR

최근 네이버 Clova에서 OCR서비스를 오픈; OCR 챌린지인 'ICDAR Robust Reading Competition'에서 '19년 4개 분야를 석권, 정확도와 기술력을 인정받았다고 한다.


네이버 클라우드 플랫폼의 OCR 소개 및 활용법

www.youtube.com/watch?v=9SSJEPgwMXs&t=28s

도메인 생성 (이번 프로젝트 경우 소설) -> 템플릿 빌더 실행 -> 템플릿 생성 -> 샘플 설정 -> 이미지 판독 -> 결과확인

Pre-process: 이미지 조정 / 굴곡 보정

 

rrc.cvc.uab.es/

www.cvc.uab.es/icdar2011competition/?com=results

Int. Conference on Document Analysis and Recognition

Textual content (handwritten or typewritten), non-textual elements (marks, tick boxes, separators, diagrams), layout (page structure, forms, tables), and style (font, colours, highlighting)

 

CRAFT 네이버 클로바 AI 팀

youtu.be/NQeaLc2X8vk

m.blog.naver.com/n_cloudplatform/222201068226

github.com/clovaai/CRAFT-pytorch

www.youtube.com/watch?v=gsZxtO7Unyg

CharacterRegionAwarenessforTextDetection.pdf
4.50MB

 

The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

ICDAR2019_Robust_Reading_Challenge_on_Multilingual_Scene_Text_Detection_and_Recognition.pdf
0.17MB


삼성SDS Techtonic
https://youtu.be/pECt2rXbpTk

 

 


faster RCNN Implementation

Faster R-CNN (object detection) implemented by Keras for custom data from Google’s Open Images Dataset V4 | by Yinghan Xu | Towards Data Science

PR-012: Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks - YouTube

The PASCAL Visual Object Classes Challenge 2012 (VOC2012) (ox.ac.uk)

github.com/small-yellow-duck/keras-frcnn/blob/master/train_frcnn.py


Pytorch implementation

Guide to build Faster RCNN in PyTorch | by Machine-Vision Research Group | Medium

Pytorch_tutorial 객체탐지 모델_.. : 네이버블로그 (naver.com)

'Coding > Image' 카테고리의 다른 글

Weakly Supervised Learning  (0) 2021.04.19
Bank Check OCR  (0) 2021.04.19
NMS (Non-Maximum Suppression)  (0) 2021.04.14
[4] RCNN Object Detection  (0) 2021.04.12
[3] Region Proposal Object Detection  (0) 2021.04.12

관련글 더보기