[7] Nging Server Setup - Building a Text Detection API using Drogon Server with OpenCV

11월 10, 2024

Setting up text detection in RHEL and integrating it with a Drogon server to develop functionality that extracts text information as an API and displays it to clients.

Note: The purpose is Text Detection, not OCR (Optical Character Recognition). OCR will be integrated later.

---

## 0. Search for Text Detection Frameworks

| Framework           | Language        | Pros                                                                                    | Cons                                                                                          |
|---------------------|-----------------|------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
| Tesseract OCR       | C++ (Python, etc.) | Open source, supports multiple languages, customizable                                   | Can have degraded performance without preprocessing, requires integration with OpenCV for detection |
| OpenCV + EAST model | C++, Python     | Fast text detection, supports both CPU and GPU, easy integration with OpenCV              | Only supports text detection, so OCR is needed separately; performance limitations with high-resolution images |
| EasyOCR             | Python          | PyTorch-based, supports multiple languages and styles, easy to install and use           | May be slow on CPU, requires GPU for high performance                                         |
| Amazon Textract (API) | Multi-language | Serverless deployment on AWS, advanced features (table and form recognition), stable performance | Potentially high cost for large datasets, restricted to cloud use, security considerations     |
| Google Vision API (API) | Multi-language | Comprehensive text detection and image analysis, serverless deployment on GCP            | High costs for large datasets, restricted to cloud use, data security concerns                 |
| PaddleOCR           | Python          | Excellent performance for Asian languages, lightweight model with fast CPU processing     | Based on PaddlePaddle, which may be a barrier for PyTorch/TensorFlow users, requires Python for integration |

- **External API calls are excluded.**
- Since the objective is detection, proceed with **OpenCV + EAST model**.
- The API will provide text coordinate data within the image and generate a new image with frames around detected text (`$filename_detection.jpg`).

---

## 1. OpenCV Setup and Required Library Installation

Proceed with the following steps (initially implemented via shell):

```
#!/bin/sh
sudo yum update -y
sudo yum groupinstall -y "Development Tools"
sudo yum install -y epel-release
sudo yum install -y cmake git gtk2-devel boost-devel
sudo yum install -y libjpeg-turbo-devel libpng-devel libtiff-devel
sudo yum install -y libdc1394-devel gstreamer1-devel gstreamer1-plugins-base-devel
sudo yum install -y tbb-devel

# OpenCV 4.10.0
wget https://github.com/opencv/opencv/archive/refs/tags/4.10.0.tar.gz
mv 4.10.0.tar.gz opencv.tar
tar -xvf opencv.tar
mv opencv-4.10.0 opencv

# OpenCV Contrib 4.10.0
wget https://github.com/opencv/opencv_contrib/archive/refs/tags/4.10.0.tar.gz
mv 4.10.0.tar.gz opencv_contrib.tar
tar -xvf opencv_contrib.tar
mv opencv_contrib-4.10.0 opencv_contrib

# OpenCV Build
cd ./opencv
mkdir build
cd build

# CMake
cmake -DOPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules \
      -DCMAKE_BUILD_TYPE=Release \
      -DBUILD_SHARED_LIBS=ON \
      -DWITH_IPP=ON \
      -DWITH_TBB=ON \
      -DWITH_OPENMP=ON \
      -DENABLE_FAST_MATH=ON \
      -DCMAKE_INSTALL_PREFIX=/usr/local \
      -DBUILD_EXAMPLES=OFF \
      -DBUILD_TESTS=OFF \
      -DBUILD_PERF_TESTS=OFF \
          ..

# Build and Install
make -j$(nproc)  # Builds using all CPU cores
sudo make install
sudo ldconfig  # Updates the library cache
```

This setup installs in the system path by default.

---

## 2. API Development

Continue using the previously set up server.

2-1. Connect with Drogon Ctl

```
cd /root/drogon2/drogon/build/drogon_ctl 
drogon_ctl create controller TextDetectionController
```

2-2. Implement TextDetection
Source location ( /root/drogon2/drogon/build/drogon_ctl/testAPI/controllers ) 

TextDetectionController.h
<pre class="brush:c++"
class="brush:plain; gutter:true">
#pragma once
#include <drogon/HttpController.h>
#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>

using namespace drogon;

void initializeModel();

class TextDetectionController : public HttpController<TextDetectionController>
{
public:
    std:: string _storagePath = "/root/storage/";
    METHOD_LIST_BEGIN
    // Process text detection for images located in src and save to dst
    ADD_METHOD_TO(TextDetectionController::handleTextDetection, "/text-detection", Get);
    METHOD_LIST_END

    // Method declaration
    void handleTextDetection(const HttpRequestPtr& req, std::function<void(const HttpResponsePtr&)>&& callback);
    std::vector<cv::RotatedRect> decodeBoundingBoxes(const cv::Mat& scores, const cv::Mat& geometry, float scoreThresh);
};
</pre>

---

TextDetectionController.cc

<pre class="brush:c++"
class="brush:plain; gutter:true">
#include "TextDetectionController.h"
#include <drogon/utils/Utilities.h>
#include <fstream>

using namespace cv;
using namespace cv::dnn;
namespace {
    cv::dnn::Net eastNet;
    const std::string eastModelPath = "./frozen_east_text_detection.pb";
}

// OpenCV Initialization
// Since model initialization takes time, set globally for efficiency
// Note: this is thread-unsafe and would need improvement for production server logic
void initializeModel() {
    if (eastNet.empty()) {
        eastNet = cv::dnn::readNet(eastModelPath);
        if (eastNet.empty()) {
            throw std::runtime_error("Failed to load EAST model");
        }
    }
}

std::vector<cv::RotatedRect> TextDetectionController::decodeBoundingBoxes(const cv::Mat& scores, const cv::Mat& geometry, float scoreThresh)
{
    std::vector<cv::RotatedRect> detections;
    const int numRows = scores.size[2];
    const int numCols = scores.size[3];

    for (int y = 0; y < numRows; y++) {
        const float* scoresData = scores.ptr<float>(0, 0, y);
        const float* x0_data = geometry.ptr<float>(0, 0, y);
        const float* x1_data = geometry.ptr<float>(0, 1, y);
        const float* x2_data = geometry.ptr<float>(0, 2, y);
        const float* x3_data = geometry.ptr<float>(0, 3, y);
        const float* anglesData = geometry.ptr<float>(0, 4, y);

        for (int x = 0; x < numCols; x++) {
            float score = scoresData[x];
            if (score < scoreThresh)
                continue;

            float offsetX = x * 4.0;
            float offsetY = y * 4.0;
            float angle = anglesData[x];
            float cosA = cos(angle);
            float sinA = sin(angle);
            float h = x0_data[x] + x2_data[x];
            float w = x1_data[x] + x3_data[x];

            Point2f offset(offsetX + cosA * x1_data[x] + sinA * x2_data[x],
                           offsetY - sinA * x1_data[x] + cosA * x2_data[x]);
            Point2f p1 = Point2f(-sinA * h, -cosA * h) + offset;
            Point2f p3 = Point2f(-cosA * w, sinA * w) + offset;
            RotatedRect rrect(0.5f * (p1 + p3), Size2f(w, h), -angle * 180.0f / CV_PI);
            detections.push_back(rrect);
        }
    }

    return detections;
}

void TextDetectionController::handleTextDetection(const HttpRequestPtr& req, std::function<void(const HttpResponsePtr&)>&& callback)
{
    Json::Value respStr;
    HttpStatusCode code = k200OK;

    do
    {
        auto filename = req->getParameter("filename");
        if (filename.empty())
        {
            code = k404NotFound;
            break;
        }

        std::string path = _storagePath.c_str() + filename;
        // Read Image From Server Local
        Mat image = imread(path);
        if (image.empty())
        {
            code = k404NotFound;
            break ;
        }

        // Set image dimensions and resize ratio
        int origH = image.rows;
        int origW = image.cols;
        int newW = 320;
        int newH = 320;
        float rW = static_cast<float>(origW) / newW;
        float rH = static_cast<float>(origH) / newH;

        // Create blob and set input for the network
        Mat blob = blobFromImage(image, 1.0, Size(newW, newH), Scalar(123.68, 116.78, 103.94), true, false);
        eastNet.setInput(blob);

        // Set output layers for the EAST model
        std::vector<String> outputLayers = {"feature_fusion/Conv_7/Sigmoid", "feature_fusion/concat_3"};
        std::vector<Mat> outs;
        eastNet.forward(outs, outputLayers);

        // Perform text detection
        std::vector<RotatedRect> detections = decodeBoundingBoxes(outs[0], outs[1], 0.7);
        if ( detections.size() <= 0 )
        {
            // Return 404 if no text is detected
            code = k404NotFound;
            break;
        }

        // Return results in JSON format
        Json::Value jsonResponse;
        for (const auto& detection : detections) {
            Rect boundingBox = detection.boundingRect();
            boundingBox.x *= rW;
            boundingBox.y *= rH;
            boundingBox.width *= rW;
            boundingBox.height *= rH;

            Json::Value box;
            box["x"] = boundingBox.x;
            box["y"] = boundingBox.y;
            box["width"] = boundingBox.width;
            box["height"] = boundingBox.height;
            jsonResponse["detections"].append(box);
        }

        // Generate the response
        auto response = HttpResponse::newHttpJsonResponse(jsonResponse);

        // Process Text Area Rectangle
        // Draw rectangles around detected text areas
        for (const auto& detection : detections)
        {
            Rect boundingBox = detection.boundingRect();
            boundingBox.x *= rW;
            boundingBox.y *= rH;
            boundingBox.width *= rW;
            boundingBox.height *= rH;

            rectangle(image, boundingBox, Scalar(0, 255, 0), 2);  // Draw green rectangle
        }
        // Save the image
        std::filesystem::path outputPath = _storagePath + (std::filesystem::path(path).stem().string() + "_detection.jpg");
        imwrite(outputPath.string(), image);

        callback(response);
        return ;
    }
    while(false);

    auto resp = HttpResponse::newHttpResponse();
    resp->setStatusCode(code);
    callback(resp);
}
</pre>

---

main.cc

<pre class="brush:c++"
class="brush:plain; gutter:true">
#include <drogon/drogon.h>
#include "controllers/TextDetectionController.h"

int main() {
    try
    {
        // Load model during server initialization
        initializeModel();
    }
    catch (const std::exception &e)
    {
        std::cerr << "Error initializing model: "
                << e.what() << std::endl;
        return 1;  // Stop server if model loading fails
    }
    drogon::app().loadConfigFile("../config.json");

    LOG_INFO << "Server RUN";
    drogon::app().run();

    return 0;
}
</pre>

2-3. Edit drogon CMake

```
cd /root/drogon2/drogon/build/drogon_ctl/testAPI
vi CMakeLists.txt
```

- Modify as shown below to add TextDetectionController.cc after FileController.cc:

```
# Necessary changes:
# Add find_package(OpenCV CONFIG REQUIRED)
# Add ${OpenCV_LIBS} to target_link_libraries

# If OpenCV is not in the default path, specify the path (not needed if previous steps are followed)
# set(OpenCV_DIR "/path/to/opencv")  # Example: /usr/local/opencv
---

add_executable(${PROJECT_NAME} main.cc controllers/FileController.cc controllers/TextDetectionController.cc)

find_package(Drogon CONFIG REQUIRED)
find_package(OpenCV CONFIG REQUIRED) # Added
target_link_libraries(${PROJECT_NAME} PRIVATE Drogon::Drogon ${OpenCV_LIBS}) # Added
```

2-4. Build drogon
- Run cmake followed by make

```
cd /root/drogon2/drogon/build/drogon_ctl/testAPI/build
cmake .. 
make
```

---

## 3. Execution
3-1. Download Model Before Running

```
cd /root/drogon2/drogon/build/drogon_ctl/testAPI/build

# Detection model
wget https://www.dropbox.com/s/r2ingd0l3zt8hxs/frozen_east_text_detection.tar.gz?dl=1

./testAPI
```

---

3-2. Execution and Upload
- Use a sample text image as shown below (sample image)

<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWXSLU-khBE73xhSr4ynW1XqXHyh78HqGPfbRvKdQlZCs4ceJWucduH6XXVIlZg81Ym7qmpcZwyicyRygHm3oDZGAHGkQi-3aVLR82w0ysmL1c1Yi4myoSiOjsUI11P0VTbg0SAHkiOXMUJopg8-iOPD4tfM0c1_nv8217lWRZNELJUGYUAismZHSg/s354/image_in_text.PNG" style="display: block; padding: 1em 0; text-align: center; "><img alt="text image" border="0" data-original-height="163" data-original-width="354" loading="lazy" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWXSLU-khBE73xhSr4ynW1XqXHyh78HqGPfbRvKdQlZCs4ceJWucduH6XXVIlZg81Ym7qmpcZwyicyRygHm3oDZGAHGkQi-3aVLR82w0ysmL1c1Yi4myoSiOjsUI11P0VTbg0SAHkiOXMUJopg8-iOPD4tfM0c1_nv8217lWRZNELJUGYUAismZHSg/s320-rw/image_in_text.PNG" width="320"/></a></div>

---

- Upload the file to the server via the upload endpoint (http://127.0.0.1:10099/doUpload/). (Refer to the previous page for the upload implementation)

<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivWzw0L6Afo-IZCq0NA5oXuo2pFFntc83oXx3LUTKYKIogyobEqiFuiPCnKtb1IZQaA9e2mqf5UNK0scdekPSjFrzPdBHB3x4SA9ss0QMlACBgyzE6bjEmToSm4VeZdXRrbIbJn7-Br_8-7Z4cieMqIF4MiQD-yhC2thyuH-6wE00KinepZJG_C7ag/s1600/upload.PNG" style="display: block; padding: 1em 0; text-align: center; "><img alt="do upload" border="0" data-original-height="210" data-original-width="398" loading="lazy" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivWzw0L6Afo-IZCq0NA5oXuo2pFFntc83oXx3LUTKYKIogyobEqiFuiPCnKtb1IZQaA9e2mqf5UNK0scdekPSjFrzPdBHB3x4SA9ss0QMlACBgyzE6bjEmToSm4VeZdXRrbIbJn7-Br_8-7Z4cieMqIF4MiQD-yhC2thyuH-6wE00KinepZJG_C7ag/s1600-rw/upload.PNG"/></a></div>

---

- After upload, check the file list at (http://127.0.0.1:10099/list/)

<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiGrWkbxfIISSvDdKchWtjegM8KgkRYnxySjEGEkQJWu1kNc_168DBJeGEkRqB3lYR7JIIVz3mjGz3pVatE5dNqa3BzOQWFOwTqH36cYzuoe2yR71u445u6WBpzNEXah8BlqbONYwoaHSTMAetPXXUrhN4Gu_m1nWVMyF-zFZW1V8iuvs4qWvwEVIR/s1600/pre-list.PNG" style="display: block; padding: 1em 0; text-align: center; "><img alt="list result" border="0" data-original-height="227" data-original-width="604" loading="lazy" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiGrWkbxfIISSvDdKchWtjegM8KgkRYnxySjEGEkQJWu1kNc_168DBJeGEkRqB3lYR7JIIVz3mjGz3pVatE5dNqa3BzOQWFOwTqH36cYzuoe2yR71u445u6WBpzNEXah8BlqbONYwoaHSTMAetPXXUrhN4Gu_m1nWVMyF-zFZW1V8iuvs4qWvwEVIR/s1600-rw/pre-list.PNG"/></a></div>

---

- Call the detection API (`http://127.0.0.1:10099/text-detection?filename=image_in_text.PNG`)

<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjm8ttm4U2dTgFMHdVr_Auhw1PNFt_w7MyltrAXVR1MqKSvr6Sh20YFXn2CU1GZAarS_aVkt-dla94H8_nGgFbsqFhEoy54HqUWm3gt6AwJioRxOAROpaychxS8PJFcqfC7xtXxE-CkKOH6gYsJcU3zqoPOwp3og37VwGndSzzLqPc8yCTYncJNsWGA/s1600/response.PNG" style="display: block; padding: 1em 0; text-align: center; "><img alt="text detection call" border="0" data-original-height="753" data-original-width="919" loading="lazy" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjm8ttm4U2dTgFMHdVr_Auhw1PNFt_w7MyltrAXVR1MqKSvr6Sh20YFXn2CU1GZAarS_aVkt-dla94H8_nGgFbsqFhEoy54HqUWm3gt6AwJioRxOAROpaychxS8PJFcqfC7xtXxE-CkKOH6gYsJcU3zqoPOwp3og37VwGndSzzLqPc8yCTYncJNsWGA/s1600-rw/response.PNG"/></a></div>


---

- After detection, check the updated file list at (http://127.0.0.1:10099/list/)

<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJCP6ToJK5xBWvRyqplWaFRdsNTUv55C5zyBkJ01xxLYZVw5ppQiuv1Mpv1LSj-7WVQZJ4SiDtQ9LnBX8V_TN1_sw1askkZlI_AtqBwSSIIVmINpTGqQcQVU4DdKwz1luknEZXMvkOQwvc9tTFpgSlNb5yasLGUWy0CZbmki42DV6wDge7ahb7Togy/s1600/post-list.PNG" style="display: block; padding: 1em 0; text-align: center; "><img alt="after detection and list result" border="0" data-original-height="251" data-original-width="654" loading="lazy" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJCP6ToJK5xBWvRyqplWaFRdsNTUv55C5zyBkJ01xxLYZVw5ppQiuv1Mpv1LSj-7WVQZJ4SiDtQ9LnBX8V_TN1_sw1askkZlI_AtqBwSSIIVmINpTGqQcQVU4DdKwz1luknEZXMvkOQwvc9tTFpgSlNb5yasLGUWy0CZbmki42DV6wDge7ahb7Togy/s1600-rw/post-list.PNG"/></a></div>

---

- Download and verify the detected image (`http://127.0.0.1:10099/download/image_in_text_detection.jpg`)

<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWys4fyTDyGLQTHLIEEfyJPow4NCVzUY2zIBbQkbjxNE-DjmtZc_DhLkmdFub3xotQaGAgHC_3S0luU5E2mSNMUTqgkSzPhMekynRwYDMD8wSuJZymFSK6o0D8RuryS8jrb9LrBCFmDnW_ZBxPkZWwIaGRLTO104Ch984N_34jQAomiJj3_vQ5EqGq/s354/image_in_text_detection.jpg" style="display: block; padding: 1em 0; text-align: center; "><img alt="opencv result" border="0" data-original-height="163" data-original-width="354" loading="lazy" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWys4fyTDyGLQTHLIEEfyJPow4NCVzUY2zIBbQkbjxNE-DjmtZc_DhLkmdFub3xotQaGAgHC_3S0luU5E2mSNMUTqgkSzPhMekynRwYDMD8wSuJZymFSK6o0D8RuryS8jrb9LrBCFmDnW_ZBxPkZWwIaGRLTO104Ch984N_34jQAomiJj3_vQ5EqGq/s320-rw/image_in_text_detection.jpg" width="320"/></a></div>

---

<!-- 목록을 표시할 HTML 컨테이너 -->
<div>
    <h3>Related Links</h3>
    <ul id="label-post-list">
        <!-- 여기에 게시물 목록이 추가됩니다 -->
    </ul>
</div>

---

<!-- 목록을 표시할 HTML 컨테이너 -->
<div>
    <h3>Recommended Link</h3>
    <ul id="label-post-list-include">
        <!-- 여기에 게시물 목록이 추가됩니다 -->
    </ul>
</div>

---
이 블로그 검색

심플 이즈 베스트

👑 조선 중종: 반정으로 오른 개혁 군주의 이상과 현실

[7] Nging Server Setup - Building a Text Detection API using Drogon Server with OpenCV

댓글

댓글 쓰기

이 블로그의 인기 게시물

한국 핵무장 논의와 방위산업 관련주: 핵무기 개발 과정과 유망 종목 분석

윤석열 계엄령 선포! 방산주 대폭발? 관련주 투자 전략 완벽 분석

[로스트아크] 제작 효율 최적화 위한 영지 세팅