CrazyEngineers
  • Efficient text detection and extraction Method/Framework

    s_athya

    s_athya

    @s-athya-VCPq4K
    Updated: Oct 26, 2024
    Views: 1.1K
    Hi [​IMG], frenz i am interested in video content analysis. Now i want some idea regarding text detection and extraction from video. (Note:- Text means Overlay text not scene text). Some what similar to video OCR. Based on observation that there exist transient colors between inserted text and its adjacent background.

    Overview of the scenario:- [​IMG]

    we propose a new overlay text detection and extraction method using the transition region between the overlay text and background.
    First, we generate the transition map based on our observation that there exist transient colors between overlay text and its adjacent background. Then the overlay text regions are roughly detected by computing the density of transition pixels and the consistency of texture around the transition pixels. The detected overlay text regions are localized accurately using the projection of transition map with an improved color-based thresholding method to extract text strings correctly.

    Since the change of intensity at the boundary of overlay text may be small in the low contrast image, to effectively determine whether a pixel is within a transition region, the modified saturation is first introduced as a weight value based on the fact that overlay text is in the form of overlay graphics. The modified saturation is defined as follows:

    S(x, y) =1-(3/(R+G+B)[min(R,G,B)])  1
    ~S(x, y) = S(x, y)/max(S(x, y))
    Max(S(x, y)) =2*(0.5-I(x, y)), if ~I(x, y)>0.5 2
    Max(S(x, y)) =I(x, y)), Otherwise. 2

    S(x, y) and Max(S(x, y)) denote the saturation value and the maximum saturation value at the corresponding intensity level, respectively~I(x, y). denotes the intensity at the (x, y), which is normalized to [0,1] . Based on the conical HSI color model , the maximum value of saturation is normalized in accordance with ~I(x, y) compared to 0.5 in (2). The transition can thus be defined by combination of the change of intensity and the modified saturation as follows:

    DL(x, y) = (1+dSL(x, y)) * |I(x-1, y) - I(x, y)|
    DH(x, y) = (1+dSH(x, y)) * |I(x, y) - I(x+1, y)|
    Where dSL(x, y) = |~S(x-1, y)-~S(x, y)| and
    dSH(x, y)= |~S(x, y)-~S(x+1,y)| 3

    Since the weight dSH(x, y)) and dSL(x, y)) can be zero by the achromatic overlay text and background, we add 1 to the weight in (3). If a pixel satisfies the logarithmical change constraint given in (4), three consecutive pixels centered by the current pixel are detected as the transition pixels and the transition map is generated

    T(x, y) = 1, if DH > DL+TH
    T(x, y) = 0, Otherwise. 4

    The thresholding value TH is empirically set to 80 in consideration of the logarithmical change.
    0
    Replies
Howdy guest!
Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.
Replies
  • just2rock

    MemberNov 11, 2009

    Will the logic work if threshold value TH as you mentioned is set to 80 ....ANYWAYS A WONDERFUL ATTEMPT
    Are you sure? This action cannot be undone.
    Cancel
  • s_athya

    MemberNov 12, 2009

    This TH value will be identical for image size of 320 240.

    I am not expert in Image processing @just2rock. Can you implement it.😕
    Are you sure? This action cannot be undone.
    Cancel
  • just2rock

    MemberNov 12, 2009

    i will suggets you to check TH for 640 resolution & 1024 frame as this being normal resolutiopn scale.Now to implement it properly recheck logic with TH changes & what nature of curvature its having out.To be stable it must have certain peak high & low levels.This must be kept into your pool to work on.So for image processing you should always take this considerable inputs & fetch out the process.However is their any ROC for it?
    Are you sure? This action cannot be undone.
    Cancel
  • s_athya

    MemberNov 12, 2009

    Actually that TH value is to minimize false positives and false negatives of overlay text detection. Shall i send you the paper which i have to you. We are going to perform this operation in video database. So the frame size will be as i said before.
    Are you sure? This action cannot be undone.
    Cancel
  • just2rock

    MemberNov 12, 2009

    yes you can send at my CE mail id
    Are you sure? This action cannot be undone.
    Cancel
Home Channels Search Login Register