CrazyEngineers
  • Suppose you are presented with the following xkcd webcomic.

    tasks

    What will you do? Chuckle and move on? Maybe. Think over it and move on? Maybe. Whatever the initial reaction, the final step in this two-part process will always be to move on; unless you work at Flickr. The Flickr guys took this really seriously and created parkorbird.flickr.com . As mentioned, the task of identifying whether the photo (with GPS data embedded into it) is taken from a US National Park is straightforward. But whether the picture contains a bird or not was the tough part. The problem wasn't easy, but it definitely did not take 5 years to materialize and it wasn't certainly 'virtually impossible'.

    Last year The Flickr team had employed a computer vision technique called deep convolutional neural network which enabled the computer to recognize more than a thousand things in an image; one of them being 'birds'.

    parkorbird
    The Deep Convolutional Neural Network Model
    The network transforms an input image into a representation in which different objects and scenes are easily distinguishable by a simple binary classification algorithm, like Support Vector Machine or Bayesian Network. It does this by passing the image through a series of layers, where each layer computes a function of the output of the layer below it. These layers are then trained using millions of images where these layers recognize image features in an ascending order of complexity. For example, the first layer may start off with a simple edge or line recognition and then proceed to recognize various shapes in the subsequent layers. Further layers might recognize higher-level concepts, like eyes and beaks, and even further ones might recognize heads and wings.

    The layers are then 'activated' on the basis of the amount of features they have detected as input image and a short floating-point vector summarizing all of the various activations at each layer is output to a binary classifier. The classifier, as mentioned, is trained using a million images and it provides a yes/no answer to identify a specific object/scene class, one of the class being birds.

    TL/DR: xkcd: Tasks (kind of indirectly), Introducing: Flickr PARK or BIRD | code.flickr.com, "#-Link-Snipped-#" is born

    Source: Introducing: Flickr PARK or BIRD | code.flickr.com
    Replies
Howdy guest!
Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.
Replies
  • Shreyas Sule

    MemberOct 24, 2014

    Nice article. Its amazing how Neural Networks can be used for such classification problems.

    If someone wants to learn more about Neural Networks, here is a book I personally recommend : Neural Network Design by Martin T. Hagan, Howard B. Demuth, Mark H. Beale, Orlando De Jesus.

    The 2nd edition can be downloaded from here: <a href="https://hagan.okstate.edu/NNDesign.pdf" target="_blank" rel="nofollow noopener noreferrer">PDF</a>
    There are also Matlab examples on this page: <a href="https://hagan.okstate.edu/nnd.html" target="_blank" rel="nofollow noopener noreferrer">Neural Network Design</a>
    Are you sure? This action cannot be undone.
    Cancel
  • avii

    MemberOct 27, 2014

    this proves that XKCD was right all along ;-)
    Are you sure? This action cannot be undone.
    Cancel
Home Channels Search Login Register