Monthly Archives: April 2016

Google Vision API Quick Test on Gym Schedule

While driving back with the next weeks Gym schedule, a startup idea struck me. Wouldn’t it be nice if I could just snap a picture of the gym schedule and all the classes get added to my calendar? Even though the schedule is available online, but i felt it would be convenient for some people just to add it calendar with a snap. Also been looking into Google vision api that Google released in the recently concluded GCP Next  .  Thought this would be a nice test for usability of the API. In order to use Google cloud API, one needs to sign up for the free trial (needs credit card). To get started with vision API, here is  a quick tutorial. So I tried it out. Took the snap of the gym schedule and uploaded this image to cloud platform storage bucket. Then modified the URL for the image to my uploaded image. The request type was modified to TEXT_DETECTION. The request took less then a second to complete, and returned  json response. The json response is divided in 2 sections. The first has a list of all the text it has detected. The second section has coordinates of the bounding rectangle for each word it has detected. I was specifically looking if it  detected the Bodypump class at 7:30pm for Thursday apr-21. My hope was that using the coordinates of the bodypump sections I could determine the date from the column header, and time from the row header, using some geometry/math.

  • It was not able to detect all the times ending in p.m. in the row headers.
  • It did detect the bodypump blobs even though they were in white font with black background, which i felt was smart work by the api.
  • It was not able to detect the smaller text under bodypump e.g. Studio 1,  45 mins, Nancy
  • It was able to detect some variations of the fonts. Notice it detects Fitness, but not the 24 Hour on the top of the page.

The Google vision does a nice job detecting some of the text. But the accuracy is not good enough that I can base my app idea on i.e. capture the text and use coordinates to find the time and date for the class, and automatically add to my calendar. Will have to wait till the api is either more accurate.

See schedule image, request and response below.

24ScheduleMine

I was specifically interested if the response from

The following is the API request

POST https://vision.googleapis.com/v1/images:annotate?key={YOUR_API_KEY}
{
 "requests": [
  {
   "features": [
    {
     "type": "TEXT_DETECTION"
    }
   ],
   "image": {
    "source": {
     "gcsImageUri": "gs://emailclassification/24ScheduleMine.JPG"
    }
   }
  }
 ]
}

The following is the API response

{
 "responses": [
  {
   "textAnnotations": [
    {
     "locale": "en",
     "description": "FITITESS\nHOUR\nGROUP X\nweek of april 18, 2016\nmon apr 18\ntue apr-19\nwed apr 20\nthu apr 21\nfri apr 22\nsat apr 23\nsun apr 24\n5:30am\n8:00am\n9:00am\nBODYPUMP\nBODYCOMBAT BODYPUMP BODYPUMP\nBODYCOMBAT\n0 M\n9:30am\nPop\n30 M\nBODYPUMP\nZUMBA\n10:00am\nOZVMSA\nBODYFLOW\nZUMBA\nStudio\nZUMBA\nBODY FLOW\n11:00am\nBODYPUMP\nPLYO\nGRIT STRENGTH 30M\nStudeo 2\nBODYPUMP\nchedule\n8 AM and 9 PM only. The\nmembers may attend ca\ny Aq\nd Zumba Gold\nd by S\n",
     "boundingPoly": {
      "vertices": [
       {
        "x": 66,
        "y": 142
       },
       {
        "x": 1962,
        "y": 142
       },
       {
        "x": 1962,
        "y": 1449
       },
       {
        "x": 66,
        "y": 1449
       }
      ]
     }
    },
    {
     "description": "FITITESS",
     "boundingPoly": {
      "vertices": [
       {
        "x": 66,
        "y": 921
       },
       {
        "x": 67,
        "y": 535
       },
       {
        "x": 143,
        "y": 535
       },
       {
        "x": 142,
        "y": 921
       }
      ]
     }
    },

 ]
}

Full API Response Text File