Category Archives: Artificial Intelligence

Google Vision API Quick Test on Gym Schedule

While driving back with the next weeks Gym schedule, a startup idea struck me. Wouldn’t it be nice if I could just snap a picture of the gym schedule and all the classes get added to my calendar? Even though the schedule is available online, but i felt it would be convenient for some people just to add it calendar with a snap. Also been looking into Google vision api that Google released in the recently concluded GCP Next  .  Thought this would be a nice test for usability of the API. In order to use Google cloud API, one needs to sign up for the free trial (needs credit card). To get started with vision API, here is  a quick tutorial. So I tried it out. Took the snap of the gym schedule and uploaded this image to cloud platform storage bucket. Then modified the URL for the image to my uploaded image. The request type was modified to TEXT_DETECTION. The request took less then a second to complete, and returned  json response. The json response is divided in 2 sections. The first has a list of all the text it has detected. The second section has coordinates of the bounding rectangle for each word it has detected. I was specifically looking if it  detected the Bodypump class at 7:30pm for Thursday apr-21. My hope was that using the coordinates of the bodypump sections I could determine the date from the column header, and time from the row header, using some geometry/math.

  • It was not able to detect all the times ending in p.m. in the row headers.
  • It did detect the bodypump blobs even though they were in white font with black background, which i felt was smart work by the api.
  • It was not able to detect the smaller text under bodypump e.g. Studio 1,  45 mins, Nancy
  • It was able to detect some variations of the fonts. Notice it detects Fitness, but not the 24 Hour on the top of the page.

The Google vision does a nice job detecting some of the text. But the accuracy is not good enough that I can base my app idea on i.e. capture the text and use coordinates to find the time and date for the class, and automatically add to my calendar. Will have to wait till the api is either more accurate.

See schedule image, request and response below.

24ScheduleMine

I was specifically interested if the response from

The following is the API request

POST https://vision.googleapis.com/v1/images:annotate?key={YOUR_API_KEY}
{
 "requests": [
  {
   "features": [
    {
     "type": "TEXT_DETECTION"
    }
   ],
   "image": {
    "source": {
     "gcsImageUri": "gs://emailclassification/24ScheduleMine.JPG"
    }
   }
  }
 ]
}

The following is the API response

{
 "responses": [
  {
   "textAnnotations": [
    {
     "locale": "en",
     "description": "FITITESS\nHOUR\nGROUP X\nweek of april 18, 2016\nmon apr 18\ntue apr-19\nwed apr 20\nthu apr 21\nfri apr 22\nsat apr 23\nsun apr 24\n5:30am\n8:00am\n9:00am\nBODYPUMP\nBODYCOMBAT BODYPUMP BODYPUMP\nBODYCOMBAT\n0 M\n9:30am\nPop\n30 M\nBODYPUMP\nZUMBA\n10:00am\nOZVMSA\nBODYFLOW\nZUMBA\nStudio\nZUMBA\nBODY FLOW\n11:00am\nBODYPUMP\nPLYO\nGRIT STRENGTH 30M\nStudeo 2\nBODYPUMP\nchedule\n8 AM and 9 PM only. The\nmembers may attend ca\ny Aq\nd Zumba Gold\nd by S\n",
     "boundingPoly": {
      "vertices": [
       {
        "x": 66,
        "y": 142
       },
       {
        "x": 1962,
        "y": 142
       },
       {
        "x": 1962,
        "y": 1449
       },
       {
        "x": 66,
        "y": 1449
       }
      ]
     }
    },
    {
     "description": "FITITESS",
     "boundingPoly": {
      "vertices": [
       {
        "x": 66,
        "y": 921
       },
       {
        "x": 67,
        "y": 535
       },
       {
        "x": 143,
        "y": 535
       },
       {
        "x": 142,
        "y": 921
       }
      ]
     }
    },

 ]
}

Full API Response Text File

Deep Learning – My Painting Picaso Style – Neural Algorithm of Artistic Style

 

 Naveed Portrait Deep Art

appendsigns

Weekend project reading through this interesting paper. The gist of the paper is that it is possible to extract the content of one image, and the style from another and produce a third with the content and style mixed. This is enabled by deep convolution neural networks. Convolution networks learn features in a hierarchy, where lower layers learn features such as line segments, and each layer above learn higher abstractions, such as nose, face or a scenery.  Example above is my own school picture plus Picaso art work produced a fascinating painting of me.

The paper is available here, A Neural Algorithm of Artistic Style

The source code for it is available here GitHub Source

The fastest way to get started on a mac or pc is to use a docker image, with precompiled dependencies. It took about 10 hours to generate the master piece. On a GPU based machine it would take 15 or so minutes.

Get Docker and start docker terminal. Set the docker virtual box memory to a higher number 8GB, stop the virtual machine and change the system settings in virtual box.

docker pull kchentw/neural-style

docker run -i -t kchentw/neural-style /bin/bash

#To exit shell without terminating container do CTRL P, Q

docker ps  # will show container id

docker cp picaso.jpg :/tmp

docker cp kid.jpg :/tmp

docker attach 

cd ~/neural-style

th neural_style.lua -gpu -1 -style_image /tmp/picaso.jpg -content_image /tmp/kid.jpg

Wait for about 10 or so hours and it will produce out.png, the picaso master piece!

Found an article I wrote about COG The Humanoid Robot. Exactly 10 years ago.

The Humanoid Robot Cog for ACM

The ending of the article was optimistic, that we would have humanoids common place. Aibo was promising, but it was discontinued by Sony many years ago. In fact I could not find a single telepresence robot for home usage. I feel in 10 years no significant progress is made. When i search for robots I still find rovers and toy like robots. I hope bigger companies get into home automation and robotics for the masses. A decent vacuum cleaner/telepresence, hooked up to home sensors robot would be something useful for the home. Turtlebot seems something in the right direction, but expensive. iPhone robot Romo is interesting, but is too small and more toyish.

PriceThinker.com first version launched!

My first weekend .com startup initiative has launched. It has taken exactly 2 weekends to hack up this site. See logo

pricethinker100x100

 

Pricethinker is a very useful machine learning application that can predict prices based on numeric features data. For example if one is out shopping, and looking at many similar products, it is hard to figure out the best value product i.e. which gives most features to the price. This tool helps you do complicated price analysis.

A simple example is house price prediction based on 1 feature, say square feet. Which one of the following is the best value? What should be the price of 2000 sq feet be? Pricethinker is exactly the tool to help you. Imagine how complicated it becomes if you also put bathrooms, garage etc into the equation. Its much easier for PriceThinker to do the number crunching for you.

Sq Ft – Price
—————-
2200 – 340000
2100 – 310000
1900 – 260000
2500 – 370000

Pricethinker will tell you that the home with 1900 sq ft is the best value. And a home with 2000 sq ft should cost 288459. The power of machine learning is at your service with simplicity and beautifully!

pricethinker.com domain registerd.

Will be moving over my test app from google apps host to here. Cannot live without mysql db. This is my weekend .com startup initiative. This is a AI application which will allow users to compare prices based on quantitative features. The application will be able to predict prices given training data. Also it will identify the best value item from a group. Hopefully plan to release the first fully functional version by end of this year!

Programming Collective Intelligence Review

I have been reading Programming Collective Intelligence on my daily travel to NY. This is a gem of a book. It covers many topics in machine learning, that can be applied to projects. All with practical examples and code in python. i had no idea that current machine learning algorithm were any useful. But this book puts so much of AI jargon in to practical use. A typical AI book will confuse the heck with the maths proofs and statistics. But this book is pure practical with jucy examples one can easily understand and put to use! Many hip words in AI made some sense to me after reading this book. e.g. Genetic Programming, Genetic Algorithm (yes they are different), baysian classifiers, hiearchical clustering, k-means clustering, optimization, anealing, hill climbing, decision trees, support vector machines.

Some practical algorithms with code and examples covered.
a) Making Recommendations for movies (e.g. similar to Netflix movie recommendation, or matchmaking ).
b) Writing a search engine, and search results.
c) Optimization algorithms. (optimization is finding a good enough solution when the optimal solution is expensive to find).
d) Document Filtring, e.g. Spam Filter
e) Discovering Groups in a large set of data / Clustering
f) Building Price Models e.g. determining the price of a home given a large data with features.
g) Evloving Intelligence. Genertic Programming.

On Intelligence Review

So I have finally finished reading On Intelligence/ Mostly read it on my commute to NY on bus.

i think the book could have been written in less than 100 or maybe 50 pages. Filled with the authors personal views about intelligence. Also is full of analogies, pages and pages of analogies. The gist of the book is that knowledge in our brain is stored in hiearchies, and the brain is always making predictions e.g. when you are listening to a song the next note is already predicted by your brain, and the brain does that for everything we do.

I thought the book revealed nothing new. Tree is a universal data structure (hiearchical), and CS people already know about that, but have not discovered true AI. Nor the prediction is a new thing, programming language compilers are impplicitly doing the same thing, i.e. looking for the next expected token, i.e. if not found throw a compiler error.

So it was interesting to know that the neo cortex behaves in a certain fashion (at place the author was filling the knowledge gaps of the brain by his own assumptions e.g. this part of the brain may possibly work like this), but the book is not any breakthrough. I.e. you cannot write an intelligent program after reading this book. It will only give you inspiration to dig in deeper.

Alice Story Telling

A few years ago i came accross this wonderful cartoon/animation building software called Alice. This is a very innovative software to learn concepts of object programming. I wanted to teach my daughter Sabah, but she was only 4 at that time. Now she is 6 and is able to read. So installed the Alice Story telling application. She was amazingly quick to go through the 3 built in tutorials. I am working with her to learn it more and be able to make simple stories in it. This software was developed by the late CMU professor Randy Paush. See this inspirational video called “The Last Lecture”.