Technology – Page 2

ICML 2016 – International Conference for Machine Learning Notes

Notes from ICML 2016 Held in New York

David Silver (Deep Mind), Yoshua Bengio (Univ of Montreal)

Summary

Attended the biggest ever machine learning conference in number of participants and papers. Red hot interest in deep learning and reinforcement learning. Great advancements in vision (Microsoft deep residual networks 1000 level deep neural networks), sound to text (Bidu Deepspeech 2.0), reinforcement learning (Deepmind A3C algorithm, a AI player learns to explore and play in 3D Lybrinth maze, folks who developed AlphaGo). Image captioning /understanding getting even more sophisticated (dense captioning work by Fei Fei and team). Language understanding is still lagging and needs breakthrough, however a couple of papers from Metamind about question answering system on text and especially on images seemed promising.

Active areas that need more digging

Memory /attention,
Ways to teach machines with less data. Currently deep learning is data hungry, needs lots of annotated data
Understanding the story in an image (Dr Fei Fei work)
Text understanding, lags image and speech

My personal conclusion is that there is still a lot to go towards the goal of strong AI. Though AlphaGo (Deepmind system that beat Go) and DeepQ are great strides in AI, these systems only learn by intuition encoded in neural network weights backed by huge compute resources, and this learning seems to be different from the way humans learn. A true AI systems should be able to use the same architecture and apply to car driving, learning to play chess, a new language or cook. I feel if breakthroughs are not made in a few more years, there could be another AI winter coming. Also at the same time it feels we are almost there to the quest of true AI!

Industry

Metamind acquired by Salesforce. Should be watching the salesforce conference announcements how they indent to use deep learning technologies.
NVidia and NYU partner to develop end to end neural network for autonomous cars
Clafiai – NY based startup for image captioning. Interesting use case for CMS and for accesibility.
Netflix – Patterns for machine learning. Netflix uses Time machine an interesting architecture to train models using production data.
Maluuba – Upcoming Canadian startup that specializes in natural langauge processing. Claimed that thier results are better than Google/Facebook.

Reading List For Papers presented

All papers presented at ICML 2016

My synthesized list to read over

Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin
Pixel Recurrent Neural Networks (Best paper award)
Dueling Network Architectures for Deep Reinforcement Learning (Best paper award)
Control of Memory, Active Perception, and Action in Minecraft
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (Metamind Question Answering)
Dynamic Memory Networks for Visual and Textual Question Answering (Metamind on question answering on images e.g. ask question what sports is he playing on an image!)
Asynchronous Methods for Deep Reinforcement Learning
Learning Simple Algorithms from Examples

Important List for Papers Referenced From Previous Conferences

People Met

Dr Fei-Fei Li (Stanford) after her keynote. Her work on image captioning is covered on NYTimes. Interesting talk about deep captioning her latest work on understanding the story.
Yauan Lecunn (NYU) after his workshop discussion asked about meta thinking, learning to think. Also asked if he will be teaching the deep learning course at NYU next spring, which he affirmed.
David Silver (Google Deepmind). Excellent tutorial on deep reinforcement learning, that learnt to play arcade game just from raw pixel data, and alphago. Asked him question what are the limitations, and he told me that challenges are for robotics where decisions have to made quicker, and for rewards that are far in the future e.g. needle in the hawstack rewards.
Richard Socher (Metamind CEO/ Bought by Salesforce). Chat at the poster session about his paper on question answering system on text and images. Am curious to know how Salesforce intends to use deep learning. Wonder if SugarCrm is diving into machine learning.
Matthew Zeiler (ClarifAI CEO). Meeting at the Intrepid after party. Clarifi provides api for image analysis. Discussion on interesting use cases for news industry.
Justin Basilico (Machine Learning Netflix). Movie recommendations, which rows and position the movie appears in etc, all driven by machine learning. Netflix has a catalog of machine learning design patterns. Discussion about the Time Machine design pattern
Adam Trischler (Maluuba Researcher). Talk about question answering system. They are soon to release products Canadian startup, and claim to have better results than Facebook and Google on public datasets.
Howard Mansell (Facebook AI). Chat about Torch usage in Facebook. The talk was about how Torch is a deep learning tool for research.
James Zhang (Bloomberg Machine Learning Researcher). Discussion about how to use news in time series prediction.
Yan Xu (SAS). Talk about how deep learning can be used in marketing automation. SAS is working on predictive modeling.

Pictures

Google Arm Robot Researcher (the Google IO one)

Google Vision API Quick Test on Gym Schedule

While driving back with the next weeks Gym schedule, a startup idea struck me. Wouldn’t it be nice if I could just snap a picture of the gym schedule and all the classes get added to my calendar? Even though the schedule is available online, but i felt it would be convenient for some people just to add it calendar with a snap. Also been looking into Google vision api that Google released in the recently concluded GCP Next . Thought this would be a nice test for usability of the API. In order to use Google cloud API, one needs to sign up for the free trial (needs credit card). To get started with vision API, here is a quick tutorial. So I tried it out. Took the snap of the gym schedule and uploaded this image to cloud platform storage bucket. Then modified the URL for the image to my uploaded image. The request type was modified to TEXT_DETECTION. The request took less then a second to complete, and returned json response. The json response is divided in 2 sections. The first has a list of all the text it has detected. The second section has coordinates of the bounding rectangle for each word it has detected. I was specifically looking if it detected the Bodypump class at 7:30pm for Thursday apr-21. My hope was that using the coordinates of the bodypump sections I could determine the date from the column header, and time from the row header, using some geometry/math.

It was not able to detect all the times ending in p.m. in the row headers.
It did detect the bodypump blobs even though they were in white font with black background, which i felt was smart work by the api.
It was not able to detect the smaller text under bodypump e.g. Studio 1, 45 mins, Nancy
It was able to detect some variations of the fonts. Notice it detects Fitness, but not the 24 Hour on the top of the page.

The Google vision does a nice job detecting some of the text. But the accuracy is not good enough that I can base my app idea on i.e. capture the text and use coordinates to find the time and date for the class, and automatically add to my calendar. Will have to wait till the api is either more accurate.

See schedule image, request and response below.

I was specifically interested if the response from

The following is the API request

POST https://vision.googleapis.com/v1/images:annotate?key={YOUR_API_KEY}
{
 "requests": [
  {
   "features": [
    {
     "type": "TEXT_DETECTION"
    }
   ],
   "image": {
    "source": {
     "gcsImageUri": "gs://emailclassification/24ScheduleMine.JPG"
    }
   }
  }
 ]
}

The following is the API response

{
 "responses": [
  {
   "textAnnotations": [
    {
     "locale": "en",
     "description": "FITITESS\nHOUR\nGROUP X\nweek of april 18, 2016\nmon apr 18\ntue apr-19\nwed apr 20\nthu apr 21\nfri apr 22\nsat apr 23\nsun apr 24\n5:30am\n8:00am\n9:00am\nBODYPUMP\nBODYCOMBAT BODYPUMP BODYPUMP\nBODYCOMBAT\n0 M\n9:30am\nPop\n30 M\nBODYPUMP\nZUMBA\n10:00am\nOZVMSA\nBODYFLOW\nZUMBA\nStudio\nZUMBA\nBODY FLOW\n11:00am\nBODYPUMP\nPLYO\nGRIT STRENGTH 30M\nStudeo 2\nBODYPUMP\nchedule\n8 AM and 9 PM only. The\nmembers may attend ca\ny Aq\nd Zumba Gold\nd by S\n",
     "boundingPoly": {
      "vertices": [
       {
        "x": 66,
        "y": 142
       },
       {
        "x": 1962,
        "y": 142
       },
       {
        "x": 1962,
        "y": 1449
       },
       {
        "x": 66,
        "y": 1449
       }
      ]
     }
    },
    {
     "description": "FITITESS",
     "boundingPoly": {
      "vertices": [
       {
        "x": 66,
        "y": 921
       },
       {
        "x": 67,
        "y": 535
       },
       {
        "x": 143,
        "y": 535
       },
       {
        "x": 142,
        "y": 921
       }
      ]
     }
    },

 ]
}

Full API Response Text File

Deep Learning – My Painting Picaso Style – Neural Algorithm of Artistic Style

Weekend project reading through this interesting paper. The gist of the paper is that it is possible to extract the content of one image, and the style from another and produce a third with the content and style mixed. This is enabled by deep convolution neural networks. Convolution networks learn features in a hierarchy, where lower layers learn features such as line segments, and each layer above learn higher abstractions, such as nose, face or a scenery. Example above is my own school picture plus Picaso art work produced a fascinating painting of me.

The paper is available here, A Neural Algorithm of Artistic Style

The source code for it is available here GitHub Source

The fastest way to get started on a mac or pc is to use a docker image, with precompiled dependencies. It took about 10 hours to generate the master piece. On a GPU based machine it would take 15 or so minutes.

Get Docker and start docker terminal. Set the docker virtual box memory to a higher number 8GB, stop the virtual machine and change the system settings in virtual box.

docker pull kchentw/neural-style

docker run -i -t kchentw/neural-style /bin/bash

#To exit shell without terminating container do CTRL P, Q

docker ps  # will show container id

docker cp picaso.jpg :/tmp

docker cp kid.jpg :/tmp

docker attach 

cd ~/neural-style

th neural_style.lua -gpu -1 -style_image /tmp/picaso.jpg -content_image /tmp/kid.jpg

Wait for about 10 or so hours and it will produce out.png, the picaso master piece!

Stanford Machine Learning Certificate Coursera

Just finished the Stanford Machine Learning Course offered by Coursera.

https://www.coursera.org/account/accomplishments/certificate/EDTD253XUN9J

Sublime Text 3 is my favorite text editor

http://www.sublimetext.com/

This text editor is by far my favorite! It picks up where textmate left off. It has the ease of use of a simple text editor, and can be enhanced to mimic a full fledge IDE. It has a great plugin support. I especially love the SublimeRepl plugin and make scripting repl programming very easy (easier than emacs). I have tried repl with python, octave, shell, works really great. Some useful plugins are

SublimeRepl – for repl integration to python, shell, octave, ocaml, scheme etc.
GoSublime – for GoLang development
PacakgeControl – the first plugin to install to manage all plugins
Git – Can do most of git commands from Sublime
Terminal – Open path of file in xterm

Command P or Command+Shit+P gives you most of the power to run most of your plugins, or search.

Added a new page for all my robots

http://navacron.com/robots/

Waiting for Pepper and Jibo robot next year.

Man or robot?

“Am I a man dreaming I am a robot, or a robot dreaming I am a man?”

Roger Zelazny (1965)

At JavaOne 2013

Went to JavaOne 2013 held in San Francisco. Was a great learning experience. However I felt that the JavaOne website to schedule conference was bit outdated. Java seems to be an aging language, though the most used. As the architect for Java said, Java is not dead yet, referencing the new feature of Java 8 i.e. closures and streams.

Some of the stuff I learnt

Java 8 Closures Streams
Programming with Java on Raspberry Pi. See my project mentioned on the Java site called Domotix.
MongoDb
Elastic Search
New features for Spring
Introduction to R

Also saw Maroon 5 at the Oracle party at Treasure Island.

With Humanoid robot Nao at the conference. Nao is the humanoid platform for robocup tournament.

Synology NAS – my home cloud

Recently I purchased a Synology NAS Server DS213. This is to archive all my digital content

Pictures
Documents,
Videos
Music

into a central repository. The NAS is

Accesible from all my devices, mac, iphone, ipad, xbox. Running 24×7
Durable/Redundancy, it has Synology raid system with 2 x 4TB of disk space.
Extra redundancy with external drives taking backup of documents and pictures. 1 weekly backup, and 1 monthly backup. The monthly drive is stored in a separate place.
VPN server, though i have disabled it for now. This allows to VPN into home network and for example access my local machines, or watch youtube on networks that block it 😛
Picture and video server.
Git over ssh server.
Time Machine Backup
Can SSH and program in python, or optionally install java as well on the server.
iPhone/iPad apps to look at pictures and videos

I have put all the images and digitized all my tape home videos and put it there. My kids now enjoy watching their childhood videos on their ipads, which were sitting in drawers for about a decade.

Future Projects of the NAS

Digitize all the older paper pictures, and documents and store them on the NAS.
Explore the Astrix server and make it a home based VOIP server.
Could make it a web and wiki server but am hesitant to expose it to internet.
Integrate home security camera system.

Found an article I wrote about COG The Humanoid Robot. Exactly 10 years ago.

The Humanoid Robot Cog for ACM

The ending of the article was optimistic, that we would have humanoids common place. Aibo was promising, but it was discontinued by Sony many years ago. In fact I could not find a single telepresence robot for home usage. I feel in 10 years no significant progress is made. When i search for robots I still find rovers and toy like robots. I hope bigger companies get into home automation and robotics for the masses. A decent vacuum cleaner/telepresence, hooked up to home sensors robot would be something useful for the home. Turtlebot seems something in the right direction, but expensive. iPhone robot Romo is interesting, but is too small and more toyish.

Implementing Prolog in Python

Very interesting site that shows a minimal prolog interpreter implemented in python.

http://openbookproject.net/py4fun/

Post after a long time. Tried to upgrade b2evolution. (Hence migrating to WordPress)

Tried it but started getting php errors. Thankfully i had saved my backup instructions.
Please save following information. You will need it in order to restore if something went wrong

If you don’t have SSH access, ask support to help you:
– Remove the directory /home/account/public_html/blog
– Untar /home/account/fantastico_backups/blog.backup.tgz
– Empty the database dbname
– Import the file /home/account/fantastico_backups/blog/backup.sql into the database dbname
– Move /home/account/fantastico_backups/blog to /home/account/public_html/blog

Fun Places For Kids

http://www.funplaceskids.com

FunPlacesKids.com is our new project. Its a highly customized word press powered website listing fun and exciting places for kids. The home page displays a google map with flags pointing to different kids locations. The flag images are based on categories. I am really excited on this new .com project, and hope to list out all the fun places for kids in Us soon.

Hungry Ants First Version Approved for Sale by Apple Store Today!

It was rejected yesterday because of a call to non public api called terminate. I fixed it in the evening, and resubmitted. And today it was all clear for the public, ready to be sold! Well its going to be free for now.

Category: Technology