Category Archives: Machine Learning

Types of Problems Machine Learning Can Solve

What types of Problems can Machine Learning solve?

  • Problems the human brain does easily, but we aren’t sure how they are accomplished. e.g. 3D object recognition
  • Problems without simple and reliable rules.  The answer might be a combination of a large number of weak rules. e.g. detecting credit card fraud
  • Moving targets where programs need to change because methods change in the real world.  e.g. again credit card fraud

Since we don’t know the exact methods by which to code these types of examples, the machine learning approach is to provide a generic algorithm a bunch of correct examples and let the machine figure out how to get there.

Done well, this approach should work well for new data fed into the program, as well as the examples given to it initially while training.

Also, if the data changes over time (as in the credit card fraud detection problem) the program can change by being trained on new data.

This boils down to:

  • Recognizing patterns
  • Recognizing anomalies
  • Making predictions


Summarized and adapted from

Review of Tesla’s Short Self-driving Proof of Concept

Autopilot Full Self-Driving Hardware (Neighborhood Short) from Tesla Motors on Vimeo.

The views they provide are (left to right, top to bottom):

  1. Interior cabin & through windshield
  2. The vehicle’s left rearward vehicle camera
  3. The vehicle’s medium range (forward) camera
  4. The vehicle’s right reward vehicle camera

Related: Tesla’s HW2 (Hardware 2) sensor suite

Object detection:

  • Motion flow
  • Lane lines (left of vehicle)
  • Lane lines (right of vehicle)
  • Road flow
  • In-path objects
  • Road lights
  • Objects
  • Road signs

The following are my observations.  These are not necessarily errors or incorrect, but are things worth mentioning.

My general observations:

  • On two lane roads, far left line typically not detected on Medium range camera
  • Multiple bounding boxes around the same object
  • Rearward objects labeled as “in-path”
  • Brake pedal moves, but accelerator pedal does not appear to move
  • Slowed down for a crosswalk (0:12)
  • Detected pedestrian near road, but did not consider to be in-path (0:41)
  • Stopped/slowed down for walkers/joggers near road (0:55)
  • Stopped during right turn (1:02)
  • Detects the back of road signs as signs (1:23)
  • Stopped after right turn (1:33)
  • The cameras angles not included are: forward narrow, forward wide, left and right side forward facing, and rear facing
  • From the information provided here, we cannot determine whether pedestrians are treated as any other object or separately as a “pedestrian type” object


  • Detected road sign as road light (1:25)

TensorFlow and deep learning, without a PhD

Excellent introductory video from Google’s Martin Gorner covering the handwritten 0-9 digits recognition problem using the MNIST data set.  Code is Python using the Tensorflow library.

He begins with a single layer network, progresses into a multi-layer network and ends with a convolutional neural network, showing how improvements in techniques correspond to better test accuracy.

Covers softmax, sigmoid, ReLU, matrix transformation, convolutional networks, over-fitting, test and training accuracy.

Well worth a watch for beginners and has better explanations than some full courses online.

Note: they figure out the microphone feedback situation around minute 15 or 16.


Exploring Udacity’s 1st 40GB driving data set

I read about the release of their second data set yesterday and wanted to check it out.  For convenience, I downloaded the original, smaller, data set.

Preface: ROS is only officially supported on Ubuntu & Debian and is experimental on  OS X (Homebrew), Gentoo, and OpenEmbedded/Yocto.

Getting the data

Download the data yourself: Driving Data

The data set, which is linked to from the page above, was served up from Amazon S3 and actually seemed quite slow to download, so I let it run late last night and started exploring today.

The compressed download is dataset.bag.tar.gz


and after extracting is a 42.3 GB file dataset.bag

.bag is a file type associated with the Robot Operating System (ROS)

Data overview

To get an overview of the file use the rosbag info <filename> command:


Open in new window

There are 28 data topics from on-board sensors including 3 color cameras.  Topics:

  • /center_camera/image_color
  • /left_camera/image_color
  • /right_camera/image_colors

Each camera topic has 15212 messages.   Doing the math on 15212 messages / 760 seconds works out to roughly 20 frames per second.

Viewing the video streams

Converting a camera topic to a standalone video is a two step process:

  1. export jpegs from the bag file
  2. convert the jpegs to video

Exporting the jpegs

To export the image topic to jpegs, the bag needs to be played back and the frames extracted.  This can be done with a launch script.  The default filename pattern (frame%04d.jpg) allows for 4 numerical figures, so we need to add the following line to modify the default file name pattern into one that allows for 5 digits:

The entire script below that launches the player and extractor:

The number of resulting frames should match the number of topic messages seen from info.

If not, as was our case, the default sec per frame time should be changed.  It seems counter-intuitive, but after slowing down the rate, trying “0.11” and “0.2”, the number of frames extracted was also going down.  I settled on “0.02” seconds per frame which resulted in the correct number of frames.  Add the line to the launch script.

The working launch script now looks like this:

Download working Left, Center, and Right jpeg export launch scripts on GitHub

The result should be the correct number of frames saved (frames starts at 00000) and the message “[rosbag-1] process has finished cleanly”

Hit Ctrl + C to exit

frame00000.jpg 640×480

frame00000.jpg extracted from topic /left_camera/image_color

Convert the jpegs to video


License: The data referenced in this post is available under the MIT license.  This post is available under CC BY 3.0

Where to next?

Udacity open sources 223GB of driving data

Following on the heels of another self-driving car developer, releasing driving data, Udacity open sourced two data sets from their self-driving Lincoln MKZ.

Udacity’s data is over 70 minutes of driving spread over two days from Mountain View, Calif.   You can read more from the TechCrunch article or their Medium post.

The data is available under the MIT License.

We downloaded the first, smaller data set and started exploring the data.

We also have a page tracking available data sets.

License: this post is available under CC BY 3.0

MLBD Machine Learning is Fun! Part 3: Deep Learning and Convolutional Neural Networks

Here we follow along with the 3rd part of Adam Geitgey’s excellent introductory series Machine Learning is Fun! “Deep Learning and Convolutional Neural Networks”

For the original article, click here.

The example presented is an object recognition classifier to determine whether or not something is a bird.  To complete this tutorial you will need:

  • TFLearn
  • TensorFlow
  • Python 2.7
  • Birds and not birds data set


Original test script

Machine Learning by Doing

Machine learning by doing is a series following the best machine learning tutorials and examples posted around the internet and ensuring they are 100% repeatable.  Clearly defined version numbers for programming languages and packages, links to data sets and explanations of some of the lesser known functions we will encounter.

These examples are some of the best found on the web, but it is incredibly frustrating to find you are missing one small piece to re-create their results.

For the first post we follow along with Adam Geitgey in Part 3 of his “Machine Learning is Fun!” series.

TensorFlow on Windows

TensorFlow is an open source (Apache 2.0) software library for Machine Intelligence created by Google

Windows 7

The two options are:

  • Run in Docker
  • Run in a Linux Virtual Machine (VM)

Windows 10

With the introduction of the Windows Subsystem for Linux (WSL) Windows 10 users have an additional option:

  • Run on Win 10
  • Run in Docker
  • Run in a Linux VM


Stay up to date with Machine Learning go here


Hacker News