ElI5: What is machine learning and how does it work?
Machine learning is, in general, a way of pushing a large amount of data into a system and getting out a model that you can then apply to more data.
In structured machine learning, you are trying to get the data to produce a particular type of model. For example, you might want to be able to identify whether a given picture is a picture of a cat. So you get a huge collection of pictures, noting which ones are of cats, and you push those through your system and tweak its model depending on whether it's correctly identifying the pictures or not (so if you give it a picture of a cat and it guesses it is a cat, you try to strengthen the bits of the model that helped it guess right, or if it guesses it isn't a cat, you weaken the bits of the model that made it get it wrong). This is called training the model.
After a while, it should be doing pretty well at correctly identifying the cats in the training pictures, so you then test it on a new set of pictures (where you still know the right answer of whether they're of cats or not), and see how well it does, and if it's still doing a good job then you can finally show it the picture that you don't know anything about, and let the model tell you if it's a cat or not.
In unsupervised learning, you're not looking to model a particular thing, but you are trying to see if there is any structure in the data. For example, you might have data from a service like Netflix on what movies each customer watches. By putting that data through a machine learning system, you might find that there are certain movies that are very likely to be watched by the same people - maybe a lot of people watch Terminator AND Predator AND Rocky AND Rambo - so if someone starts watching some of the movies in that group, you might be able to suggest other movies in that group since there's a good chance they will enjoy those ones too.
More on reddit.comA List of ML Problems That Build Up In Difficulty And Help Develop Your Skills...
Example of GIS and Machine Learning applications?
What is hard negative mining? And how is it helpful in doing that while training classifiers?
Let's say I give you a bunch of images that contain one or more people, and I give you bounding boxes for each one. Your classifier will need both positive training examples (person) and negative training examples (not person).
For each person, you create a positive training example by looking inside that bounding box. But how do you create useful negative examples?
A good way to start is to generate a bunch of random bounding boxes, and for each that doesn't overlap with any of your positives, keep that new box as a negative.
Ok, so you have positives and negatives, so you train a classifier, and to test it out, you run it on your training images again with a sliding window. But it turns out that your classifier isn't very good, because it throws a bunch of false positives (people detected where there aren't actually people).
A hard negative is when you take that falsely detected patch, and explicitly create a negative example out of that patch, and add that negative to your training set. When you retrain your classifier, it should perform better with this extra knowledge, and not make as many false positives.
More on reddit.com