Sat, 21 Mar 2015
I took a class in AI in late 2013, but I only started looking at practical engineering implementions for ML in the past few months.
In looking at things like scikit-learn, I saw that a lot of the algorithms are already coded. You can even automatically test what classifier/model will be best for the data. In looking at the package and examples, I suspected that the hard part was wrangling the in the field data into an acceptable form for the algorithms.
I was graciously invited to an event a few months ago by a fellow named Scott, at which there were several people with good field knowledge of AI and ML. I talked to two of them about algorithms and data. Both of them made the point that getting the data wrangled into a suitable form was the hard part. I then went onto the net and read about this more carefully, and others with experience seemed to agree. So it is like other programming, where getting the data structures and data input right is usually the hard part, since if that is done well, implementing the algorithms is usually not much of a chore.
So I began working on my ML project. What does it do? Sometimes I go to local supermarkets, and what I am looking for is out of stock. So this ML predicts whether the supermarket will have the item I'm looking for in stock.
I architected the data structures (which consists of purchases, and observations that certain products are missing) and programmed the inputs. Then I added Google Maps so I could see where the local supermarkets were. The program would prefer close supermarkets to far ones.
Now I have run into a problem/non-problem. In architecting the solution so that the ML models and algorithms could better understand the problem, I architected a solution so that I could better understand the problem as well. Before I would pretty much go to my closest supermarket, if they were out of stock then on to the next closest one, and so forth. Now I have all that data available on my Android, including a map, and deciding which supermarket to go to is trivial. I don't need the ML so much any more. I wonder how often this happens - you build a solution so that AI/ML can be used, but once all the data is recorded in an understandable way, you don't need the AI/ML any more. Although there can be situations where there is a lot of data for someone to remember in their head, but not a lot for an ML solution.
Any how, I went through enough trouble to put all of this together, that I will still go through with writing a program that predicts if the items I want are in stock. I'll also make a map with time/distance information between my home and the supermarkets, and the supermarkets with each other. Then my program will give me advice on which supermarkets to try first.