Next Steps: when you got data it's time to gain knowledge
What do the ancient astronomer Taqi ad-Din, the well known scientific communicator Neil deGrasse Tyson, and the father of modern statistics, Sir Francis Galton, have in common? Their life is tightly bound with data analysis, as almost everything in our "data explosion" times, should be. The era of guessing is rapidly declining, when the data-driven paradigms are more and more on the rise. Gnucoop, being a data oriented software company from the very beginning, stays true to it's origins. During the last Gnumeeting, an entire day was dedicated to an extended overview of the Data Science discipline, for all our team. I was delighted to introduce my colleagues to all the bolts and nuts of this fascinating, almost magic, activity called Machine Learning. Starting with some historical anecdotes on how a wrong data interpretation could, and in fact did lead to some disastrous outcomes, we moved on experiencing how fascinating the data visualization could be. I always suggest to make a peek at David McCandless marvelous project called "informationisbeautiful.net". It's the best place to quickly understand how much information could be delivered to us, with simple visualization techniques, instead of using pure language. Next we moved to the real core of the data analysis, taking a look at the data cleaning activity and some basic models like Linear Regression, applied on some standard datasets from Kaggle. Kaggle is de facto the place to go for every data enthusiast, being the main platform both for education and competitions in data science, setting all the reference benchmarks for the best models around. Our data-garden was next enriched by Decision Trees, creating a Random Forest (pun intended). All our products and projects will expand on existing decision support systems, leveraging all our data gathering tools, and Machine Learning techniques going forward with all this rapid developing field.
Data analysis made some huge leaps in the last years, allowing software not only to see, but also to understand what is looking at. We analyzed some data sets for Classification problems, making our model actually distinguish between entities, being it some lexical notions, as in Natural Language Processing techniques, or plain pictures containing subjects. There are several tools at our disposition for the task, from some simple Probability Models to some more advanced Convolutional Neural Networks. In conclusion we did test some other techniques regarding Time Series analysis for pattern recognition. As usual one can only predict efficiently an event which is generated from some discovered strong patterns, otherwise we go back to guessing. Artificial intelligence should and will enhance all those social interventions already done by thousands of people in the most critical areas of the globe, and I believe actually it is the best field where AI should be deployed. Using learning networks in video games is fun. Selling the right burger to the right person is nice. But what about those situations where time is not an option, nor errors are allowed. Situations where life itself depends on fast and precise reaction of those who can help. Here at Gnucoop our daily effort is committed towards deploying every useful technology, from computer vision to neural networks, from language processing to block-chain, in order to help equality, collaboration and justice.
ai dataanalysis datacollection