ML / AI
NLP
See also Speech
- Metacat - a computer model of analogy-making and perception that builds on the foundations of an earlier model called Copycat. Copycat was originally developed by Douglas Hofstadter and Melanie Mitchell as part of a research program aimed at computationally modeling the fundamental mechanisms underlying human thought processes. Central to the philosophy of this research is the belief that the mind's ability to perceive connections between apparently dissimilar things, and to make analogies based on these connections, lies at the heart of intelligence. According to this view, to understand the analogical mechanisms of thinking and perception is to understand the source of the remarkable fluidity of the human mind, including its hidden wellsprings of creativity.
Like Copycat, Metacat operates in an idealized world of analogy problems involving short strings of letters. Although the program understands only a limited set of concepts about its letter-string world, its emergent processing mechanisms give it considerable flexibility in recognizing and applying these concepts in a wide variety of situations. The program's high-level behavior emerges in a bottom-up manner from the collective actions of many small nondeterministic processing agents (called codelets) working in parallel, in much the same way that an ant colony's high-level behavior emerges from the individual behaviors of the underlying ants, without any central executive directing the course of events.
Metacat focuses on the issue of self-watching: the ability of a system to perceive and respond to patterns that arise not only in its immediate perceptions of the world, but also in its own processing of those perceptions. Copycat lacked such an "introspective" capacity, and consequently lacked insight into how it arrived at its answers. It was unable to notice similarities between analogies, or to explain the differences between them or why one might be considered to be better or worse than another. In contrast, Metacat's self-watching mechanisms enable it to create much richer representations of analogies, allowing it to compare and contrast answers in an insightful way. Furthermore, it is able to recognize, remember, and recall patterns that occur in its own "train of thought" as it makes analogies. For instance, by monitoring its own processing, Metacat can often recognize when it has fallen into a repetitive cycle of behavior, enabling it to break out of its "rut" and try something else. [3]
Machine learning
- YouTube: Neural Network Architectures
- https://github.com/rasbt/python-machine-learning-book/blob/master/faq/difference-deep-and-normal-learning.md [5]
"In applications of "usual" machine learning, there is typically a strong focus on the feature engineering part; the model learned by an algorithm can only be so good as its input data. Of course, there must be sufficient discriminatory information in our dataset, however, the performance of machine learning algorithms can suffer substantially when the information is buried in meaningless features. The goal behind deep learning is to automatically learn the features from (somewhat) noisy data; it's about algorithms that do the feature engineering for us to provide deep neural network structures with meaningful information so that it can learn more effectively. We can think of deep learning as algorithms for automatic "feature engineering," or we could simply call them "feature detectors," which help us to overcome the vanishing gradient challenge and facilitate the learning in neural networks with many layers."
- http://googleresearch.blogspot.be/2015/06/inceptionism-going-deeper-into-neural.html
- http://www.theguardian.com/technology/2015/jun/18/google-image-recognition-neural-network-androids-dream-electric-sheep
- https://317070.github.io/LSD/
- https://github.com/google/deepdream [10]
- http://ryankennedy.io/running-the-deep-dream/
- https://github.com/graphific/DeepDreamVideo [11]
- Caffe - a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.
- Torch - a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.
- https://github.com/karpathy/char-rnn - Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
- YouTube: Connections between physics and deep learning [https://news.ycombinator.com/item?id=13062139
- Data Science Machine - an end-to-end software system that is able to automatically develop predictive models from relational data. The Machine was created by Max Kanter and Kalyan Verramachaneni at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. The system automates two of the most human-intensive components of a data science endeavor: feature engineering, and selection and tuning of the machine learning methods that build predictive models from those features. First, an algorithm called Deep Feature Synthesis automatically engineers features. Next, through an approach called Deep Mining, the Machine composes a generalized machine learning pipeline that includes dimensionality reduction methods, feature selection methods, clustering, and classifier design. Finally, it tunes the parameters through a Gaussian Copula Process.
- TensorFlow - an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains.
- https://github.com/neo-ai/neo-ai-dlr - a compiler and runtime for machine learning models. The compiler optimizes machine learning models for various target hardware. The runtime executes the model on the target hardware. A stand-alone, light-weight and portable runtime for CNN and decicion-tree models. Built on top of TVM and Treelite runtime, DLR provides simple and unified Python/C++ APIs for loading and running TVM/Treelite compiled models on a wide range of devices, including X86, TRT-enabled GPU and Arm devices.
- https://github.com/nihalpasham/fingerprinting_radios_w_ML - The key idea behind radio fingerprinting is to extract unique patterns (or features) and use them as signatures to identify devices (or more precisely ID a radio embedded within a device).
- NuPIC - the Numenta Platform for Intelligent Computing, comprises a set of learning algorithms that were first described in a white paper published by Numenta in 2009. The learning algorithms faithfully capture how layers of neurons in the neocortex learn.
AI
- OpenAI - a non-profit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return. Since our research is free from financial obligations, we can better focus on a positive human impact. We believe AI should be an extension of individual human wills and, in the spirit of liberty, as broadly and evenly distributed as is possible safely. The outcome of this venture is uncertain and the work is difficult, but we believe the goal and the structure are right. We hope this is what matters most to the best in the field. [33]