The processing of ever larger mountains requires new tools. These include programs for machine learning. Such Toolkit Microsoft has now provided for free use.
Researchers at Microsoft's Asian research institute in Beijing have the Distributed Machine Learning Toolkit for developers on GitHub provided. She announced in a research blog. Such self-learning software uses many computers simultaneously to solve complex problems. It comes mainly in big data applications used, such as for the image, text and voice recognition.
With the SDK, Microsoft wants to help developers create better and schellere programs, for example, text recognition.
Picture is a Framework with numerous programming interfaces that will help the developers to focus on what is essential in this environment - focus - data models and training. According to Microsoft's researchers the Toolkit works considerably faster and comes with significantly less interconnected computers than its predecessors. For example, one could train a model with a million topics, 200 billion properties and a vocabulary of 20 million words on a cluster of 24 machines a. Such applications have been required thousands of computers.
Among the key components of the server parameters include (DMTK framework), an algorithm for training the models (LightLDA) and a tool for natural language processing (Distributed Word Embedding). The researchers hope to gain by opening a collaboration with other researchers and developers that are to expand the capabilities of the Toolkit.
No comments:
Post a Comment