Разделы презентаций


1 Navigate machine learning With InTEL ® DISTRIBUTION FOR Python* Victoriya

Содержание

Machine Learning: Your Path to Deeper Insight Driving increasing innovation and competitive advantage across industriesstrategy provides the foundation for success using AIIntel® Math Kernel Library (Intel® MKL & MKL-DNN)Intel® Data Analytics

Слайды и текст этой презентации

Слайд 1Navigate machine learning With InTEL® DISTRIBUTION FOR Python*
Victoriya Fedotova

Navigate machine learning With InTEL® DISTRIBUTION FOR Python*Victoriya Fedotova

Слайд 2Machine Learning: Your Path to Deeper Insight Driving increasing innovation and

competitive advantage across industries
strategy provides the foundation for success using

AI

Intel® Math Kernel Library (Intel® MKL & MKL-DNN)

Intel® Data Analytics Acceleration Library (Intel® DAAL)

+Network
+Memory +Storage

Datacenter

Endpoint

Solutions for reference across industries
Tools/Platforms to accelerate deployment
Optimized Frameworks to simplify development
Libraries/Languages featuring optimized building blocks
Hardware Technology portfolio that is broad and cross-compatible

Intel® Deep Learning SDK for Training & Deployment

Intel® Distribution for Python*

Machine Learning: Your Path to Deeper Insight Driving increasing innovation and competitive advantage across industriesstrategy provides the

Слайд 3Motivation
Challenge #2:
Python performance limits migration to production systems

Hire a team

of Java/C++ programmers …
OR
Have team of Python programmers to deploy

optimized Python in production

Python is among the most popular programming languages

Challenge #1:
Domain specialists are not professional software programmers

* L.Prechelt, An empirical comparison of seven programming languages, IEEE Computer, 2000, Vol. 33, Issue 10, pp. 23-29
** RedMonk - D.Berkholz, Programming languages ranked by expressiveness

MotivationChallenge #2:Python performance limits migration to production systemsHire a team of Java/C++ programmers …ORHave team of Python

Слайд 4Intel® Distribution for Python* Advancing Python performance closer to native speeds

Intel® Distribution for Python* Advancing Python performance closer to native speeds

Слайд 5Performance Gain from MKL (Compare to “vanilla” SciPy)
Configuration info: -

Versions: Intel® Distribution for Python 2017 Beta, icc 15.0; Hardware:

Intel® Xeon® CPU E5-2698 v3 @ 2.30GHz (2 sockets, 16 cores each, HT=OFF), 64 GB of RAM, 8 DIMMS of 8GB@2133MHz; Operating System: Ubuntu 14.04 LTS.

Up to 100x faster

Up to 10x faster!

Up to 10x faster!

Up to 60x faster!

Performance Gain from MKL (Compare to “vanilla” SciPy)Configuration info: - Versions: Intel® Distribution for Python 2017 Beta,

Слайд 6Out-of-the-box Performance with Intel® Distribution for Python* Mature AVX2 instructions based

product
Configuration Info: apt/atlas: installed with apt-get, Ubuntu 16.10, python 3.5.2,

numpy 1.11.0, scipy 0.17.0; pip/openblas: installed with pip, Ubuntu 16.10, python 3.5.2, numpy 1.11.1, scipy 0.18.0; Intel Python: Intel Distribution for Python 2017
Hardware: Xeon: Intel Xeon CPU E5-2698 v3 @ 2.30 GHz (2 sockets, 16 cores each, HT=off), 64 GB of RAM, 8 DIMMS of 8GB@2133MHz
Out-of-the-box Performance with Intel® Distribution for Python* Mature AVX2 instructions based product Configuration Info: apt/atlas: installed with

Слайд 7Out-of-the-box Performance with Intel® Distribution for Python* New AVX512 instructions based

product
Configuration Info: apt/atlas: installed with apt-get, Ubuntu 16.10, python 3.5.2,

numpy 1.11.0, scipy 0.17.0; pip/openblas: installed with pip, Ubuntu 16.10, python 3.5.2, numpy 1.11.1, scipy 0.18.0; Intel Python: Intel Distribution for Python 2017
Hardware: Intel Intel® Xeon Phi™ CPU 7210 1.30 GHz, 96 GB of RAM, 6 DIMMS of 16GB@1200MHz
Out-of-the-box Performance with Intel® Distribution for Python* New AVX512 instructions based product Configuration Info: apt/atlas: installed with

Слайд 8WORKSHOP: BASIC functions

WORKSHOP: BASIC functions

Слайд 9Examples of Basic Functions
NumPy, SciPy
Matrix multiplication
Random number generation
Vector Math
Linear algebra

decompositions

Not so basic functions
SciKit-learn
Linear regression
NOTE: Only Python 2.7 and 3.5

are supported for now
Examples of Basic FunctionsNumPy, SciPyMatrix multiplicationRandom number generationVector MathLinear algebra decompositionsNot so basic functionsSciKit-learnLinear regressionNOTE: Only Python

Слайд 10Intel Python Landscape
Intel® DAAL
Intel®
IPP
Intel® MPI
Library
Intel® TBB
Intel® MKL
Scipy*
Pandas*
Numpy*

Intel® Distribution for Python*
Intel®

Performance Libraries
Mpi4py*
py
DAAL
Scikit-learn*

Intel Python LandscapeIntel® DAALIntel®IPPIntel® MPILibraryIntel® TBBIntel® MKLScipy*Pandas*Numpy*…Intel® Distribution for Python*Intel® Performance LibrariesMpi4py*pyDAALScikit-learn*

Слайд 11Scikit-Learn* optimizations with Intel® MKL Speedups of Scikit-Learn* Benchmarks (2017 Update

1)
System info: 32x Intel® Xeon® CPU E5-2698 v3 @ 2.30GHz,

disabled HT, 64GB RAM; Intel® Distribution for Python* 2017 Gold; Intel® MKL 2017.0.0; Ubuntu 14.04.4 LTS; Numpy 1.11.1; scikit-learn 0.17.1. See Optimization Notice.

Speedup

Scikit-Learn* optimizations with Intel® MKL Speedups of Scikit-Learn* Benchmarks (2017 Update 1) System info: 32x Intel® Xeon®

Слайд 12More Scikit-Learn* optimizations with Intel® DAAL Speedups of Scikit-Learn* Benchmarks (2017

Update 2)
Accelerated key Machine Learning algorithms with Intel® DAAL
Distances, K-means,

Linear & Ridge Regression, PCA
Up to 160x speedup on top of MKL initial optimizations

Speedup

More Scikit-Learn* optimizations with Intel® DAAL Speedups of Scikit-Learn* Benchmarks (2017 Update 2)Accelerated key Machine Learning algorithms

Слайд 13Intel® DAAL: Heterogeneous Analytics
Targets both data centers (Intel® Xeon® and

Intel® Xeon Phi™) and edge-devices (Intel® Atom™)
Perform analysis close to

data source (sensor/client/server) to optimize response latency, decrease network bandwidth utilization, and maximize security
Offload data to server/cluster for complex and large-scale analytics

(De-)Compression
(De-)Serialization

PCA
Outlier detection
Normalization
Math functions
Sorting



Statistical moments
Quantiles
Distances
Variance matrix
Distances
QR, SVD, Cholesky
Apriori
Optimization solvers

Regression
Linear
Ridge
Classification
Naïve Bayes
SVM
Classifier boosting
kNN
Decision Forest
Clustering
Kmeans
EM GMM
Collaborative filtering
ALS

Neural Networks

Quality metrics

Available also in open source: https://software.intel.com/en-us/articles/opendaal

Intel® DAAL: Heterogeneous AnalyticsTargets both data centers (Intel® Xeon® and Intel® Xeon Phi™) and edge-devices (Intel® Atom™)Perform

Слайд 14Performance Example : Read And Compute SVM Classification with RBF kernel
Training

dataset: CSV file (PCA-preprocessed MNIST, 40 principal components) n=42000, p=40
Testing

dataset: CSV file (PCA-preprocessed MNIST, 40 principal components) n=28000, p=40











System Info: Intel® Xeon® CPU E5-2680 v3 @ 2.50GHz, 504GB, 2x24 cores, HT=on, OS RH7.2 x86_64, Intel® Distribution for Python* 2017 Update 1 (Python* 3.5)

2.2x

66x

Balanced read and compute

60% faster CSV read

Performance Example : Read And Compute SVM Classification with RBF kernelTraining dataset: CSV file (PCA-preprocessed MNIST, 40

Слайд 15WORKSHOP: PyDAAL

WORKSHOP: PyDAAL

Слайд 16pyDAAL Getting Started
https://github.com/daaltces/pydaal-getting-started

DAAL4PY: Tech Preview
https://software.intel.com/en-us/articles/daal4py-overview-a-high-level-python-api-to-the-intel-data-analytics-acceleration-library


pyDAAL Getting Startedhttps://github.com/daaltces/pydaal-getting-startedDAAL4PY: Tech Previewhttps://software.intel.com/en-us/articles/daal4py-overview-a-high-level-python-api-to-the-intel-data-analytics-acceleration-library

Слайд 17Intel® TBB: parallelism orchestration in Python ecosystem
Software components are built

from smaller ones
If each component is threaded there can be

too much!
Intel TBB dynamically balances thread loads and effectively manages oversubscription

> python -m TBB application.py

Intel® TBB: parallelism orchestration in Python ecosystemSoftware components are built from smaller onesIf each component is threaded

Слайд 18Profiling Python* code with Intel® VTune™ Amplifier Right tool for high

performance application profiling at all levels
Function-level and line-level hotspot analysis,

down to disassembly
Call stack analysis
Low overhead
Mixed-language, multi-threaded application analysis

Profiling Python* code with Intel® VTune™ Amplifier Right tool for high performance application profiling at all levels

Слайд 19Installing Intel® Distribution for Python* 2017
Stand-alone installer and anaconda.org/intel



OR
Linux
Windows*
OS X*
Download

full installer from
https://software.intel.com/en-us/intel-distribution-for-python
> conda config --add channels intel
> conda install

intelpython3_full
> conda install intelpython3_core

docker pull intelpython/intelpython3_full

Installing Intel® Distribution for Python* 2017Stand-alone installer and anaconda.org/intelORLinuxWindows*OS X*Download full installer fromhttps://software.intel.com/en-us/intel-distribution-for-python> conda config --add channels

Слайд 20Intel® Distribution for Python
https://software.intel.com/en-us/distribution-for-python

Intel® Distribution for Pythonhttps://software.intel.com/en-us/distribution-for-python

Слайд 22Collaborative Filtering
Processes users’ past behavior, their activities and ratings
Predicts, what

user might want to buy depending on his/her preferences

Collaborative FilteringProcesses users’ past behavior, their activities and ratingsPredicts, what user might want to buy depending on

Слайд 23Training: Profiling pure python*
Configuration Info: - Versions: Red Hat Enterprise

Linux* built Python*: Python 2.7.5 (default, Feb 11 2014), NumPy

1.7.1, SciPy 0.12.1, multiprocessing 0.70a1 built with gcc 4.8.2; Hardware: 24 CPUs (HT ON), 2 Sockets (6 cores/socket), 2 NUMA nodes, Intel(R) Xeon(R) X5680@3.33GHz, RAM 24GB, Operating System: Red Hat Enterprise Linux Server release 7.0 (Maipo)

Items similarity assessment (similarity matrix computation) is the main hotspot

Training: Profiling pure python*Configuration Info: - Versions: Red Hat Enterprise Linux* built Python*: Python 2.7.5 (default, Feb

Слайд 24Training: Profiling pure Python*
Configuration Info: - Versions: Red Hat Enterprise

Linux* built Python*: Python 2.7.5 (default, Feb 11 2014), NumPy

1.7.1, SciPy 0.12.1, multiprocessing 0.70a1 built with gcc 4.8.2; Hardware: 24 CPUs (HT ON), 2 Sockets (6 cores/socket), 2 NUMA nodes, Intel(R) Xeon(R) X5680@3.33GHz, RAM 24GB, Operating System: Red Hat Enterprise Linux Server release 7.0 (Maipo)

This loop is major bottleneck. Use appropriate technologies (NumPy/SciPy/Scikit-Learn or Cython/Numba) to accelerate

Training: Profiling pure Python* Configuration Info: - Versions: Red Hat Enterprise Linux* built Python*: Python 2.7.5 (default,

Слайд 25Training: Python + Numpy (MKL)
Much faster!
The most compute-intensive part takes

~5% of all the execution time

Configuration info: 96 CPUs (HT

ON), 4 Sockets (12 cores/socket), 1 NUMA nodes, Intel(R) Xeon(R) E5-4657L v2@2.40GHz, RAM 64GB, Operating System: Fedora release 23 (Twenty Three)
Training: Python + Numpy (MKL)Much faster!The most compute-intensive part takes ~5% of all the execution timeConfiguration info:

Слайд 26Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED

“AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR

OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

For more complete information about compiler optimizations, see our Optimization Notice at https://software.intel.com/en-us/articles/optimization-notice#opt-en.

Copyright © 2017, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.

Legal Disclaimer & Optimization NoticeINFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED,

Обратная связь

Если не удалось найти и скачать доклад-презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое TheSlide.ru?

Это сайт презентации, докладов, проектов в PowerPoint. Здесь удобно  хранить и делиться своими презентациями с другими пользователями.


Для правообладателей

Яндекс.Метрика